{"id":171,"date":"2017-06-30T11:06:23","date_gmt":"2017-06-30T15:06:23","guid":{"rendered":"https:\/\/dwan.org\/?p=171"},"modified":"2019-10-25T15:16:13","modified_gmt":"2019-10-25T19:16:13","slug":"the-ever-gathering-storm","status":"publish","type":"post","link":"https:\/\/dwan.org\/index.php\/2017\/06\/30\/the-ever-gathering-storm\/","title":{"rendered":"The ever gathering storm"},"content":{"rendered":"<p>It\u2019s summertime \u2013 season of thunderstorms.  Most days are punctuated with ominous clouds and distant thunder.  Actual rain, however, is rare.  The forecast is consistent \u2013 temperatures may spike up to uncomfortably hot in the afternoon, and there are low odds of a thunderstorm.  I carry an umbrella all day, and then water my garden by hand.<\/p>\n<p>It reminds me of our industry-wide set-piece about the how genomic data is so terribly huge (and growing so incredibly fast!) that it\u2019s going to overwhelm everything.<\/p>\n<p>We\u2019ve been living in the shadow of a tidal wave of data for more than 10 years.  Honestly, it\u2019s a little awkward that we\u2019re still sounding the alarm.<\/p>\n<p>The first time the phrase \u201cdata tsunami\u201d appeared in my slides was in a presentation from 2007.  That was when the first wave of so-called \u201cnext-gen\u201d DNA sequencing instruments were really coming into their own.  Those instruments increased the velocity of DNA sequencing by around three orders of magnitude.  They <i>also<\/i> reduced the per-base costs of sequencing by an <i>independent<\/i> three orders of magnitude.  Taken together, we experienced about a millionfold increase in the rate of data production.<\/p>\n<p>We observed at the time that this rate increase was in excess of Moore\u2019s Law.  Now, as genomic diagnostics and precision \/ personalized medicines finally make their way into the clinic, we\u2019re making the <i>same<\/i> observation today.  While it\u2019s flattering to hear brag words like \u201cgenomical,\u201d it\u2019s also a bit misleading.<\/p>\n<p>Because you know what?  We kept up before, and we\u2019ll keep up now.  I think that we\u2019re actually <i>better<\/i> prepared for this decade\u2019s data deluge than we were for the last one.<\/p>\n<p>Sure, there was blood, sweat, and tears \u2013 that\u2019s the <i>job<\/i> of engineering.  We changed and adapted untenable practices \u2013 including choosing to discard the raw output images from the high resolution cameras on the new sequencers.  Instead we stored only the information that was actually useful to the scientists \u2013 at the time it was base pairs and quality scores from all the reads.  That idea was a <i>fight<\/i> at the beginning.  I recall hours of conversation with scientists incredulous that I would suggest that <i>any<\/i> data could <i>ever<\/i> be deleted.  Today, you can\u2019t even get the raw images off of the sequencers.<\/p>\n<p>We upgraded the infrastructure of biology facilities for the genomic age.  We planned and built high performance network connections all the way out to laboratories.  We consolidated data-producing instruments into \u201ccores,\u201d provisioned with infrastructure to handle the network and data storage load.  We shifted servers and storage out of aging lab buildings and into co-located data centers.  We combined independent compute farms into time-shares on integrated high performance computing environments. We worked out cost recovery schemes to make sure that it was sustainable.  As public and private clouds have matured, we\u2019ve continued to evolve, and I\u2019m sure that we will continue to do so.<\/p>\n<p>We also upgraded our human relationships.  We forged partnerships with the technologists who build data storage, network, and computing systems.  Together, we adapted the tools and techniques already in use in media and entertainment, finance, and other industries to be better fits for  the challenges of science.  We sent computer science students to biology journal clubs, and vice-versa, and eventually recognized \u201cbioinformatics,\u201d and \u201ccomputational biology,\u201d as important specializations in their own rights.<\/p>\n<p>We have a decade of trust, education, and mutually beneficial work to build on.<\/p>\n<p>So while it is certainly flattering to hear people proclaim that \u201cgenomical\u201d is a better adjective than \u201castronomical\u201d to describe rapid data growth, I\u2019m not convinced that it\u2019s cause for anything other than enthusiasm.  A decade ago it was Terabytes of genomic sequence data for research.  Now it\u2019s Petabytes, or even Exabytes, of patient records for precision medicine and genomic diagnostics.<\/p>\n<p>We\u2019re gonna be fine, people.  Sure, carry an umbrella, but think of it as \u201crainbow weather.\u201d<\/p>\n","protected":false},"excerpt":{"rendered":"<p>It\u2019s summertime \u2013 season of thunderstorms. Most days are punctuated with ominous clouds and distant thunder. Actual rain, however, is rare. The forecast is consistent \u2013 temperatures may spike up to uncomfortably hot in the afternoon, and there are low odds of a thunderstorm. I&hellip;<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[37,40],"tags":[],"class_list":["post-171","post","type-post","status-publish","format-standard","hentry","category-genomics","category-storage"],"_links":{"self":[{"href":"https:\/\/dwan.org\/index.php\/wp-json\/wp\/v2\/posts\/171","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/dwan.org\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/dwan.org\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/dwan.org\/index.php\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/dwan.org\/index.php\/wp-json\/wp\/v2\/comments?post=171"}],"version-history":[{"count":3,"href":"https:\/\/dwan.org\/index.php\/wp-json\/wp\/v2\/posts\/171\/revisions"}],"predecessor-version":[{"id":1164,"href":"https:\/\/dwan.org\/index.php\/wp-json\/wp\/v2\/posts\/171\/revisions\/1164"}],"wp:attachment":[{"href":"https:\/\/dwan.org\/index.php\/wp-json\/wp\/v2\/media?parent=171"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/dwan.org\/index.php\/wp-json\/wp\/v2\/categories?post=171"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/dwan.org\/index.php\/wp-json\/wp\/v2\/tags?post=171"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}