The Three Musketeers of Genome Sciences and Bioinformatics
The 19th and 20th centuries were marked by an era of scientific discovery. The physical, chemical and mathematical sciences flourished right from the beginning of this era. However, biological sciences were left much behind and were considered merely as a descriptive science. When the biologist had just discovered the cell, the building-block of every living organism, significant progress had been made in the field of electromagnetism and thermodynamics. During the early 20th century when the two most important theories in biological sciences – Darwin’s theory of Natural Selection and Mendel’s theory of Inheritance, had just started to be widely recognized, the physical and chemical sciences had already produced Bohr’s Atomic model, Quantum Theory and Theory of Relativity!!!
It was not until 1953, when Watson and Crick cracked the model of Deoxyribonucleic Acid (DNA), the entire world turned their head towards the mysterious realm of biology. Watson and Crick explained how the heredity material, DNA, is passed on from the parents to the offspring and how DNA was able carry the information for all the life processes in a living organism. This concept of Central Dogma transformed biology from a descriptive science to a “happening” science. However, without the aid of the progress made in physical, chemical and mathematical sciences humans could have never achieved this.
In fact, it was the human skill to combine different approaches to solve a puzzle led us from the era of scientific discovery to the era of scientific mastery. Genomics and Bioinformatics are the two closely related disciplines that helped us in our journey to understand the life among and around us. Early works in the field of bioinformatics dates back to late 1960s but even till the early 1990s experts working in this field were known as “computational molecular biologist” (Bioinformaticist Vs Bioinformatician). One of the early bioinformaticians and a renowned figure in the field, Lincoln Stein, once announced in a conference that bioinformatics would cease to exist as a separate discipline ten years down the line (i.e. within 2012). But he later regretted in his own statement and asserted that bioinformatics isn’t disappearing but its stronger than before and will get even more stronger in the future (Stein, 2008).
No doubt, that majority of the science news in the late 20th century until now has been mostly dominated by the genome sciences. However it’s not just the “pure” biologist that’s driving this field but new breeds of biologist are sprouting each with some distinct skills than the other. For example: Dr. Jim Kent who initially used to write programs for computer games eventually helped solve the biggest computational hurdle of the Human Genome Project. (There is another interesting article quite relevant to the topic “How Perl Saved the Human Genome Project”).
Following are the three breeds of experts visible in the field of genome science and bioinformatics (nevertheless all these three breeds of experts have subtle difference in terms of their area of work).
- Genomicist: These are the biologist who are more focused in “-omics” studies. Genomicist like to divide their time between the wet and dry labs. They are more concerned with high-throughput analysis and has fairly good skills in bioinformatics.
- Bioinformatician: Generally a bioinformatician has a strong background in molecular biology and genomics and also good skills in mathematics, statistics and computer sciences. They are mostly interested in high-throughput data analysis and exploit bioinformatics tools to solve key biological problems.
- Computational Biologist: These experts have a very strong background in mathematics, statistics and computer science but a fairly good background in biology. They are mostly involved in algorithm design and development of various bioinformatics tools.
The scientific landscape is gradually changing from a hypothesis-driven science to a data-driven science. Since the introduction of the 454 Pyrosequencing technology in 2005, which marked the beginning of the Next Generation Sequencing (NGS) era, bioinformaticians have been inundated with sequencing data. Just a few days back, on February 17, 2012, a ground breaking sequencing technology by Oxford Nanopore was announced in the Advances in Genome Biology and Technology (AGBT) conference. It was a big buzz all over the popular social media and that morning everyone in our department was talking about it. A miniaturized sequencing device called MinION, which is just like a USB flash drive in terms of size and shape, will just put the cost of sequencing to $900. This breakthrough can put genome sequencing right to the doctor’s table within the beginning of next year and revolutionize the way we deliver health-care. But on the other hand, it will further sky-rocket the sequence data volume into an unimaginable height (Blogs on MinION: Pathogens: Genes and Genomes; genomes unzipped).
The bottleneck is however not the sequencing technology but various computational challenges like data analysis, visualization, integration and above all, the skills of the bioinformaticians have become a rate limiting step (Green ED et. al., Nature 2011). Bioinformaticians are still struggling with how to interpret the huge chunks of data. Furthermore there are several hardware challenges too. Accessing and analyzing huge datasets requires large amounts of data to be frequently transported over a ‘cluster’, ‘grid’ or even a ‘Cloud’. One of the noted challenges is readily visible in the world’s largest genome sequencing center – Beijing Genome Institute (BGI). BGI has been using computer disk to transmit the sequence data to its collaborators mailing via a courier like FedEx. If transmitted using internet the voluminous data could take weeks to be transferred (DNA Sequencing Caught in Deluge of Data).
As portrayed in the famous novel by Alexandre Dumas – The Three Musketeers, the three friends Athos, Porthos, and Aramis live by the motto “all for one, one for all” (“tous pour un, un pour tous“). Similarly, the three musketeers of the genome sciences – Genomicist, Bioinformatician and Computational Biologist all work for a same cause with different set of skills which is vital to solve the major problems outlined above.