When data has a life of its own, even the most effective algorithms fall short in identifying patterns due to complexity and unpredictability. And there is almost no precedence to look for because we are talking about the Human DNA.
Our next pathbreaker, Avantika Lal, KVPY Fellow and Post-Doc from Stanford, uses cutting-edge Deep Learning techniques to decipher the human DNA through Genomic Sequencing, looking for patterns in patients with diseases such as cancer and potentially effective Proteins as drug targets.
Avantika talks to Shyam Krishnamurthy from The Interview Portal about her work applying Deep Learning algorithms on Human Genomic data to identify Biological Markers.
Avantika, tell us about your background?
I grew up in Delhi. I remember being interested in the natural world as a child, observing plants, insects, and birds and trying to understand their behavior. As I grew older this became an interest in studying biology at school.
In the 11th standard I studied a unit on genetics which was eye-opening – it explained how the seemingly magical properties of all living things, including humans, are controlled by their DNA through basic molecular and chemical processes. Although we see a huge diversity of living things in the world, science reveals that all plants, animals, and other species actually share the same mechanisms of life through the DNA molecules in their cells.
What did you do for graduation/post graduation?
I studied biochemistry at Sri Venkateswara College, Delhi University. After graduation I was accepted into the Integrated Masters-PhD program in Biology at NCBS (National Centre for Biological Sciences), Bangalore.
How did you end up in such an offbeat, unconventional and uncommon career?
While in school, I made it a point to read news articles about biology. I also used to visit a library to read the journal Nature, which has a very well-written and understandable news section covering the latest developments in research science.
Through this reading I learned that genetics was at a very early stage in its development compared to other sciences. Scientists had understood the most basic laws of DNA, but that was just the beginning, and its much deeper complexity was just beginning to be understood. At the same time, the applications of this science – for example, in gene therapy and genetic engineering – had just started becoming reality. It seemed to be a scientific field where many great discoveries were ready to be made.
In the 11th standard I was selected for the CSIR CPYLS program that allowed me to do a summer internship in the Institute for Genomics and Integrative Biology (IGIB), Delhi. This gave me a taste of a research project and how new discoveries are really made.
While in school I also qualified for another government program, called KVPY, through a nationwide exam. This scholarship funded my college education and also gave me funds to support doing a research internship every year.
Tell us about your career path
With the KVPY funding, I applied to professors at leading national institutions (IISc, TIFR, NCBS) for summer internships every single summer of my undergraduate degree. I spent the first summer interning at IISc, Bangalore and the next two summers at NCBS, Bangalore. These internships exposed me to different branches of biology, and also allowed me to meet PhD students and postdocs who became my mentor group and helped me plan my next steps. Each internship was more challenging than the previous one. Some of my mentors gave me great independence and allowed me to design and plan my own experiments.
After completing my undergraduate degree I moved to NCBS, Bangalore for my PhD, where I studied the genetics of bacteria. Scientists have discovered that bacteria can survive for years in very harsh environments, like extreme starvation, heat and cold. Using recently developed tools to study the activity of all genes in a bacterial cell at the same time, I was able to discover some of the mechanisms by which bacteria adapt to stressful environments. My work showed that several of these mechanisms act upon the DNA – for example, the structure of DNA itself changes when cells change their behavior.
During my PhD I continued to read widely in the field and keep track of developments in biotechnology and genetics.
While there is some good academic research in India, in general research lags far behind developed countries. There is also very limited opportunity in the biotech industry, in contrast to global hubs like Boston and San Francisco, which are home to many innovative startups.
One reason for this is that in India, educators have an outdated view that biology is not connected to mathematics and computer science, and many students even study biology without mathematics. But globally, genetics was becoming more of a data science and had started to use data analytic techniques, such as machine learning. Machine learning is a group of algorithms that allows computers to analyze vast amounts of data and learn patterns and insights from it that a human cannot. These technologies had started to produce critical new results in biology, with many medical and scientific applications.
At the time, I had no formal training in programming or machine learning and very little in statistics. I taught myself these subjects through books and online courses. Coursera was a great resource, particularly the machine learning and deep learning courses of Andrew Ng.
I moved to the USA after completing my PhD and started a postdoctoral fellowship at Stanford University, where I could explore these interdisciplinary topics. In my postdoc I used machine learning to analyze genetic data from cancer patients. I developed machine learning algorithms that could analyze the genetic data of these patients, and, using what they learned, could predict the aggressiveness of their cancer, the origin of their cancer, and the patient’s likely response to treatment. Amazingly, these algorithms delivered very good results.
How did you get your first break?
Stanford is located near San Francisco and Silicon Valley which are hubs for tech companies – many of which are now seeking to apply machine learning algorithms to medical and genetic data to develop new products. A little over a year into my postdoc, I was recruited by NVIDIA, a technology company, as one of the first employees on their new biology research team.
What were the challenges? how did u address them?
Finding a job in industry: Finding a good job is not just about your qualifications but also has a lot to do with your connections. As an immigrant without a large network in the USA, I had to work extra hard to make these connections. I spent a lot of time connecting with other scientists on LinkedIn, and at career events at Stanford. I also contacted and met Stanford alumni working in interesting companies.
Visas are one of the greatest challenges that Indians who move to the USA face. I found that once I had relevant research experience, many companies were interested in talking to me – but they could not or would not apply for the visa that would allow me to work for them. I lost count of the number of times I heard, “we would like to hire you if it was not for the visa issue…” Fortunately I had planned for this and had started my job search well in advance. I kept up my search for many months and eventually I found some companies that were willing to help me get a visa. This is a very difficult situation to be in, and many Indians in the USA spend a long time stuck in their careers because of visa issues.
Where do you work now?
I work as a research scientist on the genomics team at NVIDIA. Our team applies machine learning to improve human genome sequencing. Genome sequencing is the process of deciphering the unique DNA molecule of a cell. Using machine learning, we can now sequence DNA from any human being, faster, better and cheaper than was possible before.
I work with large amounts of genetic data describing the DNA of people – both healthy people and people suffering from diseases such as cancer. I then build machine learning models to learn useful patterns in these vast datasets. Using the insights learned by these models, I look for new ways to improve DNA sequencing. In the process, I collaborate with researchers in other companies and leading universities. My work requires expertise not only in biology, but also in statistics and programming.
How does your work benefit the society?
Human genome sequencing is now used for diagnosing diseases, treating diseases, and developing new medicines. In agriculture, sequencing the genomes of plants and animals is now routinely done to help develop better crops and livestock. The research I do has the potential to advance all of these fields.
Tell us an example of a specific memorable work you did that is very close to you!
In Stanford, I worked briefly with researchers who were trying to develop better treatments for malaria. Using machine learning, we were able to identify over a hundred proteins that were potential drug targets in malaria. The amazing thing is that we were able to get a solution to this problem in a very short time, just days. Data analysis and artificial intelligence have tremendous potential in biology and it won’t be long before we see new treatments and disease prevention emerging from these technologies.
Your advice to students based on your experience?
Read widely. Read about what is happening in your field all over the world and keep track of new developments. This will help you figure out what you’re interested in, and also help you figure out good career moves.
Internships are an amazing opportunity to learn, in college and even in school. If you are interested in science or engineering, search for an opportunity to spend a few weeks or months in a research lab.
Any theoretical subject you want to learn, be it mathematics, science, statistics, programming – or even English language and writing – you can learn online. There are world-class online courses many of which are free to students. Some of the leading universities in the world, such as MIT, have started putting their courses and lectures online. So you can learn far beyond what is taught in your school or college.
Do not limit yourself, especially while you are young. Find opportunities to travel and study abroad. It is important to know that it is much easier to go to another country while you are still a student; emigrating after completing your studies is much more difficult. If you are thinking of studying/working in a country, educate yourself about their laws – will you be able to work there after you finish your studies? Will you face restrictions?
Future Plans?
I’m currently working on a blog to share some of my experiences in biology and help non-experts understand current research. Hopefully this will be useful to students as well.