Images communicate a lot more than words. Computer vision has the potential to impact almost every space we live in, by analyzing, processing and deriving usable information from images to address a wide range of applications in Customer Service, Medical Diagnosis, Manufacturing, Monitoring Urban Resources, Green Cover and Pollution Levels.
Mohana Roy Chowdhury, our next pathbreaker, Data Scientist, works on a number of problems related to OCR (Optical Character Recognition) and digitization of documents to make the process of semiconductor manufacturing more foolproof.
Mohana talks to Shyam Krishnamurthy from The Interview Portal about taking up Computer Vision because it brings together all her areas of interest – applied maths, programming and logic, along with a diverse spectrum of real world challenges to solve and learn from.
For students, focus on hands-on experience with technology during your college years, experiment with subjects and domains to get an understanding of what to expect during the later years.
Mohana, tell us about your background?
I grew up in a small town called Shibmandir, located in Darjeeling district. My father was employed with a Cooperative society and my mother was employed as a biology teacher. Up until middle school my performance was fairly average. Around 8th standard, my performance suddenly improved and I started ranking close to the top of my class, probably due to my growing interest in the areas of Physics, Maths and Computer programming (C++ and Java back when I sat for my ICSE). I thoroughly enjoyed the process of writing algorithms and playing around with code, finding alternative approaches for problems and testing them out on my own. Apart from school, I was enrolled in Fine Arts, Hindustani classical music, Girl scout training camps and was a regular in annual adventure camps that taught basics about hiking. After 10th I spent a few years in Kota, preparing for IIT-JEE and joined NIT Raipur in 2011 which followed a 1 year drop year from 2010-2011 after my 12th boards. My life underwent a lot of changes between 2008 and 2011 but the only thing that remained constant was my interest in Physics, Programming and Math.
What did you do for graduation/post graduation?
I graduated with a degree in Biomedical engineering. I was fairly well informed about the multidisciplinary nature and opportunities in the branch before taking it up as my field of education. The subjects I studied in college allowed me to explore different subdomains in the field and my co-curricular activities in my early years of college included participating in technical competitions and tech fests to get more hands-on experience as an engineering student. I started with combat robotics, got my hands dirty with embedded systems and biomedical instrumentation related activities, and finally was introduced to Machine Learning and Computer Vision by my Professor Dr Bikesh Singh who invited me to work along with him on a Classification project for Breast Cancer Mammograms, which sparked my interest in the field. After studying more about Medical Image processing as a part of my curriculum, I made up my mind about pursuing this field.
What made you choose such an offbeat, unconventional and rare career?
My career choice has been driven by a lot of factors. One would be the fact that it puts together all my areas of interest – It includes a lot of interesting math, there is a lot of programming and logic which was a constant in my entire life. Based on what topic I am learning, I bounce between subjects like probability, optics and math and it is something I thoroughly enjoy learning more about. I was really fortunate to have mentors and teachers who always kept these subjects fun and interesting right from school to JEE coaching classes and college. I was able to find mentors who encouraged me to explore my ideas and even colleagues and seniors who I have learnt a lot from and helped me stay interested.
One more thing that probably moved me to the field of computer vision and ML during college would be accessibility as well. For robotics and embedded systems, sourcing components, and being able to find workshops with appropriate equipment was quite difficult. The availability of open source resources in the field of Computer vision helped a lot in getting started with things.
Tell us about your career path
From the time I decided to make a career in Computer vision, I tried to get involved in as many projects I could. The industry wants people who not only understands the concepts but is also able to solve problems efficiently. Courses are definitely important but the problems that you solve there do not address the full spectrum of the challenges you encounter while building an end to end solution. While participating in competitions or working on papers and projects, the problem started right from data collection and ended with building an end to end demo-able pipeline. Apart from the competitions and side projects with faculties, I also freelanced during my college years, building solutions that needed Computer vision, ML or embedded systems related knowledge. This sort of taught me to envision a full solution and plan out the processes independently and this learning is still useful after almost 6 years of working as a Data Scientist in the industry.
The first set of problems I worked on were based on use cases that required real time human detection, tracking as well as expression detection, gender detection, age detection and the implementations were deployed on edge devices like Raspberry pi. Much of the problem statements revolved around the understanding of customer behaviour. For instance, the human detection and tracking based project were designed with the idea of understanding the average time spent by people in queues. Age and gender detection were meant as tools to understand demographics, and the expression detection module for example, could be leveraged as a means to understand customer satisfaction levels.
After my first switch, I joined a startup that focuses on solutions for smart cities. I initially worked on surveillance- based problems and later moved to satellite image processing which involves concepts of GIS and image processing. There were a lot of interesting problems to solve and the learning curve was very steep. Some of the problem statements we were working on involved crowd counting, theft detection, waste management, restricted area trespassing alerts using CCTV footage. Data quality was often an issue since there are challenges with respect to the setting up of cameras, camera angles and image quality. Real time results were also a necessity in surveillance based problems, especially when theft detection and monitoring restricted areas were the objectives.
The satellite imagery based usecases were meant for profiling urban resources. Understanding satellite data requires a lot of domain specific knowledge and there are multiple processing steps required to translate the raw satellite data into usable information. The objective of our work was to automate the internal processing steps and allow a user without domain knowledge to use the tool for urban area profiling or their decision-making processes.
In my third organization, I worked as a senior data scientist where I was overseeing different teams involved in client POCs. The projects we were working on revolved around the application of technology to tackle some of the prevailing problems in the agriculture like pest and disease identification, yield estimation/prediction etc.
Currently I have been working on a number of OCR related problems and semiconductor design and fabrication related problems. Solutions range from the application of pure Computer vision methods to Deep Learning.
While a lot of these topics might seem unrelated at first, but it all boils down to the same concept – deriving usable information from image data. Throughout my work experience, and even during my freelance and project work during college, the only thing that varied was the exact problem statement and the domain specific information but at the core of it, it was always about making sense of what is being represented in the image and how to make use of it.
How did you get your first break?
I joined my first job through campus placement. I joined as an analytics professional but I had an idea about ML/CV (Machine Learning/Computer Vision) based opportunities because I was asked some related questions during the final interview round and I had also discussed about the opportunities in that line. Post joining, I interviewed for a position with the R&D team where the work focused around exploring innovative solutions using different subdomains of Data Science like ML, CV, NLP etc. Here I ramped up on my Python skills and started applying them to solve a lot of interesting end to end solutions for problems like Human Detection and tracking, Facial expression and gender detection, heart rate monitoring etc. I was incorporating a lot of learning from my college curriculum and I was also understanding about the process of solving real world problems where data quality was a major challenge.
What were the challenges you faced? How did you address them?
One of the minor setbacks I faced was lacking a formal education in CSE. It wasn’t really a challenge, but more like a minor setback. But that was easily compensated as all the resources for learning and ramping up on any subject are readily available online.
Sometimes we have to work on delivering a lot of unrealistic expectations with limited availability of data. Often startups demand a much quicker output that is practically very difficult due to hardware and resource related challenges. Often such problems can be addressed by having clear discussions and working on smaller subsets of the project to fully understand the limitations and explore possible alternative solutions that can function as a feasible replacement for the original requirement.
Where do you work now? Tell us about your work
Currently, the projects I have been working on cover a wide spectrum that involves digitization of documents to measure and inspect defects of semiconductor parts. A good understanding of Computer Vision and Deep Learning is a must to work on such problems. Since we work on end-to-end processes, it is important to understand the process of doing exploratory work and small experiments to map out a range of possible solutions, mapping out possible limitations and challenges and ways to address them, building a pipeline for the solution till the final phase of deployment.
What I love about my job is the wide range of different problems that fall on my plate. Irrespective of the number of projects I have worked on previously, there is always some new learning and I don’t feel stagnant with the problem statements I have to tackle.
How does your work benefit society?
Covid has taught us the importance of technology in our day to day lives. And even in pre-covid days, electronic equipment has been the backbone of our current lifestyles. Right from the medical industry to smart homes, the semiconductor industry is single handedly responsible for revolutionizing the way we function. In my current line of work, I get to participate in the process of making the manufacturing of semiconductors more fool proof. Some of the direct indirect impact of the work we do involves reducing wastage of resources and improving the output of manufacturing units.
Beside this, computer vision has the potential to impact almost every space we live in. Medical image processing can help save lives, Remote sensing and satellite image processing plays a wide role in monitoring urban resources and green cover, pollution levels. Application of computer vision in agriculture can help predict pests and diseases and help farmers in estimating the yield of their farms and taking necessary steps to tackle problems like poor harvest and crop diseases. Surveillance based problems can aid the crime investigation process. Driving assistance systems based problems can help in preventing road accidents. So I feel that this is a field with a wide spectrum of really interesting problem statements that need a lot of exploration and can really impact lives for the better.
Tell us an example of a specific memorable work you did that is very close to you!
My most memorable work was the satellite image processing experience for smart cities. We were working on demos for vegetation layer, water level, built up area etc and another problem statement that involved using Deep learning for land use and land cover analysis. When we were exploring this, the usage of Computer vision and deep learning on low resolution satellite images were not that common and the resources available online were slightly limited. The analysis we generated let us actually see the impact of vegetation loss in a lot of cities globally. Reading about it in news articles is a different thing, actually working on a problem statement like this and witnessing the actual situation with your own eyes was a different experience in itself. And we experimented quite a bit for the Deep learning based “land use/land cover” problem which was very exciting since it involved building the model from scratch.
Your advice to students based on your experience?
Be very honest with yourself about your interests and goals in life. There are always opportunities in whatever line you want to pursue. We spend a major portion of our day working so I feel it is very important to pick a domain that you really enjoy learning and working on. If you do something you enjoy, you’ll always find new opportunities opening up for you and you will always find places to learn and grow. Do a good amount of research before picking up a college subject because if you join with prior knowledge, you’ll be able to use your learning and experience from your college years further down the line. And focus on hands-on experience with technology during the college years itself, experiment with subjects and domains to get an understanding of what to expect during the later years.
Future Plans?
I intend to explore the other subdomains in the field of Computer vision as well. Each subdomain needs a good understanding of the core concept along with domain specific learnings which are really interesting and blur the borders that we have created to separate out subjects.