Conversational AI is a game changer, thanks to speech assistants that are getting better and better at understanding the nuances and intricacies of the language spoken by humans

Pavankumar Dubagunta, our next pathbreaker, Staff AI Scientist at Uniphore, primarily focuses on building and improving speech-based solutions in the Conversational AI space focused on Speech Recognition.

Pavan talks to Shyam Krishnamurthy from The Interview Portal about his PhD (Electrical Engineering) at Idiap Research Institute and EPFL (Switzerland) on various aspects of speech that a human listener can judge apart from the spoken message.

For students, there is tremendous potential in developing speech recognition & assessment technologies for several Indian and world languages that are serving people across the globe.

Pavan, Your background?

I was born and brought up in Nellore, a town in Andhra Pradesh. At school, I was good at studies and co-curricular activities, but not very good at sports and physical activities. I had a keen interest in Maths and Sciences, and had the freedom to explore my interests without pressure to perform in exams. After the secondary school, I ended up at a coaching centre cum junior college, where competitive exams were the sole focus. I skipped a large part of the classes, studied on my own and managed to get a decent score in the state-level engineering entrance test.

What did you do for graduation/post-graduation?

I chose Electronics and Communication Engineering out of interest. College gave me the resources and opportunities to develop my inquisitiveness. The areas of Signal Processing and Communication Systems fascinated me. Towards the end of graduation, I chose to pursue higher studies to explore my interests over starting a career in the industry. Through entrance tests and interviews, I got admitted to a research-based masters program at IIT Madras.

What were some of the influences that led you to such an offbeat, unconventional and unique career?

I was introduced to Speech Processing in my masters. This is a cross-disciplinary field that involves Signal Processing combined with a newly discovered favourite, Machine Learning. I worked towards a thesis on Speech Recognition. Specifically, my thesis focused on approaches that improve speech recognition performance in the presence of background noises. My advisor and seniors helped me develop the required skills in several aspects in this journey. Pursuing interesting work and being surrounded by kind and creative people made me continue in this career direction.

How did you plan the steps to get into the career you wanted? Tell us about your career path.

My career has always been in Speech Processing since my masters.

In my first industry position at Samsung R&D, I investigated noise-robustness and other aspects of speech recognition in their smartphone assistants, a topic that I had some extent of academic exposure. I was amazed to experience large-scale projects organized into numerous components, each overseen by a dedicated teams of engineers, and the teams coordinating to develop the final solution. As a junior engineer working at this scale, getting opportunities to explore each individual component of the project to see the overall picture was often challenging.

I moved to Interactive Intelligence, a smaller organization now merged with Genesys Telecom Labs, where I focused on speech recognition for call-centre automation. Although this scenario was at a smaller scale, the projects were equally challenging. I felt more connected with my work as my contributions directly impacted their products. My interest in enrolling in a part-time PhD to sharpen my technical skills grew, while I wished to continue with my position.

When a full-time PhD opportunity from Idiap Research Institute and EPFL came my way, I could not let it go. During my PhD, I worked mainly on Speech Assessment, an emerging research field that aims at automatically predicting from a person’s speech, various aspects that a human listener can judge apart from the spoken message, such as the spoken dialect, characteristics such as fluency, and the social and emotional states of the speaker. The thesis (read here) focuses on novel methods to incorporate prior knowledge for diverse data-driven speech assessment problems.

Finally, I returned to the industry and continued working with my former manager at my current organisation on real-world problems.

How did you get your first break?

I believe my first and major break was when I got into my masters programme. I learned most of the basic skills that I regularly use today from my masters: brainstorming ideas, carrying out research experiments, managing time and resources, being efficient at work and aiming at perfection in every piece of work.

My break into the industry was initially through a senior who connected me with my manager at Samsung R&D, and subsequently applying and clearing their interview processes. Similarly, towards the end of my PhD, I reached out to my former manager at Interactive Intelligence, who had by then moved to Uniphore. I later cleared their interview process. 

What were some of the challenges you faced? How did you address them?

Being mediocre at work: My initial work at the start of my masters was mediocre. Instead of showing disappointment, my professor chose to appreciate the positive aspects of my work and indicated the areas I could improve on. Without me realising it, he steadily made me reach the high standards that he envisioned for each of his students. A lot of credit goes to him.

Being physically inactive: I ignored physical activity in high school and wanted to use the time to do things that I liked. A few years down the line, I realised I was weaker than most of my friends. Attending classes in large campuses forced me to navigate distances. The more I was active, the better my body and mind were. I gradually improved my physical health over the years. I realised that taking time out to exercise everyday makes the rest of the day much more productive.

Dealing with negativity: We often come across people spreading negativity and making our work environment unhealthy and sometimes toxic. I’ve always limited my interactions with such people and associated myself with positive-minded and collaborative people. Being polite and respectful to everyone, including those that I couldn’t identify with, helped me grow both personally and professionally.

Where do you work now? Tell us about your current role

I work at Uniphore Software Systems, where my work primarily focuses on building and improving speech-based solutions in the Conversational AI space. My job requires technical and practical knowledge on Speech Recognition. I design solutions for real-world problems that reach millions of customers, which is both exciting and fulfilling.

How does your work benefit society? 

The solutions we deliver get deployed in call centres and other clients, who use our Speech Recognition technology to capture their customers’ speech and intent with more ease than the conventional approaches. We have also built systems in several Indian and world languages that are serving people across the globe.

Tell us an example of a specific memorable work you did that is very close to you!

I have recently worked on a crucial large-scale project that required developing a solution with a major technological change in a very short span of time. The project brought several of us together as one big team, putting aside the team boundaries, seniority levels and conflicting ideologies, to make things happen. I became a key member of the project while I received guidance and support from mentors to coordinate and get the work done. This was the first time I realised the true potential of teamwork. The project has now reached millions of users and has clearly become the most impactful work that I have ever done. This will remain close to my heart.

Your advice to students based on your experience?

Identify your passion and learn to build a genuine interest in the field – this will take you a long way in your career.  Of course, the field you choose should both interest you as well as serve your financial requirements. You cannot sustain in a field that you dislike or have no passion towards. Be sincere and truthful to anything that you do, and always stay hungry to learn new things.

Future Plans?

I plan to continue with my journey, grow as a leader and make bigger and impactful contributions to technology. I am also open to ideas for contributing to speech solutions for social causes. I would also like to stay relevant and passionate towards research and innovation.