layout | title |
---|---|
page |
Speaker profile |
Desh is a PhD student at Johns Hopkins University, working in the Center for Language and Speech Processing (CLSP), advised by Sanjeev Khudanpur and Dan Povey. His research interests lie in the application of machine learning methods for speech and language tasks. He is currently working on speech recognition (a.k.a speech-to-text) and speaker diarization (a.k.a who spoke when) for multi-party conversations. He wants to build systems which can identify the speaker and transcribe their speech in real time, and do this for all of the world's languages.
He spent summer 2021 building end-to-end multi-talker ASR systems at Microsoft. In summer 2022, he was an intern in the AI Speech team at Meta, where he explored target-speaker ASR models for improving transducer models in the presence of background speech. Desh has several publications at ICASSP, InterSpeech, SLT, and ASRU, with over 450 citations.
Previously, he graduated from the Indian Institute of Technology Guwahati in 2017 with a major in Computer Science, where his thesis was on deep learning methods for biomedical text. He has worked on building smart assistants at Samsung Research (India) and was an intern at Microsoft India, devising statistics APIs for the Enterprise Commerce team.
When he is not doing ML, he likes to work out, climb boulders, play guitar, and read fiction.