Shrimai Prabhumoye


Senior Research Scientist at NVIDIA          
Adjunct Professor at Boston University          

I am a Senior Research Scientist with the Applied Deep Learning Research Group at Nvidia and an Adjunct Professor at Boston University. My research is dedicated to advancing the state-of-the-art in large language models (LLMs) by enhancing their reasoning capabilities and ensuring their safety through rigorous mitigation of toxicity and bias. As the lead contributor to the Nemotron family of models, I have worked extensively on data curation, pretraining, and scaling. My current work focuses on optimizing pretraining pipelines with an emphasis on data selection, blending, and ordering strategies to maximize downstream model accuracy. I am particularly focused on improving reasoning in LLMs, including generating synthetic data for advanced mathematical reasoning and enabling models to handle longer, more complex reasoning tasks that require deeper thought and understanding. My work has featured in many media outlets like VentureBeat, Forbes and TechCrunch.

Before that, I graduated with a PhD from School of Computer Science, Carnegie Mellon University. At CMU, I was fortunate to be advised by Prof. Alan W. Black and Prof. Ruslan Salakhutdinov. My thesis focused on controllable text generation with a focus on style, content and structure, as well as its ethical considerations. I co-designed the Computational Ethics for NLP course which was offered for the first time in Spring 2018 at CMU. I graduated with a Masters in Language Technologies in Aug 2017. During that time, I was leading the CMU Magnus team in the Amazon Alexa Prize competition. I completed my undergraduate at National Institute of Technology, Karnataka, India.

News

Jan 2023 New paper titled Adding Instructions during Pretraining: Effective Way of Controlling Toxicity in Language Models is accepted at EACL 2023.
Jul 2022 I was a finalist for the Rising Star award at VentureBeat Women in AI Awards, 2022.
Jul 2021 I joined the Applied Deep Learning Research Group at Nvidia as a Research Scientist!
Apr 2021 I successfully defended my thesis!
Sep 2020 Invited talk at Grace Hopper Conference 2020 on Text Generation: Should machines reflect the way humans interact in society.
Jul 2020 Our work on politeness transfer is featured in SCS CMU News, TechCrunch, CNET, Pittsburgh Post-Gazette, msn, Hindustan Times, and Axios.