I am a Research Scientist with the Applied Deep Learning Research Group at Nvidia where I work on building large language models (LLM). I also work on interesting applications of LLMs such as dialogue systems and QA systems, as well as reducing bias and toxicity of LLMs. Before that, I graduated with a PhD from Language Technologies Institute, School of Computer Science, Carnegie Mellon University. At CMU, I was fortunate to be advised by Prof. Alan W. Black and Prof. Ruslan Salakhutdinov. My thesis focused on controllable text generation with a focus on style, content and structure, as well as its ethical considerations. I co-designed the Computational Ethics for NLP course which was offered for the first time in Spring 2018 at CMU.
I graduated with a Masters in Language Technologies in Aug 2017. During that time, I was leading the CMU Magnus team in the Amazon Alexa Prize competition. I completed my undergraduate at National Institute of Technology, Karnataka, India.
Jan 2023 | New paper titled Adding Instructions during Pretraining: Effective Way of Controlling Toxicity in Language Models is accepted at EACL 2023. |
Jul 2022 | I was a finalist for the Rising Star award at VentureBeat Women in AI Awards, 2022. |
Jul 2021 | I joined the Applied Deep Learning Research Group at Nvidia as a Research Scientist! |
Apr 2021 | I successfully defended my thesis! |
Sep 2020 | Invited talk at Grace Hopper Conference 2020 on Text Generation: Should machines reflect the way humans interact in society. |
Jul 2020 | Our work on politeness transfer is featured in SCS CMU News, TechCrunch, CNET, Pittsburgh Post-Gazette, msn, Hindustan Times, and Axios. |
Rafal Kocielnik, Shrimai Prabhumoye, Vivian Zhang, R Michael Alvarez, Anima Anandkumar.
Published on arxiv, 2023.
Shrimai Prabhumoye, Mostofa Patwary, Mohammad Shoeybi, Bryan Catanzaro.
In the proceedings of the European Association for Computational Linguitics (EACL) 2023.
Dan Su, Mostofa Patwary, Shrimai Prabhumoye, Peng Xu, Ryan Prenger, Mohammad Shoeybi, Pascale Fung, Anima Anandkumar, Bryan Catanzaro.
In Findings of the European Association for Computational Linguitics (EACL) 2023.
Rafal Kocielnik, Sara Kangaslahti, Shrimai Prabhumoye, Meena Hari, Michael Alvarez, Anima Anandkumar.
In Transfer Learning for Natural Language Processing Workshop at NeurIPS 2022.
Peng Xu, Mostofa Patwary, Shrimai Prabhumoye, Virginia Adams, Ryan J Prenger, Wei Ping, Nayeon Lee, Mohammad Shoeybi, Bryan Catanzaro.
In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP) 2022.
Zihan Liu, Mostofa Patwary, Ryan Prenger, Shrimai Prabhumoye, Wei Ping, Mohammad Shoeybi, Bryan Catanzaro.
In Findings of the Association for Computational Linguistics (ACL) 2022.
Shaden Smith*, Mostofa Patwary*, Brandon Norick, Patrick LeGresley, Samyam Rajbhandari, Jared Casper, Zhun Liu, Shrimai Prabhumoye, George Zerveas, Vijay Korthikanti, Elton Zhang, Rewon Child, Reza Yazdani Aminabadi, Julie Bernauer, Xia Song, Mohammad Shoeybi, Yuxiong He, Michael Houston, Saurabh Tiwary, Bryan Catanzaro.
Published on arxiv, 2022.
Shrimai Prabhumoye*, Rafal Kocielnik*, Mohammad Shoeybi, Anima Anandkumar, Bryan Catanzaro.
Published on arxiv, 2023.
Dirk Hovy, Shrimai Prabhumoye.
Language and Linguistics Compass, 2021.
Shrimai Prabhumoye, Kazuma Hashimoto, Yingbo Zhou, Alan W Black, Ruslan Salakhutdinov.
In the proceedings of North America Chapter of Association of Computational Linguistics (NAACL) 2021.
Shrimai Prabhumoye*, Brendon Boldt*, Ruslan Salakhutdinov, Alan W Black.
In the proceedings of North America Chapter of Association of Computational Linguistics (NAACL) 2021.
Shrimai Prabhumoye, Alan W Black, Ruslan Salakhutdinov.
Proceedings of the 28th International Conference on Computational Linguistics (COLING) 2020.
Selected for oral presentation
Shrimai Prabhumoye, Ruslan Salakhutdinov, Alan W Black.
In the proceedings of Association for Computational Linguistics Conference (ACL) 2020.
Aman Madaan*, Amrith Setlur*, Tanmay Parekh*, Barnabas Poczos, Graham Neubig,Yiming Yang,
Ruslan Salakhutdinov, Alan W Black, Shrimai Prabhumoye.
In the proceedings of Association for Computational Linguistics Conference (ACL) 2020.
Shrimai Prabhumoye*, Margaret Li*, Jack Urbanek, Emily Dinan, Douwe Kiela, Jason Weston, Arthur Szlam.
arXiv:2002.02878 [cs.AI]
Angela Fan*, Jack Urbanek*, Pratik Ringshia, Emily Dinan, Emma Qian, Siddharth Karamcheti, Shrimai Prabhumoye,
Douwe Kiela, Tim Rocktaschel, Arthur Szlam, Jason Weston.
In the Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence.
Shrimai Prabhumoye, Elijah Mayfield, Alan W Black.
Widening NLP Workshop at ACL 2019.
Shrimai Prabhumoye*, Khyathi Chandu*, Ruslan Salakhutdinov, Alan W Black.
In the proceedings of Storytelling Workshop at ACL 2019.
Elijah Mayfield, Michael Madaio, Shrimai Prabhumoye, David Gerritsen, Brittany McLaughlin,
Ezekiel Dixon-Román, Alan W Black.
In the Proceedings of the Fourteenth Workshop on Innovative Use of NLP for Building Educational Applications at ACL 2019.
Shrimai Prabhumoye, Chris Quirk, Michel Galley
In the proceedings of North America Chapter of Association of Computational Linguistics (NAACL) 2019.
Selected for oral presentation
Kangyan Zhou, Shrimai Prabhumoye, Alan W Black.
In the proceedings of Conference on Empirical Methods in Natural Language Processing (EMNLP) 2018.
Shrimai Prabhumoye, Yulia Tsvetkov, Ruslan Salakhutdinov, Alan W Black.
In the proceedings of Association for Computational Linguistics Conference (ACL) 2018.
Selected for oral presentation
Shrimai Prabhumoye*, Samridhi Choudhary*, Evangelia Spiliopoulou, Christopher Bogart, Carolyn Penstein Rose, Alan W Black.
In the proceedings of Workshop on NLP+CSS at ACL 2017.
Shrimai Prabhumoye*, Fadi Botros*, Khyathi Chandu*, Samridhi Choudhary*, Esha Keni*, Chaitanya Malaviya*, Thomas Manzini*, Rama Pasumarthi*, Shivani Poddar*, Abhilasha Ravichander*, Zhou Yu, Alan Black
In the proceedings of Alexa Prize 2017.
Deep Learning: Classics and Trends, Oct 2020.
Allen Institute for Artificial Intelligence (AI2), Aug 2020.
Salesforce, Jul 2020.
Montreal Institute for Learning Algorithms (Mila), Jul 2020.
Apple, Seattle, Jul 2020.
The LTI Summer Seminar, Jul 2020,
University of Massachusets Amherst, October 2019.
Google AI Research, NYC, June 2019.
This work introduces a new task of politeness transfer which involves converting non-polite sentences to polite sentences while preserving the meaning. We also provide a dataset of more than 1.39 million instances automatically labeled for politeness to encourage benchmark evaluations on this new task. We design a tag and generate pipeline that identifies stylistic attributes and subsequently generates a sentence in the target style while preserving most of the source content.
Associated Publication: Politeness Transfer: A Tag and Generate Approach at ACL 2020
We know that downstream tasks are influenced by the demographic skew of training sets like the sentiment analysis task is affected by the gender confound and the part of speech (POS) tagging task is affected by the age confound. By building a generation engine that can preserve content while controlling for style, we can now produce demographically balanced datasets for these NLP tasks. We are also looking at using these downstream tasks to automatically evaluate style transfer models.
This work introduces a document grounded dataset for conversations using Wikipedia articles on movies. The dataset contains 4112 conversations with an average of 21.43 turns per conversation. We describe two neural architectures that provide benchmark performance on the task of generating the next response.
Associated Publication: A Dataset for Document Grounded Conversations at EMNLP 2018
Machine Translation and Sequence-to-sequence Models
CS 11-731, Carnegie Mellon University, Fall 2018
Computational Ethics in NLP
CS 11-830, Carnegie Mellon University, Spring 2018, Spring 2019 and Spring 2020
Speech Processing
CS 11-492 11-692 11-892, Carnegie Mellon University, Fall 2017, 2018, and Fall 2019
Speech Processing
CS 11-492 11-692 11-892, Carnegie Mellon University, Fall 2017, 2018, and Fall 2019
Speech Processing
CS 11-492 11-692 11-892, Carnegie Mellon University, Fall 2017, 2018, and Fall 2019
Chatting with Computers Workshop
OurCS, Carnegie Mellon University, Fall 2017.
CS 11-830, Carnegie Mellon University, Spring 2018
CS 11-492 11-692 11-892, Carnegie Mellon University, Fall 2017