Preetha Chatterjee


Office: 3675 Market Street, Office 1160, Philadelphia, PA 19104
Email: preetha[dot]chatterjee[at]drexel[dot]edu


I am an Assistant Professor in the Department of Computer Science at Drexel University, where I lead the SOftware Engineering and Analytics Research (SOAR) Lab. My research focuses on Software Engineering (SE), with an emphasis on developing tools, knowledge sources, and strategies to support software maintenance and improve developer productivity. I incorporate evidence from mining software repositories, conducting empirical studies, and adapting state-of-the-art techniques from the fields of Natural Language Processing and Machine Learning to solve problems in Software Engineering. Through my research, I intend to enable advances in areas including LLM-assisted software development and maintenance, developer collaboration in distributed software teams, and knowledge extraction from large-scale software artifacts. Before joining Drexel University, I graduated with my M.S. and Ph.D. in Computer Science from the University of Delaware, advised by Dr. Lori Pollock. Prior to that, I worked in the industry as a Software Engineer for 5+ years.

I am currently NOT hiring PhD students. However, MS or undergraduate students at Drexel interested in working with me are encouraged to reach out via email [Instructions here].


Latest News!

  • Jan 2025: Paper accepted at MSR 2025, Technical track
  • Jan 2025: Paper accepted at JMIR Infodemiology
  • Dec 2024: Paper accepted at NLBSE 2025, Full paper
  • Dec 2024: Congratulations to SOAR lab members Amirali Sajadi and Ramtin Ehsani on successfully passing their Ph.D. candidacy exams
  • Jan 2024: Paper accepted at MSR 2024, Data Showcase track
  • Dec 2023: Two papers accepted at ICSE 2024, research track
  • Nov 2023: Received Distinguished Reviewer Award at FSE 2023
  • Nov 2023: Paper accepted at the ICSE 2024, NIER track
  • Jul 2023: 2 papers accepted at ESEC/FSE 2023, Ideas, Visions and Reflections Track
  • Mar 2023: I am invited to talk about my research on Emotion Awareness in SE at the ``It Will Never Work in Theory (NWiT)'' April, 2023 series


  • Media/Blog Coverage

  • Recognition at FSE 2024: Drexel CCI news
  • Our research on morality in open source: Compassionate Coding Newsletter
  • Our research on emotion mining in SE texts: Drexel CCI news
  • Our research on mining developer chat communications: ABB news
  • Starting ACM-W Chapter at University of Delaware: UDaily article

  • Research

    Improving LLM-Assisted Bug Resolution

    In spite of wide adoption of LLMs in code generation, developers struggle to effectively use these models for automated program repair tasks such as bug localization, patch generation, and patch validation. This project aims to understand such challenges and build tools for reliable and efficient LLM-assisted bug resolution.

    [MSR'25]

    Security Assessment of LLM-Generated Code and Code Descriptions

    Code generated by large language models often contain security vulnerabilities. This project investigates the security risks of using LLM-generated code and information in various software maintenance tasks such as bug resolution. We build tools to automatically uncover such risks and suggest potential solutions.

    [Preprint]

    Mining Emotions from Software Engineering Communication

    Emotions strongly impact collaborative tasks such as software development. Positive emotions (e.g., Joy) can enhance productivity and job satisfaction, whereas negative emotions (e.g., Frustration) can lead to reduced motivation and team attrition. This project aims to mine emotions in software-related text to improve collaboration and productivity.

    Mining Information from Developer Chat Conversations

    Public chat communities (e.g., Slack) contain valuable software development discussions, including API descriptions, best practices, and debugging solutions. This project develops techniques for automatically identifying and extracting useful information to enhance software maintenance tools.

    [MSR'22], [TOSEM'21], [ICSE'21], [MSR'20], [MSR'19]

    Studying Developer Focus on Q&A Forums

    While platforms like Stack Overflow serve as rich knowledge resources, finding relevant answers can be time-consuming. This project develops methods to help developers quickly identify useful code and text within Q&A forums.

    [NLBSE'22], [JSS'19]

    Learning About Code Snippet Characteristics in Software Artifacts

    Large collections of software-related documents (e.g., blogs, bug reports, emails, research articles) offer opportunities to extract knowledge from discussions about code snippets. This project examines how developers describe code in various contexts to enhance documentation, search, and recommendation systems.

    [SANER'17] [MSR'17]


    Selected Talks

    Emotion Awareness in Software Engineering

    Lightning Talk, NWiT April 2023.

    Opinion-based Q&A from Developer Chats

    ICSE 2021.

    Finding Help with Programming Errors

    JSS 2021 Happy Hour.

    Slack Chats with Disentangled Conversations

    MSR 2020.


    Publications

    2025:

  • Towards Detecting Prompt Knowledge Gaps for Improved LLM‐guided Issue Resolution
    Ramtin Ehsani, Sakshi Pathak, and Preetha Chatterjee
    The 22nd International Conference on Mining Software Repositories (MSR), Technical Track, Apr 2025.

    Preprint

  • Shifting Narratives in Media Coverage: A Decade of Drug Discourse in The Philadelphia Inquirer
    Layla Bouzoubaa, Ramtin Ehsani, Preetha Chatterjee, and Rezvaneh (Shadi) Rezapour
    Journal of Medical Internet Research (JMIR) Infodemiology

    Preprint

  • Analyzing Toxicity in Open Source Software Communications Using Psycholinguistics and Moral Foundations Theory
    Ramtin Ehsani, Rezvaneh (Shadi) Rezapour, and Preetha Chatterjee
    The 4th International Workshop on Natural Language‐based Software Engineering (NLBSE), co‐located with ICSE, Apr 2025.

    Preprint

    2024:

  • Incivility in Open Source Projects: A Comprehensive Annotated Dataset of Locked GitHub Issue Threads
    Ramtin Ehsani, Mia Mohammad Imran, Robert Zita, Kostadin Damevski, and Preetha Chatterjee
    The 21st International Conference on Mining Software Repositories (MSR), Data Showcase Track, Apr 2024.

    Preprint Dataset Slides DOI

  • Exploring ChatGPT for Toxicity Detection in GitHub
    Shyamal Mishra, and Preetha Chatterjee
    The 46th International Conference on Software Engineering (ICSE), New Ideas and Emerging Results Track, Apr 2024.

    Preprint DOI Slides

  • Shedding Light on Software Engineering-specific Metaphors and Idioms
    Mia Mohammad Imran, Preetha Chatterjee, and Kostadin Damevski
    The 46th International Conference on Software Engineering (ICSE), Research Track, Apr 2024.

    Preprint DOI Slides

  • Uncovering the Causes of Emotions in Software Developer Communication Using Zero-shot LLMs
    Mia Mohammad Imran, Preetha Chatterjee, and Kostadin Damevski
    The 46th International Conference on Software Engineering (ICSE), Research Track, Apr 2024.

    Preprint DOI Slides Blog Post

    2023:

  • Exploring Moral Principles Exhibited in OSS: A Case Study on GitHub Heated Issues
    Ramtin Ehsani, Rezvaneh Rezapour, and Preetha Chatterjee
    The 31st ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE), Ideas, Visions and Reflections Track, Dec 2023.

    Preprint DOI

  • Towards Understanding Emotions in Informal Developer Interactions: A Gitter Chat Study
    Amirali Sajadi, Kostadin Damevski, and Preetha Chatterjee
    The 31st ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE), Ideas, Visions and Reflections Track, Dec 2023.

    Preprint DOI

  • Interpersonal Trust in OSS: Exploring Dimensions of Trust in GitHub Pull Requests
    Amirali Sajadi, Kostadin Damevski, and Preetha Chatterjee
    The 45th International Conference on Software Engineering (ICSE), New Ideas and Emerging Results Track, May 2023.

    Preprint DOI Slides

  • The Evolution of Substance Use Coverage in the Philadelphia Inquirer
    Layla Bouzoubaa, Ramtin Ehsani, Preetha Chatterjee, and Rezvaneh Rezapour
    The 17th International AAAI Conference On Web And Social Media (ICWSM), Data Challenge, Jun 2023.

    Preprint DOI

    2022:

  • Data Augmentation for Improving Emotion Recognition in Software Engineering Communication
    Mia Mohammad Imran, Yashasvi Jain, Preetha Chatterjee, and Kostadin Damevski
    The 37th IEEE/ACM International Conference on Automated Software Engineering (ASE), Research Track, Oct 2022.

    Preprint DOI Slides

  • DISCO: A Dataset of Discord Chat Conversations for Software Engineering Research
    Keerthana Muthu Subash, Lakshmi Prasanna Kumar, Sri Lakshmi Vadlamani, Preetha Chatterjee and Olga Baysal
    The 19th International Conference on Mining Software Repositories (MSR), Data Showcase Track, May 2022.

    Preprint DOI Dataset

  • Automatic Identification of Informative Code in Stack Overflow Posts
    Preetha Chatterjee
    The 1st International Workshop on Natural Language-based Software Engineering (NLBSE), co-located with ICSE, May 2022.

    Preprint DOI Slides Talk

  • Empirical Standards for Repository Mining
    Preetha Chatterjee, Tushar Sharma, Paul Ralph
    The 19th International Conference on Mining Software Repositories (MSR), Tutorial, May 2022, May 2022.

    Preprint DOI Empirical Standards

    2021:

  • Automatic Extraction of Opinion-based Q&A from Online Developer Chats
    Preetha Chatterjee, Kostadin Damevski, and Lori Pollock
    The 43rd International Conference on Software Engineering (ICSE), Technical Track, May 2021.

    Preprint DOI Slides Talk

  • Automatically Identifying the Quality of Developer Chats for Post Hoc Use
    Preetha Chatterjee, Kostadin Damevski, Nicholas A. Kraft, and Lori Pollock
    Transactions on Software Engineering and Methodology (TOSEM), Feb2021

    Preprint DOI Slides Talk

  • Mining Information from Developer Chats Towards Building Software Maintenance Tools (Ph.D. Thesis)
    Preetha Chatterjee
    University of Delaware

    Manuscript

  • 2020:

  • Software-related Slack Chats with Disentangled Conversations
    Preetha Chatterjee, Kostadin Damevski, Nicholas A. Kraft, and Lori Pollock
    The 17th International Conference on Mining Software Repositories (MSR), Data Showcase Track, Oct 2020. Seoul, South Korea

    Preprint DOI Dataset Slides Talk

  • Extracting Archival-Quality Information from Software-Related Chats
    Preetha Chatterjee
    The 42nd International Conference on Software Engineering (ICSE), Doctoral Symposium Track, Oct 2020. Seoul, South Korea

    Preprint DOI Slides

  • Finding Help with Programming Errors: An Exploratory Study of Novice Software Engineers’ Focus in Stack Overflow Posts
    Preetha Chatterjee, Minji Kong, Lori Pollock
    Journal of Systems and Software (JSS), Research Paper, Jan 2020.

    Preprint DOI Slides Talk

  • 2019:

  • Exploratory Study of Slack Q&A Chats as a Mining Source for Software Engineering Tools
    Preetha Chatterjee, Kostadin Damevski, Lori Pollock, Vinay Augustine, and Nicholas A. Kraft
    The 16th International Conference on Mining Software Repositories (MSR), Research Track, May 2019. Montreal, Canada

    Preprint DOI Slides Press Coverage

  • 2017:

  • Extracting Code Segments and Their Descriptions from Research Articles
    Preetha Chatterjee, Benjamin Gause, Hunter Hedinger, and Lori Pollock
    The 14th International Conference on Mining Software Repositories (MSR), Research Track, May 2017. Buenos Aires, Argentina

    Preprint DOI Slides

  • What Information about Code Snippets Is Available in Different Software-Related Documents? An Exploratory Study
    Preetha Chatterjee, Manziba Akanda Nishi, Kostadin Damevski, Vinay Augustine, Lori Pollock, and Nicholas A. Kraft
    The 24th IEEE International Conference on Software Analysis, Evolution, and Reengineering (SANER), Early Research Achievements Track, Feb 2017. Klagenfurt, Austria

    Preprint DOI

  • 2015:

  • Exploring the Generality of a Java-based Loop Action Model for the Quorum Programming Language (Ph.D. Preliminary Project)
    Preetha Chatterjee
    University of Delaware

    Manuscript


  • Teaching

  • Fall 2024: Introduction to Software Engineering and Development (SE 201) @ Drexel University [Instructor]
  • Fall 2023: Introduction to Software Engineering and Development (SE 181) @ Drexel University [Instructor]
  • Spring 2023: Software Analytics (CS T680) @ Drexel University [Instructor]
  • Fall 2022: Introduction to Software Engineering and Development (SE 181) @ Drexel University [Instructor]
  • Spring 2022: Introduction to Software Engineering and Development (SE 181) @ Drexel University [Instructor]
  • Fall 2021: Introduction to Software Engineering and Development (SE 181) @ Drexel University [Instructor]
  • Summer 2019: Introduction to Computer Science II (CISC 181) @ University of Delaware [Instructor]
  • Fall 2018: Intro to Computer Science Research (CISC 367) @ University of Delaware [Substitute Instructor]
  • Spring 2018: Communication Skills for CS Researchers (CISC 667) @ University of Delaware [Substitute Instructor]
  • Fall 2017: Advanced Software Systems: Text Analysis for Software Engineering (CISC 879) @ University of Delaware [Substitute Instructor]
  • Spring 2016: Advanced Web Technologies (CISC 474) @ University of Delaware [Teaching Assistant]
  • Fall 2015: Web Applications using Computer Science (CISC 103) @ University of Delaware [Teaching Assistant]
  • Spring 2015: General Computer Science for Engineers (CISC 106) @ University of Delaware [Teaching Assistant]
  • Fall 2014: Introduction to Computer Science II (CISC 181) @ University of Delaware [Teaching Assistant]

  • Service

    Academic Service

    At Drexel, along with Dr. Colin Gordon, I co-lead the Drexel Programming Systems seminar.
    Beyond Drexel, a subset of my academic service is available as a snapshot on my conf.researchr.org profile .

    Below is a brief listing of my service to the Software Engineering research community.

  • Editorial Board:
    • Journal of Systems of Software (2021-2024)
  • Organizing Committee:
    • Tutorials Co-chair, 22nd Intl. Conf. on Mining Software Repositories (MSR 2025)
    • Journal‐first Co‐Chair, 32nd IEEE/ACM Intl. Conf. on Program Comprehension (ICPC 2024)
    • Mining Challenge Co‐Chair, 21st Intl. Conf. on Mining Software Repositories (MSR 2024)
    • NIER PC co‐Chair, 23nd IEEE Intl. Conf. on Source Code Analysis and Manipulation (SCAM 2023)
    • PC Co-Chair, 3rd International Workshop on Software Engineering and AI for Data Quality in Cyber‐Physical Systems/Internet of Things (SEA4DQ 2023)
    • Diversity and Inclusion co-Chair, International Conference on Mining Software Repositories (MSR 2023)
    • PC Co-Chair, 1st International Workshop on Recruiting Participants for Empirical SE (RoPES 2022)
    • Editorial Board, Journal of Systems and Software (JSS), 2021 - Present
    • Conference social media chair for the International Conference on Mining Software Repositories (MSR 2020, 2022)
  • Program Committee Member:
    • International Conference on Software Engineering (ICSE 2026 - Technical Track)
    • International Conference on Automated Software Engineering (ASE 2025 - TechnicalTrack)
    • International Conference on Software Engineering (ICSE 2025 - Technical Track)
    • The Foundations of Software Engineering (FSE 2025 - Technical Track)
    • International Conference on Software Engineering (ICSE 2024 - Technical Track)
    • The Foundations of Software Engineering (ESEC/FSE 2024 - Technical Track)
    • International Conference on Mining Software Repositories (MSR 2024 - Technical Track)
    • The Foundations of Software Engineering (ESEC/FSE 2023 - Technical Track) -- Distinguished Reviewer
    • International Conference on Software Engineering (ICSE 2023 - SEIP Track)
    • International Conference on Mining Software Repositories (MSR 2023 - Technical Track)
    • International Workshop on Natural Language-based Software Engineering (NLBSE 2023)
    • International Conference on Mining Software Repositories (MSR 2022 - Technical Track)
    • IEEE International Conference on Software Analysis, Evolution, and Reengineering (SANER 2022 - ERA Track)
    • International Conference on Software Engineering (ICSE 2022 - SEET Track)
    • International Conference on Software Maintenance and Evolution (ICSME 2021 - Tool Demo Track)
    • International Conference on Mining Software Repositories (MSR 2021 - Mining Challenge Track)
    • IEEE International Conference on Software Analysis, Evolution, and Reengineering (SANER 2021 - ERA Track)
  • Journal Reviewer:
    • Empirical Software Engineering (EMSE)
    • Transactions on Software Engineering and Methodology (TOSEM)
    • Transactions on Software Engineering (TSE)
    • Journal of Systems and Software (JSS)
    • Automated Software Engineering (ASE)
    • Information and Software Technology (IST)

    Professional Memberships and Affiliations

  • Lifetime Member, Association for Computing Machinery (ACM)
  • Member, Institute of Electrical and Electronics Engineers (IEEE)
  • Member, Association for Computing Machinery, Special Interest Group on Software Engineering (ACM-SIGSOFT)
  • Member, Association for Computing Machinery, Women (ACM-W)

  • Students

    SOAR Photoshoot2024

    Current Students

  • Ramtin Ehsani, (2023-Present), Ph.D. Student, Drexel University
  • Amirali Sajadi (2022-Present), Ph.D. Student, Drexel University
  • Sakshi Pathak (WI'24-Present), M.S. Student, Drexel University
  • Shriya Rawal (WI'25-Present), M.S. Student, Drexel University
  • Hitashi Kalra (SP'25), Undergraduate Student, Drexel University
  • Former Students

  • Taqi Tahmid (SU'24), Undergraduate Student, Drexel University
  • Binh Le (SP'24-SU'24), Undergraduate Student, Drexel University
  • Anh Nguyen (SP'24-SU'24), Undergraduate Student, Drexel University
  • Giles Odigwe (SP'24), Undergraduate Student, Drexel University
  • Mustafa Bookwala (WI'24), Undergraduate Student, Drexel University
  • Shyamal Mishra (SU'23), M.S. Student, Drexel University
  • Vanessa Martinez (WI'23), Undergraduate Student, Drexel University
  • Thomas Do (WI'22), Undergraduate Student, Drexel University
  • Yashasvi Jain (2021-2022), Undergraduate Student, Drexel University
  • Brian Phillips (2019-2020), Undergraduate Student, University of Delaware
  • Humpher Owusu (2019-2020), Undergraduate Student, University of Delaware
  • Kevin Mason (2019-2020), Undergraduate Student, University of Delaware
  • Minji Kong (2018), Undergraduate Student, University of Delaware
  • Qilin Ma (2017), Undergraduate Student, University of Delaware
  • Benjamin Gause (2016), Undergraduate Student, University of Delaware
  • Hunter Hedinger (2016), Undergraduate Student, University of Delaware
  • Prospective Students

    I am generally on the lookout for highly motivated students (at all levels - undergrad, MS, and PhD) with strong academic background to work at the intersection of software engineering, machine learning, and natural language processing at Drexel University.

    Qualifications: An ideal candidate has strong programming skills, communication/writing skills, and willingness to learn. Experience in software engineering and machine learning research are a plus.

    How to Apply: Please submit the following documents via email to preetha.chatterjee@drexel.edu under the subject “Potential Student Application”.
  • Brief cover letter including: your research interests, outline of previous research experience, preferred start date
  • Your current resume/CV (including major accomplishments e.g., projects, publications, awards, etc.)
  • One or two references that I can contact for a letter of reference (e.g., previous supervisors, instructors)
  • Unofficial Transcripts
  • Sample publications (if any)
  • I encourage you to include links to any projects/software that you have worked on. The review of applications will begin immediately and will continue until the positions are filled. I will carefully go through all the applications, and contact potentially eligible candidates for a brief interview (via Zoom).

    Resources For Applying To Drexel University: All PhD students are fully supported with an assistantship in the Computer Science PhD program at Drexel University. Assistantships may be in the form of research, teaching or a combination of the two. These assistantships carry appropriate stipend, tuition remission, and subsidized health insurance. PhD admissions are rolling until department closes review (no hard deadlines). If you are thinking about applying to the Ph.D. program at Drexel University, I have included some resources:
  • Drexel University Graduate Program Admissions
  • Drexel University PhD in Computer Science Admissions and Requirements
  • If you are already a student at Drexel University, feel free to email me to discuss potential research opportunities.