Sayma Sultana

I am a PhD student in Department of Computer Science at Wayne State University. I am working in SEAL lab under the supervision of Dr. Amiangshu Bosu . My research focuses on diversity and inclusion in software engineering. My work also spans to the area of code review, social network analysis, software security and empirical software engineering. I have earned BS in Computer Science & Engineering from Bangladesh University of Engineering & Technology .

I pronounce my name as SY-muh sul-TAH-nuh.

Education

Ph.D. in Computer Science at Wayne State University (Fall 2020 - Spring/Summer 2025)

MS in Computer Science at Wayne State University (Fall 2020 - December 2023)

B.Sc. in Computer Science and Engineering at Bangladesh University of Engineering and Technology ( 2012 - 2017)

Publications

Journal & Conference Papers

S Sultana, J Sarker, F Israt, R Paul, A Bosu. Automated Identification of Sexual Orientation and Gender Identity Discriminatory Texts from Issue Comments ACM Transactions on Software engineering & Methodologies (TOSEM), 2025 (Under major revision)

AK Turzo, S Sultana, A Bosu. From First Patch to Long-Term Contributor: Evaluating Onboarding Recommendations for OSS Newcomers IEEE Transactions on Software Engineering (TSE), 2025

J Sarker, S Sultana, S R. Wilson, A Bosu. ToxiSpanSE: An Explainable Toxicity Detection in Code Review Comments Empirical Software Engineering (ESEM), 2023

S Sultana, AK Turzo, A Bosu. Code Reviews in Open Source Projects : How Do Gender Biases Affect Participation and Outcomes?, Empirical Software Engineering (EMSE), 2023

Short Paper & Posters

S Sultana, MB Kali. Exploring ChatGPT for identifying sexism in the communication of software developers 17th International Conference on PErvasive Technologies Related to Assistive Environments (PETRA), 2024

S Sultana, G Uddin, A Bosu. Assessing the Influence of Toxic and Gender Discriminatory Communication on Perceptible Diversity in OSS Projects, 21st International Conference on Mining Software Repositories (MSR), Registered Report Track, 2024

S Sultana. Identification and Mitigation of Gender Biases to Promote Diversity and Inclusion among Open Source Communities, Doctoral Symposium, 37th IEEE/ACM International Conference on Automated Software Engineering (ASE) 2022, Oakland Center, Michigan, United States

S Sultana. Identifying Sexism and Misogyny in Pull Request Comments, Student Research Competition,37th IEEE/ACM International Conference on Automated Software Engineering (ASE) 2022, Oakland Center, Michigan, United States

S Sultana, J Sarker, A Bosu. A Rubric to Identify Misogynistic and Sexist Texts from Software Developer Communications, Emerging Results and Vision paper,15th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement (ESEM) 2021, Bari, Italy

S Sultana, A Bosu. Are Code Review Processes Influenced by the Genders of the Participants?, Registered Report at ICSME 2021 (Virtual Event) 37th International Conference on Software Maintenance and Evolution September 27 - October 1

S Sultana, LA Cavaletto, A Bosu. Identifying the Prevalence of Gender Biases among the Computing Organizations , 8th ACM Celebration of Women in Computing: womENcourage™ 2021, (coordinated from Prague, Czech Republic), 22-24 September, 2021

Dey R., Sultana S. , Razi A., Wisniewski P.J.. Exploring Smart Home Device Use by Airbnb Hosts. CHI Conference on Human Factors in Computing Systems Extended Abstracts, Honolulu, USA, April 2020

Research Projects

Analyzing Toxicity and Gender Bias Impact on OSS Diversity: Currently gathering GitHub dataset of pull requests and issue comments to explore the effects of toxic and gender-biased content on diversity in gender, ethnicity, and tenure. After that, using two state-of-the-art SE domain-specific toxicity and gender derogatory detection tools, we will automatically identify toxic and gender discriminatory texts and will train multivariate regression models, where three diversity indices measured using the Blau/Simpson index would be dependents, and the ratio of toxic /gender discriminatory texts would be one of the independents. To account for confounding factors, we will use various project characteristics such as the age of a project, the number of contributors, the number of total commits, the number of issues, the number of releases, the number of pull requests, and if the project has a code of conduct, as independents. We will assess the association between perceptible diversity and the prevalence of toxicity /gender-derogatory texts using regression coefficients and their significance (i.e., p-value <0.05). [MSR'24]

Investigating LLMs for Detecting Sexist Content: We are assessing ChatGPT's capability to detect various forms of gender-based derogatory language. As an exploratory study, we collected a dataset from a previous study where the authors identified instances of gender-based maternal insults, commonly known as "mom jokes" and stereotypical remarks about women in an instant messaging chat environment. Utilizing the ChatGPT API, we sought to determine whether these comments could be accurately identified as mom jokes or stereotyping using both zero-shot and one-shot approaches. Our investigation has yielded encouraging findings regarding ChatGPT’s capacity to recognize sexist remarks. Specifically, we found that this model successfully identified 98.2% of texts containing maternal insults and 71.12% of texts containing stereotyping as positive instances. Hence, we can conclude that ChatGPT has demonstrated promising results in identifying specific forms of sexism. [PETRA'24]

Examining Gender Bias Among Software Industry Professionals : Based on the Social Identity Theory (SIT) framework, we selected four dimensions of biases for this study: bias in career development, bias in task selection, unwanted attention, and identity attacks. Through a vignette-based survey study, we investigated to determine the frequency of these biases, understand the association between demographic factors and being victims, explore the consequences of biases, and uncover the motivations behind biased behaviors. Our findings suggest that bias in career development and task selection are the two most prevalent among the four dimensions examined. More than two-thirds of the victims encounter biases multiple times. Demographic factors such as age, years of development experience, project experience, race, and location, in addition to gender, influence the likelihood of being a bias victim. Women are more than three times more likely than men to become victims of a bias. Additionally, workplace culture, along with stereotypical beliefs about women, often influences individuals to make biased decisions. [womEncourage'24]

Automated Tool for Identifying SGID Content: Developed a rubric and annotated a dataset of 11,007 GitHub pull requests and issue comments to build a tool automatically identifying sexual and gender identity-based derogatory content. SGID4SE incorporates six preprocessing steps and ten state-of-the-art algorithms and implements six different strategies to improve the performance of the minority class. We empirically evaluated each strategy and identified an optimum configuration for each algorithm. In our ten-fold cross-validation-based evaluations, a BERT-based model boosts the best performance with 85.9% precision, 80.0% recall, and 82.9% F1-Score for the SGID class. This model achieves 95.7% accuracy and 80.4% Matthews Correlation Coefficient. Our dataset and tool establish a foundation for further research in this direction. [Rubric] [SGID4SE tool] [ESEM'21] [ASE'22] [TOSEM'25]

Gender Bias in OSS Code Review: To identify whether the outcomes of participation in code reviews (or pull requests) are influenced by the gender of a developer, we conducted regression analysis on data of pull request approvals, review times, and reviewer participation from 1,010 GitHub and Gerrit projects divided into 14 datasets. Our results find significant gender biases during code acceptance and significant differences between men and women in terms of code review intervals, with women encountering longer delays than men in three cases and the opposite in seven. Moreover, our results indicate reviewer selection as one of the most gender-biased aspects, with 12 out of 14 datasets exhibiting bias. Since most of the review assignments are based on invitations, this result suggests possible affinity biases among the developers. [Replication package] [ICSME'21] [EMSE'23]

Studying Smart Home Device Use by Airbnb Hosts: An increasing number of Airbnb hosts are using smart home devices to manage their properties; as a result, Airbnb guests are expressing concerns about their privacy. To reconcile the tensions between hosts and guests, we interviewed 10 Airbnb hosts to understand what smart home devices they use, for what purposes, their concerns, and their unmet needs regarding smart home device usage. Overall, hosts used smart home devices to give remote access to their home to guests and safeguard their investment properties against misuse. They were less concerned about guest privacy and felt that smart home devices provided unique value to guests and, thus, a competitive advantage over other Airbnb properties. [Replication package] [CHI'20]

Experience

Teaching Experience

CSC 4996 (Senior Capstone Project): Fall 2022, Winter 2024.

CSC 4421 (Computer Operating Systems Lab): Winter 2022, Winter 2023, Fall 2023.

CSC 2201 (Computer Science 2 Lab): Spring/Summer 2022

CSC 2200 (Computer Science 2 ): Spring/Summer 2021, Spring/Summer 2022.

CSC 1101 (Problem Solving & Programming Lab): Fall 2020, Winter 2021, Spring/Summer 2021, Fall 2021, Spring/Summer 2023.

CSC 1100 (Problem Solving & Programming ): Spring/Summer 2023.

Professional Experience

Software Engineer Intern (Summer 2024), Meta, Menlo Park, CA, USA

Software Engineer ( March 2017- April 2019), Reve Systems , Dhaka, Bangladesh

Contact

Email: sayma@wayne.edu

Office: 5057 Woodward Ave., Suite# 3105, Detroit, MI 48202