Pooja Rani

@uzh.ch

Postdoctoral Researcher and Department of Informatics (IFI)
University of Zurich

https://researchid.co/poojaruhal

I am a postdoctoral researcher in the Zurich Empirical Software engineering Team (Prof. Alberto Bacchelli) since Sep, 2022. Previously, I was a postdoctoral researcher in the Software Engineering Group (Prof. Timo Kehrer). I have done my PhD in the Software composition group at University of Bern under the supervision of Prof. Oscar Nierstrasz and Dr. Sebastiano Panichella. I work on mainly supporting developers in comprehending source code via comments and their maintenance and evolution aspect. My research area lie in the general area of Software Engineering (SE), Natural Language Processing (NLP), Program Comprehension, Object-Oriented Programming, and Machine learning applied to SE. I have co-reviewed papers of various international conference (e.g., ICSME, SANER, SATTOSE) and journal (TOSEM, IST).

Scopus Publications

135

Scholar Citations

Scholar h-index

Scholar i10-index

Scopus Publications

Variable Discovery with Large Language Models for Metamorphic Testing of Scientific Software
Christos Tsigkanos, Pooja Rani, Sebastian Müller, and Timo Kehrer
Springer Nature Switzerland

Large Language Models: The Next Frontier for Variable Discovery within Metamorphic Testing?
Christos Tsigkanos, Pooja Rani, Sebastian Müller, and Timo Kehrer
IEEE
Metamorphic testing involves reasoning on necessary properties that a program under test should exhibit regarding multiple input and output variables. A general approach consists of extracting metamorphic relations from auxiliary artifacts such as user manuals or documentation, a strategy particularly fitting to testing scientific software. However, such software typically has large input-output spaces, and the fundamental prerequisite – extracting variables of interest – is an arduous and non-scalable process when performed manually. To this end, we devise a workflow around an autoregressive transformer-based Large Language Model (LLM) towards the extraction of variables from user manuals of scientific software. Our end-to-end approach, besides a prompt specification consisting of few-shot examples by a human user, is fully automated, in contrast to current practice requiring human intervention. We showcase our LLM workflow over a real case, and compare variables extracted to ground truth manually labelled by experts. Our preliminary results show that our LLM-based workflow achieves an accuracy of 0.87, while successfully deriving 61.8% of variables as partial matches and 34.7% as exact matches.

The NLBSE'23 Tool Competition
Rafael Kallis, Maliheh Izadi, Luca Pascarella, Oscar Chaparro, and Pooja Rani
IEEE
We report on the organization and results of the second edition of the tool competition from the International Workshop on Natural Language-based Software Engineering (NLBSE'23). As in the prior edition, we organized the competition on automated issue report classification, with a larger dataset. This year, we featured an extra competition on au-tomated code comment classification. In this tool competition edition, five teams submitted multiple classification models to automatically classify issue reports and code comments. The submitted models were fine-tuned and evaluated on a benchmark dataset of 1.4 million issue reports or 6.7 thousand code comments, respectively. The goal of the competition was to improve the classification performance of the baseline models that we provided. This paper reports details of the competition, including the rules, the teams and contestant models, and the ranking of models based on their average classification performance across issue report and code comment types.

A decade of code comment quality assessment: A systematic literature review
Pooja Rani, Arianna Blasi, Nataliia Stulova, Sebastiano Panichella, Alessandra Gorla, and Oscar Nierstrasz
Elsevier BV

Can We Automatically Generate Class Comments in Pharo?

How to identify class comment types? A multi-language approach for class comment classification
Pooja Rani, Sebastiano Panichella, Manuel Leuenberger, Andrea Di Sorbo, and Oscar Nierstrasz
Elsevier BV

What do class comments tell us? An investigation of comment evolution and practices in Pharo Smalltalk
Pooja Rani, Sebastiano Panichella, Manuel Leuenberger, Mohammad Ghafari, and Oscar Nierstrasz
Springer Science and Business Media LLC
Abstract Context Previous studies have characterized code comments in various programming languages, showing how high quality of code comments is crucial to support program comprehension activities, and to improve the effectiveness of maintenance tasks. However, very few studies have focused on understanding developer practices to write comments. None of them has compared such developer practices to the standard comment guidelines to study the extent to which developers follow the guidelines. Objective Therefore, our goal is to investigate developer commenting practices and compare them to the comment guidelines. Method This paper reports the first empirical study investigating commenting practices in Pharo Smalltalk. First, we analyze class comment evolution over seven Pharo versions. Then, we quantitatively and qualitatively investigate the information types embedded in class comments. Finally, we study the adherence of developer commenting practices to the official class comment template over Pharo versions. Results Our results show that there is a rapid increase in class comments in the initial three Pharo versions, while in subsequent versions developers added comments to both new and old classes, thus maintaining a similar code to comment ratio. We furthermore found three times as many information types in class comments as those suggested by the template. However, the information types suggested by the template tend to be present more often than other types of information. Additionally, we find that a substantial proportion of comments follow the writing style of the template in writing these information types, but they are written and formatted in a non-uniform way. Conclusion The results suggest the need to standardize the commenting guidelines for formatting the text, and to provide headers for the different information types to ensure a consistent style and to identify the information easily. Given the importance of high-quality code comments, we draw numerous implications for developers and researchers to improve the support for comment quality assessment tools.

Entropy based enhanced particle swarm optimization on multi-objective software reliability modelling for optimal testing resources allocation
Pooja Rani and G. S. Mahapatra
Wiley
SummaryThis paper proposes a generalization of the exponential software reliability model to characterize several factors including fault introduction and time‐varying fault detection rate. The software life cycle is designed based on module structure such as testing effort spent during module testing and detected software faults etc. The resource allocation problem is a critical phase in the testing stage of software reliability modelling. It is required to make decisions for optimal resource allocation among the modules to achieve the desired level of reliability. We formulate a multi‐objective software reliability model of testing resources for a new generalized exponential reliability function to characterizes dynamic allocation of total expected cost and testing effort. An enhanced particle swarm optimization (EPSO) is proposed to maximize software reliability and minimize allocation cost. We perform experiments with randomly generated testing‐resource sets and varying the performance using the entropy function. The multi‐objective model is compared with modules according to weighted cost function and testing effort measures in a typical modular testing environment.

Speculative Analysis for Quality Assessment of Code Comments
Pooja Rani
IEEE
Previous studies have shown that high-quality code comments assist developers in program comprehension and maintenance tasks. However, the semi-structured nature of comments, unclear conventions for writing good comments, and the lack of quality assessment tools for all aspects of comments make their evaluation and maintenance a non-trivial problem. To achieve high-quality comments, we need a deeper understanding of code comment characteristics and the practices developers follow. In this thesis, we approach the problem of assessing comment quality from three different perspectives: what developers ask about commenting practices, what they write in comments, and how researchers support them in assessing comment quality. Our preliminary findings show that developers embed various kinds of information in class comments across programming languages. Still, they face problems in locating relevant guidelines to write consistent and informative comments, verifying the adherence of their comments to the guidelines, and evaluating the overall state of comment quality. To help developers and researchers in building comment quality assessment tools, we provide: (i) an empirically validated taxonomy of comment convention-related questions from various community forums, (ii) an empirically validated taxonomy of comment information types from various programming languages, (iii) a language-independent approach to automatically identify the information types, and (iv) a comment quality taxonomy prepared from a systematic literature review.

Makar: A Framework for Multi-source Studies based on Unstructured Data
Mathias Birrer, Pooja Rani, Sebastiano Panichella, and Oscar Nierstrasz
IEEE
To perform various development and maintenance tasks, developers frequently seek information on various sources such as mailing lists, Stack Overflow (SO), and Quora. Researchers analyze these sources to understand developer information needs in these tasks. However, extracting and preprocessing unstructured data from various sources, building and maintaining a reusable dataset is often a time-consuming and iterative process. Additionally, the lack of tools for automating this data analysis process complicates the task to reproduce previous results or datasets.To address these concerns we propose Makar, which provides various data extraction and preprocessing methods to support researchers in conducting reproducible multi-source studies. To evaluate Makar, we conduct a case study that analyzes code comment related discussions from SO, Quora, and mailing lists. Our results show that Makar is helpful for preparing reproducible datasets from multiple sources with little effort, and for identifying the relevant data to answer specific research questions in a shorter time compared to state-of-the-art tools, which is of critical importance for studies based on unstructured data. Tool webpage: https://github.com/maethub/makar

Do Comments follow Commenting Conventions? A Case Study in Java and Python
Pooja Rani, Suada Abukar, Nataliia Stulova, Alexandre Bergel, and Oscar Nierstrasz
IEEE

What Do Developers Discuss about Code Comments?
Pooja Rani, Mathias Birrer, Sebastiano Panichella, Mohammad Ghafari, and Oscar Nierstrasz
IEEE

A neuro-particle swarm optimization logistic model fitting algorithm for software reliability analysis
Pooja Rani and GS Mahapatra
SAGE Publications
This article develops a particle swarm optimization algorithm based on a feed-forward neural network architecture to fit software reliability growth models. We employ adaptive inertia weight within the proposed particle swarm optimization in consideration of learning algorithm. The dynamic adaptive nature of proposed prior best particle swarm optimization prevents the algorithm from becoming trapped in local optima. These neuro-prior best particle swarm optimization algorithms were applied to a popular flexible logistic growth curve as the [Formula: see text] model based on the weights derived by the artificial neural network learning algorithm. We propose the prior best particle swarm optimization algorithm to train the network for application to three different software failure data sets. The new search strategy improves the rate of convergence because it retains information on the prior particle, thereby enabling better predictions. The results are verified through testing approaching of constant, modified, and linear inertia weight. We assess the fitness of each particle according to the normalized root mean squared error which updates the best particle and velocity to accelerate convergence to an optimal solution. Experimental results demonstrate that the proposed [Formula: see text] model based prior best Particle Swarm Optimization based on Neural Network (pPSONN) improves predictive quality over the [Formula: see text], [Formula: see text], and existing model.

A single change point hazard rate software reliability model with imperfect debugging
Pooja Rani and G.S. Mahapatra
IEEE
Software reliability models with imperfect debugging can characterize the quality of software fault removal, in addition to the rate of faults discovered. Change points are therefore useful to quantity such changes in the fault discovery rate. To address these issues, this paper develops a hazard rate model based on the widely studied Jelinski-Moranda model, introducing two additional parameters, namely an imperfect debugging parameter and a single change point parameters. We compare the proposed model with simpler models to show these additional parameters are justifiable. Both information theoretic and predictive measures of goodness of fit are applied to demonstrate these additional parameters are appropriate on some data sets.

Robust feedforward and recurrent neural network based dynamic weighted combination models for software reliability prediction
Pratik Roy, G.S. Mahapatra, Pooja Rani, S.K. Pandey, and K.N. Dey
Elsevier BV

RECENT SCHOLAR PUBLICATIONS

How does Simulation-based Testing for Self-driving Cars match Human Perception?
C Birchler, TK Mohammed, P Rani, T Nechita, T Kehrer, S Panichella
arXiv preprint arXiv:2401.14736 2024

Energy Patterns for Web: An Exploratory Study
P Rani, J Zellweger, V Kousadianos, L Cruz, T Kehrer, A Bacchelli
arXiv preprint arXiv:2401.06482 2024

Beyond Code: Is There a Difference between Comments in Visual and Textual Languages?
A Boll, P Rani, A Schulthei, T Kehrer
Available at SSRN 4650661 2024

Variable Discovery with Large Language Models for Metamorphic Testing of Scientific Software
C Tsigkanos, P Rani, S Mller, T Kehrer
International Conference on Computational Science, 321-335 2023

Large Language Models: The Next Frontier for Variable Discovery within Metamorphic Testing?
C Tsigkanos, P Rani, S Mller, T Kehrer
2023 IEEE International Conference on Software Analysis, Evolution and 2023

LEXTREME: A Multi-Lingual and Multi-Task Benchmark for the Legal Domain
J Niklaus, V Matoshi, P Rani, A Galassi, M Strmer, I Chalkidis
arXiv preprint arXiv:2301.13126 2023

A decade of code comment quality assessment: A systematic literature review
P Rani, A Blasi, N Stulova, S Panichella, A Gorla, O Nierstrasz
Journal of Systems and Software 195, 111515 2023

Assessing Comment Quality in Object-Oriented Languages
PR Pooja Rani, O Nierstrasz
Institute of Informatik 2022

Can We Automatically Generate Class Comments in Pharo?
PR Pooja Rani, AH Bergel, L Hess, TB Kehrer, O Nierstrasz
2022

What do class comments tell us? An investigation of comment evolution and practices in Pharo Smalltalk
P Rani, S Panichella, M Leuenberger, M Ghafari, O Nierstrasz
Empirical software engineering 26 (6), 112 2021

How to identify class comment types? A multi-language approach for class comment classification
P Rani, S Panichella, M Leuenberger, A Di Sorbo, O Nierstrasz
Journal of Systems and Software 181, 111047 2021

Do Comments follow Commenting Conventions? A Case Study in Java and Python
P Rani, S Abukar, N Stulova, A Bergel, O Nierstrasz
2021 IEEE 21st International Working Conference on Source Code Analysis and 2021

What do developers discuss about code comments?
P Rani, M Birrer, S Panichella, M Ghafari, O Nierstrasz
2021 IEEE 21st International Working Conference on Source Code Analysis and 2021

Adherence of class comments to style guidelines.
PR S Abukar, O Nierstrasz
U Bern, Bachelor thesis 2021

Tool Support for Commenting Conventions
NS M Dooley, P Rani, O Nierstrasz
U Bern, Bachelor thesis 2021

Generating automatically class comments in Pharo
PR L Hess, O Nierstrasz
U Bern, Bachelor thesis 2021

Speculative Analysis for Quality Assessment of Code Comments
P Rani
2021 IEEE/ACM 43rd International Conference on Software Engineering 2021

Makar: A Framework for Multi-source Studies based on Unstructured Data
M Birrer, P Rani, S Panichella, O Nierstrasz
2021 IEEE International Conference on Software Analysis, Evolution and 2021

Analysis of Developer Information Needs on Collaborative Platforms
M Birrer, O Nierstrasz, P Rani
Masters thesis, University of Bern 2020

Software Developers’ Information Needs
J Richner, P Rani, O Nierstrasz
2019

MOST CITED SCHOLAR PUBLICATIONS

LEXTREME: A Multi-Lingual and Multi-Task Benchmark for the Legal Domain
J Niklaus, V Matoshi, P Rani, A Galassi, M Strmer, I Chalkidis
arXiv preprint arXiv:2301.13126 2023
Citations: 23

A decade of code comment quality assessment: A systematic literature review
P Rani, A Blasi, N Stulova, S Panichella, A Gorla, O Nierstrasz
Journal of Systems and Software 195, 111515 2023
Citations: 13

What do developers discuss about code comments?
P Rani, M Birrer, S Panichella, M Ghafari, O Nierstrasz
2021 IEEE 21st International Working Conference on Source Code Analysis and 2021
Citations: 13

Variable Discovery with Large Language Models for Metamorphic Testing of Scientific Software
C Tsigkanos, P Rani, S Mller, T Kehrer
International Conference on Computational Science, 321-335 2023
Citations: 5

Speculative Analysis for Quality Assessment of Code Comments
P Rani
2021 IEEE/ACM 43rd International Conference on Software Engineering 2021
Citations: 3

Generating automatically class comments in Pharo
PR L Hess, O Nierstrasz
U Bern, Bachelor thesis 2021
Citations: 2

Makar: A Framework for Multi-source Studies based on Unstructured Data
M Birrer, P Rani, S Panichella, O Nierstrasz
2021 IEEE International Conference on Software Analysis, Evolution and 2021
Citations: 2

Analysis of Developer Information Needs on Collaborative Platforms
M Birrer, O Nierstrasz, P Rani
Masters thesis, University of Bern 2020
Citations: 2

Software Developers’ Information Needs
J Richner, P Rani, O Nierstrasz
2019
Citations: 1