Sawan Rai

@srmap.edu.in

Assistant Professor CSE
SRM University-AP

https://researchid.co/sawan777

RESEARCH, TEACHING, or OTHER INTERESTS

Software, Artificial Intelligence, Computer Science, Computer Science Applications

Scopus Publications

179

Scholar Citations

Scholar h-index

Scholar i10-index

Scopus Publications

A Review of Existing Conversational Recommendation Systems
Subiya Zaidi, Sawan Rai, and Kapil Juneja
IEEE
ChatGPT, Alexa, Siri, Okay Google are an indispensable part of our lives today. These assistants are referred to as Digital Assistants and enable users to communicate their choices through natural language. The Digital Assistants ease the customer task of selecting items in various applications like movies, songs and so on. This process of making a choice through natural language conversations is known as a Conversational Recommender system (CoRS). CoRS is a dialogue-based model which aims to provide customer with accurate and quality recommendations. The interaction-oriented method gives the customer an edge over the traditional way of seeking recommendations. The traditional recommendation systems are static in nature and derive information through past history of the customer. A CoRS mitigates the challenges faced in the earlier methods of recommendation like cold start where in a new user is often recommended inaccurate choices. Other issues like data sparsity and lack of diversity due to not so updated content to choose from are common. CoRS is dynamic in nature, it works on delivering high end choices by interpreting the customer demands one dialogue at a time. This comprehensive survey aims to give an overview of the research in progress using conversation as a means to achieve better results for recommendation systems.

Large scale annotated dataset for code-mix abusive short noisy text
Paras Tiwari, Sawan Rai, and C. Ravindranath Chowdary
Springer Science and Business Media LLC

Accurate module name prediction using similarity based and sequence generation models
Sawan Rai, Ramesh Chandra Belwal, and Atul Gupta
Springer Science and Business Media LLC

Extractive text summarization using clustering-based topic modeling
Ramesh Chandra Belwal, Sawan Rai, and Atul Gupta
Springer Science and Business Media LLC

Is the Corpus Ready for Machine Translation? A Case Study with Python to Pseudo-Code Corpus
Sawan Rai, Ramesh Chandra Belwal, and Atul Gupta
Springer Science and Business Media LLC

Investigating the Application of Multi-lingual Transformer in Graph-Based Extractive Text Summarization for Hindi Text
Sawan Rai, Ramesh Chandra Belwal, and Abhinav Sharma
Springer Nature Singapore

A Mathematical Model for the Effect of Vaccination on COVID-19 Epidemic Spread
Avaneesh Singh, Sawan Rai, and Manish Kumar Bajpai
Springer Nature Singapore

Advanced Hierarchical Topic Labeling for Short Text
Paras Tiwari, Ashutosh Tripathi, Avaneesh Singh, and Sawan Rai
Institute of Electrical and Electronics Engineers (IEEE)
Hierarchical Topic Modeling is the probabilistic approach for discovering latent topics distributed hierarchically among the documents. The distributed topics are represented with the respective topic terms. An unambiguous conclusion from the topic term distribution is a challenge for readers. The hierarchical topic labeling eases the challenge by facilitating an individual, appropriate label for each topic at every level. In this work, we propose a BERT-embedding inspired methodology for labeling hierarchical topics in short text corpora. The short texts have gained significant popularity on multiple platforms in diverse domains. The limited information available in the short text makes it difficult to deal with. In our work, we have used three diverse short text datasets that include both structured and unstructured instances. Such diversity ensures the broad application scope of this work. Considering the relevancy factor of the labels, the proposed methodology has been compared against both automatic and human annotators. Our proposed methodology outperformed the benchmark with an average score of 0.4185, 49.50, and 49.16 for cosine similarity, exact match, and partial match, respectively.

Generating class name in sequential manner using convolution attention neural network
Sawan Rai, Ramesh Chandra Belwal, and Atul Gupta
Elsevier BV

A Review on Source Code Documentation
Sawan Rai, Ramesh Chandra Belwal, and Atul Gupta
Association for Computing Machinery (ACM)
Context: Coding is an incremental activity where a developer may need to understand a code before making suitable changes in the code. Code documentation is considered one of the best practices in software development but requires significant efforts from developers. Recent advances in natural language processing and machine learning have provided enough motivation to devise automated approaches for source code documentation at multiple levels. Objective: The review aims to study current code documentation practices and analyze the existing literature to provide a perspective on their preparedness to address the stated problem and the challenges that lie ahead. Methodology: We provide a detailed account of the literature in the area of automated source code documentation at different levels and critically analyze the effectiveness of the proposed approaches. This also allows us to infer gaps and challenges to address the problem at different levels. Findings: (1) The research community focused on method-level summarization. (2) Deep learning has dominated the past five years of this research field. (3) Researchers are regularly proposing bigger corpora for source code documentation. (4) Java and Python are the widely used programming languages as corpus. (5) Bilingual Evaluation Understudy is the most favored evaluation metric for the research persons.

Effect of Identifier Tokenization on Automatic Source Code Documentation
Sawan Rai, Ramesh Chandra Belwal, and Atul Gupta
Springer Science and Business Media LLC

A new graph-based extractive text summarization using keywords or topic modeling
Ramesh Chandra Belwal, Sawan Rai, and Atul Gupta
Springer Science and Business Media LLC

Text summarization using topic-based vector space model and semantic measure
Ramesh Chandra Belwal, Sawan Rai, and Atul Gupta
Elsevier BV

Development of web browser prototype with embedded classification capability for mitigating Cross-Site Scripting attacks
Vikas K. Malviya, Sawan Rai, and Atul Gupta
Elsevier BV

Mind Your Tweet: Abusive Tweet Detection
Paras Tiwari and Sawan Rai
Springer International Publishing

Development of a plugin based extensible feature extraction framework
Vikas Malviya, Sawan Rai, and Atul Gupta
ACM
An important ingredient for a successful recipe for solving machine learning problems is the availability of a suitable dataset. However, such a dataset may have to be extracted from a large unstructured and semi-structured data like programming code, scripts, and text. In this work, we propose a plug-in based, extensible feature extraction framework for which we have prototyped as a tool. The proposed framework is demonstrated by extracting features from two different sources of semi-structured and unstructured data. The semi-structured data comprised of web page and script based data whereas the other data was taken from email data for spam filtering. The usefulness of the tool was also assessed on the aspect of ease of programming.

Method Level Text Summarization for Java Code Using Nano-Patterns
Sawan Rai, Tejaswini Gaikwad, Sparshi Jain, and Atul Gupta
IEEE
Rapid growth in providing automated solutions resulted in large code bases to get quickly developed and consumed. However, maintaining code and its subsequent reuse pose some challenges here. One of the best practices used to handle such issues is also to provide suitable text summary of the code to allow the human developers to comprehend the code easily, but this can be quite time-consuming and costly affair. A few efforts have been made in this direction where the text summary of the code either generated from the method signature or its body. In this paper, we propose a text summarization approach for Java code that makes use of identification of code level nano-patterns to obtain text summary. The approach also looks for associations between these nano-patterns in a Java method code and then use a template based text generation to obtain the final text summary of the Java method. We evaluated the summary generated by the proposed approach using a controlled experiment with other three existing approaches. Our results suggested that the summary generated by our approach was better on the part of completeness and correctness criteria. The feedback obtained during the experimental validation suggested additional inputs to improve the generated text summary on the other two accounts as well.

RECENT SCHOLAR PUBLICATIONS

Large scale annotated dataset for code-mix abusive short noisy text
P Tiwari, S Rai, CR Chowdary
Language Resources and Evaluation, 1-28 2024

Accurate module name prediction using similarity based and sequence generation models
S Rai, RC Belwal, A Gupta
Journal of Ambient Intelligence and Humanized Computing 14 (9), 11531-11543 2023

A Mathematical Model for the Effect of Vaccination on COVID-19 Epidemic Spread
A Singh, S Rai, MK Bajpai
Machine Vision and Augmented Intelligence: Select Proceedings of MAI 2022 2023

Advanced hierarchical topic labeling for short text
P Tiwari, A Tripathi, A Singh, S Rai
IEEE Access 2023

Extractive text summarization using clustering-based topic modeling
RC Belwal, S Rai, A Gupta
Soft Computing 27 (7), 3965-3982 2023

Is the corpus ready for machine translation? A case study with Python to pseudo-code corpus
S Rai, RC Belwal, A Gupta
Arabian Journal for Science and Engineering 48 (2), 1845-1858 2023

Investigating the Application of Multi-lingual Transformer in Graph-Based Extractive Text Summarization for Hindi Text
S Rai, RC Belwal, A Sharma
International Conference on Data Management, Analytics & Innovation, 393-403 2023

Generating class name in sequential manner using convolution attention neural network
S Rai, RC Belwal, A Gupta
Expert Systems with Applications 199, 116854 2022

A review on source code documentation
S Rai, RC Belwal, A Gupta
ACM Transactions on Intelligent Systems and Technology (TIST) 13 (5), 1-44 2022

Effect of identifier tokenization on automatic source code documentation
S Rai, RC Belwal, A Gupta
Arabian Journal for Science and Engineering 47 (2), 2141-2157 2022

A new graph-based extractive text summarization using keywords or topic modeling
RC Belwal, S Rai, A Gupta
Journal of Ambient Intelligence and Humanized Computing 12 (10), 8975-8990 2021

Mind your tweet: Abusive tweet detection
P Tiwari, S Rai
International Conference on Speech and Computer, 704-715 2021

Text summarization using topic-based vector space model and semantic measure
RC Belwal, S Rai, A Gupta
Information Processing & Management 58 (3), 102536 2021

Development of web browser prototype with embedded classification capability for mitigating Cross-Site Scripting attacks
VK Malviya, S Rai, A Gupta
Applied Soft Computing 102, 106873 2021

Generation of pseudo code from the python source code using rule-based machine translation
S Rai, A Gupta
arXiv preprint arXiv:1906.06117 2019

Development of a plugin based extensible feature extraction framework
V Malviya, S Rai, A Gupta
Proceedings of the 33rd Annual ACM Symposium on Applied Computing, 1840-1847 2018

Method level text summarization for java code using nano-patterns
S Rai, T Gaikwad, S Jain, A Gupta
2017 24th Asia-Pacific Software Engineering Conference (APSEC), 199-208 2017

Method level text summarization for java code using nano-patterns. In 2017 24th Asia-Pacific Software Engineering Conference (APSEC)
S Rai, T Gaikwad, S Jain, A Gupta
IEEE, 199ś208 2017

MOST CITED SCHOLAR PUBLICATIONS

Text summarization using topic-based vector space model and semantic measure
RC Belwal, S Rai, A Gupta
Information Processing & Management 58 (3), 102536 2021
Citations: 60

A new graph-based extractive text summarization using keywords or topic modeling
RC Belwal, S Rai, A Gupta
Journal of Ambient Intelligence and Humanized Computing 12 (10), 8975-8990 2021
Citations: 41

Development of web browser prototype with embedded classification capability for mitigating Cross-Site Scripting attacks
VK Malviya, S Rai, A Gupta
Applied Soft Computing 102, 106873 2021
Citations: 16

A review on source code documentation
S Rai, RC Belwal, A Gupta
ACM Transactions on Intelligent Systems and Technology (TIST) 13 (5), 1-44 2022
Citations: 15

Method level text summarization for java code using nano-patterns
S Rai, T Gaikwad, S Jain, A Gupta
2017 24th Asia-Pacific Software Engineering Conference (APSEC), 199-208 2017
Citations: 11

Generation of pseudo code from the python source code using rule-based machine translation
S Rai, A Gupta
arXiv preprint arXiv:1906.06117 2019
Citations: 7

Extractive text summarization using clustering-based topic modeling
RC Belwal, S Rai, A Gupta
Soft Computing 27 (7), 3965-3982 2023
Citations: 6

Method level text summarization for java code using nano-patterns. In 2017 24th Asia-Pacific Software Engineering Conference (APSEC)
S Rai, T Gaikwad, S Jain, A Gupta
IEEE, 199ś208 2017
Citations: 5

Is the corpus ready for machine translation? A case study with Python to pseudo-code corpus
S Rai, RC Belwal, A Gupta
Arabian Journal for Science and Engineering 48 (2), 1845-1858 2023
Citations: 4

Advanced hierarchical topic labeling for short text
P Tiwari, A Tripathi, A Singh, S Rai
IEEE Access 2023
Citations: 3

Mind your tweet: Abusive tweet detection
P Tiwari, S Rai
International Conference on Speech and Computer, 704-715 2021
Citations: 3

Development of a plugin based extensible feature extraction framework
V Malviya, S Rai, A Gupta
Proceedings of the 33rd Annual ACM Symposium on Applied Computing, 1840-1847 2018
Citations: 3

Effect of identifier tokenization on automatic source code documentation
S Rai, RC Belwal, A Gupta
Arabian Journal for Science and Engineering 47 (2), 2141-2157 2022
Citations: 2

Accurate module name prediction using similarity based and sequence generation models
S Rai, RC Belwal, A Gupta
Journal of Ambient Intelligence and Humanized Computing 14 (9), 11531-11543 2023
Citations: 1

Generating class name in sequential manner using convolution attention neural network
S Rai, RC Belwal, A Gupta
Expert Systems with Applications 199, 116854 2022
Citations: 1