@srmap.edu.in
Assistant Professor CSE
SRM University-AP
Software, Artificial Intelligence, Computer Science, Computer Science Applications
Scopus Publications
Scholar Citations
Scholar h-index
Scholar i10-index
Subiya Zaidi, Sawan Rai, and Kapil Juneja
IEEE
ChatGPT, Alexa, Siri, Okay Google are an indispensable part of our lives today. These assistants are referred to as Digital Assistants and enable users to communicate their choices through natural language. The Digital Assistants ease the customer task of selecting items in various applications like movies, songs and so on. This process of making a choice through natural language conversations is known as a Conversational Recommender system (CoRS). CoRS is a dialogue-based model which aims to provide customer with accurate and quality recommendations. The interaction-oriented method gives the customer an edge over the traditional way of seeking recommendations. The traditional recommendation systems are static in nature and derive information through past history of the customer. A CoRS mitigates the challenges faced in the earlier methods of recommendation like cold start where in a new user is often recommended inaccurate choices. Other issues like data sparsity and lack of diversity due to not so updated content to choose from are common. CoRS is dynamic in nature, it works on delivering high end choices by interpreting the customer demands one dialogue at a time. This comprehensive survey aims to give an overview of the research in progress using conversation as a means to achieve better results for recommendation systems.
Paras Tiwari, Sawan Rai, and C. Ravindranath Chowdary
Springer Science and Business Media LLC
Sawan Rai, Ramesh Chandra Belwal, and Atul Gupta
Springer Science and Business Media LLC
Ramesh Chandra Belwal, Sawan Rai, and Atul Gupta
Springer Science and Business Media LLC
Sawan Rai, Ramesh Chandra Belwal, and Atul Gupta
Springer Science and Business Media LLC
Sawan Rai, Ramesh Chandra Belwal, and Abhinav Sharma
Springer Nature Singapore
Avaneesh Singh, Sawan Rai, and Manish Kumar Bajpai
Springer Nature Singapore
Paras Tiwari, Ashutosh Tripathi, Avaneesh Singh, and Sawan Rai
Institute of Electrical and Electronics Engineers (IEEE)
Hierarchical Topic Modeling is the probabilistic approach for discovering latent topics distributed hierarchically among the documents. The distributed topics are represented with the respective topic terms. An unambiguous conclusion from the topic term distribution is a challenge for readers. The hierarchical topic labeling eases the challenge by facilitating an individual, appropriate label for each topic at every level. In this work, we propose a BERT-embedding inspired methodology for labeling hierarchical topics in short text corpora. The short texts have gained significant popularity on multiple platforms in diverse domains. The limited information available in the short text makes it difficult to deal with. In our work, we have used three diverse short text datasets that include both structured and unstructured instances. Such diversity ensures the broad application scope of this work. Considering the relevancy factor of the labels, the proposed methodology has been compared against both automatic and human annotators. Our proposed methodology outperformed the benchmark with an average score of 0.4185, 49.50, and 49.16 for cosine similarity, exact match, and partial match, respectively.
Sawan Rai, Ramesh Chandra Belwal, and Atul Gupta
Elsevier BV
Sawan Rai, Ramesh Chandra Belwal, and Atul Gupta
Association for Computing Machinery (ACM)
Context: Coding is an incremental activity where a developer may need to understand a code before making suitable changes in the code. Code documentation is considered one of the best practices in software development but requires significant efforts from developers. Recent advances in natural language processing and machine learning have provided enough motivation to devise automated approaches for source code documentation at multiple levels. Objective: The review aims to study current code documentation practices and analyze the existing literature to provide a perspective on their preparedness to address the stated problem and the challenges that lie ahead. Methodology: We provide a detailed account of the literature in the area of automated source code documentation at different levels and critically analyze the effectiveness of the proposed approaches. This also allows us to infer gaps and challenges to address the problem at different levels. Findings: (1) The research community focused on method-level summarization. (2) Deep learning has dominated the past five years of this research field. (3) Researchers are regularly proposing bigger corpora for source code documentation. (4) Java and Python are the widely used programming languages as corpus. (5) Bilingual Evaluation Understudy is the most favored evaluation metric for the research persons.
Sawan Rai, Ramesh Chandra Belwal, and Atul Gupta
Springer Science and Business Media LLC
Ramesh Chandra Belwal, Sawan Rai, and Atul Gupta
Springer Science and Business Media LLC
Ramesh Chandra Belwal, Sawan Rai, and Atul Gupta
Elsevier BV
Vikas K. Malviya, Sawan Rai, and Atul Gupta
Elsevier BV
Paras Tiwari and Sawan Rai
Springer International Publishing
Vikas Malviya, Sawan Rai, and Atul Gupta
ACM
An important ingredient for a successful recipe for solving machine learning problems is the availability of a suitable dataset. However, such a dataset may have to be extracted from a large unstructured and semi-structured data like programming code, scripts, and text. In this work, we propose a plug-in based, extensible feature extraction framework for which we have prototyped as a tool. The proposed framework is demonstrated by extracting features from two different sources of semi-structured and unstructured data. The semi-structured data comprised of web page and script based data whereas the other data was taken from email data for spam filtering. The usefulness of the tool was also assessed on the aspect of ease of programming.
Sawan Rai, Tejaswini Gaikwad, Sparshi Jain, and Atul Gupta
IEEE
Rapid growth in providing automated solutions resulted in large code bases to get quickly developed and consumed. However, maintaining code and its subsequent reuse pose some challenges here. One of the best practices used to handle such issues is also to provide suitable text summary of the code to allow the human developers to comprehend the code easily, but this can be quite time-consuming and costly affair. A few efforts have been made in this direction where the text summary of the code either generated from the method signature or its body. In this paper, we propose a text summarization approach for Java code that makes use of identification of code level nano-patterns to obtain text summary. The approach also looks for associations between these nano-patterns in a Java method code and then use a template based text generation to obtain the final text summary of the Java method. We evaluated the summary generated by the proposed approach using a controlled experiment with other three existing approaches. Our results suggested that the summary generated by our approach was better on the part of completeness and correctness criteria. The feedback obtained during the experimental validation suggested additional inputs to improve the generated text summary on the other two accounts as well.