I have worked on Dialogue System as a research scholar in Human-Computer-Interaction department of Indian Institute of Information Technology Allahabad (IIITA), India under Prof. U.S. Tiwary and also a member of SILP (Speech, Image and Language Processing) Lab.
My area is more focussed toward the natural language understanding and state tracking in the dialogue. I am enthusiastic to work in the field of artificial intelligence, machine learning, applied mathematics and statistics.
Currently, working as an AI Scientist at Uniphore Private Limited since Sept 2022
EDUCATION
BTech in CSE department from Guru Ghasidas Visvidhayala (Central University)
MTech-PhD in IT department from Indian Institute of Information Technology Allahabad.
Soft clustering and interval type-2 fuzzy set based inference strategy for I.T. personnel selection Rohit Mishra, Shrikant Malviya, Rudra Chandra Ghosh, Uma Shanker Tiwary Journal of Intelligent and Fuzzy Systems, 2022 Impreciseness and uncertainty are the fabrics that make life interesting. For decades, human beings have developed strategies to cope with uncertainties and automate them. In personnel selection for the I.T. field, selectors often find it very difficult to select candidates by going through a set of resumes containing similar kinds of skills. Hence the selection task becomes a fuzzy decision making with the uncertainty involved. A combination of fuzzy clustering and Interval Type-2 fuzzy sets (IT2FS) is proposed in such scenarios. An experiment is conducted over a resume dataset containing fifteen hundred resumes for a particular job description. Firstly, Fuzzy C-means clustering (FCM) is applied for selective clustering, while decision-making under uncertainty is carried through IT2FS. The candidates in the selected cluster are given a score for ranking as per the skillset criteria. The final decision for shortlisting the resumes is carried through IT2FS. The model shows an average accuracy of 88.2% with an F1-score of 0.76 compared to (K-means + IT2FS) model with an F1-score of 0.72. Thus, the proposed model performs better while decision-making under uncertainty.
Detection of Semantically Equivalent Question Pairs Reetu Kumari, Rohit Mishra, Shrikant Malviya, Uma Shanker Tiwary Lecture Notes in Computer Science Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics, 2021
Analysis of Word-level Embeddings for Indic Languages on AI4Bharat-IndicNLP Corpora Dipam Goswami, Shrikant Malviya, Rohit Mishra, Uma Shanker Tiwary 2021 IEEE 8th Uttar Pradesh Section International Conference on Electrical Electronics and Computer Engineering Upcon 2021, 2021 This paper presents the analysis of non-contextual word embeddings trained on AI4Bharat-IndicNLP corpus containing 2.7 billion words covering 10 Indian languages. We share the pre-trained embeddings for research and development in Indic languages. These embeddings are evaluated on several evaluation tasks like word similarity and analogy evaluation, classification tasks on multiple datasets. The analysis of word embeddings is expected to give researchers a better understanding of the Indic Languages. We show that Word2Vec skip-gram and FastText skip-gram embeddings are the best performing models for NLP tasks on Indic languages. All the embeddings are made freely available.
HDRS: Hindi Dialogue Restaurant Search Corpus for Dialogue State Tracking in Task-Oriented Environment Shrikant Malviya, Rohit Mishra, Santosh Kumar Barnwal, Uma Shanker Tiwary IEEE ACM Transactions on Audio Speech and Language Processing, 2021 Due to the rapid increase in the development of Task-oriented dialogue systems, the need for labelled dialogue corpus has become inevitable. For the Hindi language, there is no such dialogue corpus yet available. As a first attempt, we release a Hindi Dialogue Restaurant Search (HDRS) corpus and compare various state-of-the-art dialogue state tracking (DST) models on it. The corpus consists of 1.4 k human-to-human typed dialogues collected using Wizard-of-Oz paradigm. The paper starts with a brief description of the corpus by providing the details of features, corpus collection process and statistical analysis, then the performance of baseline NLU and DST models are investigated. Further, we experimented two categories of state-of-the-art belief state trackers: (1) Non-contextual pre-trained word embedding based DST models; (2) Contextual pre-trained BERT based DST models. All belief trackers follow a three-layered generic architecture. The category-1 models use the static domain ontology, while category-2 models have the capability to handle the dynamic ontology. The DST models are compared on joint-goal and turn-request accuracy. Global encoder and Slot-ATtentive decoders (GSAT) outperforms all the models with 83.25% joint-goal accuracy, followed by SUMBT.
RNN Based Language Generation Models for a Hindi Dialogue System Sumit Singh, Shrikant Malviya, Rohit Mishra, Santosh Kumar Barnwal, Uma Shanker Tiwary Lecture Notes in Computer Science Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics, 2020
Structural analysis of Hindi phonetics and a method for extraction of phonetically rich sentences from a very large Hindi text corpus Shrikant Malviya, Rohit Mishra, Uma Shanker Tiwary 2016 Conference of the Oriental Chapter of International Committee for Coordination and Standardization of Speech Databases and Assessment Techniques O Cocosda 2016, 2017 Automatic speech recognition (ASR) and Text to speech (TTS) are two prominent area of research in human computer interaction nowadays. A set of phonetically rich sentences is in a matter of importance in order to develop these two interactive modules of HCI. Essentially, the set of phonetically rich sentences has to cover all possible phone units distributed uniformly. Selecting such a set from a big corpus with maintaining phonetic characteristic based similarity is still a challenging problem. The major objective of this paper is to devise a criteria in order to select a set of sentences encompassing all phonetic aspects of a corpus with size as minimum as possible. First, this paper presents a statistical analysis of Hindi phonetics by observing the structural characteristics. Further a two stage algorithm is proposed to extract phonetically rich sentences with a high variety of triphones from the EMILLE Hindi corpus. The algorithm consists of a distance measuring criteria to select a sentence in order to improve the triphone distribution. Moreover, a special preprocessing method is proposed to score each triphone in terms of inverse probability in order to fasten the algorithm. The results show that the approach efficiently build uniformly distributed phonetically-rich corpus with optimum number of sentences.
RECENT SCHOLAR PUBLICATIONS
A framework for quality assessment of synthesised speech using learning-based objective evaluation S Malviya, R Mishra, SK Barnwal, US Tiwary International Journal of Speech Technology 26 (1), 221-243 , 2023 2023 Citations: 2
Multi-attribute decision making application using hybridly modelled Gaussian interval type-2 fuzzy sets with uncertain mean R Mishra, S Malviya, S Singh, V Singh, US Tiwary Multimedia Tools and Applications 82 (4), 4913-4940 , 2023 2023 Citations: 14
Soft clustering and interval type-2 fuzzy set based inference strategy for IT personnel selection R Mishra, S Malviya, RC Ghosh, US Tiwary Journal of Intelligent & Fuzzy Systems 42 (6), 5351-5359 , 2022 2022 Citations: 1
Analysis of word-level embeddings for InDic Languages on AI4Bharat-IndicNLP Corpora D Goswami, S Malviya, R Mishra, US Tiwary 2021 IEEE 8th Uttar Pradesh Section International Conference on Electrical … , 2021 2021 Citations: 5
HDRS: Hindi dialogue restaurant search corpus for dialogue state tracking in task-oriented environment S Malviya, R Mishra, SK Barnwal, US Tiwary IEEE/ACM Transactions on Audio, Speech, and Language Processing 29, 2517-2528 , 2021 2021 Citations: 12
Detection of Semantically Equivalent Question Pairs R Kumari, R Mishra, S Malviya, US Tiwary 12th International Conference on Intelligent Human Computer Interaction … , 2021 2021 Citations: 5
Computing with Words Through Interval Type-2 Fuzzy Sets for Decision Making Environment R Mishra, SK Barnwal, S Malviya, V Singh, P Singh, S Singh, US Tiwary 11th International Conference Intelligent Human Computer Interaction (IHCI … , 2020 2020 Citations: 3
RNN Based Language Generation Models for a Hindi Dialogue System S Singh, S Malviya, R Mishra, S Kumar Barnwal, U Shanker Tiwary International Conference on Intelligent Human Computer Interaction (IHCI … , 2020 2020 Citations: 8
Prosodic feature selection of personality traits for job interview performance R Mishra, SK Barnwal, S Malviya, P Mishra, US Tiwary International conference on intelligent systems design and applications, 673-682 , 2018 2018 Citations: 7
Sentiment analysis: An empirical comparative study of various machine learning approaches S Jain, S Malviya, R Mishra, US Tiwary Proceedings of the 14th International Conference on Natural Language … , 2017 2017 Citations: 17
Structural analysis of Hindi phonetics and a method for extraction of phonetically rich sentences from a very large Hindi text corpus S Malviya, R Mishra, US Tiwary 2016 Conference of The Oriental Chapter of International Committee for … , 2016 2016 Citations: 18
MOST CITED SCHOLAR PUBLICATIONS
Structural analysis of Hindi phonetics and a method for extraction of phonetically rich sentences from a very large Hindi text corpus S Malviya, R Mishra, US Tiwary 2016 Conference of The Oriental Chapter of International Committee for … , 2016 2016 Citations: 18
Sentiment analysis: An empirical comparative study of various machine learning approaches S Jain, S Malviya, R Mishra, US Tiwary Proceedings of the 14th International Conference on Natural Language … , 2017 2017 Citations: 17
Multi-attribute decision making application using hybridly modelled Gaussian interval type-2 fuzzy sets with uncertain mean R Mishra, S Malviya, S Singh, V Singh, US Tiwary Multimedia Tools and Applications 82 (4), 4913-4940 , 2023 2023 Citations: 14
HDRS: Hindi dialogue restaurant search corpus for dialogue state tracking in task-oriented environment S Malviya, R Mishra, SK Barnwal, US Tiwary IEEE/ACM Transactions on Audio, Speech, and Language Processing 29, 2517-2528 , 2021 2021 Citations: 12
RNN Based Language Generation Models for a Hindi Dialogue System S Singh, S Malviya, R Mishra, S Kumar Barnwal, U Shanker Tiwary International Conference on Intelligent Human Computer Interaction (IHCI … , 2020 2020 Citations: 8
Prosodic feature selection of personality traits for job interview performance R Mishra, SK Barnwal, S Malviya, P Mishra, US Tiwary International conference on intelligent systems design and applications, 673-682 , 2018 2018 Citations: 7
Analysis of word-level embeddings for InDic Languages on AI4Bharat-IndicNLP Corpora D Goswami, S Malviya, R Mishra, US Tiwary 2021 IEEE 8th Uttar Pradesh Section International Conference on Electrical … , 2021 2021 Citations: 5
Detection of Semantically Equivalent Question Pairs R Kumari, R Mishra, S Malviya, US Tiwary 12th International Conference on Intelligent Human Computer Interaction … , 2021 2021 Citations: 5
Computing with Words Through Interval Type-2 Fuzzy Sets for Decision Making Environment R Mishra, SK Barnwal, S Malviya, V Singh, P Singh, S Singh, US Tiwary 11th International Conference Intelligent Human Computer Interaction (IHCI … , 2020 2020 Citations: 3
A framework for quality assessment of synthesised speech using learning-based objective evaluation S Malviya, R Mishra, SK Barnwal, US Tiwary International Journal of Speech Technology 26 (1), 221-243 , 2023 2023 Citations: 2
Soft clustering and interval type-2 fuzzy set based inference strategy for IT personnel selection R Mishra, S Malviya, RC Ghosh, US Tiwary Journal of Intelligent & Fuzzy Systems 42 (6), 5351-5359 , 2022 2022 Citations: 1
INDUSTRY EXPERIENCE
1 year and 4 month (currently) working at Uniphore Private Limited, Bangalore