Sukomal Pal

@iitbhu.ac.in

Assistant Professor, CSE
IIT (BHU), Varanasi, India



              

https://researchid.co/sukomalpal

RESEARCH INTERESTS

Information Retrieval, Recommender Systems, Text Mining, Data Science

74

Scopus Publications

Scopus Publications

  • Sentiment analysis on Hindi tweets during COVID-19 pandemic
    Anita Saroj, Akash Thakur, and Sukomal Pal

    Wiley
    AbstractA gap among the people has been created due to a lack of social interactions. The physical void has led to an increase in online interaction among users on social media platforms. Sentiment analysis of such interactions can help us analyze the general public psychology during the pandemic. However, the lack of data in non‐English and low‐resource languages like ‘Hindi’ makes it difficult to study it among native and non‐English speaking masses. Here, we create a small collection of ‘Hindi’ tweets on COVID‐19 during the pandemic containing 10,011 tweets for sentiment analysis, which is named as sentiment analysis for Hindi (SAFH). In this article, we describe the process of collecting, creating, annotating the corpus, and sentiment classification. The claims have been verified using different word embedding with a deep learning classifier through the proposed model. The achieved accuracy of the proposed model yields up to a permissible rate of 90.9%.

  • Diversified recommendation using implicit content node embedding in heterogeneous information network
    Naina Yadav, Sukomal Pal, and Anil Kumar Singh

    Springer Science and Business Media LLC

  • Water chicken swarm optimization-based deep segmental neural network for spoken term detection using bayesian filtering
    Sushil Venkatesh Kulkarni and Sukomal Pal

    Springer Science and Business Media LLC

  • Effect of Stopwords and Stemming Techniques in Urdu IR
    Siba Sankar Sahu, Debrup Dutta, Sukomal Pal, and Imran Rasheed

    Springer Science and Business Media LLC

  • The Effect of Stopword Removal on Information Retrieval for Code-Mixed Data Obtained Via Social Media
    Supriya Chanda and Sukomal Pal

    Springer Science and Business Media LLC

  • A Study on Corpus-based Stopword Lists in Indian Language IR
    Siba Sankar Sahu and Sukomal Pal

    Association for Computing Machinery (ACM)
    We explore and evaluate the effect of different stopword lists (non-corpus-based and corpus-based) in the information retrieval (IR) tasks with different Indian languages such as Bengali, Marathi, Gujarati, Hindi, and English. The issue was investigated from three viewpoints. Is there any performance difference between non-corpus-based and corpus-based stopword removal in chosen Indian languages? Can corpus-based stopword lists improve performance in Indian languages IR? If yes, to what extent? Among the different corpus-based stopword lists, which stopword list provides the best IR performance? Does the length of a corpus-based stopword list affect the retrieval performance in Indian languages? If yes, to what extent? It was observed that a corpus-based stopword list provides better retrieval performance than a non-corpus-based stopword list in different Indian languages. Among the different corpus-based stopword lists generated and experimented with, Zipf’s law-based stopword list (idf-based one) provides the best retrieval performance in various Indian languages. The aggregation1-based stopword list provides better retrieval than the aggregation2-based list in Indian languages, but in English, the aggregation2-based stopword list performs better than the aggregation1-based list. The best performing idf-based stopword list improves MAP score by 5.43% in Bengali, 1.91% in Marathi, 5.4% in Gujarati, 1.5% in Hindi, and 2.12% in English, respectively, over their baseline counterparts. The probabilistic retrieval models (BM25 and TF-IDF) perform best in different Indian languages. A smaller length of corpus-based stopword lists performs better than a larger length of non-corpus-based stopword lists for all the Indian languages considered. The proposed schemes demonstrate that a stopword list can be heuristically generated in a language-independent statistical method and effectively used for IR tasks with performance comparable, to or even better than non-corpus-based stopword lists.


  • Domain Aligned Prefix Averaging for Domain Generalization in Abstractive Summarization


  • Ensemble-based domain adaptation on social media posts for irony detection
    Anita Saroj and Sukomal Pal

    Springer Science and Business Media LLC


  • Clus-DR: Cluster-based pre-trained model for diverse recommendation generation
    Naina Yadav, Sukomal Pal, Anil Kumar Singh, and Kartikey Singh

    Elsevier BV




  • Effect of stopwords in Indian language IR
    Siba Sankar Sahu and Sukomal Pal

    Springer Science and Business Media LLC

  • Extractive Text Summarization using Meta-heuristic Approach


  • Sentiment Analysis and Homophobia detection of Code-Mixed Dravidian Languages leveraging pre-trained model and word-level language tag


  • Coarse and Fine-Grained Conversational Hate Speech and Offensive Content Identification in Code-Mixed Languages using Fine-Tuned Multilingual Embedding


  • Applying an Information Retrieval Approach to Retrieve Relevant Articles in the Legal Domain
    Ambedkar Kanapala, Sukomal Pal, Suresh Dara, and Srikanth Jannu

    Springer Science and Business Media LLC

  • Query Expansion for Transliterated Text Retrieval
    Dinesh Kumar Prabhakar, Sukomal Pal, and Chiranjeev Kumar

    Association for Computing Machinery (ACM)
    With Web 2.0, there has been exponential growth in the number of Web users and the volume of Web content. Most of these users are not only consumers of the information but also generators of it. People express themselves here in colloquial languages, but using Roman script (transliteration). These texts are mostly informal and casual, and therefore seldom follow grammar rules. Also, there does not exist any prescribed set of spelling rules in transliterated text. This freedom leads to large-scale spelling variations, which is a major challenge in mixed script information processing. This article studies different existing phonetic algorithms to handle the issue of spelling variation, points out the limitations of them, and proposes a novel phonetic encoding approach with two different flavors in the light of Hindi transliteration. Experiments performed over Hindi song lyrics retrieval in mixed script domain with three different retrieval models show that proposed approaches outperform the existing techniques in a majority of the cases (sometimes statistically significantly) for a number of metrics like nDCG@1, nDCG@5, nDCG@10, MAP, MRR, and Recall.

  • CLAVER: An integrated framework of convolutional layer, bidirectional LSTM with attention mechanism based scholarly venue recommendation
    Tribikram Pradhan, Prashant Kumar, and Sukomal Pal

    Elsevier BV
    Abstract Scholarly venue recommendation is an emerging field due to a rapid surge in the number of scholarly venues concomitant with exponential growth in interdisciplinary research and cross collaboration among researchers. Finding appropriate publication venues is confronted as one of the most challenging aspects in paper publication as a larger proportion of manuscripts face rejection due to a disjunction between the scope of the venue and the field of research pursued by the research article. We present CLAVERG??an integrated framework of Convolutional Layer, bi-directional LSTM with an Attention mechanism-based scholarly VEnue Recommender system. The system is the first of its kind to integrate multiple deep learning-based concepts, that only requiring only the abstract and title of a manuscript to identify academic venues. An extensive and exhaustive set of experiments conducted on the DBLP dataset certify that the postulated model CLAVER performs better than most of the modern techniques as entrenched by standard metrics such as stability, accuracy, MRR, average venue quality, precision@k, nDCG@k and diversity.

  • A proactive decision support system for reviewer recommendation in academia
    Tribikram Pradhan, Suchit Sahoo, Utkarsh Singh, and Sukomal Pal

    Elsevier BV
    Abstract Peer review is an essential part of scientific communications to ensure the quality of publications and a healthy scientific evaluation process. Assigning appropriate reviewers poses a great challenge for program chairs and journal editors for many reasons, including relevance, fair judgment, no conflict of interest, and qualified reviewers in terms of scientific impact. With a steady increase in the number of research domains, scholarly venues, researchers, and papers in academia, manually selecting and accessing adequate reviewers is becoming a tedious and time-consuming task. Traditional approaches for reviewer selection mainly focus on the matching of research relevance by keywords or disciplines. However, in real-world systems, various factors are often needed to be considered. Therefore, we propose a multilayered approach integrating Topic Network, Citation Network, and Reviewer Network into a reviewer Recommender System (TCRRec). We explore various aspects, including relevance between reviewer candidates and submission, authority, expertise, di- versity, and conflict of interest and integrate them into the proposed framework TCRRec. The proposed system also considers the temporal changes of reviewers’ interest and the stability of reviewers’ interests trends to enhance their performance. The paper also addresses cold start issues for researchers having unique areas of interest or for isolated researchers. Experiments based on the NIPS and AMiner dataset demonstrate that the proposed TCRRec outperforms state-of-the-art recommendation techniques in terms of standard metrics of precision@k, MRR, nDCG@k, authority, expertise, diversity, and coverage.

  • A deep neural architecture based meta-review generation and final decision prediction of a scholarly article
    Tribikram Pradhan, Chaitanya Bhatia, Prashant Kumar, and Sukomal Pal

    Elsevier BV
    Abstract Peer reviews form an essential part of scientific communications. Research papers and proposals are reviewed by several peers before they are finally accepted or rejected for publication and funding, respectively. With the steady increase in the number of research domains, scholarly venues (journal and/or conference), researchers, and papers, managing the peer review process is becoming a daunting task. Application of recommender systems to assist peer reviewing is, therefore, being explored and becoming an emerging research area. In this paper, we present a deep learning network based Meta-Review Generation considering peer review prediction of the scholarly article (MRGen). MRGen is able to provide solutions for: (i) Peer review prediction (Task 1) and (ii) Meta-review generation (Task 2). First, the system takes the peer reviews as input and produces a draft meta-review. Then it employs an integrated framework of convolution layer, long short-term memory (LSTM) model, Bi-LSTM model, and attention mechanism to predict the final decision (accept/reject) of the scholarly article. Based on the final decision, the proposed model MRGen incorporates Pointer Generator Network-based abstractive summarization to generate the final meta-review. The focus of our approach is to give a concise meta-review that maximizes information coverage, coherence, readability and also reduces redundancy. Extensive experiments conducted on the PeerRead dataset demonstrate good consistency between the recommended decisions and original decisions. We also compare the performance of MRGen with some of the existing state-of-the-art multi-document summarization methods. The system also outperforms a few existing models based on accuracy, Rouge scores, readability, non-redundancy, and cohesion.

  • Is Meta Embedding better than pre-trained word embedding to perform Sentiment Analysis for Dravidian Languages in Code-Mixed Text?


  • Fine-tuning Pre-Trained Transformer based model for Hate Speech and Offensive Content Identification in English, Indo-Aryan and Code-Mixed (English-Hindi) languages


RECENT SCHOLAR PUBLICATIONS