Jan Kocon

@pwr.edu.pl

Department of Artificial Intelligence
Wroclaw University of Science and Technology



                             

https://researchid.co/jankocon

I'm an Assistant Professor in the department of Artificial Intelligence at the Wroclaw University of Science and Technology, where I earned both my Ph.D. in computer science (2018) and M.Sc. Eng. degree (2012). I serve as the AI/ML Team Leader and Senior ML/NLP Data Scientist for the CLARIN-BIZ project. My passion for natural language processing (NLP) has spanned over a decade, with a keen interest in machine learning techniques. I've penned over 60 scientific papers, showcased at prominent conferences including ACL, ICDM, and more. My current endeavors revolve around pioneering deep learning models for subjective tasks such as emotion and sentiment analysis. I'm also delving into cross-lingual knowledge transfer and language-agnostic models. My contributions have been integral to projects like CrisisDetector, StockBrief, Sentimenti, and CLARIN-PL, among others. I enjoy imparting knowledge on data science, AI's role in NLP, and building sophisticated deep neural network models.

RESEARCH, TEACHING, or OTHER INTERESTS

Computer Science, Artificial Intelligence, Computer Science Applications, Signal Processing

61

Scopus Publications

3160

Scholar Citations

21

Scholar h-index

39

Scholar i10-index

Scopus Publications


  • Improving Training Dataset Balance with ChatGPT Prompt Engineering
    Mateusz Kochanek, Igor Cichecki, Oliwier Kaszyca, Dominika Szydło, Michał Madej, Dawid Jędrzejewski, Przemysław Kazienko, and Jan Kocoń

    MDPI AG
    The rapid evolution of large language models, in particular OpenAI’s GPT-3.5-turbo and GPT-4, indicates a growing interest in advanced computational methodologies. This paper proposes a novel approach to synthetic data generation and knowledge distillation through prompt engineering. The potential of large language models (LLMs) is used to address the problem of unbalanced training datasets for other machine learning models. This is not only a common issue but also a crucial determinant of the final model quality and performance. Three prompting strategies have been considered: basic, composite, and similarity prompts. Although the initial results do not match the performance of comprehensive datasets, the similarity prompts method exhibits considerable promise, thus outperforming other methods. The investigation of our rebalancing methods opens pathways for future research on leveraging continuously developed LLMs for the enhanced generation of high-quality synthetic data. This could have an impact on many large-scale engineering applications.

  • ChatGPT: Jack of all trades, master of none
    Jan Kocoń, Igor Cichecki, Oliwier Kaszyca, Mateusz Kochanek, Dominika Szydło, Joanna Baran, Julita Bielaniewicz, Marcin Gruza, Arkadiusz Janz, Kamil Kanclerz,et al.

    Elsevier BV

  • Migrants vs. stayers in the pandemic – A sentiment analysis of Twitter content
    Olga Czeranowska, Karol Chlasta, Piotr Miłkowski, Izabela Grabowska, Jan Kocoń, Krzysztof Hwaszcz, Jan Wieczorek, and Agata Jastrzębowska

    Elsevier BV

  • Human-centered neural reasoning for subjective content processing: Hate speech, emotions, and humor
    Przemysław Kazienko, Julita Bielaniewicz, Marcin Gruza, Kamil Kanclerz, Konrad Karanowski, Piotr Miłkowski, and Jan Kocoń

    Elsevier BV

  • Deep Emotions Across Languages: A Novel Approach for Sentiment Propagation in Multilingual WordNets
    Jan Kocoń

    IEEE
    Sentiment analysis involves using WordNets enriched with emotional metadata, which are valuable resources. However, manual annotation is time-consuming and expensive, resulting in only a few WordNet Lexical Units being annotated. This paper introduces two new techniques for automatically propagating sentiment annotations from a partially annotated WordNet to its entirety and to a WordNet in a different language: Multilingual Structured Synset Embeddings (MSSE) and Cross-Lingual Deep Neural Sentiment Propagation (CLDNS). We evaluated the proposed MSSE+CLDNS method extensively using Princeton WordNet and Polish WordNet, which have many inter-lingual relations. Our results show that the MSSE+CLDNS method outperforms existing propagation methods, indicating its effectiveness in enriching WordNets with emotional metadata across multiple languages. This work provides a solid foundation for large-scale, multilingual sentiment analysis and is valuable for academic research and practical applications.

  • From Big to Small Without Losing It All: Text Augmentation with ChatGPT for Efficient Sentiment Analysis
    Stanisław Woźniak and Jan Kocoń

    IEEE
    In the era of artificial intelligence, data is gold but costly to annotate. The paper demonstrates a groundbreaking solution to this dilemma using ChatGPT for text augmentation in sentiment analysis. We leverage ChatGPT’s generative capabilities to create synthetic training data that significantly improves the performance of smaller models, making them competitive with, or even outperforming, their larger counterparts. This innovation enables models to be both efficient and effective, thereby reducing computational cost, inference time, and memory usage without compromising on quality. Our work marks a key advancement in the cost-effective development and deployment of robust sentiment analysis models.

  • Towards Model-Based Data Acquisition for Subjective Multi-Task NLP Problems
    Kamil Kanclerz, Julita Bielaniewicz, Marcin Gruza, Jan Kocoń, Stanisław Woźniak, and Przemysław Kazienko

    IEEE
    Data annotated by humans is a source of knowledge by describing the peculiarities of the problem and therefore fueling the decision process of the trained model. Unfortunately, the annotation process for subjective natural language processing (NLP) problems like offensiveness or emotion detection is often very expensive and time-consuming. One of the inevitable risks is to spend some of the funds and annotator effort on annotations that do not provide any additional knowledge about the specific task. To minimize these costs, we propose a new model-based approach that allows the selection of tasks annotated individually for each text in a multi-task scenario. The experiments carried out on three datasets, dozens of NLP tasks, and thousands of annotations show that our method allows up to 40% reduction in the number of annotations with negligible loss of knowledge. The results also emphasize the need to collect a diverse amount of data required to efficiently train a model, depending on the subjectivity of the annotation task. We also focused on measuring the relation between subjective tasks by evaluating the model in single-task and multi-task scenarios. Moreover, for some datasets, training only on the labels predicted by our model improved the efficiency of task selection as a self-supervised learning regularization technique.

  • Modeling Uncertainty in Personalized Emotion Prediction with Normalizing Flows
    Piotr Miłkowski, Konrad Karanowski, Patryk Wielopolski, Jan Kocoń, Przemysław Kazienko, and Maciej Zięba

    IEEE
    Designing predictive models for subjective problems in natural language processing (NLP) remains challenging. This is mainly due to its non-deterministic nature and different perceptions of the content by different humans. It may be solved by Personalized Natural Language Processing (PNLP), where the model exploits additional information about the reader to make more accurate predictions. However, current approaches require complete information about the recipients to be straight embedded. Besides, the recent methods focus on deterministic inference or simple frequency-based estimations of the probabilities. In this work, we overcome this limitation by proposing a novel approach to capture the uncertainty of the forecast using conditional Normalizing Flows. This allows us to model complex multimodal distributions and to compare various models using negative log-likelihood (NLL). In addition, the new solution allows for various interpretations of possible reader perception thanks to the available sampling function. We validated our method on three challenging, subjective NLP tasks, including emotion recognition and hate speech. The comparative analysis of generalized and personalized approaches revealed that our personalized solutions significantly outperform the baseline and provide more precise uncertainty estimates. The impact on the text interpretability and uncertainty studies are presented as well. The information brought by the developed methods makes it possible to build hybrid models whose effectiveness surpasses classic solutions. In addition, an analysis and visualization of the probabilities of the given decisions for texts with high entropy of annotations and annotators with mixed views were carried out.

  • PALS: Personalized Active Learning for Subjective Tasks in NLP


  • RWKV: Reinventing RNNs for the Transformer Era


  • Capturing Human Perspectives in NLP: Questionnaires, Annotations, and Biases


  • CLARIN-Emo: Training Emotion Recognition Models Using Human Annotation and ChatGPT
    Bartłomiej Koptyra, Anh Ngo, Łukasz Radliński, and Jan Kocoń

    Springer Nature Switzerland

  • Differential Dataset Cartography: Explainable Artificial Intelligence in Comparative Personalized Sentiment Analysis
    Jan Kocoń, Joanna Baran, Kamil Kanclerz, Michał Kajstura, and Przemysław Kazienko

    Springer Nature Switzerland

  • Personalized Models Resistant to Malicious Attacks for Human-centered Trusted AI


  • Emotion norms for 6000 Polish word meanings with a direct mapping to the Polish wordnet
    Małgorzata Wierzba, Monika Riegel, Jan Kocoń, Piotr Miłkowski, Arkadiusz Janz, Katarzyna Klessa, Konrad Juszczyk, Barbara Konat, Damian Grimling, Maciej Piasecki,et al.

    Springer Science and Business Media LLC
    AbstractEmotion lexicons are useful in research across various disciplines, but the availability of such resources remains limited for most languages. While existing emotion lexicons typically comprise words, it is a particular meaning of a word (rather than the word itself) that conveys emotion. To mitigate this issue, we present the Emotion Meanings dataset, a novel dataset of 6000 Polish word meanings. The word meanings are derived from the Polish wordnet (plWordNet), a large semantic network interlinking words by means of lexical and conceptual relations. The word meanings were manually rated for valence and arousal, along with a variety of basic emotion categories (anger, disgust, fear, sadness, anticipation, happiness, surprise, and trust). The annotations were found to be highly reliable, as demonstrated by the similarity between data collected in two independent samples: unsupervised (n = 21,317) and supervised (n = 561). Although we found the annotations to be relatively stable for female, male, younger, and older participants, we share both summary data and individual data to enable emotion research on different demographically specific subgroups. The word meanings are further accompanied by the relevant metadata, derived from open-source linguistic resources. Direct mapping to Princeton WordNet makes the dataset suitable for research on multiple languages. Altogether, this dataset provides a versatile resource that can be employed for emotion research in psychology, cognitive science, psycholinguistics, computational linguistics, and natural language processing.

  • Multi-Modal Personalized Hate Speech Analysis using Differential Dataset Cartography


  • Towards a contextualised spatial-diachronic history of literature: mapping emotional representations of the city and the country in Polish fiction from 1864 to 1939


  • Linguistic Knowledge Application to Neuro-Symbolic Transformers in Sentiment Analysis
    Joanna Baran and Jan Kocon

    IEEE
    Neuro-symbolic approaches explore ways to com-bine neural networks with traditional symbolic knowledge. These methods are gaining attention due to their efficiency and the requirement of fewer data compared to currently used deep models. This work investigated several neuro-symbolic models for sentiment analysis focusing on a variety of ways to add linguistic knowledge to the transformer-based architecture. English and Polish WordNets were used as a knowledge source with their polarity extensions (SentiWordNet, plWordNet Emo). The neuro- symbolic methods using knowledge during fine-tuning were not better or worse than the baseline model. However, a statistically significant gain of about three percentage points in the Fl- macro was obtained for the SentiLARE model that applied domain data - word sentiment labels - already at the pretraining stage. It was the most visible for medium-sized training sets. Therefore, developing an effective neuro-symbolic model is not trivial. The conclusions drawn from this work indicate a further need for a detailed study of these approaches, especially in natural language processing. In the context of sentiment classification, it could help design more efficient AI systems that can be deployed in business or marketing.

  • MultiAspectEmo: Multilingual and Language-Agnostic Aspect-Based Sentiment Analysis
    Joanna Szolomicka and Jan Kocon

    IEEE
    The paper addresses the important problem of multilingual and language-agnostic approaches to the aspect-based sentiment analysis (ABSA) task, using modern approaches based on transformer models. We propose a new dataset based on automatic translation of the Polish AspectEmo dataset together with cross-lingual transfer of tags describing aspect polarity. The result is a MultiAspectEmo dataset translated into five other languages: English, Czech, Spanish, French and Dutch. In this paper, we also present the original Tr Asp (Transformer-based Aspect Extraction and Classification) method, which is significantly better than methods from the literature in the ABSA task. In addition, we present multilingual and language-agnostic variants of this method, evaluated on the MultiAspectEmo and also the SemEval2016 datasets. We also test various language models for the ABSA task, including compressed models that give promising results while significantly reducing inference time and memory usage.

  • Compression Methods for Transformers in Multidomain Sentiment Analysis
    Wojciech Korczynski and Jan Kocon

    IEEE
    Transformer models like BERT have significantly improved performance on many NLP tasks, e.g., sentiment analysis. However, their large number of parameters makes real-world applications difficult because of computational costs and latency. Many compression methods have been proposed to solve this problem using quantization, weight pruning, and knowledge distillation. In this work, we explore some of these task-specific and task-agnostic methods by comparing their effectiveness and quality on the MultiEmo sentiment analysis dataset. Additionally, we analyze their ability to generalize and capture sentiment features by conducting domain-sentiment experiments. The results show that the compression methods reduce the model size by 8.6 times and the inference time by 6.9 times compared to the original model while maintaining unimpaired quality. Smaller models perform better on tasks with fewer data and retain more remarkable generalization ability after fine-tuning because they are less prone to overfitting. The best trade-off is obtained using the task-agnostic XtremeDistil model.

  • Deep-SHEEP: Sense of Humor Extraction from Embeddings in the Personalized Context
    Julita Bielaniewicz, Kamil Kanclerz, Piotr Milkowski, Marcin Gruza, Konrad Karanowski, Przemyslaw Kazienko, and Jan Kocon

    IEEE
    As humans, we experience a wide range of feelings and reactions. One of these is laughter, often related to a personal sense of humor and the perception of funny content. Due to its subjective nature, recognizing humor in NLP is a very challenging task. Here, we present a new approach to the task of predicting humor in the text by applying the idea of a personalized approach. It takes into account both the text and the context of the content receiver. For that purpose, we proposed four Deep-SHEEP learning models that take advantage of user preference information differently. The experiments were conducted on four datasets: Cockamamie, HUMOR, Jester, and Humicroedit. The results have shown that the application of an innovative personalized approach and user-centric perspective significantly improves performance compared to generalized methods. Moreover, even for random text embeddings, our personalized methods outperform the generalized ones in the subjective humor modeling task. We also argue that the user-related data reflecting an individual sense of humor has similar importance as the evaluated text itself. Different types of humor were investigated as well.

  • StudEmo: A Non-aggregated Review Dataset for Personalized Emotion Recognition


  • What if Ground Truth is Subjective? Personalized Deep Neural Hate Speech Detection


  • Multi-module Natural Language Search Engine for Travel Offers
    Karol Gawron, Konrad Wojtasik, Bartłomiej Bojanowski, Arkadiusz Janz, Jan Kocoń, Tomasz Krupa, Agnieszka Kukałowicz, Piotr Miłkowski, Maciej Piasecki, Michał Pogoda,et al.

    Springer International Publishing

RECENT SCHOLAR PUBLICATIONS

  • Fortifying NLP models against poisoning attacks: The power of personalized prediction architectures
    T Ferdinan, J Kocoń
    Information Fusion 114, 102692 2025

  • Eagle and finch: Rwkv with matrix-valued states and dynamic recurrence
    B Peng, D Goldstein, Q Anthony, A Albalak, E Alcaide, S Biderman, ...
    arXiv preprint arXiv:2404.05892 2024

  • Personalized large language models
    S Woźniak, B Koptyra, A Janz, P Kazienko, J Kocoń
    arXiv preprint arXiv:2402.09269 2024

  • Into the Unknown: Self-Learning Large Language Models
    T Ferdinan, J Kocoń, P Kazienko
    arXiv preprint arXiv:2402.09147 2024

  • Improving Training Dataset Balance with ChatGPT Prompt Engineering
    M Kochanek, I Cichecki, O Kaszyca, D Szydło, M Madej, D Jędrzejewski, ...
    Electronics 13 (12), 2255 2024

  • Beyond Human Review: Leveraging ChatGPT for Label Noise Detection
    I Cichecki, J Kocon, P Kazienko, O Kaszyca, D Szydłlo, M Kochanek
    https://www.techrxiv.org/doi/full/10.36227/techrxiv.170326715.56351742/v1 2023

  • Can innovative prompt engineering with ChatGPT address imbalances in machine learning datasets?
    M Kochanek, P Kazienko, J Kocon, I Cichecki, O Kaszyca, D Szydło
    Authorea Preprints 2023

  • Is it possible for ChatGPT to mimic human annotator?
    O Kaszyca, P Kazienko, J Kocoń, I Cichecki, M Kochanek, D Szydło
    Authorea Preprints 2023

  • Deep Emotions Across Languages: A Novel Approach for Sentiment Propagation in Multilingual WordNets
    J Kocoń
    2023 IEEE International Conference on Data Mining Workshops (ICDMW), 744-749 2023

  • Modeling Uncertainty in Personalized Emotion Prediction with Normalizing Flows
    P Miłkowski, K Karanowski, P Wielopolski, J Kocoń, P Kazienko, M Zięba
    2023 IEEE International Conference on Data Mining Workshops (ICDMW), 757-766 2023

  • From Big to Small Without Losing It All: Text Augmentation with ChatGPT for Efficient Sentiment Analysis
    S Woźniak, J Kocoń
    2023 IEEE International Conference on Data Mining Workshops (ICDMW), 799-808 2023

  • Towards Model-Based Data Acquisition for Subjective Multi-Task NLP Problems
    K Kanclerz, J Bielaniewicz, M Gruza, J Kocoń, S Woźniak, P Kazienko
    2023 IEEE International Conference on Data Mining Workshops (ICDMW), 726-735 2023

  • Beyond Human Review: Levereging ChatGPT for Label Noise Detection
    I Cichecki, J Kocoń, P Kazienko, O Kaszyca, D Szydło, M Kochanek
    2023

  • PALS: Personalized Active Learning for Subjective Tasks in NLP
    K Kanclerz, K Karanowski, J Bielaniewicz, M Gruza, P Miłkowski, J Kocoń, ...
    Proceedings of the 2023 Conference on Empirical Methods in Natural Language 2023

  • ChatGPT: Jack of all trades, master of none
    J Kocoń, I Cichecki, O Kaszyca, M Kochanek, D Szydło, J Baran, ...
    Information Fusion 99, 101861 2023

  • Clarin-emo: Training emotion recognition models using human annotation and chatgpt
    B Koptyra, A Ngo, Ł Radliński, J Kocoń
    International Conference on Computational Science, 365-379 2023

  • Differential dataset cartography: Explainable artificial intelligence in comparative personalized sentiment analysis
    J Kocoń, J Baran, K Kanclerz, M Kajstura, P Kazienko
    International Conference on Computational Science, 148-162 2023

  • Migrants vs. stayers in the pandemic–A sentiment analysis of Twitter content
    O Czeranowska, K Chlasta, P Miłkowski, I Grabowska, J Kocoń, ...
    Telematics and Informatics Reports 10, 100059 2023

  • Human-centered neural reasoning for subjective content processing: Hate speech, emotions, and humor
    P Kazienko, J Bielaniewicz, M Gruza, K Kanclerz, K Karanowski, ...
    Information Fusion 94, 43-65 2023

  • Rwkv: Reinventing rnns for the transformer era
    B Peng, E Alcaide, Q Anthony, A Albalak, S Arcadinho, S Biderman, ...
    arXiv preprint arXiv:2305.13048 2023

MOST CITED SCHOLAR PUBLICATIONS

  • Beyond the imitation game: Quantifying and extrapolating the capabilities of language models
    A Srivastava, A Rastogi, A Rao, AAM Shoeb, A Abid, A Fisch, AR Brown, ...
    arXiv preprint arXiv:2206.04615 2022
    Citations: 1114

  • ChatGPT: Jack of all trades, master of none
    J Kocoń, I Cichecki, O Kaszyca, M Kochanek, D Szydło, J Baran, ...
    Information Fusion 99, 101861 2023
    Citations: 553

  • Rwkv: Reinventing rnns for the transformer era
    B Peng, E Alcaide, Q Anthony, A Albalak, S Arcadinho, S Biderman, ...
    arXiv preprint arXiv:2305.13048 2023
    Citations: 411

  • Offensive, aggressive, and hate speech analysis: From data-centric to human-centered approach
    J Kocoń, A Figas, M Gruza, D Puchalska, T Kajdanowicz, P Kazienko
    Information Processing & Management 58 (5), 102643 2021
    Citations: 111

  • Liner2–a customizable framework for proper names recognition for Polish
    M Marcińczuk, J Kocoń, M Janicki
    Intelligent Tools for Building a Scientific Information Platform: Advanced 2013
    Citations: 60

  • Multi-level sentiment analysis of PolEmo 2.0: Extended corpus of multi-domain consumer reviews
    J Kocoń, P Miłkowski, M Zaśko-Zielińska
    Proceedings of the 23rd Conference on Computational Natural Language 2019
    Citations: 56

  • Cross-lingual deep neural transfer learning in sentiment analysis
    K Kanclerz, P Miłkowski, J Kocoń
    Procedia Computer Science 176, 128-137 2020
    Citations: 48

  • Learning personal human biases and representations for subjective tasks in natural language processing
    J Kocoń, M Gruza, J Bielaniewicz, D Grimling, K Kanclerz, P Miłkowski, ...
    2021 IEEE International Conference on Data Mining (ICDM), 1168-1173 2021
    Citations: 43

  • Personal bias in prediction of emotions elicited by textual opinions
    P Miłkowski, M Gruza, K Kanclerz, P Kazienko, D Grimling, J Kocoń
    Proceedings of the 59th annual meeting of the association for computational 2021
    Citations: 38

  • Controversy and conformity: from generalized to personalized aggressiveness detection
    K Kanclerz, A Figas, M Gruza, T Kajdanowicz, J Kocoń, D Puchalska, ...
    Proceedings of the 59th Annual Meeting of the Association for Computational 2021
    Citations: 36

  • Human-centered neural reasoning for subjective content processing: Hate speech, emotions, and humor
    P Kazienko, J Bielaniewicz, M Gruza, K Kanclerz, K Karanowski, ...
    Information Fusion 94, 43-65 2023
    Citations: 33

  • Neuro-symbolic models for sentiment analysis
    J Kocoń, J Baran, M Gruza, A Janz, M Kajstura, P Kazienko, W Korczyński, ...
    International conference on computational science, 667-681 2022
    Citations: 26

  • Mapping WordNet onto human brain connectome in emotion processing and semantic similarity recognition
    J Kocoń, M Maziarz
    Information Processing & Management 58 (3), 102530 2021
    Citations: 26

  • plWordNet as a basis for large emotive lexicons of Polish
    A Janz, J Kocon, M Piasecki, M Zasko-Zielinska
    Proceedings of Human Language Technologies as a Challenge for Computer 2017
    Citations: 26

  • Inforex-a web-based tool for text corpus management and semantic annotation.
    M Marcinczuk, J Kocon, B Broda
    LREC, 224-230 2012
    Citations: 26

  • What if ground truth is subjective? personalized deep neural hate speech detection
    K Kanclerz, M Gruza, K Karanowski, J Bielaniewicz, P Miłkowski, J Kocoń, ...
    Proceedings of the 1st Workshop on Perspectivist Approaches to NLP@ LREC2022 2022
    Citations: 25

  • Evaluating KGR10 Polish word embeddings in the recognition of temporal expressions using BiLSTM-CRF
    J Kocoń, M Gawor
    arXiv preprint arXiv:1904.04055 2019
    Citations: 25

  • Recognition of emotions, valence and arousal in large-scale multi-domain text reviews
    J Kocoń, A Janz, P Miłkowski, M Riegel, M Wierzba, A Marchewka, ...
    9th Language & Technology Conference (LTC 2019): Human Language Technologies 2019
    Citations: 23

  • Eagle and finch: Rwkv with matrix-valued states and dynamic recurrence
    B Peng, D Goldstein, Q Anthony, A Albalak, E Alcaide, S Biderman, ...
    arXiv preprint arXiv:2404.05892 2024
    Citations: 22

  • Multiemo: Multilingual, multilevel, multidomain sentiment analysis corpus of consumer reviews
    J Kocoń, P Miłkowski, K Kanclerz
    International Conference on Computational Science, 297-312 2021
    Citations: 22