@pwr.edu.pl
Department of Artificial Intelligence
Wroclaw University of Science and Technology
I'm an Assistant Professor in the department of Artificial Intelligence at the Wroclaw University of Science and Technology, where I earned both my Ph.D. in computer science (2018) and M.Sc. Eng. degree (2012). I serve as the AI/ML Team Leader and Senior ML/NLP Data Scientist for the CLARIN-BIZ project. My passion for natural language processing (NLP) has spanned over a decade, with a keen interest in machine learning techniques. I've penned over 60 scientific papers, showcased at prominent conferences including ACL, ICDM, and more. My current endeavors revolve around pioneering deep learning models for subjective tasks such as emotion and sentiment analysis. I'm also delving into cross-lingual knowledge transfer and language-agnostic models. My contributions have been integral to projects like CrisisDetector, StockBrief, Sentimenti, and CLARIN-PL, among others. I enjoy imparting knowledge on data science, AI's role in NLP, and building sophisticated deep neural network models.
Computer Science, Artificial Intelligence, Computer Science Applications, Signal Processing
Scopus Publications
Scholar Citations
Scholar h-index
Scholar i10-index
Teddy Ferdinan and Jan Kocoń
Elsevier BV
Mateusz Kochanek, Igor Cichecki, Oliwier Kaszyca, Dominika Szydło, Michał Madej, Dawid Jędrzejewski, Przemysław Kazienko, and Jan Kocoń
MDPI AG
The rapid evolution of large language models, in particular OpenAI’s GPT-3.5-turbo and GPT-4, indicates a growing interest in advanced computational methodologies. This paper proposes a novel approach to synthetic data generation and knowledge distillation through prompt engineering. The potential of large language models (LLMs) is used to address the problem of unbalanced training datasets for other machine learning models. This is not only a common issue but also a crucial determinant of the final model quality and performance. Three prompting strategies have been considered: basic, composite, and similarity prompts. Although the initial results do not match the performance of comprehensive datasets, the similarity prompts method exhibits considerable promise, thus outperforming other methods. The investigation of our rebalancing methods opens pathways for future research on leveraging continuously developed LLMs for the enhanced generation of high-quality synthetic data. This could have an impact on many large-scale engineering applications.
Jan Kocoń, Igor Cichecki, Oliwier Kaszyca, Mateusz Kochanek, Dominika Szydło, Joanna Baran, Julita Bielaniewicz, Marcin Gruza, Arkadiusz Janz, Kamil Kanclerz,et al.
Elsevier BV
Olga Czeranowska, Karol Chlasta, Piotr Miłkowski, Izabela Grabowska, Jan Kocoń, Krzysztof Hwaszcz, Jan Wieczorek, and Agata Jastrzębowska
Elsevier BV
Przemysław Kazienko, Julita Bielaniewicz, Marcin Gruza, Kamil Kanclerz, Konrad Karanowski, Piotr Miłkowski, and Jan Kocoń
Elsevier BV
Jan Kocoń
IEEE
Sentiment analysis involves using WordNets enriched with emotional metadata, which are valuable resources. However, manual annotation is time-consuming and expensive, resulting in only a few WordNet Lexical Units being annotated. This paper introduces two new techniques for automatically propagating sentiment annotations from a partially annotated WordNet to its entirety and to a WordNet in a different language: Multilingual Structured Synset Embeddings (MSSE) and Cross-Lingual Deep Neural Sentiment Propagation (CLDNS). We evaluated the proposed MSSE+CLDNS method extensively using Princeton WordNet and Polish WordNet, which have many inter-lingual relations. Our results show that the MSSE+CLDNS method outperforms existing propagation methods, indicating its effectiveness in enriching WordNets with emotional metadata across multiple languages. This work provides a solid foundation for large-scale, multilingual sentiment analysis and is valuable for academic research and practical applications.
Stanisław Woźniak and Jan Kocoń
IEEE
In the era of artificial intelligence, data is gold but costly to annotate. The paper demonstrates a groundbreaking solution to this dilemma using ChatGPT for text augmentation in sentiment analysis. We leverage ChatGPT’s generative capabilities to create synthetic training data that significantly improves the performance of smaller models, making them competitive with, or even outperforming, their larger counterparts. This innovation enables models to be both efficient and effective, thereby reducing computational cost, inference time, and memory usage without compromising on quality. Our work marks a key advancement in the cost-effective development and deployment of robust sentiment analysis models.
Kamil Kanclerz, Julita Bielaniewicz, Marcin Gruza, Jan Kocoń, Stanisław Woźniak, and Przemysław Kazienko
IEEE
Data annotated by humans is a source of knowledge by describing the peculiarities of the problem and therefore fueling the decision process of the trained model. Unfortunately, the annotation process for subjective natural language processing (NLP) problems like offensiveness or emotion detection is often very expensive and time-consuming. One of the inevitable risks is to spend some of the funds and annotator effort on annotations that do not provide any additional knowledge about the specific task. To minimize these costs, we propose a new model-based approach that allows the selection of tasks annotated individually for each text in a multi-task scenario. The experiments carried out on three datasets, dozens of NLP tasks, and thousands of annotations show that our method allows up to 40% reduction in the number of annotations with negligible loss of knowledge. The results also emphasize the need to collect a diverse amount of data required to efficiently train a model, depending on the subjectivity of the annotation task. We also focused on measuring the relation between subjective tasks by evaluating the model in single-task and multi-task scenarios. Moreover, for some datasets, training only on the labels predicted by our model improved the efficiency of task selection as a self-supervised learning regularization technique.
Piotr Miłkowski, Konrad Karanowski, Patryk Wielopolski, Jan Kocoń, Przemysław Kazienko, and Maciej Zięba
IEEE
Designing predictive models for subjective problems in natural language processing (NLP) remains challenging. This is mainly due to its non-deterministic nature and different perceptions of the content by different humans. It may be solved by Personalized Natural Language Processing (PNLP), where the model exploits additional information about the reader to make more accurate predictions. However, current approaches require complete information about the recipients to be straight embedded. Besides, the recent methods focus on deterministic inference or simple frequency-based estimations of the probabilities. In this work, we overcome this limitation by proposing a novel approach to capture the uncertainty of the forecast using conditional Normalizing Flows. This allows us to model complex multimodal distributions and to compare various models using negative log-likelihood (NLL). In addition, the new solution allows for various interpretations of possible reader perception thanks to the available sampling function. We validated our method on three challenging, subjective NLP tasks, including emotion recognition and hate speech. The comparative analysis of generalized and personalized approaches revealed that our personalized solutions significantly outperform the baseline and provide more precise uncertainty estimates. The impact on the text interpretability and uncertainty studies are presented as well. The information brought by the developed methods makes it possible to build hybrid models whose effectiveness surpasses classic solutions. In addition, an analysis and visualization of the probabilities of the given decisions for texts with high entropy of annotations and annotators with mixed views were carried out.
Bartłomiej Koptyra, Anh Ngo, Łukasz Radliński, and Jan Kocoń
Springer Nature Switzerland
Jan Kocoń, Joanna Baran, Kamil Kanclerz, Michał Kajstura, and Przemysław Kazienko
Springer Nature Switzerland
Małgorzata Wierzba, Monika Riegel, Jan Kocoń, Piotr Miłkowski, Arkadiusz Janz, Katarzyna Klessa, Konrad Juszczyk, Barbara Konat, Damian Grimling, Maciej Piasecki,et al.
Springer Science and Business Media LLC
AbstractEmotion lexicons are useful in research across various disciplines, but the availability of such resources remains limited for most languages. While existing emotion lexicons typically comprise words, it is a particular meaning of a word (rather than the word itself) that conveys emotion. To mitigate this issue, we present the Emotion Meanings dataset, a novel dataset of 6000 Polish word meanings. The word meanings are derived from the Polish wordnet (plWordNet), a large semantic network interlinking words by means of lexical and conceptual relations. The word meanings were manually rated for valence and arousal, along with a variety of basic emotion categories (anger, disgust, fear, sadness, anticipation, happiness, surprise, and trust). The annotations were found to be highly reliable, as demonstrated by the similarity between data collected in two independent samples: unsupervised (n = 21,317) and supervised (n = 561). Although we found the annotations to be relatively stable for female, male, younger, and older participants, we share both summary data and individual data to enable emotion research on different demographically specific subgroups. The word meanings are further accompanied by the relevant metadata, derived from open-source linguistic resources. Direct mapping to Princeton WordNet makes the dataset suitable for research on multiple languages. Altogether, this dataset provides a versatile resource that can be employed for emotion research in psychology, cognitive science, psycholinguistics, computational linguistics, and natural language processing.
Joanna Baran and Jan Kocon
IEEE
Neuro-symbolic approaches explore ways to com-bine neural networks with traditional symbolic knowledge. These methods are gaining attention due to their efficiency and the requirement of fewer data compared to currently used deep models. This work investigated several neuro-symbolic models for sentiment analysis focusing on a variety of ways to add linguistic knowledge to the transformer-based architecture. English and Polish WordNets were used as a knowledge source with their polarity extensions (SentiWordNet, plWordNet Emo). The neuro- symbolic methods using knowledge during fine-tuning were not better or worse than the baseline model. However, a statistically significant gain of about three percentage points in the Fl- macro was obtained for the SentiLARE model that applied domain data - word sentiment labels - already at the pretraining stage. It was the most visible for medium-sized training sets. Therefore, developing an effective neuro-symbolic model is not trivial. The conclusions drawn from this work indicate a further need for a detailed study of these approaches, especially in natural language processing. In the context of sentiment classification, it could help design more efficient AI systems that can be deployed in business or marketing.
Joanna Szolomicka and Jan Kocon
IEEE
The paper addresses the important problem of multilingual and language-agnostic approaches to the aspect-based sentiment analysis (ABSA) task, using modern approaches based on transformer models. We propose a new dataset based on automatic translation of the Polish AspectEmo dataset together with cross-lingual transfer of tags describing aspect polarity. The result is a MultiAspectEmo dataset translated into five other languages: English, Czech, Spanish, French and Dutch. In this paper, we also present the original Tr Asp (Transformer-based Aspect Extraction and Classification) method, which is significantly better than methods from the literature in the ABSA task. In addition, we present multilingual and language-agnostic variants of this method, evaluated on the MultiAspectEmo and also the SemEval2016 datasets. We also test various language models for the ABSA task, including compressed models that give promising results while significantly reducing inference time and memory usage.
Wojciech Korczynski and Jan Kocon
IEEE
Transformer models like BERT have significantly improved performance on many NLP tasks, e.g., sentiment analysis. However, their large number of parameters makes real-world applications difficult because of computational costs and latency. Many compression methods have been proposed to solve this problem using quantization, weight pruning, and knowledge distillation. In this work, we explore some of these task-specific and task-agnostic methods by comparing their effectiveness and quality on the MultiEmo sentiment analysis dataset. Additionally, we analyze their ability to generalize and capture sentiment features by conducting domain-sentiment experiments. The results show that the compression methods reduce the model size by 8.6 times and the inference time by 6.9 times compared to the original model while maintaining unimpaired quality. Smaller models perform better on tasks with fewer data and retain more remarkable generalization ability after fine-tuning because they are less prone to overfitting. The best trade-off is obtained using the task-agnostic XtremeDistil model.
Julita Bielaniewicz, Kamil Kanclerz, Piotr Milkowski, Marcin Gruza, Konrad Karanowski, Przemyslaw Kazienko, and Jan Kocon
IEEE
As humans, we experience a wide range of feelings and reactions. One of these is laughter, often related to a personal sense of humor and the perception of funny content. Due to its subjective nature, recognizing humor in NLP is a very challenging task. Here, we present a new approach to the task of predicting humor in the text by applying the idea of a personalized approach. It takes into account both the text and the context of the content receiver. For that purpose, we proposed four Deep-SHEEP learning models that take advantage of user preference information differently. The experiments were conducted on four datasets: Cockamamie, HUMOR, Jester, and Humicroedit. The results have shown that the application of an innovative personalized approach and user-centric perspective significantly improves performance compared to generalized methods. Moreover, even for random text embeddings, our personalized methods outperform the generalized ones in the subjective humor modeling task. We also argue that the user-related data reflecting an individual sense of humor has similar importance as the evaluated text itself. Different types of humor were investigated as well.
Karol Gawron, Konrad Wojtasik, Bartłomiej Bojanowski, Arkadiusz Janz, Jan Kocoń, Tomasz Krupa, Agnieszka Kukałowicz, Piotr Miłkowski, Maciej Piasecki, Michał Pogoda,et al.
Springer International Publishing