The language of discrimination: assessing attention discrimination by Hungarian local governments Jakab Buda, Renáta Németh, Bori Simonovits, Gábor Simonovits Language Resources and Evaluation, 2023 In our study we assess the responsiveness of Hungarian local governments to requests for information by Roma and non-Roma clients, relying on a nationwide correspondence study. Our paper has both methodological and substantive relevance. The methodological novelty is that we treat discrimination as a classification problem and study to what extent emails written to Roma and non-Roma clients can be distinguished, which in turn serves as a metric of discrimination in general. We show that it is possible to detect discrimination in textual data in an automated way without human coding, and that machine learning (ML) may detect features of discrimination that human coders may not recognize. To the best of our knowledge, our study is the first attempt to assess discrimination using ML techniques. From a substantive point of view, our study focuses on linguistic features the algorithm detects behind the discrimination. Our models worked significantly better compared to random classification (the accuracy of the best of our models was 61%), confirming the differential treatment of Roma clients. The most important predictors showed that the answers sent to ostensibly Roma clients are not only shorter, but their tone is less polite and more reserved, supporting the idea of attention discrimination, in line with the results of Bartos et al. (2016). A higher level of attention discrimination is detectable against male senders, and in smaller settlements. Also, our results can be interpreted as digital discrimination in the sense in which Edelman and Luca (2014) use this term.
The impact of depression forums on illness narratives: a comprehensive NLP analysis of socialization in e-mental health communities Domonkos Sik, Márton Rakovics, Jakab Buda, Renáta Németh Journal of Computational Social Science, 2023 While depression is globally on the rise, the mental health sector struggles with handling the increased number of cases, especially since the pandemic. These circumstances have resulted in an increased interest in the e-mental health sector. The dataset is constituted of 67 857 posts from the most popular English-language online health forums between 15 February 2016 and 15 February 2019. The posts were first automatically labelled (biomedical vs. psy framing) via deep learning; second, the time series of framing types of recurring forum users were analysed; third, the clusters of biomedical and psy patterns were analysed; fourth, the discursive characteristics of each cluster were analysed with the help of topic modelling. Five ideal-typical patterns of forum socialization are described: the first and the second clusters express the developing of a ‘recovery helper’ role, either by opposing expert discourses or by identifying with the psy discourses; the third cluster expresses the acquiring of a substantively diffuse, uncertain role; the fourth and fifth clusters refer to a trajectory leading to the incorporating of a biomedically framed patient role, or a therapeutic psy subjectivity. Elements of data collection that potentially undermine representativeness: online forum users, open and public forums, keyword search. The trajectories identified in our study represent various phases of a general forum socialization process: newcomers (cluster 3); settled patient role (cluster 4) or psy subjectivity (cluster 5); recovery helpers (cluster 1 and 2).
Trust in the household Endre Sik, Jakab Buda Szociologiai Szemle, 2023 Ebben a tanulmányban egy olyan modellt elemzünk, amely valamennyi háztartástagot külön-külön tartalmaz, de együttesen vizsgál. Kérdésünk: Hogyan függ össze a háztartástagok bizalmának mértéke és egyenlőtlensége a háztartástagok szociológiai jellemzőivel, a közöttük lévő egyenlőtlenségekkel és a háztartás egészének jellemzőivel? Az elemzés alapja a EU-SILC 2015. évi adatbázisa. A bizalom mértékének becslésére az intézményi és általánosított bizalmat, valamit a belőlük képzett változó átlagát, a bizalom egyenlőtlenségének becslésére e változók szóródását használtuk. Azt találtuk, hogy a bizalom magasabb szintje a magasabb társadalmi státusszal (jó lakókörnyezet, jó anyagi helyzet, magasabb iskolai végzettség és több baráti kapcsolat) és a háztartáson belül élő nők nagyobb arányával jár együtt. A hétköznapi életvitel gondjaitól szenvedő háztartások (akadályozott háztartástag léte, rossz lakás- és lakókörnyezet) körében alacsonyabb a bizalom mértéke. A háztartás bizalma egyenlőtlenebbül oszlik meg a háztartás tagjai között, ha magas az iskolai végzettség, és a sok baráti kapcsolat. Az iskolai végzettség kivételével a háztartástagok közötti valamennyi egyenlőtlenség növeli a bizalmi heterogenitást, ami arra utal, hogy a háztartásban működik egyfajta „egyenlőtlenség-csomag”, s a bizalom is része ennek.
Using N-grams and statistical features to identify Hate Speech Spreaders on Twitter Ceur Workshop Proceedings, 2021
An Ensemble Model Using N-grams and Statistical Features to Identify Fake News Spreaders on Twitter Notebook for PAN at CLEF 2020 Ceur Workshop Proceedings, 2020
Bot or not: A two-level approach in author profiling notebook for PAN at CLEF 2019 Ceur Workshop Proceedings, 2019
RECENT SCHOLAR PUBLICATIONS
From Polarization to Consensus? A Comparative Analysis of Refugee and Migrant Discourse in Belgium and Hungary Across Parliamentary, Media, and Social Media Layers (2015/16 and … S Kiyak, J Buda, M Gosztonyi, C Meeusen, R Németh, I Barna, ... Etmaal 2025, Date: 2025/02/03-2025/02/04, Location: Brugges , 2025 2025
A felügyelt gépi tanulás alkalmazási lehetőségei szöveges adatokon. A magyar országgyűlésben 1998–2018 között elhangzott beszédek elemzése= The application of supervised … JM Buda, R Németh STATISZTIKAI SZEMLE 102 (11), 1087-1103 , 2024 2024
The language of discrimination: assessing attention discrimination by Hungarian local governments J Buda, R Németh, B Simonovits, G Simonovits Language Resources and Evaluation 57 (4), 1547-1570 , 2023 2023 Citations: 5
The impact of depression forums on illness narratives: a comprehensive NLP analysis of socialization in e-mental health communities D Sik, M Rakovics, J Buda, R Németh Journal of Computational Social Science 6 (2), 781-802 , 2023 2023 Citations: 11
The language of discrimination: assessing attention discrimination by Hungarian local governments using machine learning R Nemeth, J Buda, B Simonovits XX ISA World Congress of Sociology (June 25-July 1, 2023) , 2023 2023 Citations: 1
Using N-grams and Statistical Features to Identify Hate Speech Spreaders on Twitter. E Katona, J Buda, F Bolonyai CLEF (Working Notes), 2025-2034 , 2021 2021 Citations: 11
An Ensemble Model Using N-grams and Statistical Features to Identify Fake News Spreaders on Twitter. J Buda, F Bolonyai CLEF (working notes) , 2020 2020 Citations: 64
Bot Or Not: A Two-Level Approach In Author Profiling. F Bolonyai, J Buda, E Katona CLEF (Working Notes) , 2019 2019 Citations: 2
MOST CITED SCHOLAR PUBLICATIONS
An Ensemble Model Using N-grams and Statistical Features to Identify Fake News Spreaders on Twitter. J Buda, F Bolonyai CLEF (working notes) , 2020 2020 Citations: 64
The impact of depression forums on illness narratives: a comprehensive NLP analysis of socialization in e-mental health communities D Sik, M Rakovics, J Buda, R Németh Journal of Computational Social Science 6 (2), 781-802 , 2023 2023 Citations: 11
Using N-grams and Statistical Features to Identify Hate Speech Spreaders on Twitter. E Katona, J Buda, F Bolonyai CLEF (Working Notes), 2025-2034 , 2021 2021 Citations: 11
The language of discrimination: assessing attention discrimination by Hungarian local governments J Buda, R Németh, B Simonovits, G Simonovits Language Resources and Evaluation 57 (4), 1547-1570 , 2023 2023 Citations: 5
Bot Or Not: A Two-Level Approach In Author Profiling. F Bolonyai, J Buda, E Katona CLEF (Working Notes) , 2019 2019 Citations: 2
The language of discrimination: assessing attention discrimination by Hungarian local governments using machine learning R Nemeth, J Buda, B Simonovits XX ISA World Congress of Sociology (June 25-July 1, 2023) , 2023 2023 Citations: 1
From Polarization to Consensus? A Comparative Analysis of Refugee and Migrant Discourse in Belgium and Hungary Across Parliamentary, Media, and Social Media Layers (2015/16 and … S Kiyak, J Buda, M Gosztonyi, C Meeusen, R Németh, I Barna, ... Etmaal 2025, Date: 2025/02/03-2025/02/04, Location: Brugges , 2025 2025
A felügyelt gépi tanulás alkalmazási lehetőségei szöveges adatokon. A magyar országgyűlésben 1998–2018 között elhangzott beszédek elemzése= The application of supervised … JM Buda, R Németh STATISZTIKAI SZEMLE 102 (11), 1087-1103 , 2024 2024