Rui Manuel Sousa Silva

@sigarra.up.pt

CLUP, Faculty of Arts and Humanities, University of Porto
Faculty of Arts and Humanities, University of Porto

Rui Manuel Sousa Silva
Rui Sousa-Silva is assistant professor of the Faculty of Arts and Humanities, and researcher and Scientific Coordinator of the Centre for Linguistics (CLUP) of the University of Porto, where he conducts his research into Forensic Linguistics, notably authorship analysis, plagiarism detection and analysis and cybercrime. He is also a member of the Scientific Committee of the Master's in Translation and Language Services and Coordinator of the Specialisation Course in Forensic Linguistics. He has a first degree in Translation and a Masters in Terminology and Translation, both awared by the Faculty of Arts and Humanities of the University of Porto, and a PhD in Applied Linguistics from Aston University (Birmingham, UK), where he submitted his thesis on Forensic Linguistics. He is co-editor of the journal Language and Law and co-editor of the Routledge Handbook of Forensic Linguistics. He was LITHME WG1-Computational Linguistics Chair (2020-2024).

EDUCATION

PhD in Applied Linguistics - Forensic Linguistics, Aston University, Birningham, UK

RESEARCH, TEACHING, or OTHER INTERESTS

Linguistics and Language, Arts and Humanities
24

Scopus Publications

Scopus Publications

  • "The facts speak for themselves": dismantling conspiracy theories as disinformation
    Rui Sousa-Silva
    Linguistics Vanguard, 2026
    Disinformation has been commonly approached as fake news, i.e., news that does not comply with the principles of factuality, objectivity, and neutrality. However, not all pieces of disinformation are damaging (e.g., satire) or rely on lack of factuality. Rather, it is the combination of lack of factuality and intention to deceive that embodies the most serious form of disinformation. By taking advantage of echo chambers and filter bubbles, disinformers use distorted facts to disseminate alternative forms of disinformation and manipulate readers’ views. This article discusses the relevance of establishing a sociolect of conspiracy theories (CTs) as alternative sources of disinformation. It builds on a small corpus of CTs published in Portuguese to explore the use of forensic linguistics methods to assist the detection of disinformation. A holistic linguistics approach is adopted, which operates by scrutinizing metadata, structure, and discourse to identify which linguistic features depart from mainstream sources and which ones overlap, and thus understanding the linguistic materializations of CTs. The findings reveal promising results in detecting CTs as alternative sources of disinformation, with the additional advantage of substantiating judgements of disinformation with linguistic evidence. This article concludes with a discussion of the limitations of this exploratory research.
  • Disinformation in the Age of Permacrisis: The Route to Lawlessness?
    Rui Sousa-Silva
    International Journal for the Semiotics of Law, 2025
  • Function words as possible style markers: an application to the forensic authorship analysis of Getúlio Vargas’ suicide note (carta-testamento)
    Viviane Costa, Rui Sousa-Silva
    Delta Documentacao De Estudos Em Linguistica Teorica E Aplicada, 2025
    Resumo Em agosto de 1954, o então presidente do Brasil Getúlio Vargas tirou a sua própria vida com um tiro no peito. Ao lado do corpo, foi encontrada uma carta, que ficou conhecida como carta-testamento, cuja autoria foi contestada na época. Pessoas próximas a Vargas atribuíram a autoria da carta ao seu amigo íntimo e speechwriter, José Soares Maciel Filho. O suicídio de Vargas e a mensagem deixada em sua última missiva mudaram os rumos da História do Brasil e, por isso, acreditamos ser importante analisar a sua carta de suicídio pela perspectiva da análise de autoria forense. Nossa análise concentra-se, entretanto, apenas nas palavras gramaticais. Para tal, analisaremos também outros textos de autoria conhecida, tanto de Getúlio Vargas, quanto de Maciel Filho. Os resultados, parte de um projeto de pesquisa mais amplo, mostram que há diferenças consideráveis entre a carta-testamento e as demais cartas de suicídio escritas por Vargas. Além disso, os resultados da análise sugerem que a utilização de certas palavras parece estar mais ligada à relação entre locutário e alocutário do que ao gênero e aos tópicos textuais
  • Leveraging Loanword Constraints for Improving Machine Translation in a Low-Resource Multilingual Context
    Felermino D. M. A. Ali, Henrique Lopes Cardoso, Rui Sousa-Silva
    Emnlp 2025 2025 Conference on Empirical Methods in Natural Language Processing Proceedings of the Conference, 2025
  • Evaluating WMT 2025 Metrics Shared Task Submissions on the SSA-MTE African Challenge Set
    Senyu Li, Felermino Dario Mario Ali, Jiayi Wang, Rui Sousa-Silva, Henrique Lopes Cardoso, et al.
    Conference on Machine Translation Proceedings, 2025
  • SSA-COMET: Do LLMs Outperform Learned Metrics in Evaluating MT for Under-Resourced African Languages?
    Senyu Li, Jiayi Wang, Felermino D. M. A. Ali, Colin Cherry, Daniel Deutsch, et al.
    Emnlp 2025 2025 Conference on Empirical Methods in Natural Language Processing Proceedings of the Conference, 2025
  • ‘We Attempted to Deliver Your Package’: Forensic Translation in the Fight Against Cross-Border Cybercrime
    Rui Sousa-Silva
    International Journal for the Semiotics of Law, 2024
    Cybercrime has increased significantly, recently, as a result of both individual and group criminal practice, and is now a threat to individuals, organisations, and democratic systems worldwide. However, cybercrime raises two main challenges for legal systems: firstly, because cybercriminals operate online, cybercrime spans beyond the boundaries of specific jurisdictions, which constrains the operation of the police and, subsequently, the conviction of the perpetrators; secondly, since cybercriminals can operate from anywhere in the world, law enforcement agencies struggle to identify the origin of the communications, especially when obfuscation strategies are used, e.g. dark web fora. Nevertheless, cybercriminals inherently use language to communicate, so the linguistic analysis of suspect communications is particularly helpful in deterring cybercriminal practice. This article reports the potential of forensic translation in the fight against cybercrime. Although the term ‘forensic translation’ is typically understood as a synonym of ‘legal translation’, it is argued that the implications of forensic translation span beyond those of legal translation, to include analyses of language rights, of the right to interpretation and translation in legal procedures (in the EU), or even investigative and intelligence practices. Translation is a pervasive activity that is conducted, not only by professional translators, but also by lay speakers of language, often using machine translation systems. The ease of use of the latter makes it particularly suitable for cross-border criminal (e.g. extortion or fraud) and cybercriminal communications (e.g. cybertrespass, cyberfraud, cyberpiracy, cyberporn or child online porn, cyberviolence or cyberstalking). This article presents the results of the analysis of cybercriminal communications from a forensic translation perspective. It demonstrates that translation is frequently used to spread cybercriminal communications, and that reverse-engineering the translational procedure will assist law enforcement agencies in narrowing down their pool of suspects and, consequently, deter cybercriminal threats.
  • Cyber Hate Speech Detection and Analysis: An Evidence-Based Forensic Linguistics Approach
    Rui Sousa-Silva
    Law and Visual Jurisprudence, 2024
  • Building Resources for Emakhuwa: Machine Translation and News Classification Benchmarks
    Felermino D. M. A. Ali, Henrique Lopes Cardoso, Rui Sousa-Silva
    Emnlp 2024 2024 Conference on Empirical Methods in Natural Language Processing Proceedings of the Conference, 2024
  • Expanding FLORES+ Benchmark for more Low-Resource Settings: Portuguese-Emakhuwa Machine Translation Evaluation
    Conference on Machine Translation Proceedings, 2024
  • Detecting Loanwords in Emakhuwa: An Extremely Low-Resource Bantu Language Exhibiting Significant Borrowing From Portuguese
    2024 Joint International Conference on Computational Linguistics Language Resources and Evaluation Lrec Coling 2024 Main Conference Proceedings, 2024
  • Introduction: Understanding Language in the Human-Machine Era
    Proceedings of the 1st Luhme Workshop, 2024
  • Network-based Approach for Stopwords Detection
    Proceedings of the 16th International Conference on Computational Processing of the Portuguese Language Propor 2024, 2024
  • Argumentation models and their use in corpus annotation: Practice, prospects, and challenges
    Henrique Lopes Cardoso, Rui Sousa-Silva, Paula Carvalho, Bruno Martins
    Natural Language Engineering, 2023
  • Fighting the Fake: A Forensic Linguistic Analysis to Fake News Detection
    Rui Sousa-Silva
    International Journal for the Semiotics of Law, 2022
  • Annotating Arguments in a Corpus of Opinion Articles
    2022 Language Resources and Evaluation Conference Lrec 2022, 2022
  • Predicting Argument Density from Multiple Annotations
    Gil Rocha, Bernardo Leite, Luís Trigo, Henrique Lopes Cardoso, Rui Sousa-Silva, et al.
    Lecture Notes in Computer Science Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics, 2022
  • Automated Fake News Detection Using Computational Forensic Linguistics
    Ricardo Moura, Rui Sousa-Silva, Henrique Lopes Cardoso
    Lecture Notes in Computer Science Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics, 2021
  • Knowledge Organization in the New Era Using DIY Corpora as Writing Assistants
    Advances in Knowledge Organization, 2020
  • Biased Language Detection in Court Decisions
    Alexandra Guedes Pinto, Henrique Lopes Cardoso, Isabel Margarida Duarte, Catarina Vaz Warrot, Rui Sousa-Silva
    Lecture Notes in Computer Science Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics, 2020
  • Plagiarism across languages and cultures: A (forensic) linguistic analysis
    Rui Sousa-Silva
    Handbook of the Changing World Language Map, 2019
  • Team Fernando-Pessa at SemEval-2019 task 4: Back to basics in hyperpartisan news detection
    André Cruz, Gil Rocha, Rui Sousa-Silva, Henrique Lopes Cardoso
    Naacl Hlt 2019 International Workshop on Semantic Evaluation Semeval 2019 Proceedings of the 13th Workshop, 2019
  • 'twazn me!!! ;(' automatic authorship analysis of micro-blogging messages
    Rui Sousa Silva, Gustavo Laboreiro, Luís Sarmento, Tim Grant, Eugénio Oliveira, et al.
    Lecture Notes in Computer Science Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics, 2011
  • Comparing sentence-level features for authorship analysis in Portuguese
    Rui Sousa-Silva, Luís Sarmento, Tim Grant, Eugénio Oliveira, Belinda Maia
    Lecture Notes in Computer Science Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics, 2010

CONSULTANCY

Forensic Linguistic Analysis