Stefan Rebrikov

@hse.ru

Theoretical, applied and comparative linguistics, Faculty of Humanities
National Research University Higher School of Economics

Stefan Rebrikov

RESEARCH, TEACHING, or OTHER INTERESTS

Language and Linguistics, Artificial Intelligence
3

Scopus Publications

5

Scholar Citations

2

Scholar h-index

Scopus Publications

  • RuCAM: Comparative Argumentative Machine for the Russian Language
    Maria Maslova, Stefan Rebrikov, Anton Artsishevski, Sebastian Zaczek, Chris Biemann, Irina Nikishina
    Lecture Notes in Computer Science Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics, 2024
  • THE EFFECT OF (HISTORICAL) LANGUAGE VARIATION ON THE EAST SLAVIC LECTS LEMMATISERS PERFORMANCE
    Ilia Afanasev, Olga Lyashevskaya, Stefan Rebrikov, Yana Shishkina, Igor Trofimov, Natalia Vlasova
    Jazykovedny Casopis, 2023
    The need to develop tools for historical and regional variations is becoming more urgent in natural language processing. In this paper, we present two candidate systems for lemmatising historical East Slavic lects (Late Old East Slavic and Middle Russian), as well as modern regional East Slavic lects (Belogornoje and Megra): BERT-based end-to-end pipeline with language-specific heuristics and sequence-to-sequence BART-based encoderdecoder. To evaluate their predictions, we use accuracy score and string similarity measures, such as Levenshtein distance. The BERT-based model is more suitable for the regional data, achieving 85% accuracy score, and only 74% on the historical data. BART-based model climbs up to 92.6% accuracy score on the historical data, yet gets only 80% on the regional data. We provide an error analysis and discuss ways to enhance models, such as dictionary lookup and spellchecker.
  • Disambiguation in context in the Russian National Corpus: 20 yeas later
    Olga Lyashevskaya, Ilya Afanasev, Stefan Rebrikov, Yana Shishkina, Elena Suleymanova, Igor Trofimov, Natalia Vlasova
    Komp Juternaja Lingvistika I Intellektual Nye Tehnologii, 2023
    An updated annotation of the Main, Media, and some other corpora of the Russian National Corpus (RNC) features the part-of-speech and other morphological information, lemmas, dependency structures, and constituency types. Transformer-based architectures are used to resolve the homonymy in context according to a schema based on the manually disambiguated subcorpus of the Main corpus (morphology and lexicon) and UD-SynTagRus (syntax). The paper discusses the challenges in applying the models to texts of different registers, orthographies, and time periods, on the one hand, and making the new version convenient for users accustomed to the old search practices, on the other. The re-annotated corpus data form the basis for the enhancement of the RNC tools such as word and n-gram frequency lists, collocations, corpus comparison, and Word at a glance.

RECENT SCHOLAR PUBLICATIONS

  • RuCAM: comparative argumentative machine for the Russian language
    M Maslova, S Rebrikov, A Artsishevski, S Zaczek, C Biemann, I Nikishina
    International Conference on Analysis of Images, Social Networks and Texts, 78-91 , 2023
    2023
    Citations: 3
  • Disambiguation in context in the Russian National Corpus: 20 yeas later
    MTS AI
    Proceedings of the International Conference “Dialogue 2023 , 2023
    2023
  • The effect of (historical) language variation on the east Slavic Lects Lematisers performance
    I Afanasev, O Lyashevskaya, S Rebrikov, Y Shishkina, I Trofimov, ...
    Jazykovedny Casopis 74 (1), 225-233 , 2023
    2023
    Citations: 2

MOST CITED SCHOLAR PUBLICATIONS

  • RuCAM: comparative argumentative machine for the Russian language
    M Maslova, S Rebrikov, A Artsishevski, S Zaczek, C Biemann, I Nikishina
    International Conference on Analysis of Images, Social Networks and Texts, 78-91 , 2023
    2023
    Citations: 3
  • The effect of (historical) language variation on the east Slavic Lects Lematisers performance
    I Afanasev, O Lyashevskaya, S Rebrikov, Y Shishkina, I Trofimov, ...
    Jazykovedny Casopis 74 (1), 225-233 , 2023
    2023
    Citations: 2
  • Disambiguation in context in the Russian National Corpus: 20 yeas later
    MTS AI
    Proceedings of the International Conference “Dialogue 2023 , 2023
    2023