Maarten Janssen

@cuni.cz

Researcher
UFAL, MFF, Charles University

Maarten Janssen
20

Scopus Publications

927

Scholar Citations

17

Scholar h-index

25

Scholar i10-index

Scopus Publications

  • Advancing African NLP: UDMorph and flexiPipe
    Maarten Janssen
    Africanlp 2026 7th Workshop on African Natural Language Processing Proceedings of the Workshop, 2026
  • Alignment of Historical Manuscript Transcriptions and Translations
    Maarten Janssen, Piroska Lendvai, Anna Jouravel, and
    International Conference Recent Advances in Natural Language Processing Ranlp, 2025
  • UDMorph: Morphosyntactically Tagged UD Corpora
    2024 Joint International Conference on Computational Linguistics Language Resources and Evaluation Lrec Coling 2024 Main Conference Proceedings, 2024
  • ParlaMint in TEITOK
    Parlaclarin 4th Workshop on Creating Analysing and Increasing Accessibility of Parliamentary Corpora at Lrec Coling 2024 Proceedings, 2024
  • UDWiki: Guided creation and exploitation of UD treebanks
    Udw 2021 5th Workshop on Universal Dependencies Proceedings to Be Held as Part of Syntaxfest 2021, 2021
  • A Corpus with Wavesurfer and TEI: Speech and Video in TEITOK
    Maarten Janssen
    Lecture Notes in Computer Science, 2021
  • TEITOK as a tool for dependency grammar
    Maarten Janssen
    Procesamiento Del Lenguaje Natural, 2018
    TEITOK is an online platform for visualizing, searching, and editing TEI-based corpora. TEITOK has a modular set-up, and amongst other things provides an interface to easily add dependency relations to a tokenized TEI document, and provides the option to visualize the dependency trees, edit them, and search through the corpus using dependency relations.
  • Adding words to manuscripts: From pagesxml to TEITOK
    Maarten Janssen
    Lecture Notes in Computer Science Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics, 2018
  • Dependency Graphs and TEITOK: Exploiting Dependency Parsing
    Maarten Janssen
    Lecture Notes in Computer Science Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics, 2018
  • Technical Implementation of the Vocabulário Ortográfico Comum da Língua Portuguesa
    Maarten Janssen, José Pedro Ferreira
    Lecture Notes in Computer Science Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics, 2018
  • The CPLP Corpus: A pluricentric corpus for the common Portuguese spelling dictionary (VOC)
    Euralex Proceedings, 2018
  • TEITOK: Text-faithful annotated corpora
    Proceedings of the 10th International Conference on Language Resources and Evaluation Lrec 2016, 2016
  • POS tagging and less resources languages individuated features in Corpus Wiki
    Maarten Janssen
    Lecture Notes in Computer Science, 2016
  • The COPLE2 corpus: A learner corpus for Portuguese
    Proceedings of the 10th International Conference on Language Resources and Evaluation Lrec 2016, 2016
  • A rule based pronunciation generator and regional accent databank for portuguese
    13th Annual Conference of the International Speech Communication Association 2012 Interspeech 2012, 2012
  • NeoTag: A POS tagger for grammatical neologism detection
    Proceedings of the 8th International Conference on Language Resources and Evaluation Lrec 2012, 2012
  • The common orthographic vocabulary of the Portuguese language a set of open lexical resources for a pluricentric language
    Proceedings of the 8th International Conference on Language Resources and Evaluation Lrec 2012, 2012
  • Combining resources: Taxonomy extraction from multiple dictionaries
    Proceedings of the 7th International Conference on Language Resources and Evaluation Lrec 2010, 2010
  • Spock - A spoken corpus client
    Proceedings of the 6th International Conference on Language Resources and Evaluation Lrec 2008, 2008
  • Translatable terminology and knowledge representation: Usage of hyponymic relations
    Maarten Janssen, Marc Van Campenhoudt
    Langages, 2005

RECENT SCHOLAR PUBLICATIONS

  • Advancing African NLP: UDMorph and flexiPipe
    M Janssen
    Proceedings of the 7th Workshop on African Natural Language Processing … , 2026
    2026
  • Dynamic arthroscopy in retropatellar cartilage defect: identifying trochlear ridge impingement and guiding towards mini-trochleoplasty
    N Merkelbach, HD Veldman, MPF Janssen, PJ Emans
    BMJ Case Reports CP 19 (1), e270085 , 2026
    2026
  • Alignment of historical manuscript transcriptions and translations
    M Janssen, P Lendvai, A Jouravel
    Proceedings of the 15th International Conference on Recent Advances in … , 2025
    2025
    Citations: 1
  • Searchable Language Documentation Corpora: DoReCo meets TEITOK
    M Janssen, F Seifart
    Proceedings of the Fourth Workshop on NLP Applications to Field Linguistics … , 2025
    2025
  • Forecasting the value of innovation in total knee arthroplasty care: a headroom approach
    TM Otten, SE Grimm, B Ramaekers, A Roth, P Emans, T Boymans, ...
    Journal of Experimental Orthopaedics 11 (4), e70096 , 2024
    2024
    Citations: 2
  • ParlaMint in TEITOK
    M Janssen, M Kopp
    Proceedings of the iv workshop on creating, analysing, and increasing … , 2024
    2024
    Citations: 3
  • UDMorph: Morphosyntactically Tagged UD Corpora
    M Janssen
    Proceedings of the 2024 Joint International Conference on Computational … , 2024
    2024
    Citations: 2
  • CLARIN in Training and Education
    K De Smedt, I Van der Lek, H Van den Heuvel, A Balvet, M Janssen, ...
    Linköping ECP , 2024
    2024
    Citations: 2
  • Dynamically chaining APIs: From Dracor to TEITOK
    M Janssen
    CLARIN Annual Conference Proceedings, 116-119 , 2023
    2023
    Citations: 1
  • Deliverable/Document Information
    S Cinková, JM Birkholz, I Börner, T Dejaeghere, S Heiden, M Janssen, ...
    2023
  • Topvorm: lessen voor de prestatiemaatschappij
    M Janssen, K Putters
    Prometheus , 2022
    2022
  • Endochondral ossification in the damaged joint
    MPF Janssen
    2022
  • Twenty-two-year outcome of cartilage repair surgery by perichondrium transplantation
    MPF Janssen, EGM van der Linden, TAEJ Boymans, TJM Welting, ...
    Cartilage 13 (1_suppl), 860S-867S , 2021
    2021
    Citations: 18
  • UDWiki: guided creation and exploitation of UD treebanks
    M Janssen
    Proceedings of the Fifth Workshop on Universal Dependencies (UDW, SyntaxFest … , 2021
    2021
    Citations: 1
  • 7-Tesla MRI evaluation of the knee, 25 years after cartilage repair surgery: the influence of intralesional osteophytes on biochemical quality of cartilage
    MPF Janssen, MJM Peters, EGM Steijvers-Peeters, P Szomolanyi, ...
    Cartilage 13 (1_suppl), 767S-779S , 2021
    2021
    Citations: 2
  • Response to Comment by Filardo et al.
    MPF Janssen, TAEJ Boymans, TJM Welting, LW van Rhijn, SK Bulstra, ...
    Cartilage 13 (Suppl 1), 1829S-1829S , 2021
    2021
  • A corpus with wavesurfer and TEI: Speech and video in TEITOK
    M Janssen
    International Conference on Text, Speech, and Dialogue, 261-268 , 2021
    2021
    Citations: 6
  • Integrating TEITOK and KonText/PMLTQ at LINDAT
    M Janssen
    CLARIN Annual Conference, 104-110 , 2020
    2020
    Citations: 1
  • Da edición dixital á análise lingüística. A creación de corpus históricos na plataforma TEITOK
    M Janssen, G Vaamonde dos Santos
    2020
    Citations: 15
  • Corpus de produções escritas de aprendentes de PL2 (PEAPL2): Subcorpus Português língua estrangeira
    C Martins, T Ferreira, M Sitoe, C Abrantes, M Janssen, A Fernandes, ...
    Coimbra: CELGA-ILTEC , 2019
    2019
    Citations: 16

MOST CITED SCHOLAR PUBLICATIONS

  • TEITOK: Text-faithful annotated corpora
    M Janssen
    Proceedings of the Tenth International Conference on Language Resources and … , 2016
    2016
    Citations: 136
  • The COPLE2 corpus: a learner corpus for Portuguese
    A Mendes, S Antunes, M Janssen, A Gonçalves
    Proceedings of the Tenth International Conference on Language Resources and … , 2016
    2016
    Citations: 67
  • MULTILINGUAL LEXICAL DATABASES, LEXICAL GAPS, AND SIM u LLDA
    M Janssen
    International Journal of lexicography 17 (2), 137-154 , 2004
    2004
    Citations: 60
  • NeoTag: a POS Tagger for Grammatical Neologism Detection.
    M Janssen
    LREC, 2118-2124 , 2012
    2012
    Citations: 53
  • Orthographic neologisms selection criteria and semi-automatic detection
    M Janssen
    Instituto de Linguística Teórica e Computacional 21, 1-13 , 2011
    2011
    Citations: 51
  • Open source lexical information network
    M Janssen
    Third international workshop on generative approaches to the lexicon, 400-401 , 2005
    2005
    Citations: 33
  • 6.4 The codification of usage by labels
    HJ Verkuyl, M Janssen, F Jansen
    A practical guide to lexicography 6, 297 , 2003
    2003
    Citations: 31
  • Simullda a Multilingual Lexical Database Application Using a Structured Interlingua
    M Janssen
    PQDT-Global , 2018
    2018
    Citations: 29
  • Towards error annotation in a learner corpus of Portuguese
    IDR Gayo, S Antunes, A Mendes, M Janssen
    Proceedings of the joint workshop on NLP for Computer Assisted Language … , 2016
    2016
    Citations: 26
  • Working together towards an ideal infrastructure for language learner corpora
    EW Stemle, A Boyd, M Janssen, T Lindström Tiedemann, ...
    Widening the scope of learner corpus research: selected papers from the … , 2019
    2019
    Citations: 24
  • The Common Orthographic Vocabulary of the Portuguese Language: a set of open lexical resources for a pluricentric language.
    JP Ferreira, M Janssen, GB de Oliveira, M Correia, GM De Oliveira
    LREC, 1071-1075 , 2012
    2012
    Citations: 23
  • Detección de Neologismos: una perspectiva computacional
    M Janssen
    Debate Terminológico , 2009
    2009
    Citations: 23
  • Terminologie traductive et représentation des connaissances: l'usage des relations hyponymiques
    M Janssen, M Van Campenhoudt
    Langages 157 (1), 63-80 , 2005
    2005
    Citations: 22
  • TEITOK-a Tokenized TEI environment
    M Janssen
    Lisboa: Centro de inguística da Universidade de Lisboa. http://alfclul. clul … , 2014
    2014
    Citations: 20
  • Between inflection and derivation: Paradigmatic lexical functions in morphological databases
    M Janssen
    East-West encounter: Second International Conference on Meaning and Text … , 2005
    2005
    Citations: 20
  • Twenty-two-year outcome of cartilage repair surgery by perichondrium transplantation
    MPF Janssen, EGM van der Linden, TAEJ Boymans, TJM Welting, ...
    Cartilage 13 (1_suppl), 860S-867S , 2021
    2021
    Citations: 18
  • NeoTrack: semi-automatic neologism detection
    M Janssen
    APL XXI, Porto, Portugal , 2005
    2005
    Citations: 17
  • Corpus de produções escritas de aprendentes de PL2 (PEAPL2): Subcorpus Português língua estrangeira
    C Martins, T Ferreira, M Sitoe, C Abrantes, M Janssen, A Fernandes, ...
    Coimbra: CELGA-ILTEC , 2019
    2019
    Citations: 16
  • Da edición dixital á análise lingüística. A creación de corpus históricos na plataforma TEITOK
    M Janssen, G Vaamonde dos Santos
    2020
    Citations: 15
  • Governance of local care & social service: An evaluation of the implementation of the Wmo in the Netherlands
    K Putters, K Grit, M Janssen, D Schmidt, P Meurs
    Erasmus University Rotterdam. Institute of Health Policy & Management. https … , 2010
    2010
    Citations: 15