Rubén Solera-Ureña

@inesc-id.pt

INESC-ID Lisboa

14

Scopus Publications

363

Scholar Citations

8

Scholar h-index

7

Scholar i10-index

Scopus Publications

  • On the Relevance of Clinical Assessment Tasks for the Automatic Detection of Parkinson's Disease Medication State from Speech
    David Gimeno-Gómez, Rubén Solera-Ureña, Anna Pompili, Carlos-D. Martínez-Hinarejos, Rita Cardoso, et al.
    Proceedings of the Annual Conference of the International Speech Communication Association Interspeech, 2025
  • CAMÕES: A Comprehensive Automatic Speech Recognition Benchmark for European Portuguese
    Carlos Carvalho, Francisco Teixeira, Catarina Botelho, Anna Pompili, Rubén Solera-Ureña, et al.
    Asru 2025 2025 IEEE Automatic Speech Recognition and Understanding Workshop, 2025
  • Acoustic and Linguistic Biomarkers for Cognitive Impairment Detection from Speech
    Catarina Botelho, David Gimeno-Gómez, Francisco Teixeira, John Mendonça, Patrícia Pereira, et al.
    Proceedings of the Annual Conference of the International Speech Communication Association Interspeech, 2025
  • Transfer learning-based cough representations for automatic detection of COVID-19
    Rubén Solera-Ureña, Catarina Botelho, Francisco Teixeira, Thomas Rolland, Alberto Abad, et al.
    Proceedings of the Annual Conference of the International Speech Communication Association Interspeech, 2021
  • Assessment of Parkinson's disease medication state through automatic speech analysis
    Anna Pompili, Rubén Solera-Ureña, Alberto Abad, Rita Cardoso, Isabel Guimarães, et al.
    Proceedings of the Annual Conference of the International Speech Communication Association Interspeech, 2020
    Parkinson's disease (PD) is a progressive degenerative disorder of the central nervous system characterized by motor and non-motor symptoms. As the disease progresses, patients alternate periods in which motor symptoms are mitigated due to medication intake (ON state) and periods with motor complications (OFF state). The time that patients spend in the OFF condition is currently the main parameter employed to assess pharmacological interventions and to evaluate the efficacy of different active principles. In this work, we present a system that combines automatic speech processing and deep learning techniques to classify the medication state of PD patients by leveraging personal speech-based bio-markers. We devise a speaker-dependent approach and investigate the relevance of different acoustic-prosodic feature sets. Results show an accuracy of 90.54% in a test task with mixed speech and an accuracy of 95.27% in a semi-spontaneous speech task. Overall, the experimental assessment shows the potentials of this approach towards the development of reliable, remote daily monitoring and scheduling of medication intake of PD patients.
  • Project INSIDE: towards autonomous semi-unstructured human–robot social interaction in autism therapy
    Francisco S. Melo, Alberto Sardinha, David Belo, Marta Couto, Miguel Faria, et al.
    Artificial Intelligence in Medicine, 2019
    This paper describes the INSIDE system, a networked robot system designed to allow the use of mobile robots as active players in the therapy of children with autism spectrum disorders (ASD). While a significant volume of work has explored the impact of robots in ASD therapy, most such work comprises remotely operated robots and/or well-structured interaction dynamics. In contrast, the INSIDE system allows for complex, semi-unstructured interaction in ASD therapy while featuring a fully autonomous robot. In this paper we describe the hardware and software infrastructure that supports such rich form of interaction, as well as the design methodology that guided the development of the INSIDE system. We also present some results on the use of our system both in pilot and in a long-term study comprising multiple therapy sessions with children at Hospital Garcia de Orta, in Portugal, highlighting the robustness and autonomy of the system as a whole.
  • A semi-supervised learning approach for acoustic-prosodic personality perception in under-resourced domains
    Rubén Solera-Ureña, Helena Moniz, Fernando Batista, Vera Cabarrão, Anna Pompili, et al.
    Proceedings of the Annual Conference of the International Speech Communication Association Interspeech, 2017
    Automatic personality analysis has gained attention in the last years as a fundamental dimension in human-to-human and human-to-machine interaction. However, it still suffers from limited number and size of speech corpora for specific domains, such as the assessment of children’s personality. This paper investigates a semi-supervised training approach to tackle this scenario. We devise an experimental setup with age and language mismatch and two training sets: a small labeled training set from the Interspeech 2012 Personality Sub-challenge, containing French adult speech labeled with personality OCEAN traits, and a large unlabeled training set of Portuguese children’s speech. As test set, a corpus of Portuguese children’s speech labeled with OCEAN traits is used. Based on this setting, we investigate a weak supervision approach that iteratively refines an initial model trained with the labeled data-set using the unlabeled data-set. We also investigate knowledge-based features, which leverage expert knowledge in acoustic-prosodic cues and thus need no extra data. Results show that, despite the large mismatch imposed by language and age differences, it is possible to attain improvements with these techniques, pointing both to the benefits of using a weak supervision and expert-based acoustic-prosodic features across age and language.
  • Acoustic-prosodic automatic personality trait assessment for adults and children
    Rubén Solera-Ureña, Helena Moniz, Fernando Batista, Ramón F. Astudillo, Joana Campos, et al.
    Lecture Notes in Computer Science Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics, 2016
    This paper investigates the use of heterogeneous speech corpora for automatic assessment of personality traits in terms of the Big-Five OCEAN dimensions. The motivation for this work is twofold: the need to develop methods to overcome the lack of children’s speech corpora, particularly severe when targeting personality traits, and the interest on cross-age comparisons of acoustic-prosodic features to build robust paralinguistic detectors. For this purpose, we devise an experimental setup with age mismatch utilizing the Interspeech 2012 Personality Sub-challenge, containing adult speech, as training data. As test data, we use a corpus of children’s European Portuguese speech. We investigate various features sets such as the Sub-challenge baseline features, the recently introduced eGeMAPS features and our own knowledge-based features. The preliminary results bring insights into cross-age and -language detection of personality traits in spontaneous speech, pointing out to a stable set of acoustic-prosodic features for Extraversion and Agreeableness in both adult and child speech.
  • Real-time robust automatic speech recognition using compact support vector machines
    R. Solera-Urena, A. I. Garcia-Moral, C. Pelaez-Moreno, Manel Martinez-Ramon, F. Diaz-de-Maria
    IEEE Transactions on Audio Speech and Language Processing, 2012
    In the last years, support vector machines (SVMs) have shown excellent performance in many applications, especially in the presence of noise. In particular, SVMs offer several advantages over artificial neural networks (ANNs) that have attracted the attention of the speech processing community. Nevertheless, their high computational requirements prevent them from being used in practice in automatic speech recognition (ASR), where ANNs have proven to be successful. The high complexity of SVMs in this context arises from the use of huge speech training databases with millions of samples and highly overlapped classes. This paper suggests the use of a weighted least squares (WLS) training procedure that facilitates the possibility of imposing a compact semiparametric model on the SVM, which results in a dramatic complexity reduction. Such a complexity reduction with respect to conventional SVMs, which is between two and three orders of magnitude, allows the proposed hybrid WLS-SVC/HMM system to perform real-time speech decoding on a connected-digit recognition task (SpeechDat Spanish database). The experimental evaluation of the proposed system shows encouraging performance levels in clean and noisy conditions, although further improvements are required to reach the maturity level of current context-dependent HMM-based recognizers.
  • Data balancing for efficient training of hybrid ANN/HMM automatic speech recognition systems
    Ana Isabel Garcia-Moral, Rubén Solera-Urena, Carmen Pelaez-Moreno, Fernando Diaz-de-Maria
    IEEE Transactions on Audio Speech and Language Processing, 2011
    Hybrid speech recognizers, where the estimation of the emission pdf of the states of hidden Markov models (HMMs), usually carried out using Gaussian mixture models (GMMs), is substituted by artificial neural networks (ANNs) have several advantages over the classical systems. However, to obtain performance improvements, the computational requirements are heavily increased because of the need to train the ANN. Departing from the observation of the remarkable skewness of speech data, this paper proposes sifting out the training set and balancing the amount of samples per class. With this method, the training time has been reduced 18 times while obtaining performances similar to or even better than those with the whole database, especially in noisy environments. However, the application of these reduced sets is not straightforward. To avoid the mismatch between training and testing conditions created by the modification of the distribution of the training data, a proper scaling of the a posteriori probabilities obtained and a resizing of the context window need to be performed as demonstrated in this paper.
  • UC3M high level feature extraction at TRECVID 2008
    2008 Trec Video Retrieval Evaluation Notebook Papers, 2008
  • Robust ASR using Support Vector Machines
    R. Solera-Ureña, D. Martín-Iglesias, A. Gallardo-Antolín, C. Peláez-Moreno, F. Díaz-de-María
    Speech Communication, 2007
  • Hybrid models for automatic speech recognition: A comparison of classical ANN and kernel based methods
    Ana I. García-Moral, Rubén Solera-Ureña, Carmen Peláez-Moreno, Fernando Díaz-de-María
    Lecture Notes in Computer Science Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics, 2007
  • SVMs for automatic speech recognition: A survey
    R. Solera-Ureña, J. Padrell-Sendra, D. Martín-Iglesias, A. Gallardo-Antolín, C. Peláez-Moreno, et al.
    Lecture Notes in Computer Science Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics, 2007

RECENT SCHOLAR PUBLICATIONS

  • CAMÕES: A Comprehensive Automatic Speech Recognition Benchmark for European Portuguese
    C Carvalho, F Teixeira, C Botelho, A Pompili, R Solera-Ureña, S Paulo, ...
    arXiv preprint arXiv:2508.19721 , 2025
    2025
    Citations: 2
  • Acoustic and Linguistic Biomarkers for Cognitive Impairment Detection from Speech
    C Botelho, D Gimeno-Gómez, F Teixeira, J Mendonça, P Pereira, ...
    Proc. Interspeech 2025, 1418-1422 , 2025
    2025
    Citations: 4
  • On the Relevance of Clinical Assessment Tasks for the Automatic Detection of Parkinson's Disease Medication State from Speech
    D Gimeno-Gómez, R Solera-Ureña, A Pompili, CD Martínez-Hinarejos, ...
    Proc. Interspeech 2025, 2025-793 , 2025
    2025
    Citations: 1
  • Tackling cognitive impairment detection from speech: A submission to the process challenge
    C Botelho, D Gimeno-Gómez, F Teixeira, J Mendonça, P Pereira, ...
    arXiv preprint arXiv:2501.00145 , 2024
    2024
    Citations: 3
  • Accelerat. AI: INESC-ID/IST-Universidade de Lisboa contributions towards improved conversational agents in European Portuguese
    A Abad, S Paulo, R Solera-Ureña, A Pompili
    Proc. IberSPEECH 2024, 293-296 , 2024
    2024
  • Using Self-Supervised Feature Extractors with Attention for Automatic COVID-19 Detection from Speech
    J Mendonça, R Solera-Ureña, A Abad, I Trancoso
    arXiv preprint arXiv:2107.00112 , 2021
    2021
    Citations: 3
  • Transfer Learning-Based Cough Representations for Automatic Detection of COVID-19
    R Solera-Ureña, C Botelho, F Teixeira, T Rolland, A Abad, I Trancoso
    Proceedings of INTERSPEECH 2021, 436-440 , 2021
    2021
    Citations: 21
  • Assessment of Parkinson's Disease Medication State through Automatic Speech Analysis
    A Pompili, R Solera-Ureña, A Abad, R Cardoso, I Guimarães, M Fabbri, ...
    Proceedings of INTERSPEECH 2020, 4591-4595 , 2020
    2020
    Citations: 20
  • Uma abordagem de aprendizagem semissupervisionada para a classificação automática de personalidade baseada em pistas acústico-prosódicas
    R Solera-Ureña, H Moniz, F Batista, V Cabarrão, A Pompili, ...
    Revista da Associação Portuguesa de Linguística, 348-364 , 2019
    2019
  • Affective analysis of customer service calls
    V Cabarrão, M Julião, R Solera-Ureña, H Moniz, F Batista, I Trancoso, ...
    10th International Conference of Experimental Linguistics (ExLing 2019), 37-40 , 2019
    2019
    Citations: 3
  • Affective computing based on acoustic-prosodic cues
    H Moniz, R Solera-Ureña, V Cabarrão, M Julião, F Batista, I Trancoso
    14th Annual INGRoup Conference (Interdisciplinary Network for Group Research … , 2019
    2019
  • Project INSIDE: towards autonomous semi-unstructured human–robot social interaction in autism therapy
    FS Melo, A Sardinha, D Belo, M Couto, M Faria, A Farias, H Gambôa, ...
    Artificial Intelligence in Medicine 96, 198 - 216 , 2019
    2019
    Citations: 65
  • Uma abordagem de aprendizagem semi-supervisionada para a percepção automática de personalidade, baseada em pistas acústico-prosódicas em domínios com poucos recursos
    R Solera-Ureña, H Moniz, F Batista, V Cabarrão, A Pompili, ...
    XXXIV Encontro Nacional da Associação Portuguesa de Linguística , 2018
    2018
  • A Semi-Supervised Learning Approach for Acoustic-Prosodic Personality Perception in Under-Resourced Domains
    R Solera-Ureña, H Moniz, F Batista, V Cabarrão, A Pompili, ...
    Proceedings of INTERSPEECH 2017, 929-933 , 2017
    2017
    Citations: 8
  • Acoustic-Prosodic Automatic Personality Trait Assessment for Adults and Children
    R Solera-Ureña, H Moniz, F Batista, R Fernández Astudillo, J Campos, ...
    Advances in Speech and Language Technologies for Iberian Languages … , 2016
    2016
    Citations: 8
  • Human-Robotic Agent Speech Interaction
    R Solera-Ureña, H Moniz
    2016
  • Real-time Robust Automatic Speech Recognition Using Compact Support Vector Machines
    R Solera-Ureña, AI García-Moral, C Peláez-Moreno, M Martínez-Ramón, ...
    IEEE Transactions on Audio Speech and LanguageProcessing 20 (4), 1347-1361 , 2012
    2012
    Citations: 39
  • Máquinas de vectores soporte para reconocimiento robusto de habla
    R Solera-Ureña
    Universidad Carlos III de Madrid, Spain , 2011
    2011
    Citations: 4
  • Data Balancing for Efficient Training of Hybrid ANN/HMM Automatic Speech Recognition Systems
    AI García-Moral, R Solera-Ureña, C Peláez-Moreno, F Díaz-de-María
    IEEE Transactions on Audio, Speech, and Language Processing 19 (3), 468-481 , 2011
    2011
    Citations: 36
  • UC3M high level feature extraction at TRECVID 2008
    I González-Díaz, D García-García, R Solera-Ureña, J Madrid-Sánchez, ...
    2008 TREC VIDEO RETRIEVAL EVALUATION Workshop (TRECVID 2008) , 2008
    2008

MOST CITED SCHOLAR PUBLICATIONS

  • Robust ASR using Support Vector Machines
    R Solera-Ureña, D Martín-Iglesias, A Gallardo-Antolín, C Peláez-Moreno, ...
    Speech Communication 49 (4), 253-267 , 2007
    2007
    Citations: 68
  • SVMs for Automatic Speech Recognition: A Survey
    R Solera-Ureña, J Padrell-Sendra, D Martín-Iglesias, A Gallardo-Antolín, ...
    Progress in Nonlinear Speech Processing 4391, 190-216 , 2007
    2007
    Citations: 67
  • Project INSIDE: towards autonomous semi-unstructured human–robot social interaction in autism therapy
    FS Melo, A Sardinha, D Belo, M Couto, M Faria, A Farias, H Gambôa, ...
    Artificial Intelligence in Medicine 96, 198 - 216 , 2019
    2019
    Citations: 65
  • Real-time Robust Automatic Speech Recognition Using Compact Support Vector Machines
    R Solera-Ureña, AI García-Moral, C Peláez-Moreno, M Martínez-Ramón, ...
    IEEE Transactions on Audio Speech and LanguageProcessing 20 (4), 1347-1361 , 2012
    2012
    Citations: 39
  • Data Balancing for Efficient Training of Hybrid ANN/HMM Automatic Speech Recognition Systems
    AI García-Moral, R Solera-Ureña, C Peláez-Moreno, F Díaz-de-María
    IEEE Transactions on Audio, Speech, and Language Processing 19 (3), 468-481 , 2011
    2011
    Citations: 36
  • Transfer Learning-Based Cough Representations for Automatic Detection of COVID-19
    R Solera-Ureña, C Botelho, F Teixeira, T Rolland, A Abad, I Trancoso
    Proceedings of INTERSPEECH 2021, 436-440 , 2021
    2021
    Citations: 21
  • Assessment of Parkinson's Disease Medication State through Automatic Speech Analysis
    A Pompili, R Solera-Ureña, A Abad, R Cardoso, I Guimarães, M Fabbri, ...
    Proceedings of INTERSPEECH 2020, 4591-4595 , 2020
    2020
    Citations: 20
  • A Semi-Supervised Learning Approach for Acoustic-Prosodic Personality Perception in Under-Resourced Domains
    R Solera-Ureña, H Moniz, F Batista, V Cabarrão, A Pompili, ...
    Proceedings of INTERSPEECH 2017, 929-933 , 2017
    2017
    Citations: 8
  • Acoustic-Prosodic Automatic Personality Trait Assessment for Adults and Children
    R Solera-Ureña, H Moniz, F Batista, R Fernández Astudillo, J Campos, ...
    Advances in Speech and Language Technologies for Iberian Languages … , 2016
    2016
    Citations: 8
  • Hybrid models for automatic speech recognition: a comparison of classical ANN and kernel based methods
    AI García-Moral, R Solera-Ureña, C Peláez-Moreno, F Díaz-de-María
    Advances in Nonlinear Speech Processing (NOLISP 2007) 4885, 152-160 , 2007
    2007
    Citations: 6
  • Estimación de probabilidades a posteriori en SVMs multiclase para reconocimiento de habla continua
    R Solera-Ureña, F Pérez-Cruz, F Díaz-de-María
    IV Jornadas en Tecnologías del Habla 2006 (JTH 2006) , 2006
    2006
    Citations: 5
  • Acoustic and Linguistic Biomarkers for Cognitive Impairment Detection from Speech
    C Botelho, D Gimeno-Gómez, F Teixeira, J Mendonça, P Pereira, ...
    Proc. Interspeech 2025, 1418-1422 , 2025
    2025
    Citations: 4
  • Máquinas de vectores soporte para reconocimiento robusto de habla
    R Solera-Ureña
    Universidad Carlos III de Madrid, Spain , 2011
    2011
    Citations: 4
  • Tackling cognitive impairment detection from speech: A submission to the process challenge
    C Botelho, D Gimeno-Gómez, F Teixeira, J Mendonça, P Pereira, ...
    arXiv preprint arXiv:2501.00145 , 2024
    2024
    Citations: 3
  • Using Self-Supervised Feature Extractors with Attention for Automatic COVID-19 Detection from Speech
    J Mendonça, R Solera-Ureña, A Abad, I Trancoso
    arXiv preprint arXiv:2107.00112 , 2021
    2021
    Citations: 3
  • Affective analysis of customer service calls
    V Cabarrão, M Julião, R Solera-Ureña, H Moniz, F Batista, I Trancoso, ...
    10th International Conference of Experimental Linguistics (ExLing 2019), 37-40 , 2019
    2019
    Citations: 3
  • CAMÕES: A Comprehensive Automatic Speech Recognition Benchmark for European Portuguese
    C Carvalho, F Teixeira, C Botelho, A Pompili, R Solera-Ureña, S Paulo, ...
    arXiv preprint arXiv:2508.19721 , 2025
    2025
    Citations: 2
  • On the Relevance of Clinical Assessment Tasks for the Automatic Detection of Parkinson's Disease Medication State from Speech
    D Gimeno-Gómez, R Solera-Ureña, A Pompili, CD Martínez-Hinarejos, ...
    Proc. Interspeech 2025, 2025-793 , 2025
    2025
    Citations: 1
  • Accelerat. AI: INESC-ID/IST-Universidade de Lisboa contributions towards improved conversational agents in European Portuguese
    A Abad, S Paulo, R Solera-Ureña, A Pompili
    Proc. IberSPEECH 2024, 293-296 , 2024
    2024
  • Uma abordagem de aprendizagem semissupervisionada para a classificação automática de personalidade baseada em pistas acústico-prosódicas
    R Solera-Ureña, H Moniz, F Batista, V Cabarrão, A Pompili, ...
    Revista da Associação Portuguesa de Linguística, 348-364 , 2019
    2019