Rubén Solera-Ureña
@inesc-id.pt
INESC-ID Lisboa
Scopus Publications
- On the Relevance of Clinical Assessment Tasks for the Automatic Detection of Parkinson's Disease Medication State from Speech
David Gimeno-Gómez, Rubén Solera-Ureña, Anna Pompili, Carlos-D. Martínez-Hinarejos, Rita Cardoso, et al.
Proceedings of the Annual Conference of the International Speech Communication Association Interspeech, 2025 - CAMÕES: A Comprehensive Automatic Speech Recognition Benchmark for European Portuguese
Carlos Carvalho, Francisco Teixeira, Catarina Botelho, Anna Pompili, Rubén Solera-Ureña, et al.
Asru 2025 2025 IEEE Automatic Speech Recognition and Understanding Workshop, 2025 - Acoustic and Linguistic Biomarkers for Cognitive Impairment Detection from Speech
Catarina Botelho, David Gimeno-Gómez, Francisco Teixeira, John Mendonça, Patrícia Pereira, et al.
Proceedings of the Annual Conference of the International Speech Communication Association Interspeech, 2025 - Transfer learning-based cough representations for automatic detection of COVID-19
Rubén Solera-Ureña, Catarina Botelho, Francisco Teixeira, Thomas Rolland, Alberto Abad, et al.
Proceedings of the Annual Conference of the International Speech Communication Association Interspeech, 2021 - Assessment of Parkinson's disease medication state through automatic speech analysis
Anna Pompili, Rubén Solera-Ureña, Alberto Abad, Rita Cardoso, Isabel Guimarães, et al.
Proceedings of the Annual Conference of the International Speech Communication Association Interspeech, 2020
Parkinson's disease (PD) is a progressive degenerative disorder of the central nervous system characterized by motor and non-motor symptoms. As the disease progresses, patients alternate periods in which motor symptoms are mitigated due to medication intake (ON state) and periods with motor complications (OFF state). The time that patients spend in the OFF condition is currently the main parameter employed to assess pharmacological interventions and to evaluate the efficacy of different active principles. In this work, we present a system that combines automatic speech processing and deep learning techniques to classify the medication state of PD patients by leveraging personal speech-based bio-markers. We devise a speaker-dependent approach and investigate the relevance of different acoustic-prosodic feature sets. Results show an accuracy of 90.54% in a test task with mixed speech and an accuracy of 95.27% in a semi-spontaneous speech task. Overall, the experimental assessment shows the potentials of this approach towards the development of reliable, remote daily monitoring and scheduling of medication intake of PD patients. - Project INSIDE: towards autonomous semi-unstructured human–robot social interaction in autism therapy
Francisco S. Melo, Alberto Sardinha, David Belo, Marta Couto, Miguel Faria, et al.
Artificial Intelligence in Medicine, 2019
This paper describes the INSIDE system, a networked robot system designed to allow the use of mobile robots as active players in the therapy of children with autism spectrum disorders (ASD). While a significant volume of work has explored the impact of robots in ASD therapy, most such work comprises remotely operated robots and/or well-structured interaction dynamics. In contrast, the INSIDE system allows for complex, semi-unstructured interaction in ASD therapy while featuring a fully autonomous robot. In this paper we describe the hardware and software infrastructure that supports such rich form of interaction, as well as the design methodology that guided the development of the INSIDE system. We also present some results on the use of our system both in pilot and in a long-term study comprising multiple therapy sessions with children at Hospital Garcia de Orta, in Portugal, highlighting the robustness and autonomy of the system as a whole. - A semi-supervised learning approach for acoustic-prosodic personality perception in under-resourced domains
Rubén Solera-Ureña, Helena Moniz, Fernando Batista, Vera Cabarrão, Anna Pompili, et al.
Proceedings of the Annual Conference of the International Speech Communication Association Interspeech, 2017
Automatic personality analysis has gained attention in the last years as a fundamental dimension in human-to-human and human-to-machine interaction. However, it still suffers from limited number and size of speech corpora for specific domains, such as the assessment of children’s personality. This paper investigates a semi-supervised training approach to tackle this scenario. We devise an experimental setup with age and language mismatch and two training sets: a small labeled training set from the Interspeech 2012 Personality Sub-challenge, containing French adult speech labeled with personality OCEAN traits, and a large unlabeled training set of Portuguese children’s speech. As test set, a corpus of Portuguese children’s speech labeled with OCEAN traits is used. Based on this setting, we investigate a weak supervision approach that iteratively refines an initial model trained with the labeled data-set using the unlabeled data-set. We also investigate knowledge-based features, which leverage expert knowledge in acoustic-prosodic cues and thus need no extra data. Results show that, despite the large mismatch imposed by language and age differences, it is possible to attain improvements with these techniques, pointing both to the benefits of using a weak supervision and expert-based acoustic-prosodic features across age and language. - Acoustic-prosodic automatic personality trait assessment for adults and children
Rubén Solera-Ureña, Helena Moniz, Fernando Batista, Ramón F. Astudillo, Joana Campos, et al.
Lecture Notes in Computer Science Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics, 2016
This paper investigates the use of heterogeneous speech corpora for automatic assessment of personality traits in terms of the Big-Five OCEAN dimensions. The motivation for this work is twofold: the need to develop methods to overcome the lack of children’s speech corpora, particularly severe when targeting personality traits, and the interest on cross-age comparisons of acoustic-prosodic features to build robust paralinguistic detectors. For this purpose, we devise an experimental setup with age mismatch utilizing the Interspeech 2012 Personality Sub-challenge, containing adult speech, as training data. As test data, we use a corpus of children’s European Portuguese speech. We investigate various features sets such as the Sub-challenge baseline features, the recently introduced eGeMAPS features and our own knowledge-based features. The preliminary results bring insights into cross-age and -language detection of personality traits in spontaneous speech, pointing out to a stable set of acoustic-prosodic features for Extraversion and Agreeableness in both adult and child speech. - Real-time robust automatic speech recognition using compact support vector machines
R. Solera-Urena, A. I. Garcia-Moral, C. Pelaez-Moreno, Manel Martinez-Ramon, F. Diaz-de-Maria
IEEE Transactions on Audio Speech and Language Processing, 2012
In the last years, support vector machines (SVMs) have shown excellent performance in many applications, especially in the presence of noise. In particular, SVMs offer several advantages over artificial neural networks (ANNs) that have attracted the attention of the speech processing community. Nevertheless, their high computational requirements prevent them from being used in practice in automatic speech recognition (ASR), where ANNs have proven to be successful. The high complexity of SVMs in this context arises from the use of huge speech training databases with millions of samples and highly overlapped classes. This paper suggests the use of a weighted least squares (WLS) training procedure that facilitates the possibility of imposing a compact semiparametric model on the SVM, which results in a dramatic complexity reduction. Such a complexity reduction with respect to conventional SVMs, which is between two and three orders of magnitude, allows the proposed hybrid WLS-SVC/HMM system to perform real-time speech decoding on a connected-digit recognition task (SpeechDat Spanish database). The experimental evaluation of the proposed system shows encouraging performance levels in clean and noisy conditions, although further improvements are required to reach the maturity level of current context-dependent HMM-based recognizers. - Data balancing for efficient training of hybrid ANN/HMM automatic speech recognition systems
Ana Isabel Garcia-Moral, Rubén Solera-Urena, Carmen Pelaez-Moreno, Fernando Diaz-de-Maria
IEEE Transactions on Audio Speech and Language Processing, 2011
Hybrid speech recognizers, where the estimation of the emission pdf of the states of hidden Markov models (HMMs), usually carried out using Gaussian mixture models (GMMs), is substituted by artificial neural networks (ANNs) have several advantages over the classical systems. However, to obtain performance improvements, the computational requirements are heavily increased because of the need to train the ANN. Departing from the observation of the remarkable skewness of speech data, this paper proposes sifting out the training set and balancing the amount of samples per class. With this method, the training time has been reduced 18 times while obtaining performances similar to or even better than those with the whole database, especially in noisy environments. However, the application of these reduced sets is not straightforward. To avoid the mismatch between training and testing conditions created by the modification of the distribution of the training data, a proper scaling of the a posteriori probabilities obtained and a resizing of the context window need to be performed as demonstrated in this paper. - UC3M high level feature extraction at TRECVID 2008
2008 Trec Video Retrieval Evaluation Notebook Papers, 2008 - Robust ASR using Support Vector Machines
R. Solera-Ureña, D. Martín-Iglesias, A. Gallardo-Antolín, C. Peláez-Moreno, F. Díaz-de-María
Speech Communication, 2007 - Hybrid models for automatic speech recognition: A comparison of classical ANN and kernel based methods
Ana I. García-Moral, Rubén Solera-Ureña, Carmen Peláez-Moreno, Fernando Díaz-de-María
Lecture Notes in Computer Science Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics, 2007 - SVMs for automatic speech recognition: A survey
R. Solera-Ureña, J. Padrell-Sendra, D. Martín-Iglesias, A. Gallardo-Antolín, C. Peláez-Moreno, et al.
Lecture Notes in Computer Science Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics, 2007
RECENT SCHOLAR PUBLICATIONS
- CAMÕES: A Comprehensive Automatic Speech Recognition Benchmark for European Portuguese
C Carvalho, F Teixeira, C Botelho, A Pompili, R Solera-Ureña, S Paulo, ...
arXiv preprint arXiv:2508.19721 , 2025
2025
Citations: 2 - Acoustic and Linguistic Biomarkers for Cognitive Impairment Detection from Speech
C Botelho, D Gimeno-Gómez, F Teixeira, J Mendonça, P Pereira, ...
Proc. Interspeech 2025, 1418-1422 , 2025
2025
Citations: 4 - On the Relevance of Clinical Assessment Tasks for the Automatic Detection of Parkinson's Disease Medication State from Speech
D Gimeno-Gómez, R Solera-Ureña, A Pompili, CD Martínez-Hinarejos, ...
Proc. Interspeech 2025, 2025-793 , 2025
2025
Citations: 1 - Tackling cognitive impairment detection from speech: A submission to the process challenge
C Botelho, D Gimeno-Gómez, F Teixeira, J Mendonça, P Pereira, ...
arXiv preprint arXiv:2501.00145 , 2024
2024
Citations: 3 - Accelerat. AI: INESC-ID/IST-Universidade de Lisboa contributions towards improved conversational agents in European Portuguese
A Abad, S Paulo, R Solera-Ureña, A Pompili
Proc. IberSPEECH 2024, 293-296 , 2024
2024 - Using Self-Supervised Feature Extractors with Attention for Automatic COVID-19 Detection from Speech
J Mendonça, R Solera-Ureña, A Abad, I Trancoso
arXiv preprint arXiv:2107.00112 , 2021
2021
Citations: 3 - Transfer Learning-Based Cough Representations for Automatic Detection of COVID-19
R Solera-Ureña, C Botelho, F Teixeira, T Rolland, A Abad, I Trancoso
Proceedings of INTERSPEECH 2021, 436-440 , 2021
2021
Citations: 21 - Assessment of Parkinson's Disease Medication State through Automatic Speech Analysis
A Pompili, R Solera-Ureña, A Abad, R Cardoso, I Guimarães, M Fabbri, ...
Proceedings of INTERSPEECH 2020, 4591-4595 , 2020
2020
Citations: 20 - Uma abordagem de aprendizagem semissupervisionada para a classificação automática de personalidade baseada em pistas acústico-prosódicas
R Solera-Ureña, H Moniz, F Batista, V Cabarrão, A Pompili, ...
Revista da Associação Portuguesa de Linguística, 348-364 , 2019
2019 - Affective analysis of customer service calls
V Cabarrão, M Julião, R Solera-Ureña, H Moniz, F Batista, I Trancoso, ...
10th International Conference of Experimental Linguistics (ExLing 2019), 37-40 , 2019
2019
Citations: 3 - Affective computing based on acoustic-prosodic cues
H Moniz, R Solera-Ureña, V Cabarrão, M Julião, F Batista, I Trancoso
14th Annual INGRoup Conference (Interdisciplinary Network for Group Research … , 2019
2019 - Project INSIDE: towards autonomous semi-unstructured human–robot social interaction in autism therapy
FS Melo, A Sardinha, D Belo, M Couto, M Faria, A Farias, H Gambôa, ...
Artificial Intelligence in Medicine 96, 198 - 216 , 2019
2019
Citations: 65 - Uma abordagem de aprendizagem semi-supervisionada para a percepção automática de personalidade, baseada em pistas acústico-prosódicas em domínios com poucos recursos
R Solera-Ureña, H Moniz, F Batista, V Cabarrão, A Pompili, ...
XXXIV Encontro Nacional da Associação Portuguesa de Linguística , 2018
2018 - A Semi-Supervised Learning Approach for Acoustic-Prosodic Personality Perception in Under-Resourced Domains
R Solera-Ureña, H Moniz, F Batista, V Cabarrão, A Pompili, ...
Proceedings of INTERSPEECH 2017, 929-933 , 2017
2017
Citations: 8 - Acoustic-Prosodic Automatic Personality Trait Assessment for Adults and Children
R Solera-Ureña, H Moniz, F Batista, R Fernández Astudillo, J Campos, ...
Advances in Speech and Language Technologies for Iberian Languages … , 2016
2016
Citations: 8 - Human-Robotic Agent Speech Interaction
R Solera-Ureña, H Moniz
2016 - Real-time Robust Automatic Speech Recognition Using Compact Support Vector Machines
R Solera-Ureña, AI García-Moral, C Peláez-Moreno, M Martínez-Ramón, ...
IEEE Transactions on Audio Speech and LanguageProcessing 20 (4), 1347-1361 , 2012
2012
Citations: 39 - Máquinas de vectores soporte para reconocimiento robusto de habla
R Solera-Ureña
Universidad Carlos III de Madrid, Spain , 2011
2011
Citations: 4 - Data Balancing for Efficient Training of Hybrid ANN/HMM Automatic Speech Recognition Systems
AI García-Moral, R Solera-Ureña, C Peláez-Moreno, F Díaz-de-María
IEEE Transactions on Audio, Speech, and Language Processing 19 (3), 468-481 , 2011
2011
Citations: 36 - UC3M high level feature extraction at TRECVID 2008
I González-Díaz, D García-García, R Solera-Ureña, J Madrid-Sánchez, ...
2008 TREC VIDEO RETRIEVAL EVALUATION Workshop (TRECVID 2008) , 2008
2008
MOST CITED SCHOLAR PUBLICATIONS
- Robust ASR using Support Vector Machines
R Solera-Ureña, D Martín-Iglesias, A Gallardo-Antolín, C Peláez-Moreno, ...
Speech Communication 49 (4), 253-267 , 2007
2007
Citations: 68 - SVMs for Automatic Speech Recognition: A Survey
R Solera-Ureña, J Padrell-Sendra, D Martín-Iglesias, A Gallardo-Antolín, ...
Progress in Nonlinear Speech Processing 4391, 190-216 , 2007
2007
Citations: 67 - Project INSIDE: towards autonomous semi-unstructured human–robot social interaction in autism therapy
FS Melo, A Sardinha, D Belo, M Couto, M Faria, A Farias, H Gambôa, ...
Artificial Intelligence in Medicine 96, 198 - 216 , 2019
2019
Citations: 65 - Real-time Robust Automatic Speech Recognition Using Compact Support Vector Machines
R Solera-Ureña, AI García-Moral, C Peláez-Moreno, M Martínez-Ramón, ...
IEEE Transactions on Audio Speech and LanguageProcessing 20 (4), 1347-1361 , 2012
2012
Citations: 39 - Data Balancing for Efficient Training of Hybrid ANN/HMM Automatic Speech Recognition Systems
AI García-Moral, R Solera-Ureña, C Peláez-Moreno, F Díaz-de-María
IEEE Transactions on Audio, Speech, and Language Processing 19 (3), 468-481 , 2011
2011
Citations: 36 - Transfer Learning-Based Cough Representations for Automatic Detection of COVID-19
R Solera-Ureña, C Botelho, F Teixeira, T Rolland, A Abad, I Trancoso
Proceedings of INTERSPEECH 2021, 436-440 , 2021
2021
Citations: 21 - Assessment of Parkinson's Disease Medication State through Automatic Speech Analysis
A Pompili, R Solera-Ureña, A Abad, R Cardoso, I Guimarães, M Fabbri, ...
Proceedings of INTERSPEECH 2020, 4591-4595 , 2020
2020
Citations: 20 - A Semi-Supervised Learning Approach for Acoustic-Prosodic Personality Perception in Under-Resourced Domains
R Solera-Ureña, H Moniz, F Batista, V Cabarrão, A Pompili, ...
Proceedings of INTERSPEECH 2017, 929-933 , 2017
2017
Citations: 8 - Acoustic-Prosodic Automatic Personality Trait Assessment for Adults and Children
R Solera-Ureña, H Moniz, F Batista, R Fernández Astudillo, J Campos, ...
Advances in Speech and Language Technologies for Iberian Languages … , 2016
2016
Citations: 8 - Hybrid models for automatic speech recognition: a comparison of classical ANN and kernel based methods
AI García-Moral, R Solera-Ureña, C Peláez-Moreno, F Díaz-de-María
Advances in Nonlinear Speech Processing (NOLISP 2007) 4885, 152-160 , 2007
2007
Citations: 6 - Estimación de probabilidades a posteriori en SVMs multiclase para reconocimiento de habla continua
R Solera-Ureña, F Pérez-Cruz, F Díaz-de-María
IV Jornadas en Tecnologías del Habla 2006 (JTH 2006) , 2006
2006
Citations: 5 - Acoustic and Linguistic Biomarkers for Cognitive Impairment Detection from Speech
C Botelho, D Gimeno-Gómez, F Teixeira, J Mendonça, P Pereira, ...
Proc. Interspeech 2025, 1418-1422 , 2025
2025
Citations: 4 - Máquinas de vectores soporte para reconocimiento robusto de habla
R Solera-Ureña
Universidad Carlos III de Madrid, Spain , 2011
2011
Citations: 4 - Tackling cognitive impairment detection from speech: A submission to the process challenge
C Botelho, D Gimeno-Gómez, F Teixeira, J Mendonça, P Pereira, ...
arXiv preprint arXiv:2501.00145 , 2024
2024
Citations: 3 - Using Self-Supervised Feature Extractors with Attention for Automatic COVID-19 Detection from Speech
J Mendonça, R Solera-Ureña, A Abad, I Trancoso
arXiv preprint arXiv:2107.00112 , 2021
2021
Citations: 3 - Affective analysis of customer service calls
V Cabarrão, M Julião, R Solera-Ureña, H Moniz, F Batista, I Trancoso, ...
10th International Conference of Experimental Linguistics (ExLing 2019), 37-40 , 2019
2019
Citations: 3 - CAMÕES: A Comprehensive Automatic Speech Recognition Benchmark for European Portuguese
C Carvalho, F Teixeira, C Botelho, A Pompili, R Solera-Ureña, S Paulo, ...
arXiv preprint arXiv:2508.19721 , 2025
2025
Citations: 2 - On the Relevance of Clinical Assessment Tasks for the Automatic Detection of Parkinson's Disease Medication State from Speech
D Gimeno-Gómez, R Solera-Ureña, A Pompili, CD Martínez-Hinarejos, ...
Proc. Interspeech 2025, 2025-793 , 2025
2025
Citations: 1 - Accelerat. AI: INESC-ID/IST-Universidade de Lisboa contributions towards improved conversational agents in European Portuguese
A Abad, S Paulo, R Solera-Ureña, A Pompili
Proc. IberSPEECH 2024, 293-296 , 2024
2024 - Uma abordagem de aprendizagem semissupervisionada para a classificação automática de personalidade baseada em pistas acústico-prosódicas
R Solera-Ureña, H Moniz, F Batista, V Cabarrão, A Pompili, ...
Revista da Associação Portuguesa de Linguística, 348-364 , 2019
2019