Psychological, economic, and ethical factors in human feedback for a chatbot-based smoking cessation intervention Nele Albers, Francisco S. Melo, Mark A. Neerincx, Olya Kudina, Willem-Paul Brinkman Npj Digital Medicine, 2025 Integrating human support with chatbot-based behavior change interventions raises three challenges: (1) attuning the support to an individual’s state (e.g., motivation) for enhanced engagement, (2) limiting the use of the concerning human resources for enhanced efficiency, and (3) optimizing outcomes on ethical aspects (e.g., fairness). Therefore, we conducted a study in which 679 smokers and vapers had a 20% chance of receiving human feedback between five chatbot sessions. We find that having received feedback increases retention and effort spent on preparatory activities. However, analyzing a reinforcement learning (RL) model fit on the data shows there are also states where not providing feedback is better. Even this “standard” benefit-maximizing RL model is value-laden. It not only prioritizes people who would benefit most, but also those who are already doing well and want feedback. We show how four other ethical principles can be incorporated to favor other smoker subgroups, yet, interdependencies exist.
Regularization and Two Time Scales for Convergence of Reinforcement Learning Diogo S. Carvalho, Pedro A. Santos, Francisco S. Melo Applied Mathematics and Optimization, 2025 Reinforcement learning algorithms aim at solving discrete time stochastic control problems with unknown underlying dynamical systems by an iterative process of interaction. The process is formalized as a Markov decision process, where at each time step, a control action is given, the system provides a reward, and the state changes stochastically. The objective of the controller is the expected sum of rewards obtained throughout the interaction. When the set of states and or actions is large, it is necessary to use some form of function approximation. But even if the function approximation set is simply a linear span of fixed features, the reinforcement learning algorithms may diverge. In this work, we propose and analyze regularized two-time-scale variations of the algorithms, and prove that they are guaranteed to converge almost-surely to a unique solution to the reinforcement learning problem.
Networked Agents in the Dark: Team Value Learning under Partial Observability Proceedings of the International Joint Conference on Autonomous Agents and Multiagent Systems Aamas, 2025
A Comparative Study of Continual Backpropagation Jacopo Silvestrin, Francisco S. Melo, Manuel Lopes Lecture Notes in Computer Science Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics, 2025
The Number of Trials Matters in Infinite-Horizon General-Utility Markov Decision Processes Proceedings of Machine Learning Research, 2025
Distributed Value Decomposition Networks with Networked Agents: Extended Abstract Proceedings of the International Joint Conference on Autonomous Agents and Multiagent Systems Aamas, 2025
Implicit Repair with Reinforcement Learning in Emergent Communication Proceedings of the International Joint Conference on Autonomous Agents and Multiagent Systems Aamas, 2025
Preface Steven Davy, Danyal Aftab Frontiers in Artificial Intelligence and Applications, 2024
NeuralSolver: Learning Algorithms For Consistent and Efficient Extrapolation Across General Tasks Advances in Neural Information Processing Systems, 2024
Centralized Training with Hybrid Execution in Multi-Agent Reinforcement Learning Proceedings of the International Joint Conference on Autonomous Agents and Multiagent Systems Aamas, 2024
Learning to Perceive in Deep Model-Free Reinforcement Learning Proceedings of the International Joint Conference on Autonomous Agents and Multiagent Systems Aamas, 2023
Entropic Risk-Aware Monte Carlo Tree Search PP Santos, J Silvestrin, A Sardinha, FS Melo arXiv preprint arXiv:2601.17667 , 2026 2026
Regularization and Two Time Scales for Convergence of Reinforcement Learning DS Carvalho, PA Santos, FS Melo Applied Mathematics & Optimization 92 (2), 30 , 2025 2025
Reinforcement learning in convergently non-stationary environments: Feudal hierarchies and learned representations DS Carvalho, PA Santos, FS Melo Artificial Intelligence 347, 104382 , 2025 2025 Citations: 7
Optimizing 2D Packing Strategies for Autoclave Loading Using Deep Reinforcement Learning VU Pugliese, DS Carvalho, OF Ferreira, FA Faria, FS Melo EPIA Conference on Artificial Intelligence, 41-53 , 2025 2025
" Teammates, Am I Clear?": Analysing Legible Behaviours in Teams M Faria, FS Melo, A Paiva arXiv preprint arXiv:2507.21631 , 2025 2025
RecBayes: Recurrent Bayesian Ad Hoc Teamwork in Large Partially Observable Domains JG Ribeiro, Y Oren, A Sardinha, M Spaan, FS Melo arXiv preprint arXiv:2506.15756 , 2025 2025
Psychological, economic, and ethical factors in human feedback for a chatbot-based smoking cessation intervention N Albers, FS Melo, MA Neerincx, O Kudina, WP Brinkman npj Digital Medicine 8 (1), 326 , 2025 2025 Citations: 1
Solving General-Utility Markov Decision Processes in the Single-Trial Regime with Online Planning PP Santos, A Sardinha, FS Melo arXiv preprint arXiv:2505.15782 , 2025 2025
Optimize and coordinate multiple DMPs under constraints to achieve a collaborative manipulation task AH Kordia, FS Melo 2025 IEEE International Conference on Robotics and Automation (ICRA), 1-7 , 2025 2025
Implicit repair with reinforcement learning in emergent communication F Vital, A Sardinha, FS Melo arXiv preprint arXiv:2502.12624 , 2025 2025 Citations: 1
Distributed Value Decomposition Networks with Networked Agents GS Varela, A Sardinha, FS Melo arXiv preprint arXiv:2502.07635 , 2025 2025
Networked agents in the dark: Team value learning under partial observability GS Varela, A Sardinha, FS Melo arXiv preprint arXiv:2501.08778 , 2025 2025 Citations: 4
NeuralSolver: Learning Algorithms For Consistent and Efficient Extrapolation Across General Tasks B Esteves, M Vasco, FS Melo Advances in Neural Information Processing Systems 37, 3458-3498 , 2024 2024
Combining active learning and learning to reject for anomaly detection L Stradiotti, L Perini, J Davis 27th European Conference on Artificial Intelligence, 19–24 October 2024 … , 2024 2024 Citations: 5
The number of trials matters in infinite-horizon general-utility markov decision processes PP Santos, A Sardinha, FS Melo arXiv preprint arXiv:2409.15128 , 2024 2024 Citations: 2
A comparative study of continual backpropagation J Silvestrin, FS Melo, M Lopes EPIA Conference on Artificial Intelligence, 324-334 , 2024 2024 Citations: 1
The impact of data distribution on Q-learning with function approximation PP Santos, DS Carvalho, A Sardinha, FS Melo Machine Learning 113 (9), 6141-6163 , 2024 2024 Citations: 7
When a robot is your teammate F Correia, FS Melo, A Paiva Topics in Cognitive Science 16 (3), 527-553 , 2024 2024 Citations: 18
HOTSPOT: An ad hoc teamwork platform for mixed human-robot teams JG Ribeiro, LM Henriques, S Colcher, JC Duarte, FS Melo, RL Milidiú, ... Plos one 19 (6), e0305705 , 2024 2024 Citations: 2
“Guess what I'm doing”: Extending legibility to sequential decision tasks M Faria, FS Melo, A Paiva Artificial Intelligence 330, 104107 , 2024 2024 Citations: 7
MOST CITED SCHOLAR PUBLICATIONS
An analysis of reinforcement learning with function approximation FS Melo, SP Meyn, MI Ribeiro Proceedings of the 25th international conference on Machine learning, 664-671 , 2008 2008 Citations: 369
Active learning for reward estimation in inverse reinforcement learning M Lopes, F Melo, L Montesano Joint European conference on machine learning and knowledge discovery in … , 2009 2009 Citations: 268
Affordance-based imitation learning in robots M Lopes, FS Melo, L Montesano 2007 IEEE/RSJ international conference on intelligent robots and systems … , 2007 2007 Citations: 177
Q -Learning with Linear Function Approximation FS Melo, MI Ribeiro International Conference on Computational Learning Theory, 308-322 , 2007 2007 Citations: 160
Exploring the impact of fault justification in human-robot trust F Correia, C Guerra, S Mascarenhas, FS Melo, A Paiva Proceedings of the 17th international conference on autonomous agents and … , 2018 2018 Citations: 140
Decentralized MDPs with sparse interactions FS Melo, M Veloso Artificial Intelligence 175 (11), 1757-1789 , 2011 2011 Citations: 139
Empathic robot for group learning: A field study P Alves-Oliveira, P Sequeira, FS Melo, G Castellano, A Paiva ACM Transactions on Human-Robot Interaction (THRI) 8 (1), 1-34 , 2019 2019 Citations: 129
Geometric multimodal contrastive representation learning P Poklukar, M Vasco, H Yin, FS Melo, A Paiva, D Kragic International Conference on Machine Learning, 17782-17800 , 2022 2022 Citations: 117
Learning of coordination: Exploiting sparse interactions in multiagent systems FS Melo, M Veloso Proceedings of The 8th International Conference on Autonomous Agents and … , 2009 2009 Citations: 117
Interaction-driven Markov games for decentralized multiagent planning under uncertainty MTJ Spaan, FS Melo Proceedings of the 7th international joint conference on Autonomous agents … , 2008 2008 Citations: 112
Group-based emotions in teams of humans and robots F Correia, S Mascarenhas, R Prada, FS Melo, A Paiva Proceedings of the 2018 ACM/IEEE international conference on human-robot … , 2018 2018 Citations: 109
Just follow the suit! trust in human-robot interactions during card game playing F Correia, P Alves-Oliveira, N Maia, T Ribeiro, S Petisca, FS Melo, ... 2016 25th IEEE international symposium on robot and human interactive … , 2016 2016 Citations: 76
Personalized assistance for dressing users SD Klee, BQ Ferreira, R Silva, JP Costeira, FS Melo, M Veloso International Conference on Social Robotics, 359-369 , 2015 2015 Citations: 71
An empathic robotic tutor for school classrooms: Considering expectation and satisfaction of children as end-users P Alves-Oliveira, T Ribeiro, S Petisca, E Di Tullio, FS Melo, A Paiva International Conference on social robotics, 21-30 , 2015 2015 Citations: 66
Project INSIDE: towards autonomous semi-unstructured human–robot social interaction in autism therapy FS Melo, A Sardinha, D Belo, M Couto, M Faria, A Farias, H Gamboa, ... Artificial intelligence in medicine 96, 198-216 , 2019 2019 Citations: 65
Monte carlo tree search experiments in hearthstone A Santos, PA Santos, FS Melo 2017 IEEE conference on computational intelligence and games (CIG), 272-279 , 2017 2017 Citations: 65
Discovering social interaction strategies for robots from restricted-perception Wizard-of-Oz studies P Sequeira, P Alves-Oliveira, T Ribeiro, E Di Tullio, S Petisca, FS Melo, ... 2016 11th ACM/IEEE International Conference on Human-Robot Interaction (HRI … , 2016 2016 Citations: 63
Emotion-based intrinsic motivation for reinforcement learning agents P Sequeira, FS Melo, A Paiva International conference on affective computing and intelligent interaction … , 2011 2011 Citations: 63
Exploring prosociality in human-robot teams F Correia, SF Mascarenhas, S Gomes, P Arriaga, I Leite, R Prada, ... 2019 14th ACM/IEEE international conference on human-robot interaction (HRI … , 2019 2019 Citations: 62
Abstraction levels for robotic imitation: Overview and computational approaches M Lopes, F Melo, L Montesano, J Santos-Victor From Motor Learning to Interaction Learning in Robots, 313-355 , 2010 2010 Citations: 61