Francisco S Melo

@inesc-id.pt

Associate Professor, Department of Computer Science
INESC-ID and Instituto Superior Técnico, University of Lisbon

https://researchid.co/fmelo

RESEARCH INTERESTS

Artificial Intelligence; Machine Learning; Reinforcement Learning

135

Scopus Publications

3223

Scholar Citations

Scholar h-index

Scholar i10-index

Scopus Publications

“Guess what I'm doing”: Extending legibility to sequential decision tasks
Miguel Faria, Francisco S. Melo, and Ana Paiva
Elsevier BV

Interactively Teaching an Inverse Reinforcement Learner with Limited Feedback
Rustam Zayanov, Francisco Melo, and Manuel Lopes
SCITEPRESS - Science and Technology Publications

TEAMSTER: Model-based reinforcement learning for ad hoc teamwork
João G. Ribeiro, Gonçalo Rodrigues, Alberto Sardinha, and Francisco S. Melo
Elsevier BV

Theoretical Remarks on Feudal Hierarchies and Reinforcement Learning
Diogo S. Carvalho, Francisco S. Melo, and Pedro A. Santos
IOS Press
Hierarchical reinforcement learning is an increasingly demanded resource for learning to make sequential decisions towards long term goals. Feudal hierarchies are among the most deployed frameworks. However, there are few theoretical results for hierarchical structures. In this work, we formalize the common two-level feudal hierarchy as two Markov decision processes, with the one on the high level being dependent on the policy executed at the low level. Despite the non-stationarity raised by the dependency, we show that each of the processes presents stable behavior. We then build on the first result to show that, regardless of the convergent learning algorithm used for the low level, convergence of both prediction and control algorithms at the high-level is guaranteed. Our results contribute with theoretical support for the use of feudal hierarchies in combination with standard reinforcement learning methods at each level.

Making Friends in the Dark: Ad Hoc Teamwork Under Partial Observability
João G. Ribeiro, Cassandro Martinho, Alberto Sardinha, and Francisco S. Melo
IOS Press
This paper introduces a formal definition of the setting of ad hoc teamwork under partial observability and proposes a first-principled model-based approach which relies only on prior knowledge and partial observations of the environment in order to perform ad hoc teamwork. We make three distinct assumptions that set it apart previous works, namely: i) the state of the environment is always partially observable, ii) the actions of the teammates are always unavailable to the ad hoc agent and iii) the ad hoc agent has no access to a reward signal which could be used to learn the task from scratch. Our results in 70 POMDPs from 11 domains show that our approach is not only effective in assisting unknown teammates in solving unknown tasks but is also robust in scaling to more challenging problems. Supplementary material is available at https://github.com/jmribeiro/adhoc-teamwork-under-partial-observability.

Pre-training with Augmentations for Efficient Transfer in Model-Based Reinforcement Learning
Bernardo Esteves, Miguel Vasco, and Francisco S. Melo
Springer Nature Switzerland

Learning to Perceive in Deep Model-Free Reinforcement Learning

Robotic Gaze Responsiveness in Multiparty Teamwork
Filipa Correia, Joana Campos, Francisco S. Melo, and Ana Paiva
Springer Science and Business Media LLC

“Sequencing Matters”: Investigating Suitable Action Sequences in Robot-Assisted Autism Therapy
Kim Baraka, Marta Couto, Francisco S. Melo, Ana Paiva, and Manuela Veloso
Frontiers Media SA
Social robots have been shown to be promising tools for delivering therapeutic tasks for children with Autism Spectrum Disorder (ASD). However, their efficacy is currently limited by a lack of flexibility of the robot’s social behavior to successfully meet therapeutic and interaction goals. Robot-assisted interventions are often based on structured tasks where the robot sequentially guides the child towards the task goal. Motivated by a need for personalization to accommodate a diverse set of children profiles, this paper investigates the effect of different robot action sequences in structured socially interactive tasks targeting attention skills in children with different ASD profiles. Based on an autism diagnostic tool, we devised a robotic prompting scheme on a NAO humanoid robot, aimed at eliciting goal behaviors from the child, and integrated it in a novel interactive storytelling scenario involving screens. We programmed the robot to operate in three different modes: diagnostic-inspired (Assess), personalized therapy-inspired (Therapy), and random (Explore). Our exploratory study with 11 young children with ASD highlights the usefulness and limitations of each mode according to different possible interaction goals, and paves the way towards more complex methods for balancing short-term and long-term goals in personalized robot-assisted therapy.

Leveraging hierarchy in multimodal generative models for effective cross-modality inference
Miguel Vasco, Hang Yin, Francisco S. Melo, and Ana Paiva
Elsevier BV
This work addresses the problem of cross-modality inference (CMI), i.e., inferring missing data of unavailable perceptual modalities (e.g., sound) using data from available perceptual modalities (e.g., image). We overview single-modality variational autoencoder methods and discuss three problems of computational cross-modality inference, arising from recent developments in multimodal generative models. Inspired by neural mechanisms of human recognition, we contribute the Nexus model, a novel hierarchical generative model that can learn a multimodal representation of an arbitrary number of modalities in an unsupervised way. By exploiting hierarchical representation levels, Nexus is able to generate high-quality, coherent data of missing modalities given any subset of available modalities. To evaluate CMI in a natural scenario with a high number of modalities, we contribute the "Multimodal Handwritten Digit" (MHD) dataset, a novel benchmark dataset that combines image, motion, sound and label information from digit handwriting. We access the key role of hierarchy in enabling high-quality samples during cross-modality inference and discuss how a novel training scheme enables Nexus to learn a multimodal representation robust to missing modalities at test time. Our results show that Nexus outperforms current state-of-the-art multimodal generative models in regards to their cross-modality inference capabilities.

Geometric Multimodal Contrastive Representation Learning

Perceive, Represent, Generate: Translating Multimodal Information to Robotic Motion Trajectories
Fabio Vital, Miguel Vasco, Alberto Sardinha, and Francisco Melo
IEEE
We present Perceive-Represent-Generate (PRG), a novel three-stage framework that maps perceptual information of different modalities (e.g., visual or sound), corresponding to a series of instructions, to a sequence of movements to be executed by a robot. In the first stage, we perceive and preprocess the given inputs, isolating individual commands from the complete instruction provided by a human user. In the second stage we encode the individual commands into a multimodal latent space, employing a deep generative model. Finally, in the third stage we convert the latent samples into individual trajectories and combine them into a single dynamic movement primitive, allowing its execution by a robotic manipulator. We evaluate our pipeline in the context of a novel robotic handwriting task, where the robot receives as input a word through different perceptual modalities (e.g., image, sound), and generates the corresponding motion trajectory to write it, creating coherent and high-quality handwritten words.

When a Robot Is Your Teammate
Filipa Correia, Francisco S. Melo, and Ana Paiva
Wiley
Creating effective teamwork between humans and robots involves not only addressing their performance as a team but also sustaining the quality and sense of unity among teammates, also known as cohesion. This paper explores the research problem of: how can we endow robotic teammates with social capabilities to improve the cohesive alliance with humans? By defining the concept of a human-robot cohesive alliance in the light of the multidimensional construct of cohesion from the social sciences, we propose to address this problem through the idea of multifaceted human-robot cohesion. We present our preliminary effort from previous works to examine each of the five dimensions of cohesion: social, collective, emotional, structural, and task. We finish the paper with a discussion on how human-robot cohesion contributes to the key questions and ongoing challenges of creating robotic teammates. Overall, cohesion in human-robot teams might be a key factor to propel team performance and it should be considered in the design, development, and evaluation of robotic teammates.

Preface

FIT: Using Feature Importance to Teach Classification Tasks to Unknown Learners
Carla Guerra, Francisco S. Melo, and Manuel Lopes
Springer International Publishing

Cooperation and Learning Dynamics under Wealth Inequality and Diversity in Individual Risk Perception
Ramona Merhej, Fernando P. Santos, Francisco S. Melo, and Francisco C. Santos
AI Access Foundation
We examine how wealth inequality and diversity in the perception of risk of a collective disaster impact cooperation levels in the context of a public goods game with uncertain and non-linear returns. In this game, individuals face a collective-risk dilemma where they may contribute or not to a common pool to reduce their chances of future losses. We draw our conclusions based on social simulations with populations of independent reinforcement learners with diverse levels of risk and wealth. We find that both wealth inequality and diversity in risk assessment can hinder cooperation and augment collective losses. Additionally, wealth inequality further exacerbates long term inequality, causing rich agents to become richer and poor agents to become poorer. On the other hand, diversity in risk only amplifies inequality when combined with bias in group assortment—i.e., high probability that agents from the same risk class play together. Our results also suggest that taking wealth inequality into account can help to design effective policies aiming at leveraging cooperation in large group sizes, a configuration where collective action is harder to achieve. Finally, we characterize the circumstances under which risk perception alignment is crucial and those under which reducing wealth inequality constitutes a deciding factor for collective welfare.

How to Sense the World: Leveraging Hierarchy in Multimodal Perception for Robust Reinforcement Learning Agents

Cooperation and Learning Dynamics under Risk Diversity and Financial Incentives

Socially Reactive Navigation Models for Mobile Robots
Francisco Melo and Plinio Moreno
IEEE
This work considers socially acceptable behaviors in traditional reactive navigation systems, allowing a robot to approach a group of humans in a socially acceptable manner by considering the personal space and the group space. In contrast to the fixed parameters of social distancing, this work presents an adaptive model; that is, the parameters of the personal and group space’s cost functions adapt according to the arrangement of the group and space constraints, avoiding the choice of initial parameters. A socially aware navigation system capable of approaching groups is implemented for a general-purpose mobile robot. The adaptive personal and group space algorithm is integrated with the standard navigation system of ROS, representing their information in a costmap layer. The adaptation of spaces is tested using fixed and adaptive parameters for different groups provided by three datasets. The navigation system is evaluated through simulation experiments, demonstrating that the robot is capable of approaching groups and, at the same time, provides a more realistic space modeling adapted to the context.

A Game AI Competition to Foster Collaborative AI Research and Development
Ana Salta, Rui Prada, and Francisco S. Melo
Institute of Electrical and Electronics Engineers (IEEE)
Game artificial intelligence (AI) competitions are important to foster research and development on Game AI and AI in general. These competitions supply different challenging problems that can be translated into other contexts, virtual or real. They provide frameworks and tools to facilitate the research on their core topics and provide means for comparing and sharing results. A competition is also a way to motivate new researchers to study these challenges. In this article, we present the Geometry Friends game AI competition. Geometry Friends is a two-player cooperative physics-based puzzle platformer computer game. The concept of the game is simple, though its solving has proven to be difficult. While the main and apparent focus of the game is cooperation, it also relies on other AI-related problems such as planning, plan execution, and motion control, all connected to situational awareness. All of these must be solved in real-time. In this article, we discuss the competition and the challenges it brings, and present an overview of the current solutions.

Teaching Multiple Inverse Reinforcement Learners
Francisco S. Melo and Manuel Lopes
Frontiers Media SA
In this paper, we propose the first machine teaching algorithm for multiple inverse reinforcement learners. As our initial contribution, we formalize the problem of optimally teaching a sequential task to a heterogeneous class of learners. We then contribute a theoretical analysis of such problem, identifying conditions under which it is possible to conduct such teaching using the same demonstration for all learners. Our analysis shows that, contrary to other teaching problems, teaching a sequential task to a heterogeneous class of learners with a single demonstration may not be possible, as the differences between individual agents increase. We then contribute two algorithms that address the main difficulties identified by our theoretical analysis. The first algorithm, which we dub SplitTeach, starts by teaching the class as a whole until all students have learned all that they can learn as a group; it then teaches each student individually, ensuring that all students are able to perfectly acquire the target task. The second approach, which we dub JointTeach, selects a single demonstration to be provided to the whole class so that all students learn the target task as well as a single demonstration allows. While SplitTeach ensures optimal teaching at the cost of a bigger teaching effort, JointTeach ensures minimal effort, although the learners are not guaranteed to perfectly recover the target task. We conclude by illustrating our methods in several simulation domains. The simulation results agree with our theoretical findings, showcasing that indeed class teaching is not possible in the presence of heterogeneous students. At the same time, they also illustrate the main properties of our proposed algorithms: in all domains, SplitTeach guarantees perfect teaching and, in terms of teaching effort, is always at least as good as individualized teaching (often better); on the other hand, JointTeach attains minimal teaching effort in all domains, even if sometimes it compromises the teaching performance.

Understanding robots: Making robots more legible in multi-party interactions
Miguel Faria, Francisco S. Melo, and Ana Paiva
IEEE
In this work we explore implicit communication between humans and robots—through movement—in multi-party (or multi-user) interactions. In particular, we investigate how a robot can move to better convey its intentions using legible movements in multi-party interactions. Current research on the application of legible movements has focused on single-user interactions, causing a vacuum of knowledge regarding the impact of such movements in multi-party interactions. We propose a novel approach that extends the notion of legible motion to multi-party settings, by considering that legibility depends on all human users involved in the interaction, and should take into consideration how each of them perceives the robot’s movements from their respective points-of-view. We show, through simulation and a user study, that our proposed model of multi-user legibility leads to movements that, on average, optimize the legibility of the motion as perceived by the group of users. Our model creates movements that allow each human to more quickly and confidently understand what are the robot’s intentions, thus creating safer, clearer and more efficient interactions and collaborations.

Interactive Teaching with Groups of Unknown Bayesian Learners
Carla Guerra, Francisco S. Melo, and Manuel Lopes
Springer International Publishing

Preface

Exploiting Symmetry in Human Robot-Assisted Dressing Using Reinforcement Learning
Pedro Ildefonso, Pedro Remédios, Rui Silva, Miguel Vasco, Francisco S. Melo, Ana Paiva, and Manuela Veloso
Springer International Publishing

RECENT SCHOLAR PUBLICATIONS

TEAMSTER: Model-Based Reinforcement Learning for Ad Hoc Teamwork (Abstract Reprint)
JG Ribeiro, G Rodrigues, A Sardinha, FS Melo
Proceedings of the AAAI Conference on Artificial Intelligence 38 (20), 22708 2024

“Guess what I'm doing”: Extending legibility to sequential decision tasks
M Faria, FS Melo, A Paiva
Artificial Intelligence, 104107 2024

NeuralThink: Algorithm Synthesis that Extrapolates in General Tasks
B Esteves, M Vasco, FS Melo
arXiv preprint arXiv:2402.15393 2024

TEAMSTER: Model-based reinforcement learning for ad hoc teamwork
JG Ribeiro, G Rodrigues, A Sardinha, FS Melo
Artificial Intelligence 324, 104013 2023

HOTSPOT: An Ad Hoc Teamwork Platform for Mixed Human-Robot Teams
JG Ribeiro, LM Henriques, S Colcher, JC Duarte, FS Melo, RL Milidi, ...
Authorea Preprints 2023

Emergent Robust Communication for Multi-Round Interactions in Noisy Environments
F Vital, A Sardinha, FS Melo
2023

Making Friends in the Dark: Ad Hoc Teamwork Under Partial Observability
JG Ribeiroa, C Martinhoa, A Sardinhaa, FS Melo
arXiv preprint arXiv:2310.01439 2023

Multi-Bellman operator for convergence of -learning with linear function approximation
DS Carvalho, PA Santos, FS Melo
arXiv preprint arXiv:2309.16819 2023

Interactively Teaching an Inverse Reinforcement Learner with Limited Feedback
R Zayanov, FS Melo, M Lopes
arXiv preprint arXiv:2309.09095 2023

Pre-training with Augmentations for Efficient Transfer in Model-Based Reinforcement Learning
B Esteves, M Vasco, FS Melo
EPIA Conference on Artificial Intelligence, 133-145 2023

Learning to Perceive in Deep Model-Free Reinforcement Learning
G Querido, A Sardinha, FS Melo
arXiv preprint arXiv:2301.03730 2023

Theoretical remarks on feudal hierarchies and reinforcement learning
DS Carvalho, FS Melo, PA Santos
ECAI 2023, 351-356 2023

Robotic gaze responsiveness in multiparty teamwork
F Correia, J Campos, FS Melo, A Paiva
International Journal of Social Robotics 15 (1), 27-36 2023

When a robot is your teammate
F Correia, FS Melo, A Paiva
Topics in Cognitive Science 2022

Autonomous Agents and Multiagent Systems. Best and Visionary Papers: AAMAS 2022 Workshops, Virtual Event, May 9–13, 2022, Revised Selected Papers
FS Melo, F Fang
Springer Nature 2022

Perceive, Represent, Generate: Translating Multimodal Information to Robotic Motion Trajectories
F Vital, M Vasco, A Sardinha, F Melo
2022 IEEE/RSJ International Conference on Intelligent Robots and Systems 2022

Centralized Training with Hybrid Execution in Multi-Agent Reinforcement Learning
PP Santos, DS Carvalho, M Vasco, A Sardinha, PA Santos, A Paiva, ...
arXiv preprint arXiv:2210.06274 2022

-learning with regularization converges with non-linear non-stationary features
DS Carvalho, FS Melo, PA Santos
2022

FIT: Using Feature Importance to Teach Classification Tasks to Unknown Learners
C Guerra, FS Melo, M Lopes
EPIA Conference on Artificial Intelligence, 440-451 2022

Geometric multimodal contrastive representation learning
P Poklukar, M Vasco, H Yin, FS Melo, A Paiva, D Kragic
International Conference on Machine Learning, 17782-17800 2022

MOST CITED SCHOLAR PUBLICATIONS

An analysis of reinforcement learning with function approximation
FS Melo, SP Meyn, MI Ribeiro
Proceedings of the 25th international conference on Machine learning, 664-671 2008
Citations: 311

Active learning for reward estimation in inverse reinforcement learning
M Lopes, F Melo, L Montesano
Joint European conference on machine learning and knowledge discovery in 2009
Citations: 239

Affordance-based imitation learning in robots
M Lopes, FS Melo, L Montesano
2007 IEEE/RSJ international conference on intelligent robots and systems 2007
Citations: 173

Q-Learning with Linear Function Approximation
FS Melo, MI Ribeiro
International Conference on Computational Learning Theory, 308-322 2007
Citations: 138

Decentralized MDPs with sparse interactions
FS Melo, M Veloso
Artificial Intelligence 175 (11), 1757-1789 2011
Citations: 121

Interaction-driven Markov games for decentralized multiagent planning under uncertainty
MTJ Spaan, FS Melo
Proceedings of the 7th international joint conference on Autonomous agents 2008
Citations: 106

Learning of coordination: Exploiting sparse interactions in multiagent systems
FS Melo, M Veloso
Proceedings of The 8th International Conference on Autonomous Agents and 2009
Citations: 103

Exploring the impact of fault justification in human-robot trust
F Correia, C Guerra, S Mascarenhas, FS Melo, A Paiva
Proceedings of the 17th international conference on autonomous agents and 2018
Citations: 93

Group-based emotions in teams of humans and robots
F Correia, S Mascarenhas, R Prada, FS Melo, A Paiva
Proceedings of the 2018 ACM/IEEE international conference on human-robot 2018
Citations: 84

Empathic robot for group learning: A field study
P Alves-Oliveira, P Sequeira, FS Melo, G Castellano, A Paiva
ACM Transactions on Human-Robot Interaction (THRI) 8 (1), 1-34 2019
Citations: 75

Just follow the suit! trust in human-robot interactions during card game playing
F Correia, P Alves-Oliveira, N Maia, T Ribeiro, S Petisca, FS Melo, ...
2016 25th IEEE international symposium on robot and human interactive 2016
Citations: 62

Personalized assistance for dressing users
SD Klee, BQ Ferreira, R Silva, JP Costeira, FS Melo, M Veloso
Social Robotics: 7th International Conference, ICSR 2015, Paris, France 2015
Citations: 62

Emotion-based intrinsic motivation for reinforcement learning agents
P Sequeira, FS Melo, A Paiva
Affective Computing and Intelligent Interaction: 4th International 2011
Citations: 56

Abstraction levels for robotic imitation: Overview and computational approaches
M Lopes, F Melo, L Montesano, J Santos-Victor
From Motor Learning to Interaction Learning in Robots, 313-355 2010
Citations: 56

Monte carlo tree search experiments in hearthstone
A Santos, PA Santos, FS Melo
2017 IEEE conference on computational intelligence and games (CIG), 272-279 2017
Citations: 52

Project INSIDE: towards autonomous semi-unstructured human–robot social interaction in autism therapy
FS Melo, A Sardinha, D Belo, M Couto, M Faria, A Farias, H Gamba, ...
Artificial intelligence in medicine 96, 198-216 2019
Citations: 50

An empathic robotic tutor for school classrooms: Considering expectation and satisfaction of children as end-users
P Alves-Oliveira, T Ribeiro, S Petisca, E Di Tullio, FS Melo, A Paiva
Social Robotics: 7th International Conference, ICSR 2015, Paris, France 2015
Citations: 45

A computational model of social-learning mechanisms
M Lopes, FS Melo, B Kenward, J Santos-Victor
Adaptive behavior 17 (6), 467-483 2009
Citations: 45

Convergence of Q-learning with linear function approximation
FS Melo, MI Ribeiro
2007 European Control Conference (ECC), 2671-2678 2007
Citations: 45

Exploring prosociality in human-robot teams
F Correia, SF Mascarenhas, S Gomes, P Arriaga, I Leite, R Prada, ...
2019 14th ACM/IEEE international conference on human-robot interaction (HRI 2019
Citations: 44