Francisco S Melo

@inesc-id.pt

Associate Professor, Department of Computer Science
INESC-ID and Instituto Superior Técnico, University of Lisbon



                    

https://researchid.co/fmelo

RESEARCH INTERESTS

Artificial Intelligence; Machine Learning; Reinforcement Learning

135

Scopus Publications

3223

Scholar Citations

29

Scholar h-index

62

Scholar i10-index

Scopus Publications

  • “Guess what I'm doing”: Extending legibility to sequential decision tasks
    Miguel Faria, Francisco S. Melo, and Ana Paiva

    Elsevier BV

  • Interactively Teaching an Inverse Reinforcement Learner with Limited Feedback
    Rustam Zayanov, Francisco Melo, and Manuel Lopes

    SCITEPRESS - Science and Technology Publications

  • TEAMSTER: Model-based reinforcement learning for ad hoc teamwork
    João G. Ribeiro, Gonçalo Rodrigues, Alberto Sardinha, and Francisco S. Melo

    Elsevier BV

  • Theoretical Remarks on Feudal Hierarchies and Reinforcement Learning
    Diogo S. Carvalho, Francisco S. Melo, and Pedro A. Santos

    IOS Press
    Hierarchical reinforcement learning is an increasingly demanded resource for learning to make sequential decisions towards long term goals. Feudal hierarchies are among the most deployed frameworks. However, there are few theoretical results for hierarchical structures. In this work, we formalize the common two-level feudal hierarchy as two Markov decision processes, with the one on the high level being dependent on the policy executed at the low level. Despite the non-stationarity raised by the dependency, we show that each of the processes presents stable behavior. We then build on the first result to show that, regardless of the convergent learning algorithm used for the low level, convergence of both prediction and control algorithms at the high-level is guaranteed. Our results contribute with theoretical support for the use of feudal hierarchies in combination with standard reinforcement learning methods at each level.

  • Making Friends in the Dark: Ad Hoc Teamwork Under Partial Observability
    João G. Ribeiro, Cassandro Martinho, Alberto Sardinha, and Francisco S. Melo

    IOS Press
    This paper introduces a formal definition of the setting of ad hoc teamwork under partial observability and proposes a first-principled model-based approach which relies only on prior knowledge and partial observations of the environment in order to perform ad hoc teamwork. We make three distinct assumptions that set it apart previous works, namely: i) the state of the environment is always partially observable, ii) the actions of the teammates are always unavailable to the ad hoc agent and iii) the ad hoc agent has no access to a reward signal which could be used to learn the task from scratch. Our results in 70 POMDPs from 11 domains show that our approach is not only effective in assisting unknown teammates in solving unknown tasks but is also robust in scaling to more challenging problems. Supplementary material is available at https://github.com/jmribeiro/adhoc-teamwork-under-partial-observability.

  • Pre-training with Augmentations for Efficient Transfer in Model-Based Reinforcement Learning
    Bernardo Esteves, Miguel Vasco, and Francisco S. Melo

    Springer Nature Switzerland

  • Learning to Perceive in Deep Model-Free Reinforcement Learning


  • Robotic Gaze Responsiveness in Multiparty Teamwork
    Filipa Correia, Joana Campos, Francisco S. Melo, and Ana Paiva

    Springer Science and Business Media LLC

  • “Sequencing Matters”: Investigating Suitable Action Sequences in Robot-Assisted Autism Therapy
    Kim Baraka, Marta Couto, Francisco S. Melo, Ana Paiva, and Manuela Veloso

    Frontiers Media SA
    Social robots have been shown to be promising tools for delivering therapeutic tasks for children with Autism Spectrum Disorder (ASD). However, their efficacy is currently limited by a lack of flexibility of the robot’s social behavior to successfully meet therapeutic and interaction goals. Robot-assisted interventions are often based on structured tasks where the robot sequentially guides the child towards the task goal. Motivated by a need for personalization to accommodate a diverse set of children profiles, this paper investigates the effect of different robot action sequences in structured socially interactive tasks targeting attention skills in children with different ASD profiles. Based on an autism diagnostic tool, we devised a robotic prompting scheme on a NAO humanoid robot, aimed at eliciting goal behaviors from the child, and integrated it in a novel interactive storytelling scenario involving screens. We programmed the robot to operate in three different modes: diagnostic-inspired (Assess), personalized therapy-inspired (Therapy), and random (Explore). Our exploratory study with 11 young children with ASD highlights the usefulness and limitations of each mode according to different possible interaction goals, and paves the way towards more complex methods for balancing short-term and long-term goals in personalized robot-assisted therapy.

  • Leveraging hierarchy in multimodal generative models for effective cross-modality inference
    Miguel Vasco, Hang Yin, Francisco S. Melo, and Ana Paiva

    Elsevier BV
    This work addresses the problem of cross-modality inference (CMI), i.e., inferring missing data of unavailable perceptual modalities (e.g., sound) using data from available perceptual modalities (e.g., image). We overview single-modality variational autoencoder methods and discuss three problems of computational cross-modality inference, arising from recent developments in multimodal generative models. Inspired by neural mechanisms of human recognition, we contribute the Nexus model, a novel hierarchical generative model that can learn a multimodal representation of an arbitrary number of modalities in an unsupervised way. By exploiting hierarchical representation levels, Nexus is able to generate high-quality, coherent data of missing modalities given any subset of available modalities. To evaluate CMI in a natural scenario with a high number of modalities, we contribute the "Multimodal Handwritten Digit" (MHD) dataset, a novel benchmark dataset that combines image, motion, sound and label information from digit handwriting. We access the key role of hierarchy in enabling high-quality samples during cross-modality inference and discuss how a novel training scheme enables Nexus to learn a multimodal representation robust to missing modalities at test time. Our results show that Nexus outperforms current state-of-the-art multimodal generative models in regards to their cross-modality inference capabilities.

  • Geometric Multimodal Contrastive Representation Learning


  • Perceive, Represent, Generate: Translating Multimodal Information to Robotic Motion Trajectories
    Fabio Vital, Miguel Vasco, Alberto Sardinha, and Francisco Melo

    IEEE
    We present Perceive-Represent-Generate (PRG), a novel three-stage framework that maps perceptual information of different modalities (e.g., visual or sound), corresponding to a series of instructions, to a sequence of movements to be executed by a robot. In the first stage, we perceive and preprocess the given inputs, isolating individual commands from the complete instruction provided by a human user. In the second stage we encode the individual commands into a multimodal latent space, employing a deep generative model. Finally, in the third stage we convert the latent samples into individual trajectories and combine them into a single dynamic movement primitive, allowing its execution by a robotic manipulator. We evaluate our pipeline in the context of a novel robotic handwriting task, where the robot receives as input a word through different perceptual modalities (e.g., image, sound), and generates the corresponding motion trajectory to write it, creating coherent and high-quality handwritten words.

  • When a Robot Is Your Teammate
    Filipa Correia, Francisco S. Melo, and Ana Paiva

    Wiley
    Creating effective teamwork between humans and robots involves not only addressing their performance as a team but also sustaining the quality and sense of unity among teammates, also known as cohesion. This paper explores the research problem of: how can we endow robotic teammates with social capabilities to improve the cohesive alliance with humans? By defining the concept of a human-robot cohesive alliance in the light of the multidimensional construct of cohesion from the social sciences, we propose to address this problem through the idea of multifaceted human-robot cohesion. We present our preliminary effort from previous works to examine each of the five dimensions of cohesion: social, collective, emotional, structural, and task. We finish the paper with a discussion on how human-robot cohesion contributes to the key questions and ongoing challenges of creating robotic teammates. Overall, cohesion in human-robot teams might be a key factor to propel team performance and it should be considered in the design, development, and evaluation of robotic teammates.

  • Preface


  • FIT: Using Feature Importance to Teach Classification Tasks to Unknown Learners
    Carla Guerra, Francisco S. Melo, and Manuel Lopes

    Springer International Publishing

  • Cooperation and Learning Dynamics under Wealth Inequality and Diversity in Individual Risk Perception
    Ramona Merhej, Fernando P. Santos, Francisco S. Melo, and Francisco C. Santos

    AI Access Foundation
    We examine how wealth inequality and diversity in the perception of risk of a collective disaster impact cooperation levels in the context of a public goods game with uncertain and non-linear returns. In this game, individuals face a collective-risk dilemma where they may contribute or not to a common pool to reduce their chances of future losses. We draw our conclusions based on social simulations with populations of independent reinforcement learners with diverse levels of risk and wealth. We find that both wealth inequality and diversity in risk assessment can hinder cooperation and augment collective losses. Additionally, wealth inequality further exacerbates long term inequality, causing rich agents to become richer and poor agents to become poorer. On the other hand, diversity in risk only amplifies inequality when combined with bias in group assortment—i.e., high probability that agents from the same risk class play together. Our results also suggest that taking wealth inequality into account can help to design effective policies aiming at leveraging cooperation in large group sizes, a configuration where collective action is harder to achieve. Finally, we characterize the circumstances under which risk perception alignment is crucial and those under which reducing wealth inequality constitutes a deciding factor for collective welfare.

  • How to Sense the World: Leveraging Hierarchy in Multimodal Perception for Robust Reinforcement Learning Agents


  • Cooperation and Learning Dynamics under Risk Diversity and Financial Incentives


  • Socially Reactive Navigation Models for Mobile Robots
    Francisco Melo and Plinio Moreno

    IEEE
    This work considers socially acceptable behaviors in traditional reactive navigation systems, allowing a robot to approach a group of humans in a socially acceptable manner by considering the personal space and the group space. In contrast to the fixed parameters of social distancing, this work presents an adaptive model; that is, the parameters of the personal and group space’s cost functions adapt according to the arrangement of the group and space constraints, avoiding the choice of initial parameters. A socially aware navigation system capable of approaching groups is implemented for a general-purpose mobile robot. The adaptive personal and group space algorithm is integrated with the standard navigation system of ROS, representing their information in a costmap layer. The adaptation of spaces is tested using fixed and adaptive parameters for different groups provided by three datasets. The navigation system is evaluated through simulation experiments, demonstrating that the robot is capable of approaching groups and, at the same time, provides a more realistic space modeling adapted to the context.

  • A Game AI Competition to Foster Collaborative AI Research and Development
    Ana Salta, Rui Prada, and Francisco S. Melo

    Institute of Electrical and Electronics Engineers (IEEE)
    Game artificial intelligence (AI) competitions are important to foster research and development on Game AI and AI in general. These competitions supply different challenging problems that can be translated into other contexts, virtual or real. They provide frameworks and tools to facilitate the research on their core topics and provide means for comparing and sharing results. A competition is also a way to motivate new researchers to study these challenges. In this article, we present the Geometry Friends game AI competition. Geometry Friends is a two-player cooperative physics-based puzzle platformer computer game. The concept of the game is simple, though its solving has proven to be difficult. While the main and apparent focus of the game is cooperation, it also relies on other AI-related problems such as planning, plan execution, and motion control, all connected to situational awareness. All of these must be solved in real-time. In this article, we discuss the competition and the challenges it brings, and present an overview of the current solutions.

  • Teaching Multiple Inverse Reinforcement Learners
    Francisco S. Melo and Manuel Lopes

    Frontiers Media SA
    In this paper, we propose the first machine teaching algorithm for multiple inverse reinforcement learners. As our initial contribution, we formalize the problem of optimally teaching a sequential task to a heterogeneous class of learners. We then contribute a theoretical analysis of such problem, identifying conditions under which it is possible to conduct such teaching using the same demonstration for all learners. Our analysis shows that, contrary to other teaching problems, teaching a sequential task to a heterogeneous class of learners with a single demonstration may not be possible, as the differences between individual agents increase. We then contribute two algorithms that address the main difficulties identified by our theoretical analysis. The first algorithm, which we dub SplitTeach, starts by teaching the class as a whole until all students have learned all that they can learn as a group; it then teaches each student individually, ensuring that all students are able to perfectly acquire the target task. The second approach, which we dub JointTeach, selects a single demonstration to be provided to the whole class so that all students learn the target task as well as a single demonstration allows. While SplitTeach ensures optimal teaching at the cost of a bigger teaching effort, JointTeach ensures minimal effort, although the learners are not guaranteed to perfectly recover the target task. We conclude by illustrating our methods in several simulation domains. The simulation results agree with our theoretical findings, showcasing that indeed class teaching is not possible in the presence of heterogeneous students. At the same time, they also illustrate the main properties of our proposed algorithms: in all domains, SplitTeach guarantees perfect teaching and, in terms of teaching effort, is always at least as good as individualized teaching (often better); on the other hand, JointTeach attains minimal teaching effort in all domains, even if sometimes it compromises the teaching performance.

  • Understanding robots: Making robots more legible in multi-party interactions
    Miguel Faria, Francisco S. Melo, and Ana Paiva

    IEEE
    In this work we explore implicit communication between humans and robots—through movement—in multi-party (or multi-user) interactions. In particular, we investigate how a robot can move to better convey its intentions using legible movements in multi-party interactions. Current research on the application of legible movements has focused on single-user interactions, causing a vacuum of knowledge regarding the impact of such movements in multi-party interactions. We propose a novel approach that extends the notion of legible motion to multi-party settings, by considering that legibility depends on all human users involved in the interaction, and should take into consideration how each of them perceives the robot’s movements from their respective points-of-view. We show, through simulation and a user study, that our proposed model of multi-user legibility leads to movements that, on average, optimize the legibility of the motion as perceived by the group of users. Our model creates movements that allow each human to more quickly and confidently understand what are the robot’s intentions, thus creating safer, clearer and more efficient interactions and collaborations.

  • Interactive Teaching with Groups of Unknown Bayesian Learners
    Carla Guerra, Francisco S. Melo, and Manuel Lopes

    Springer International Publishing

  • Preface


  • Exploiting Symmetry in Human Robot-Assisted Dressing Using Reinforcement Learning
    Pedro Ildefonso, Pedro Remédios, Rui Silva, Miguel Vasco, Francisco S. Melo, Ana Paiva, and Manuela Veloso

    Springer International Publishing

RECENT SCHOLAR PUBLICATIONS

  • TEAMSTER: Model-Based Reinforcement Learning for Ad Hoc Teamwork (Abstract Reprint)
    JG Ribeiro, G Rodrigues, A Sardinha, FS Melo
    Proceedings of the AAAI Conference on Artificial Intelligence 38 (20), 22708 2024

  • “Guess what I'm doing”: Extending legibility to sequential decision tasks
    M Faria, FS Melo, A Paiva
    Artificial Intelligence, 104107 2024

  • NeuralThink: Algorithm Synthesis that Extrapolates in General Tasks
    B Esteves, M Vasco, FS Melo
    arXiv preprint arXiv:2402.15393 2024

  • TEAMSTER: Model-based reinforcement learning for ad hoc teamwork
    JG Ribeiro, G Rodrigues, A Sardinha, FS Melo
    Artificial Intelligence 324, 104013 2023

  • HOTSPOT: An Ad Hoc Teamwork Platform for Mixed Human-Robot Teams
    JG Ribeiro, LM Henriques, S Colcher, JC Duarte, FS Melo, RL Milidi, ...
    Authorea Preprints 2023

  • Emergent Robust Communication for Multi-Round Interactions in Noisy Environments
    F Vital, A Sardinha, FS Melo
    2023

  • Making Friends in the Dark: Ad Hoc Teamwork Under Partial Observability
    JG Ribeiroa, C Martinhoa, A Sardinhaa, FS Melo
    arXiv preprint arXiv:2310.01439 2023

  • Multi-Bellman operator for convergence of -learning with linear function approximation
    DS Carvalho, PA Santos, FS Melo
    arXiv preprint arXiv:2309.16819 2023

  • Interactively Teaching an Inverse Reinforcement Learner with Limited Feedback
    R Zayanov, FS Melo, M Lopes
    arXiv preprint arXiv:2309.09095 2023

  • Pre-training with Augmentations for Efficient Transfer in Model-Based Reinforcement Learning
    B Esteves, M Vasco, FS Melo
    EPIA Conference on Artificial Intelligence, 133-145 2023

  • Learning to Perceive in Deep Model-Free Reinforcement Learning
    G Querido, A Sardinha, FS Melo
    arXiv preprint arXiv:2301.03730 2023

  • Theoretical remarks on feudal hierarchies and reinforcement learning
    DS Carvalho, FS Melo, PA Santos
    ECAI 2023, 351-356 2023

  • Robotic gaze responsiveness in multiparty teamwork
    F Correia, J Campos, FS Melo, A Paiva
    International Journal of Social Robotics 15 (1), 27-36 2023

  • When a robot is your teammate
    F Correia, FS Melo, A Paiva
    Topics in Cognitive Science 2022

  • Autonomous Agents and Multiagent Systems. Best and Visionary Papers: AAMAS 2022 Workshops, Virtual Event, May 9–13, 2022, Revised Selected Papers
    FS Melo, F Fang
    Springer Nature 2022

  • Perceive, Represent, Generate: Translating Multimodal Information to Robotic Motion Trajectories
    F Vital, M Vasco, A Sardinha, F Melo
    2022 IEEE/RSJ International Conference on Intelligent Robots and Systems 2022

  • Centralized Training with Hybrid Execution in Multi-Agent Reinforcement Learning
    PP Santos, DS Carvalho, M Vasco, A Sardinha, PA Santos, A Paiva, ...
    arXiv preprint arXiv:2210.06274 2022

  • -learning with regularization converges with non-linear non-stationary features
    DS Carvalho, FS Melo, PA Santos
    2022

  • FIT: Using Feature Importance to Teach Classification Tasks to Unknown Learners
    C Guerra, FS Melo, M Lopes
    EPIA Conference on Artificial Intelligence, 440-451 2022

  • Geometric multimodal contrastive representation learning
    P Poklukar, M Vasco, H Yin, FS Melo, A Paiva, D Kragic
    International Conference on Machine Learning, 17782-17800 2022

MOST CITED SCHOLAR PUBLICATIONS

  • An analysis of reinforcement learning with function approximation
    FS Melo, SP Meyn, MI Ribeiro
    Proceedings of the 25th international conference on Machine learning, 664-671 2008
    Citations: 311

  • Active learning for reward estimation in inverse reinforcement learning
    M Lopes, F Melo, L Montesano
    Joint European conference on machine learning and knowledge discovery in 2009
    Citations: 239

  • Affordance-based imitation learning in robots
    M Lopes, FS Melo, L Montesano
    2007 IEEE/RSJ international conference on intelligent robots and systems 2007
    Citations: 173

  • Q-Learning with Linear Function Approximation
    FS Melo, MI Ribeiro
    International Conference on Computational Learning Theory, 308-322 2007
    Citations: 138

  • Decentralized MDPs with sparse interactions
    FS Melo, M Veloso
    Artificial Intelligence 175 (11), 1757-1789 2011
    Citations: 121

  • Interaction-driven Markov games for decentralized multiagent planning under uncertainty
    MTJ Spaan, FS Melo
    Proceedings of the 7th international joint conference on Autonomous agents 2008
    Citations: 106

  • Learning of coordination: Exploiting sparse interactions in multiagent systems
    FS Melo, M Veloso
    Proceedings of The 8th International Conference on Autonomous Agents and 2009
    Citations: 103

  • Exploring the impact of fault justification in human-robot trust
    F Correia, C Guerra, S Mascarenhas, FS Melo, A Paiva
    Proceedings of the 17th international conference on autonomous agents and 2018
    Citations: 93

  • Group-based emotions in teams of humans and robots
    F Correia, S Mascarenhas, R Prada, FS Melo, A Paiva
    Proceedings of the 2018 ACM/IEEE international conference on human-robot 2018
    Citations: 84

  • Empathic robot for group learning: A field study
    P Alves-Oliveira, P Sequeira, FS Melo, G Castellano, A Paiva
    ACM Transactions on Human-Robot Interaction (THRI) 8 (1), 1-34 2019
    Citations: 75

  • Just follow the suit! trust in human-robot interactions during card game playing
    F Correia, P Alves-Oliveira, N Maia, T Ribeiro, S Petisca, FS Melo, ...
    2016 25th IEEE international symposium on robot and human interactive 2016
    Citations: 62

  • Personalized assistance for dressing users
    SD Klee, BQ Ferreira, R Silva, JP Costeira, FS Melo, M Veloso
    Social Robotics: 7th International Conference, ICSR 2015, Paris, France 2015
    Citations: 62

  • Emotion-based intrinsic motivation for reinforcement learning agents
    P Sequeira, FS Melo, A Paiva
    Affective Computing and Intelligent Interaction: 4th International 2011
    Citations: 56

  • Abstraction levels for robotic imitation: Overview and computational approaches
    M Lopes, F Melo, L Montesano, J Santos-Victor
    From Motor Learning to Interaction Learning in Robots, 313-355 2010
    Citations: 56

  • Monte carlo tree search experiments in hearthstone
    A Santos, PA Santos, FS Melo
    2017 IEEE conference on computational intelligence and games (CIG), 272-279 2017
    Citations: 52

  • Project INSIDE: towards autonomous semi-unstructured human–robot social interaction in autism therapy
    FS Melo, A Sardinha, D Belo, M Couto, M Faria, A Farias, H Gamba, ...
    Artificial intelligence in medicine 96, 198-216 2019
    Citations: 50

  • An empathic robotic tutor for school classrooms: Considering expectation and satisfaction of children as end-users
    P Alves-Oliveira, T Ribeiro, S Petisca, E Di Tullio, FS Melo, A Paiva
    Social Robotics: 7th International Conference, ICSR 2015, Paris, France 2015
    Citations: 45

  • A computational model of social-learning mechanisms
    M Lopes, FS Melo, B Kenward, J Santos-Victor
    Adaptive behavior 17 (6), 467-483 2009
    Citations: 45

  • Convergence of Q-learning with linear function approximation
    FS Melo, MI Ribeiro
    2007 European Control Conference (ECC), 2671-2678 2007
    Citations: 45

  • Exploring prosociality in human-robot teams
    F Correia, SF Mascarenhas, S Gomes, P Arriaga, I Leite, R Prada, ...
    2019 14th ACM/IEEE international conference on human-robot interaction (HRI 2019
    Citations: 44