@inesc-id.pt
Associate Professor, Department of Computer Science
INESC-ID and Instituto Superior Técnico, University of Lisbon
Artificial Intelligence; Machine Learning; Reinforcement Learning
Scopus Publications
Scholar Citations
Scholar h-index
Scholar i10-index
Miguel Faria, Francisco S. Melo, and Ana Paiva
Elsevier BV
Rustam Zayanov, Francisco Melo, and Manuel Lopes
SCITEPRESS - Science and Technology Publications
João G. Ribeiro, Gonçalo Rodrigues, Alberto Sardinha, and Francisco S. Melo
Elsevier BV
Diogo S. Carvalho, Francisco S. Melo, and Pedro A. Santos
IOS Press
Hierarchical reinforcement learning is an increasingly demanded resource for learning to make sequential decisions towards long term goals. Feudal hierarchies are among the most deployed frameworks. However, there are few theoretical results for hierarchical structures. In this work, we formalize the common two-level feudal hierarchy as two Markov decision processes, with the one on the high level being dependent on the policy executed at the low level. Despite the non-stationarity raised by the dependency, we show that each of the processes presents stable behavior. We then build on the first result to show that, regardless of the convergent learning algorithm used for the low level, convergence of both prediction and control algorithms at the high-level is guaranteed. Our results contribute with theoretical support for the use of feudal hierarchies in combination with standard reinforcement learning methods at each level.
João G. Ribeiro, Cassandro Martinho, Alberto Sardinha, and Francisco S. Melo
IOS Press
This paper introduces a formal definition of the setting of ad hoc teamwork under partial observability and proposes a first-principled model-based approach which relies only on prior knowledge and partial observations of the environment in order to perform ad hoc teamwork. We make three distinct assumptions that set it apart previous works, namely: i) the state of the environment is always partially observable, ii) the actions of the teammates are always unavailable to the ad hoc agent and iii) the ad hoc agent has no access to a reward signal which could be used to learn the task from scratch. Our results in 70 POMDPs from 11 domains show that our approach is not only effective in assisting unknown teammates in solving unknown tasks but is also robust in scaling to more challenging problems. Supplementary material is available at https://github.com/jmribeiro/adhoc-teamwork-under-partial-observability.
Bernardo Esteves, Miguel Vasco, and Francisco S. Melo
Springer Nature Switzerland
Filipa Correia, Joana Campos, Francisco S. Melo, and Ana Paiva
Springer Science and Business Media LLC
Kim Baraka, Marta Couto, Francisco S. Melo, Ana Paiva, and Manuela Veloso
Frontiers Media SA
Social robots have been shown to be promising tools for delivering therapeutic tasks for children with Autism Spectrum Disorder (ASD). However, their efficacy is currently limited by a lack of flexibility of the robot’s social behavior to successfully meet therapeutic and interaction goals. Robot-assisted interventions are often based on structured tasks where the robot sequentially guides the child towards the task goal. Motivated by a need for personalization to accommodate a diverse set of children profiles, this paper investigates the effect of different robot action sequences in structured socially interactive tasks targeting attention skills in children with different ASD profiles. Based on an autism diagnostic tool, we devised a robotic prompting scheme on a NAO humanoid robot, aimed at eliciting goal behaviors from the child, and integrated it in a novel interactive storytelling scenario involving screens. We programmed the robot to operate in three different modes: diagnostic-inspired (Assess), personalized therapy-inspired (Therapy), and random (Explore). Our exploratory study with 11 young children with ASD highlights the usefulness and limitations of each mode according to different possible interaction goals, and paves the way towards more complex methods for balancing short-term and long-term goals in personalized robot-assisted therapy.
Miguel Vasco, Hang Yin, Francisco S. Melo, and Ana Paiva
Elsevier BV
This work addresses the problem of cross-modality inference (CMI), i.e., inferring missing data of unavailable perceptual modalities (e.g., sound) using data from available perceptual modalities (e.g., image). We overview single-modality variational autoencoder methods and discuss three problems of computational cross-modality inference, arising from recent developments in multimodal generative models. Inspired by neural mechanisms of human recognition, we contribute the Nexus model, a novel hierarchical generative model that can learn a multimodal representation of an arbitrary number of modalities in an unsupervised way. By exploiting hierarchical representation levels, Nexus is able to generate high-quality, coherent data of missing modalities given any subset of available modalities. To evaluate CMI in a natural scenario with a high number of modalities, we contribute the "Multimodal Handwritten Digit" (MHD) dataset, a novel benchmark dataset that combines image, motion, sound and label information from digit handwriting. We access the key role of hierarchy in enabling high-quality samples during cross-modality inference and discuss how a novel training scheme enables Nexus to learn a multimodal representation robust to missing modalities at test time. Our results show that Nexus outperforms current state-of-the-art multimodal generative models in regards to their cross-modality inference capabilities.
Fabio Vital, Miguel Vasco, Alberto Sardinha, and Francisco Melo
IEEE
We present Perceive-Represent-Generate (PRG), a novel three-stage framework that maps perceptual information of different modalities (e.g., visual or sound), corresponding to a series of instructions, to a sequence of movements to be executed by a robot. In the first stage, we perceive and preprocess the given inputs, isolating individual commands from the complete instruction provided by a human user. In the second stage we encode the individual commands into a multimodal latent space, employing a deep generative model. Finally, in the third stage we convert the latent samples into individual trajectories and combine them into a single dynamic movement primitive, allowing its execution by a robotic manipulator. We evaluate our pipeline in the context of a novel robotic handwriting task, where the robot receives as input a word through different perceptual modalities (e.g., image, sound), and generates the corresponding motion trajectory to write it, creating coherent and high-quality handwritten words.
Filipa Correia, Francisco S. Melo, and Ana Paiva
Wiley
Creating effective teamwork between humans and robots involves not only addressing their performance as a team but also sustaining the quality and sense of unity among teammates, also known as cohesion. This paper explores the research problem of: how can we endow robotic teammates with social capabilities to improve the cohesive alliance with humans? By defining the concept of a human-robot cohesive alliance in the light of the multidimensional construct of cohesion from the social sciences, we propose to address this problem through the idea of multifaceted human-robot cohesion. We present our preliminary effort from previous works to examine each of the five dimensions of cohesion: social, collective, emotional, structural, and task. We finish the paper with a discussion on how human-robot cohesion contributes to the key questions and ongoing challenges of creating robotic teammates. Overall, cohesion in human-robot teams might be a key factor to propel team performance and it should be considered in the design, development, and evaluation of robotic teammates.
Carla Guerra, Francisco S. Melo, and Manuel Lopes
Springer International Publishing
Ramona Merhej, Fernando P. Santos, Francisco S. Melo, and Francisco C. Santos
AI Access Foundation
We examine how wealth inequality and diversity in the perception of risk of a collective disaster impact cooperation levels in the context of a public goods game with uncertain and non-linear returns. In this game, individuals face a collective-risk dilemma where they may contribute or not to a common pool to reduce their chances of future losses. We draw our conclusions based on social simulations with populations of independent reinforcement learners with diverse levels of risk and wealth. We find that both wealth inequality and diversity in risk assessment can hinder cooperation and augment collective losses. Additionally, wealth inequality further exacerbates long term inequality, causing rich agents to become richer and poor agents to become poorer. On the other hand, diversity in risk only amplifies inequality when combined with bias in group assortment—i.e., high probability that agents from the same risk class play together. Our results also suggest that taking wealth inequality into account can help to design effective policies aiming at leveraging cooperation in large group sizes, a configuration where collective action is harder to achieve. Finally, we characterize the circumstances under which risk perception alignment is crucial and those under which reducing wealth inequality constitutes a deciding factor for collective welfare.
Francisco Melo and Plinio Moreno
IEEE
This work considers socially acceptable behaviors in traditional reactive navigation systems, allowing a robot to approach a group of humans in a socially acceptable manner by considering the personal space and the group space. In contrast to the fixed parameters of social distancing, this work presents an adaptive model; that is, the parameters of the personal and group space’s cost functions adapt according to the arrangement of the group and space constraints, avoiding the choice of initial parameters. A socially aware navigation system capable of approaching groups is implemented for a general-purpose mobile robot. The adaptive personal and group space algorithm is integrated with the standard navigation system of ROS, representing their information in a costmap layer. The adaptation of spaces is tested using fixed and adaptive parameters for different groups provided by three datasets. The navigation system is evaluated through simulation experiments, demonstrating that the robot is capable of approaching groups and, at the same time, provides a more realistic space modeling adapted to the context.
Ana Salta, Rui Prada, and Francisco S. Melo
Institute of Electrical and Electronics Engineers (IEEE)
Game artificial intelligence (AI) competitions are important to foster research and development on Game AI and AI in general. These competitions supply different challenging problems that can be translated into other contexts, virtual or real. They provide frameworks and tools to facilitate the research on their core topics and provide means for comparing and sharing results. A competition is also a way to motivate new researchers to study these challenges. In this article, we present the Geometry Friends game AI competition. Geometry Friends is a two-player cooperative physics-based puzzle platformer computer game. The concept of the game is simple, though its solving has proven to be difficult. While the main and apparent focus of the game is cooperation, it also relies on other AI-related problems such as planning, plan execution, and motion control, all connected to situational awareness. All of these must be solved in real-time. In this article, we discuss the competition and the challenges it brings, and present an overview of the current solutions.
Francisco S. Melo and Manuel Lopes
Frontiers Media SA
In this paper, we propose the first machine teaching algorithm for multiple inverse reinforcement learners. As our initial contribution, we formalize the problem of optimally teaching a sequential task to a heterogeneous class of learners. We then contribute a theoretical analysis of such problem, identifying conditions under which it is possible to conduct such teaching using the same demonstration for all learners. Our analysis shows that, contrary to other teaching problems, teaching a sequential task to a heterogeneous class of learners with a single demonstration may not be possible, as the differences between individual agents increase. We then contribute two algorithms that address the main difficulties identified by our theoretical analysis. The first algorithm, which we dub SplitTeach, starts by teaching the class as a whole until all students have learned all that they can learn as a group; it then teaches each student individually, ensuring that all students are able to perfectly acquire the target task. The second approach, which we dub JointTeach, selects a single demonstration to be provided to the whole class so that all students learn the target task as well as a single demonstration allows. While SplitTeach ensures optimal teaching at the cost of a bigger teaching effort, JointTeach ensures minimal effort, although the learners are not guaranteed to perfectly recover the target task. We conclude by illustrating our methods in several simulation domains. The simulation results agree with our theoretical findings, showcasing that indeed class teaching is not possible in the presence of heterogeneous students. At the same time, they also illustrate the main properties of our proposed algorithms: in all domains, SplitTeach guarantees perfect teaching and, in terms of teaching effort, is always at least as good as individualized teaching (often better); on the other hand, JointTeach attains minimal teaching effort in all domains, even if sometimes it compromises the teaching performance.
Miguel Faria, Francisco S. Melo, and Ana Paiva
IEEE
In this work we explore implicit communication between humans and robots—through movement—in multi-party (or multi-user) interactions. In particular, we investigate how a robot can move to better convey its intentions using legible movements in multi-party interactions. Current research on the application of legible movements has focused on single-user interactions, causing a vacuum of knowledge regarding the impact of such movements in multi-party interactions. We propose a novel approach that extends the notion of legible motion to multi-party settings, by considering that legibility depends on all human users involved in the interaction, and should take into consideration how each of them perceives the robot’s movements from their respective points-of-view. We show, through simulation and a user study, that our proposed model of multi-user legibility leads to movements that, on average, optimize the legibility of the motion as perceived by the group of users. Our model creates movements that allow each human to more quickly and confidently understand what are the robot’s intentions, thus creating safer, clearer and more efficient interactions and collaborations.
Carla Guerra, Francisco S. Melo, and Manuel Lopes
Springer International Publishing
Pedro Ildefonso, Pedro Remédios, Rui Silva, Miguel Vasco, Francisco S. Melo, Ana Paiva, and Manuela Veloso
Springer International Publishing