Foundations of Formal Reasoning over Knowledge Bases Combining Symbolic and Sub-Symbolic Knowledge Gianluca Cima, Marco Console, Laura Papi Proceedings of the Aaai Conference on Artificial Intelligence, 2026 More and more organizations are relying on Machine Learning (ML) models to support internal decision-making processes. To better support such processes, it would be highly beneficial to contextualize the inductively acquired knowledge encoded in these models and enable formal reasoning over it. Despite significant progress in Neural-Symbolic AI, this specific challenge remains largely under-explored. We propose a framework that allows to integrate the knowledge induced by ML classifiers with the knowledge specified by logic-based formalisms. The framework is based on the novel notion of Hybrid Knowledge Base (HKB), consisting of two components: an ontology and a set of ML binary classifiers. As usual, the ontology provides an intensional representation of the modeled domain through logic-based axioms, while the binary classifiers implicitly encode the extensional knowledge. Specifically, a HKB associates to each concept and role mentioned in the ontology a classifier based on a set of features deemed to be relevant for the application domain, thereby virtually populating the concepts and roles with the instances and pairs of instances from the feature space. Besides the definition of the new framework, as a more technical contribution we show how to reason in this framework by studying query answering over HKBs. In particular, we investigate the computational complexity of query answering in a rich language over HKBs in which the ontology is specified in (the Description Logic counterpart of) RDFS, while the binary classifiers are represented by Multi-Layer Perceptrons.
Expressive Recursive Answers for Ontological Knowledge Bases Luca Andolfi, Gianluca Cima, Marco Console, Maurizio Lenzerini Proceedings of the Aaai Conference on Artificial Intelligence, 2026 A fundamental use of knowledge bases (KBs) is query answering, i.e., retrieving the information entailed by the KB in response to a user query. When both the KB and the query are specified as logical formulae, the standard form of answer provided to users is the set of all certain answers (CAs): tuples of constants that satisfy the formula defining the query in every model of the logical theory defining the KB. Despite their wide adoption, CAs are known to be just a lossy representation of the information that a KB and a query provide. While several alternative answer languages have been proposed in the literature, no general consensus has emerged on the most suitable approach to query answering over ontological KBs, as each language comes with its own limitations. To address some of these issues, we introduce Regularly Recurrent Answers (RRAs), a novel answer language for queries over ontological KBs based on regular expressions. RRAs support the representation of infinite sets of tuples of constants via a simple (and arguably well understood) generation mechanism. We show that RRAs can capture a fundamental fragment of the certain information entailed by union of conjunctive queries and DL-Lite KBs, making them a strong candidate for informative query answering settings. Our contribution includes the formal definition of RRAs, a proof of their informativeness, and a study of the computational complexity of query answering problem using RRAs.
Ontology-Based Schema-Level Data Quality: The Case of Consistency Gianluca Cima, Marco Console, Maurizio Lenzerini Journal of Data and Information Quality, 2025 The quality of metadata plays a crucial role in many data FAIRification processes. So much so, in fact, that all the four main principles of data FAIRification prescribe the use of high-quality metadata. One of the main data management paradigms where metadata is a first-class citizen is Ontology-Based Data Management (OBDM). The goal of OBDM is to provide users with a reconciled view of a set of heterogeneous data sources by means of a semantic metadata layer comprising an ontology and a mapping. The former is a high-level, declarative representation of the domain of interest written in terms of a logical theory, and the latter is a formal description of the relation between the symbols in the ontology and the data at the sources. In this article, we introduce a novel data quality framework based on OBDM and specifically tailored for metadata analysis. The target of this framework is one of the most common forms of metadata currently in circulation, i.e., the integrity constraints defined by a database schema. Specifically, we will focus on the data quality dimension known as Consistency, i.e., the property of data that is free of contradictions and incoherence. In this context, our techniques provide a set of tools to compare the integrity constraints defined by a database schema against the knowledge encoded in an ontology and check whether these constraints are strict enough (i.e., protect) and are not too strict (i.e., are faithful to) for such knowledge. The contribution of the article is the presentation of the framework and the study of the related computational problems. We will present a detailed computational complexity analysis of such problems and show that they are decidable for classes of OBDM specifications and integrity constraints that are very popular in practice.
Enhancing cooperativity in controlled query evaluation over ontologies Piero Bonatti, Gianluca Cima, Domenico Lembo, Francesco Magliocca, Lorenzo Marconi, Riccardo Rosati, Luigi Sauro, Domenico Fabio Savo Artificial Intelligence, 2025 Controlled Query Evaluation (CQE) is a methodology designed to maintain confidentiality by either rejecting specific queries or adjusting responses to safeguard sensitive information. In this investigation, our focus centers on CQE within Description Logic ontologies, aiming to ensure that queries are answered truthfully as long as possible before resorting to deceptive responses, a cooperativity property which is called the “longest honeymoon”. Our work introduces new semantics for CQE, denoted as MC-CQE, which enjoys the longest honeymoon property and outperforms previous methodologies in terms of cooperativity. We study the complexity of query answering in this new framework for ontologies expressed in the Description Logic DL-Lite_R. Specifically, we establish data complexity results under different maximally cooperative semantics and for different classes of queries. Our results identify both tractable and intractable cases. In particular, we show that the evaluation of Boolean unions of conjunctive queries is the same under all the above semantics and its data complexity is in AC^0. This result makes query answering amenable to SQL query rewriting. However, this favorable property does not extend to open queries, even with a restricted query language limited to conjunctions of atoms. While, in general, answering open queries in the MC-CQE framework is intractable, we identify a sub-family of semantics under which answering full conjunctive queries is tractable.
Recent Advances in Logic-Based Entity Resolution Meghyn Bienvenu, Gianluca Cima, Víctor Gutiérrez-Basulto, Yazmín Ibáñez-García, Zhiliang Xiang SIGMOD Record, 2025 Entity resolution (ER) is a central task in data quality, which is concerned with identifying pairs of distinct constants or tuples that refer to the same real-world entity. Declarative approaches, based upon logical rules and constraints, are a natural choice for tackling complex, collective ER tasks involving the joint resolution of multiple entity types across multiple tables. This paper provides an overview of recent advances in logicbased entity resolution, with a particular focus on the Lace framework, first introduced at PODS'22 and subsequently extended with additional features (IJCAI'23, KR'23) and equipped with an answer set programmingbased implementation (KR'24, KR'25).
Answering Conjunctive Queries with Safe Negation and Inequalities over RDFS Knowledge Bases Gianluca Cima, Marco Console, Roberto Maria Delfino, Maurizio Lenzerini, Antonella Poggi Proceedings of the Aaai Conference on Artificial Intelligence, 2025 Expressing negative conditions is a crucial feature of query languages for knowledge bases (KBs). Answering such queries over ontological KBs, however, is a very challenging task that becomes undecidable even for lightweight Description Logic (DL) ontologies. Such negative results hold even for Conjunctive Queries (CQs) equipped with basic forms of negative conditions such as the so-called safe negation or inequality atoms. One ontology language that is seemingly unaffected by these results is (the DL counterpart of) RDFS even if equipped with disjointness axioms. Answering CQs with inequalities over such ontologies is known to be Pi^p_2-complete, if the number of inequality atoms is unbounded, and NP-complete if we limit this number to one. Notably, these results leave open the cases of CQs with a fixed number greater than two of inequality atoms. Additionally, such a thorough analysis is missing for CQs with safe negation. In this paper, we embark in a refined analysis of the combined complexity of answering CQs with inequality atoms and safe negation over RDFS ontologies augmented with disjointness axioms. Firstly, we provide a unified Pi^p_2 query answering algorithm for the general problem. Secondly, we confirm the generally held conjecture according to which answering CQs with two inequality atoms over such ontologies is already Pi^p_2-hard. This result closes an important gap in the current literature and has an impact on the widely influential problem of query containment. Lastly, for CQs with safe negation, we prove a behavior similar to that of CQs with inequality atoms. Specifically, we show that answering CQs with at most one negated atom can be done in NP, while allowing at most two negated atoms is sufficient to obtain Pi^p_2-hardness.
Advances in Logic-Based Entity Resolution: Enhancing ASPEN with Local Merges and Optimality Criteria Zhiliang Xiang, Meghyn Bienvenu, Gianluca Cima, Víctor Gutiérrez-Basulto, Yazmín Ibáñez-García Proceedings of the International Conference on Knowledge Representation and Reasoning, 2025 In this paper, we present ASPEN+, which extends an existing ASP-based system, ASPEN,for collective entity resolution with two important functionalities: support for local merges and new optimality criteria for preferred solutions. Indeed, ASPEN only supports so-called global merges of entity-referring constants (e.g. author ids), in which all occurrences of matched constants are treated as equivalent and merged accordingly. However, it has been argued that when resolving data values, local merges are often more appropriate, as e.g. some instances of ‘J. Lee’ may refer to ‘Joy Lee’, while others should be matched with ‘Jake Lee’. In addition to allowing such local merges, ASPEN+ offers new optimality criteria for selecting solutions, such as minimizing rule violations or maximising the number of rules supporting a merge. Our main contributions are thus (1) the formalisation and computational analysis of various notions of optimal solution, and (2) an extensive experimental evaluation on real-world datasets, demonstrating the effect of local merges and the new optimality criteria on both accuracy and runtime.
Indistinguishability in controlled query evaluation over prioritized description logic ontologies Gianluca Cima, Domenico Lembo, Lorenzo Marconi, Riccardo Rosati, Domenico Fabio Savo Journal of Web Semantics, 2025 In this paper we study Controlled Query Evaluation (CQE) , a declarative approach to privacy-preserving query answering over databases, knowledge bases, and ontologies. CQE is based on the notion of censor , which defines the answers to each query posed to the data/knowledge base. We investigate both semantic and computational properties of CQE in the context of OWL ontologies, and specifically in the description logic DL-Lite R , which underpins the OWL 2 QL profile. In our analysis, we focus on semantics of CQE based on censors (called optimal GA censors ) that enjoy the so-called indistinguishability property, analyzing the trade-off between maximizing the amount of data disclosed by query answers and minimizing the computational cost of privacy-preserving query answering. We first study the data complexity of skeptical entailment of unions of conjunctive queries under all the optimal GA censors, showing that the computational cost of query answering in this setting is intractable. To overcome this computational issue, we then define a different semantics for CQE centered around the notion of intersection of all the optimal GA censors. We show that query answering over OWL 2 QL ontologies under the new intersection-based semantics for CQE enjoys tractability and is first-order rewritable , i.e. amenable to be implemented through SQL query rewriting techniques and the use of standard relational database systems; on the other hand, this approach shows limitations in terms of amount of data disclosed. To improve this aspect, we add preferences between ontology predicates to the CQE framework, and identify a semantics under which query answering over OWL 2 QL ontologies maintains the same computational properties of the intersection-based approach without preferences.
Controlled Query Evaluation in DL-Lite through Epistemic Protection Policies (Extended Abstract) Ceur Workshop Proceedings, 2025
Controlled Query Evaluation in Description Logic Ontologies Gianluca Cima, Domenico Lembo, Lorenzo Marconi, Riccardo Rosati, Domenico Fabio Savo Lecture Notes in Computer Science Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics, 2025
Data Augmentation for Data-Centric AI Through the Lens of Semantic Technologies: A Position Paper Ceur Workshop Proceedings, 2025
Combining Global and Local Merges in Logic-based Entity Resolution Proceedings of the International Conference on Knowledge Representation and Reasoning, 2023
A review of data abstraction Gianluca Cima, Marco Console, Maurizio Lenzerini, Antonella Poggi Frontiers in Artificial Intelligence, 2023
Controlled Query Evaluation in OWL 2 QL: A “Longest Honeymoon” Approach Piero Bonatti, Gianluca Cima, Domenico Lembo, Lorenzo Marconi, Riccardo Rosati, Luigi Sauro, Domenico Fabio Savo Lecture Notes in Computer Science Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics, 2022
Abstraction in Data Integration Gianluca Cima, Marco Console, Maurizio Lenzerini, Antonella Poggi Proceedings Symposium on Logic in Computer Science, 2021
Privacy preserving query answering in description logics through instance indistinguishability Ceur Workshop Proceedings, 2021
Non-monotonic ontology-based abstractions of data services 17th International Conference on Principles of Knowledge Representation and Reasoning Kr 2020, 2020
Answering conjunctive queries with inequalities in DL-LiteR Aaai 2020 34th Aaai Conference on Artificial Intelligence, 2020
Controlled query evaluation in description logics through instance indistinguishability Ijcai International Joint Conference on Artificial Intelligence, 2020
Controlled query evaluation in description logics through instance indistinguishability Ceur Workshop Proceedings, 2020
Ontology-based explanation of classifiers Ceur Workshop Proceedings, 2020
Controlled Query Evaluation in Ontology-Based Data Access Gianluca Cima, Domenico Lembo, Lorenzo Marconi, Riccardo Rosati, Domenico Fabio Savo Lecture Notes in Computer Science Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics, 2020
Reverse engineering of data services Ceur Workshop Proceedings, 2019
Exploiting ontologies for explaining data sources semantics Ceur Workshop Proceedings, 2019
On queries with inequalities in DL-LiteR≠ Ceur Workshop Proceedings, 2019
Bag Semantics of DL-Lite with Functionality Axioms Gianluca Cima, Charalampos Nikolaou, Egor V. Kostylev, Mark Kaminski, Bernardo Cuenca Grau, Ian Horrocks Lecture Notes in Computer Science Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics, 2019
Bagging the DL-lite family further Ceur Workshop Proceedings, 2019
Preliminary results on ontology-based open data publishing Ceur Workshop Proceedings, 2017
Querying OWL 2 QL ontologies under the SPARQL Metamodeling Semantics Entailment Regime Ceur Workshop Proceedings, 2017
RECENT SCHOLAR PUBLICATIONS
Foundations of Formal Reasoning over Knowledge Bases Combining Symbolic and Sub-Symbolic Knowledge G Cima, M Console, L Papi Fortieth AAAI Conference on Artificial Intelligence (AAAI 2026) 40 (23 … , 2026 2026
Expressive Recursive Answers for Ontological Knowledge Bases L Andolfi, G Cima, M Console, M Lenzerini Fortieth AAAI Conference on Artificial Intelligence (AAAI 2026) 40 (23 … , 2026 2026
Data Augmentation for Data-Centric AI Through the Lens of Semantic Technologies: A Position Paper L Cabibbo, D Bertillo, G Cima, V Crescenzi, M Console, RM Delfino, ... CEUR WORKSHOP PROCEEDINGS 4182 , 2026 2026
Ontology-based schema-level data quality: The case of consistency G Cima, M Console, M Lenzerini ACM Journal of Data and Information Quality 17 (4), 1-25 , 2025 2025 Citations: 2
Recent advances in logic-based entity resolution M Bienvenu, G Cima, V Gutiérrez-Basulto, Y Ibáñez-García, Z Xiang SIGMOD Record 54 (3), 7-21 , 2025 2025
Answering MetaQueries over RDFS Ontologies Under the SPARQL Metamodeling Semantics Entailment Regime D Calvanese, G Cima, J Corman, RM Delfino, M Lenzerini, L Marconi, ... Fourteenth International Joint Conference on Knowledge Graphs (IJCKG 2025 … , 2025 2025
Assessing the exposure to public knowledge in policy-protected description logic ontologies G Cima, D Lembo, L Marconi, R Rosati, DF Savo Thirty-Fourth International Joint Conference on Artificial Intelligence … , 2025 2025
Advances in logic-based entity resolution: Enhancing ASPen with local merges and optimality criteria Z Xiang, M Bienvenu, G Cima, V Gutiérrez-Basulto, Y Ibáñez-García Twenty-Second International Conference on Principles of Knowledge … , 2025 2025 Citations: 1
Enhancing Cooperativity in Controlled Query Evaluation over Ontologies P Bonatti, G Cima, D Lembo, F Magliocca, L Marconi, R Rosati, L Sauro, ... Artificial Intelligence 348, 104402 , 2025 2025
Answering Queries with Negation and Inequalities over RDFS Knowledge Bases G Cima, M Console, RM Delfino, M Lenzerini, A Poggi Proceedings of The International Research and Industry Symposium on … , 2025 2025
Answering Conjunctive Queries with Safe Negation and Inequalities over RDFS Knowledge Bases G Cima, M Console, RM Delfino, M Lenzerini, A Poggi Thirty-Ninth AAAI Conference on Artificial Intelligence (AAAI 2025) 39 (14 … , 2025 2025 Citations: 9
Proceedings of the 33nd Symposium on Advanced Database Systems Ischia, Italy, June 16th to 19th, 2025 I Bartolini, G Cima, D Firmani, D Lembo, D Martinenghi, M Mecella, ... CEUR-WS , 2025 2025
Answering Expressive Conjunctive Queries over RDFS Knowledge Bases G Cima, M Console, RM Delfino, M Lenzerini, A Poggi Thirty-Eighth International Workshop on Description Logics (DL 2025) 4091 , 2025 2025
Indistinguishability in controlled query evaluation over prioritized description logic ontologies G Cima, D Lembo, L Marconi, R Rosati, DF Savo Journal of Web Semantics 84, 100841 , 2025 2025 Citations: 4
Separability and its approximations in ontology-based data management G Cima, F Croce, M Lenzerini Semantic Web 15 (4), 1021-1056 , 2024 2024 Citations: 5
Controlled query evaluation in description logics through consistent query answering G Cima, D Lembo, R Rosati, DF Savo Artificial Intelligence 334, 104176 , 2024 2024 Citations: 5
ASPEN: ASP-based system for collective entity resolution Z Xiang, M Bienvenu, G Cima, V Gutiérrez-Basulto, Y Ibáñez-García Twenty-First International Conference on Principles of Knowledge … , 2024 2024 Citations: 2
Controlled query evaluation through epistemic dependencies G Cima, D Lembo, L Marconi, R Rosati, DF Savo arXiv preprint arXiv:2405.02458 , 2024 2024 Citations: 1
A gentle introduction to controlled query evaluation in DL-Lite ontologies G Cima, D Lembo, L Marconi, R Rosati, DF Savo SN Computer Science 5 (4), 335 , 2024 2024 Citations: 2
What Does a Query Answer Tell You? Informativeness of Query Answers for Knowledge Bases L Andolfi, G Cima, M Console, M Lenzerini Thirty-Eighth AAAI Conference on Artificial Intelligence (AAAI 2024) 38 (9 … , 2024 2024 Citations: 3
MOST CITED SCHOLAR PUBLICATIONS
Predictive analytic techniques to identify hidden relationships between training load, fatigue and muscle strains in young soccer players M Mandorino, AJ Figueiredo, G Cima, A Tessitore Sports 10 (1), 3 , 2022 2022 Citations: 37
A data mining approach to predict non-contact injuries in young soccer players M Mandorino, AJ Figueiredo, G Cima, A Tessitore International Journal of Computer Science in Sport 20 (2), 147-163 , 2021 2021 Citations: 31
Controlled Query Evaluation in Description Logics Through Instance Indistinguishability G Cima, D Lembo, R Rosati, DF Savo Twenty-Ninth International Joint Conference on Artificial Intelligence … , 2020 2020 Citations: 30
Controlled Query Evaluation in Ontology-Based Data Access G Cima, D Lembo, L Marconi, R Rosati, DF Savo Nineteenth International Semantic Web Conference (ISWC 2020) 12506, 128-146 , 2020 2020 Citations: 25
On the SPARQL metamodeling semantics entailment regime for OWL 2 QL ontologies G Cima, G De Giacomo, M Lenzerini, A Poggi Seventh International Conference on Web Intelligence, Mining and Semantics … , 2017 2017 Citations: 23
Semantic Characterization of Data Services through Ontologies G Cima, M Lenzerini, A Poggi Twenty-Eighth International Joint Conference on Artificial Intelligence … , 2019 2019 Citations: 22
Preliminary Results on Ontology-based Open Data Publishing G Cima Thirtieth International Workshop on Description Logics (DL 2017) 1879 , 2017 2017 Citations: 21
Answering Conjunctive Queries with Inequalities in DL-Lite ℛ G Cima, M Lenzerini, A Poggi Thirty-Fourth AAAI Conference on Artificial Intelligence (AAAI 2020) 34 (03 … , 2020 2020 Citations: 15
Monotone Abstractions in Ontology-based Data Management G Cima, M Console, M Lenzerini, A Poggi Thirty-Sixth AAAI Conference on Artificial Intelligence (AAAI 2022) 36 (5 … , 2022 2022 Citations: 14
Abstraction in Data Integration G Cima, M Console, M Lenzerini, A Poggi Thirty-Sixth Annual ACM/IEEE Symposium on Logic in Computer Science (LICS … , 2021 2021 Citations: 14
Controlled Query Evaluation over Prioritized Ontologies with Expressive Data Protection Policies G Cima, D Lembo, L Marconi, R Rosati, DF Savo Twentieth International Semantic Web Conference (ISWC 2021) 12922, 374-391 , 2021 2021 Citations: 13
Analysis of relationship between training load and recovery status in adult soccer players: a machine learning approach M Mandorino, AJ Figueiredo, G Cima, A Tessitore International Journal of Computer Science in Sport 21 (2), 1-16 , 2022 2022 Citations: 12
Query Definability and Its Approximations in Ontology-based Data Management G Cima, F Croce, M Lenzerini Thirtieth ACM International Conference on Information and Knowledge … , 2021 2021 Citations: 12
LACE: A Logical Approach to Collective Entity Resolution M Bienvenu, G Cima, V Gutiérrez-Basulto Forty-First ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database … , 2022 2022 Citations: 11
Non-Monotonic Ontology-based Abstractions of Data Services G Cima, M Lenzerini, A Poggi Seventeenth International Conference on Principles of Knowledge … , 2020 2020 Citations: 11
Abstraction in Ontology-Based Data Management G Cima IOS Press, FAIA series , 2022 2022 Citations: 10
Controlled Query Evaluation in OWL 2 QL: A “Longest Honeymoon” Approach P Bonatti, G Cima, D Lembo, L Marconi, R Rosati, L Sauro, DF Savo Twenty-First International Semantic Web Conference (ISWC 2022) 13489, 428-444 , 2022 2022 Citations: 10
Answering Conjunctive Queries with Safe Negation and Inequalities over RDFS Knowledge Bases G Cima, M Console, RM Delfino, M Lenzerini, A Poggi Thirty-Ninth AAAI Conference on Artificial Intelligence (AAAI 2025) 39 (14 … , 2025 2025 Citations: 9
The notion of abstraction in ontology-based data management G Cima, A Poggi, M Lenzerini Artificial Intelligence 323, 103976 , 2023 2023 Citations: 9
Ontology-based explanation of classifiers F Croce, G Cima, M Lenzerini, T Catarci Second International Workshop on Processing Information Ethically (PIE 2020 … , 2020 2020 Citations: 8