\"We are living in an era where an unprecedented amount of data isavailable to companies, scientists, and individuals. Making use ofthis data is increasingly becoming crucial for economic success andfor scientific progress, and even to take informed decisions ineveryday life...
\"We are living in an era where an unprecedented amount of data is
available to companies, scientists, and individuals. Making use of
this data is increasingly becoming crucial for economic success and
for scientific progress, and even to take informed decisions in
everyday life. However, a large amount of data from many different
sources can be very difficult to handle and to interpret and, in
particular, such data will often be highly incomplete and very
heterogeneous in representation. How can we provide support for
dealing with this situation? Imagine that a computer program which
processes data has at its disposal a large amount of encyclopedic
domain knowledge, of the kind provided by Wikipedia for human
users. This is the purpose of an ontology and indeed ontologies have
proved to be very useful for supporting the processing of incomplete
and heterogeneous data. But the convenience comes at a price:
processing the data and the ontology at the same time is
computationally expensive. The CODA project sets out to provide an
ultimately fine-grained analysis of the computational costs of
ontology-mediated querying, identifying and studying \"\"islands of
tractability\"\", that is, classes of ontologies which can be processed
efficiently and have other desirable computational properties. The
overall aim of this approach is to enable ontology engineers to design
and customize ontologies that strike an ideal balance between the
knowledge provided and efficient processing, in this way developing
ontology-mediated querying to its full potential.\"
The work carried out in the first reporting phase has mainly
concentrated on fundamental questions of complexity and rewritability:
when can an ontology-mediated query (OMQ), constituted by an ontology
and an actual query, be answered in polynomial time (thus efficiently)
or in AC0 (thus efficiently in parallel) and related questions; when
can it be rewritten into a more traditional database query such as SQL
(represented by first-order logic or by linear datalog) and into
Datalog? In the first phase, we have provided elegant characterizations
for many relevant cases. We have also obtained many important results
on the computational complexity of the question whether a given OMQ
has one of the mentioned desirable properties, thus significantly
advancing our knowledge about several fundamental islands of
tractability.
So far, the knowledge about many islands of tractability in
ontology-based data access has been rather incomplete and
scattered. In the first phase of CODA, we have made a significant step
towards a more holistic understanding of such islands, in particular
those pertaining to tractability and rewritability. For ontologies
from the EL family of description logics (DLs), we have brought to the
picture also inverse roles and conjunctive queries (instead of only
atomic ones), resulting in a rather comprehensive knowledge of the
important island of FO-rewitability in the context of such DLs. For
ontologies from the ALC family, we have brought to the picture
conjunctive queries (instead of only atomic ones), which results in an
increase of expressive power from CSP to MMSNP. We have established
decidability and obtained results on the complexity of rewritability
into FO and monadic Datalog, and we have made first important
observations on rewritability into Datalog. On top of that, we have
studied OMQ containment as a more fundamental problem, proving
decidability and tight complexity bounds, and have analyzed the shape
of rewritings. In the case of quantified queries, we have pushed the
frontier beyond description logics by implementing a very fine-grained
study of ontologies formulated in the guarded fragment, obtaining a
host of results on islands of tractability defined in terms of complexity
and rewritability and winning the prestigious PODS best paper award in
2017. We have also pushed the state-of-the-art regarding query
inseparability, automatic termination of the chase, description logics
with relations of higher arity, and first-order logics of incomplete
information.