The lack of accurate methods for predicting meaning representations of texts is the key bottleneck for many NLP applications such as question answering, text summarization and information retrieval. Such methods would, for example, enable users to pose questions to systems...
The lack of accurate methods for predicting meaning representations of texts is the key bottleneck for many NLP applications such as question answering, text summarization and information retrieval. Such methods would, for example, enable users to pose questions to systems relying on knowledge contained in Web texts rather than in hand-crafted ontologies.
Although state-of-the-art semantic analyzers work fairly well on closed domains (e.g., interpreting natural language queries to databases), accurately predicting even shallow forms of semantic representations (e.g., frame-semantic parsing) for less restricted texts remains a challenge. The reason for the unsatisfactory performance is reliance on supervised learning, with the amounts of annotation required for accurate open-domain parsing exceeding what is practically feasible.
In this project, we address this challenge by defining expressive statistical models which can exploit not only annotated data but also leverage information contained in unannotated texts (e.g., on the Web). Rather than modeling sentences in isolation, we will model relations between facts, both within and across different texts, and also exploit linking to facts present in knowledge bases. This ‘linked’ setting will let us to both discover inference rules (i.e. learn that one fact implies another) and induce semantic representations more appropriate for applications requiring reasoning.
During the first 12 months of the project, we considered unsupervised and supervised modeling, and primiarily focused on developing crucial components of our methods which will be used throughout the project.
First, we developed an accurate and simple shallow semantic parser (i.e. a semantic role labeller). Given its simplicity and computational efficiency, it can be directly applied to large amounts of annotated data, as required by next stages of the project.
Second, we developed an effective method for linking mentions of the same concept / referents across sentences in text (co-reference resolution). This step is necessary for modeling relations between facts in a document and training the model to \'read between the lines\'.
Additionally, we developed methods for integrating linguistic structure (e.g., semantic representation) into linguistic-uninformed neural models. Specifically we introduced a class of graph neural networks suitable for encoding semantic properties of sentences while relying on prior knowledge represented as labeled directed graphs. This method, or its variations, will be used in the next stages of the project (including jointly modeling knowledge bases and text).
Besides these novel contributions, we focused on preparing data and establishing benchmarks for evaluating semantic parsers.
The key areas where we achieved progress beyond the state of the art are as the following:
1. The semantic role labeling model we introduced is simple and fast, and surpasses comparable methods on standard benchmarks on multiple languages. Moreover, it substantially outperforms all previous methods in the arguably more realistic out-of-domain setting (when a model is tested on data sufficiently different from the data it was trained on). Semantic role labeling (SRL) is a crucial natural language processing (NLP) problem, availability of a simple multilingual method is an important step toward even broader use of SRL in NLP applications (and hence toward creation of even more intelligent text processing tools).
2. Besides producing an effective co-reference system, our work in this direction has wider implications. In fact, the method we introduced (specifically the loss function) can potentially be used to improve any co-reference system, for any language.
3. Deep learning, and specifically recent classes of recurrent neural networks (LSTMs), have had large impact on NLP. One of the challenges is the lack of simple and effective methods for incorporating structured information in such models (e.g., syntax or semantics). Our work on incorporating graph structure into neural models is an important step towards resolving this challenge.