T-PAS

Typed Predicate Argument Structures for Italian

T-PAS is a linguistic resource of Typed Predicate Argument Structures (t-pass) for Italian, acquired from corpora by manual clustering of distributional information about Italian verbs.

T-pass are corpus-derived verb patterns with specification of the expected semantic type (ST) for each argument slot. T-pass are semantically motivated. We discover the most salient verbal patterns using a lexicographic procedure called Corpus Pattern Analysis (CPA, Hanks 2004), which relies on the analysis of co- occurrence statistics of syntactic slots in concrete examples found in corpora.

The example below reports three t-pass of the verb divorare (Eng. 'to devour') and their sense description.

The semantic types (ST) labels used in the T-PAS resource are semantic classes discovered by generalizing over sets of lexical items found in the argument positions in the corpus. In the example below [[Animate]] generalizes over e.g. belve, marinaio, mozzo (Eng. 'beasts', 'sailor', 'cabin boy') and [[Food]] generalizes over e.g. briciole, focacce (Eng. 'crumbs', 'focaccia bread').

STs are drawn from an inventory organized in a hierarchy; they are language-driven, and reflect how we predicate about entities in the world. Despite the obvious correlations, they differ from categories of entities defined on the basis of ontological axioms.

T-pass are sense-stable objects, i.e. expressions where all the words are disambiguated; they provide the exact context carrying the relevant information for word senses. This has important consequences for the use of T-PAS in NLP tasks.