Research

Research at the ItaliaNLP Lab focuses on the development of computational models, linguistic resources and AI technologies for Natural Language Processing, with a particular emphasis on the Italian language.

Our work combines linguistically grounded approaches with machine learning and neural language models to analyze, understand and generate natural language. A key goal of the laboratory is to transform large collections of texts into structured linguistic and semantic knowledge, while developing computational models that capture language variation, linguistic complexity and text quality.

Our research spans the following main areas:

Linguistic Analysis

We develop computational methods and tools for the automatic linguistic analysis of texts across multiple levels of representation, including morpho-syntactic, syntactic and semantic processing. Research in this area includes the development of NLP pipelines, the creation of annotated corpora and benchmark datasets, and techniques for adapting NLP systems to domain-specific and non-canonical language varieties.

Knowledge Extraction

We study methods for extracting and structuring knowledge from large collections of textual data. Our work includes the identification of entities, terminology and complex expressions, the extraction of events and relations, and the development of techniques for semantic annotation, knowledge organization and knowledge graph construction.

Linguistic Profiling and Language Variation

We investigate computational models for analyzing language variation, linguistic complexity and text quality across domains, genres and sociolinguistic contexts. Research includes linguistic profiling of texts, genre and register identification, the study of dialectal and learner language variation, and the development of readability assessment and text simplification methods aimed at improving accessibility and supporting educational applications.

Natural Language Generation

We develop computational approaches to automatic text generation using neural and large language models. Research in this area includes controllable text generation, methods for guiding the linguistic properties and communicative goals of generated texts, and techniques for producing texts tailored to specific users, contexts or tasks.

Evaluation of Language Models

We design methodologies and benchmarks for the evaluation and analysis of neural and large language models. Research focuses on assessing their linguistic competence, robustness and reliability, as well as on understanding the linguistic knowledge and representations learned by neural models. Our work also investigates explainability and interpretability of large language models, with the goal of making their behavior more transparent.

Human-Centered NLP

We develop human-centered language technologies that support interaction between people and AI systems. Research includes NLP for education and accessibility, the analysis of narrative and experiential texts, and the development of trustworthy, explainable and socially aware language technologies.