Automatic Data and documents Analysis to enhance human-based processes – ADA.
A 2 year project (2018-2020) funded by Regione Toscana (Bando POR FESR 2014-2020) in collaboration with the IT company Hyperborea s.r.l., the IT company Erre Quadro Engineering, the IT company NETRESULTS Srl, the company SUPEREVO Srl and the Multimedia Information Retrieval (MIR) group at the Institute of Information Science and Technologies (ISTI) of CNR Pisa.
The aim of this project is to develop a platform driving innovation in the production process within the framework on Industry 4.0. The platform will use technologies based on artificial intelligence and big data analysis to tackle these challenges. It will allow collection, organization and smart retrieval of information from technical text and images at all stages of production process. The main innovative functionalities provided by ADA will be: assisted document drafting, multimodal analysis including text and images, automatic extraction of information from technical documents, blockchain technology to secure certification processes, testing and predictive maintenance.
Personalizzazione di pERcorsi FORMativi Avanzati – PERFORMA.
A two-year project (2017-2019) funded by Regione Toscana (Progetti Congiunti di Alta Formazione – POR FSE 2014-2020 Asse A – Occupazione) in collaboration with Meta srl company.
The project will develop innovative methodologies for the creation and personalization of e-learning courses thanks to the integration of NLP-based functionalities aimed at assessing the adeguacy of educational materials with respect to the level of language skills of each learner’s profile and to the characteristics of different reading devices.
UBIquitous Massive Open Learning – UBIMOL.
A 2 year project (2017-2019) funded by Regione Toscana (Bando POR FESR 2014-2020) in collaboration with M.E.T.A. Srl company, 01Sistemi Srl company, VIDITRUST Srl company, PERSAFE Srl company, the CoLing Lab of the Department of Philology, Literature, and Linguistics (University of Pisa).
The project aims at developing an E-learning platform enriched with innovative technologies able to offer language courses personalized with respect to the level of language skills specific to each learner profile. Advanced Natural Language Processing techniques will enable learners to self-assess her/his developmental growth over time both in terms of the new contents learned and of the written language competences acquired during the course.
ENgaging Content Object for Reuse and Exploitation of cultural resources – ENCORE.
A 24-month project (2017-2019) in collaboration with M.E.T.A. Srl company, F2 Glocal Innovation company and the Università degli Studi di Salerno.
The project will develop a system based on an innovative approach for the production, access and reuse of cultural resources offering users personalized narrative pathways to access cultural and touristic heritage contents.
NLP-based technologies for the educational domain.
A 1-year project (2016-2017) funded by CNR DUS.AD016.037 in collaboration with the INDIRE institute (National Institute for Documentation, Innovation and Research on Education) of the the Ministry of Education.
The project aims at exploiting state-of-the-art NLP tools and Information Extraction technologies to classify, organize and semantically index the content of different typologies of documents provided by the INDIRE institute which are relevant in the educational domain (such as projects of work-related learning, reports of newly recruited teachers, etc.).
Cultural Heritage Resources Orienting Multimodal Experience – CHROME.
A 36-month project (2017-2020) funded by the Italian Ministry of Education, University and Research (PRIN 2015), in collaboration with the Università degli Studi di NAPOLI “Federico II”, Università degli Studi Roma Tre, Università degli Studi di Salerno, Istituto di Scienze Applicate e Sistemi Intelligenti “Eduardo Caianiello” (CNR).
The main output of the CHROME project will be a methodology to collect, represent and analyse cultural heritage multimodal contents and present them through artificial agents whose behaviour is inspired by accurate analysis of expert guides, museum curators and tour operators. Jointly carried out by humanists and computer scientists the project will allow to model the behaviour that gatekeepers adopt when presenting cultural heritage. Such a model will be used to control a humanoid robot designed to follow similar presentation strategies.
Voci della Grande Guerra.
A 18-month project (2016-2018) funded by Presidenza Consiglio dei Ministri in the framework of the First World War Centenary events. In collaboration with the CoPhiLab of the Institute for Computational Linguistics “A. Zampolli” (ILC), the CoLing Lab of the Department of Philology, Literature, and Linguistics (University of Pisa), Accademia della Crusca, Interuniversity Center for Historical-Military Research (University of Siena).
The project aims at building a corpus of different types of documents (letters, war bulletins, journals, diaries) to investigate how Italian people perceived and narrated the First World War and how this war contributed to change the Italian language.
The project includes: i.) the digitalization of the corpus; ii.) the development and application of NLP-based modules for event extraction and georeferencing of the war locations; iii.) the design and development of the Web search interface.
ItaliaNLP-WAFI. Cyber Intelligence.
A 2-year project (2016-2018) funded by the Institute of Informatics and Telematics (IIT) in collaboration with Web Application for the Future Internet (WAFI) Laboratory at the Institute of Informatics and Telematics (IIT) of CNR Pisa.
The project aims at developing and adapting advanced Natural Language Processing (NLP) tools and techniques for automatic linguistic analysis and domain-knowledge extraction from social media texts in the field of Cyber Intelligence.
Recent past projects
Collaborative Research Agreement with M.E.T.A. srl company.
A 1-year agreement (2015-2016) with M.E.T.A. srl company aimed at developing NLP-based technologies for Educational Applications.
SCRIBE – Scritture Brevi, Semplificazione Linguistica, Inclusione Sociale: Modelli e Applicazioni. SCRIBE – Short writings, Linguistic simplification, social inclusion: models and applications.
A 3-year project (2013-2016) funded by the Italian Ministry of Education, University and Research (PRIN 2010FWM3B4 – Area 10). Project partners: Università di Tor Vergata, Università “L’Orientale” di NAPOLI, Università ROMA TRE, Università di MACERATA, Università di PISA, ILC-CNR.
The project aims at studying both from a synchronic and from a diachronic perspective the phenomenon of the synthetic and shorted messages’ production, from its contemporary expressions (short writings used for e-mails, sms and chats) to the other abbreviations’ strategies peculiar to Italian and dialectal graphic and linguistic systems. The goal of the ItaliaNLP Lab is to develop advanced computational linguistics methods for the analysis of these varieties of the Italian language.
iSLe – intelligent Semantic Liquid eBook
A 2-year project funded by Regione Toscana (POR CReO 2007 – 2013) in collaboration with IT companies (M.E.T.A SRL, 01Servizi SRL, VIDITRUST SRL, SPACE SPA).
The aim of the project is to develop an innovative software platform for digital educational publishing augmented with NLP-based functionalities for knowledge management and readability assessment.
L’amministrazione della giustizia in italia: il caso della neurogenetica e delle neuroscienze, un approccio multidisciplinare. The Administration of Justice in Italy: the case of neurogenetics and neuroscience, a multidisciplinary approach.
Progetto Premiale MIUR. Prize Project MIUR.
One-year research CNR project (2013-2014) funded by the Italian Ministry of Education, University and Research (MIUR) proposed by Istituto di Studi Giuridici Internazionali (ISGI), Istituto di Linguistica Computazionale «Antonio Zampolli» (ILC), Istituto di Ricerche sulla Popolazione e le Politiche Sociali (IRPPS) and Istituto di Scienze e Tecnologie della Cognizione (ISTC).
The project addresses a focused set of closely-related problems at the intersection of neuroscience and criminal justice from a multidisciplinary perspective: in particular, it aims at assessing whether, and if so how, neuroscientific evidence is starting to be admitted and evaluated in individual cases in Italy. The goal of the ItaliaNLP Lab within the project is to develop advanced technologies for the textual and linguistic analysis of cases with the final aim of building a terminological resource of Neuroscience in Law.
Legal Text Mining: costruzione di reti semantico-concettuali finalizzate a una navigazione intelligente di corpora di testi giuridici (JURNET). Legal Text Mining: building semantic networks to support advanced queries in legal textual corpora (JURNET)
A 2-year project (2013-2014) funded by Regione Toscana (POR CRO FSE 2007-2013 Asse IV – Capitale Umano). Project partners: ILC-CNR, Istituto di Teoria e Tecniche dell’Informazione Giuridica (ITTIG-CNR), Scuola Superiore Sant’Anna (Pisa), European Center for Law, Science and New Technologies (ECLT) – Università degli Studi di Pavia.
The project aims at creating the prerequisites for an advanced access to the knowledge contained in case law corpora: in particular, NLP methods and techniques will be used for building the semantic network of concepts and/or citations linking case law texts.
Collaboration ILC- Vodafone Omnitel N.V.
One year contract (2011-2012) aimed at developing and specializing NLP tools to be used within the Vodafone drafting platform Right&Clear for assessing text readability and supporting, whenever required, text simplification.
Paisà – Piattaforma per l’Apprendimento dell’Italiano Su corpora Annotati. Paisà – Platform for Corpus-Assisted Italian Language Learning
A 3-year project (2009-2012) funded by the Italian Ministry of Education, University and Research (Firb 2007), in collaboration with University of Bologna (Project Director Sergio Scalise), ILC-CNR, University of Trento and Eurac (Bolzano). The project has built a large, freely available, richly annotated corpus of Italian, and lexical databases that will be automatically acquired from it.
PORTALITA – Piattaforma di servizi integrati per l’accesso semantico e plurilingue ai contenuti culturali italiani nel web. PORTALITA – An integrated services platform for semantic and multilingual access to Italian cultural contents in the web
A 3-year project (2009-2012) funded by the Italian Ministry of Education, University and Research (Firb 2007, prot. RBNE07C4R9). Project partners: Dip. di Studi italianistici dell’Università di Pisa (coordinatore); Dip. di Informatica dell’Università di Pisa; Dip. di Storia delle Arti dell’Università di Pisa; Dip. di Italianistica e Spettacolo dell’Università di Roma “La Sapienza”; Consorzio ICoN – Italian Culture on the Net; Direzione Generale per i Beni Librari e gli Istituti Culturali (DGBLIC); CAP s.p.a.
The project has built a web platform including advanced tools for semantic and multilingual access to Italian cultural contents in the web, including NLP-based knowledge extraction and management tools.