Papers by Gabriella Pardelli
Biblos" : Historical, Philosophical and Philological Digital Library of the Italian National Research Council
English. This paper illustrates methods and tools to study the development of research topics in ... more English. This paper illustrates methods and tools to study the development of research topics in the TEI community across the years. For this purpose, automatic terminology extraction technologies were exploited. Italiano. Questo contributo illustra metodi e strumenti per studiare il cambiamento diacronico degli interessi di ricerca della comunità TEI grazie all'uso di metodi di estrazione automatica della terminologia da corpora di dominio. 1

Open Grey for Natural Language Processing: A ride on the network
The aim of this paper is to introduce the Open Access movement for Natural Language Processing (N... more The aim of this paper is to introduce the Open Access movement for Natural Language Processing (NLP) by means of a wide range of open access Grey Literature documentation available on the web. In 2008 Robert Dale, in the last issue of volume 35 of Computational Linguistics said: "There are a number of definitions of the term 'open access' in circulation, but almost all share the key principle that scientific literature should be freely available for all to read, download, copy, distribute, and use (with appropriate attribution) without restriction". At first glance it might seem that the Open Access movement has gradually become more influential in the field of language technology by building repositories accessible through the network. Today's digital archives are niches of intellectual production spread by means of a wide range of documents (such as journal articles and proceedings) which, paradoxically, the search engines do not always reach. The use of inap...

Marine Planning and Service Platform (MAPS) An Advanced Research Engine for Grey Literature in Marine Science
The MAPS (Marine Planning and Service Platform) project is a development of the Marine project (R... more The MAPS (Marine Planning and Service Platform) project is a development of the Marine project (Ricerca Industriale e Sviluppo Sperimentale Regione Liguria 2007-2013) aiming at building a computer platform for supporting a Marine Information and Knowledge System, as part of the data management activities. One of the main objective of the project is to develop a repository that should gather, classify and structure marine scientific literature and data thus guaranteeing their accessibility to researchers and institutions by means of standard protocols. We will present the scenario of the Operative Oceanography together with the technologies used to develop an advanced search engine which aims at providing rapid and efficient access to a Digital Library of oceanographic data. The case-study is also highlighting how the retrieval of grey literature from this specific marine community could be reproduced for similar communities as well, thus revealing the great impact that the processin...
Providing access to Grey Literature: the CLARIN infrastructure
The idea is to use the VLO for inquiring about the quantity of grey literature which can actually... more The idea is to use the VLO for inquiring about the quantity of grey literature which can actually be found in CLARIN. A mapping between the terminological resources contained in the GreyNet International 1992-2017 website and in particular in the GreySource Index > Document Types in Grey Literature – and those retrievable from the VLO has been performed. This work will provide a map of the documentation archived in the CLARIN infrastructure, whose purpose is to share language resources produced and managed in the various European countries but finally merged into the CLARIN data centers for allowing access, interoperability, reuse and preservation of scientific documentation as well as Grey Literature.
A terminological "journey" in the Grey Literature domain
Questo articolo propone il primo studio comparativo di quattro anni di conferenze italiane nel ca... more Questo articolo propone il primo studio comparativo di quattro anni di conferenze italiane nel campo delle Digital Humanities e della Linguistica Computazionale. Nello specifico, e stato creato un corpus costituito dai contributi presentati tra il 2014 ed il 2017 nelle conferenze AIUCD e CLiC-it a cui e stata applicata un’analisi multidimensionale prendendo in considerazione: (i) lo studio delle collaborazioni tra autori usando tecniche di analisi delle reti sociali, (ii) l’estrazione automatica di terminologia ed informazioni e (iii) l’esame delle pratiche citazionali. Combinano metodi di indagine sia qualitativi che quantitativi, questo lavoro vuole far luce su convergenze e discrepanze tra due ambiti di ricerca che storicamente hanno sorgenti comuni.

This work analyses a corpus made of the titles of research projects belonging to the last four Eu... more This work analyses a corpus made of the titles of research projects belonging to the last four European Commission Framework Programmes (FP4, FP5, FP6, FP7) during a time span of nearly two decades (1994-2012). The starting point is the idea of creating a corpus of titles which would constitute a terminological niche, a sort of “cluster map” offering an overall vision on the terms used and the links between them. Moreover, by performing a terminological comparison over a period of time it is possible to trace the presence of obsolete words in outdated research areas as well as of neologisms in the most recent fields. Within this scenario, the minimal purpose is to build a corpus of titles of European projects belonging to the several Framework Programmes in order to obtain a terminological mapping of relevant words in the various research areas: particularly significant would be those terms spread across different domains or those extremely tied to a specific domain. A term could ac...

After 8 years we revisit the LRE Map of Language Resources, introduced at LREC 2010, to try to ge... more After 8 years we revisit the LRE Map of Language Resources, introduced at LREC 2010, to try to get a picture of the field and its evolution as reflected by the creation and use of Language Resources. The purpose of the Map was in fact “to shed light on the vast amount of resources that represent the background of the research presented at LREC”. It also aimed at a “change of culture in the field, actively engaging each researcher in the documentation task about resources”. The data analysed here have been provided by the authors of several conferences during the phase of submission of papers, and contain information about ca. 7500 resources. We analysed the LRE Map data from many different viewpoints and the paper reports on the global picture, on different trends emerging from the diachronic perspective and finally on some comparisons between the 2 major conferences present in the Map: LREC and COLING.
This paper describes a serialization of the LRE Map database according to the RDF model. Due to t... more This paper describes a serialization of the LRE Map database according to the RDF model. Due to the peculiar nature of the LRE Map, many ontologies are necessary to model the map in RDF, including newly created and reused ontologies. The importance of having the LRE Map in RDF and its connections to other open resources is also addressed.
This proposal describes a new way to visualise resources in the LREMap, a community-built reposit... more This proposal describes a new way to visualise resources in the LREMap, a community-built repository of language resource descriptions and uses. The LREMap is represented as a force-directed graph, where resources, papers and authors are nodes. The analysis of the visual representation of the underlying graph is used to study how the community gathers around LRs and how LRs are used in research.
The aim of this work is to present an overview of the research presented at the LREC workshops ov... more The aim of this work is to present an overview of the research presented at the LREC workshops over the years 1998-2016 with the aim to shed light on the community represented by workshop participants in terms of country of origin, type of affiliation, gender. There has been also an effort towards the identification of the major topics dealt with as well as of the terminological variations noticed in this time span. Data has been retrieved from the portal of the European Language Resources Association (ELRA) which organizes the conference and the resulting corpus made up of workshops titles and of the related presentations has then been processed using a term extraction tool developed at ILC-CNR.
English. This paper illustrates methods and tools to study the development of research topics in ... more English. This paper illustrates methods and tools to study the development of research topics in the TEI community across the years. For this purpose, automatic terminology extraction technologies were exploited. Italiano. Questo contributo illustra metodi e strumenti per studiare il cambiamento diacronico degli interessi di ricerca della comunità TEI grazie all’uso di metodi di estrazione automatica della terminologia da corpora di dominio.1
Biblos" : Historical, Philosophical and Philological Digital Library of the Italian National Research Council

The LRE Map: what does it tell us about the last decade of our field?
Language Resources and Evaluation, 2021
The LRE Map of Language Resources was introduced at LREC 2010. Its intended purpose was: “to shed... more The LRE Map of Language Resources was introduced at LREC 2010. Its intended purpose was: “to shed light on the vast amount of resources that represent the background of the research presented at LREC”. It also aimed at a change of culture in the field, actively engaging each researcher both in the documentation task about resources and in sharing resources. When we started to use it regularly also in other conferences, it became clear that it was an innovative instrument able to provide a picture of the field and its evolution as reflected by the creation and use of Language Resources. After 9 years we revisit the Map, considerably extending the data analysed in an LREC 2018 paper. The LRE Map data analysed here have been provided by the authors of 21 conferences during the phase of submission of papers, and contain information about 9405 resources. We analyse the LRE Map data from many different viewpoints and the paper reports on the global picture, along the many Map dimensions, on different trends emerging from a diachronic perspective and finally on some comparisons between five editions of the two major conferences present in the Map: LREC and COLING.
La littérature grise des projets de recherche européens
I2D - Information, données & documents, 2015
Les projets scientifiques financés par la Commission européenne produisent de la littérature gris... more Les projets scientifiques financés par la Commission européenne produisent de la littérature grise. Une étude menée en 2013 sur 226 projets CNR du 7e programme-cadre (2007-2013) a analysé la typologie, le format et la disponibilité des documents signalés sur le serveur Cordis (rapports de recherche et articles scientifiques) et les sites projets (contenant listes de partenaires, brochures, communiqués,... Consiglio Nazionale delle Ricerche CNR, ILC, Italy

Proceedings of the Second Italian Conference on Computational Linguistics CLiC-it 2015
English. This paper aims to provide a first snapshot of Italian Language Resources (LRs) and thei... more English. This paper aims to provide a first snapshot of Italian Language Resources (LRs) and their uses by the community, as documented by the papers presented at two different conferences, LREC2014 and CLiC-it 2014. The data of the former were drawn from the LOD version of the LRE Map, while those of the latter come from manually analyzing the proceedings. The results are presented in the form of visual graphs and confirm the initial hypothesis that Italian LRs require concrete actions to enhance their visibility. Italiano. Questo articolo ha l'obiettivo di fornire una fotografia del contesto delle Risorse Linguistiche italiane e dei loro usi da parte della comunità scientifica; i dati usati sono tratti dagli articoli presentati a due diverse conferenze del settore, LREC2014 e CLiC-it 2014. I primi sono derivati dalla LRE Map in versione LOD, mentre i secondi sono stati ottenuti da un'analisi manuale degli atti della conferenza. I risultati sono presentati e analizzati sotto forma di grafi e confermano l'ipotesi che le risorse linguistiche italiane richiedano azioni mirate ad aumentare la loro visibilità.
TAL Bibliography (1951-2002). Parte I

SCIRES-IT : SCIentific RESearch and Information Technology, Mar 20, 2013
Le ricerche linguistiche, a partire dal secondo dopoguerra, hanno avuto un ritmo di evoluzione e ... more Le ricerche linguistiche, a partire dal secondo dopoguerra, hanno avuto un ritmo di evoluzione e di espansione molto rapido, grazie anche ai metodi di analisi introdotti, come l'uso dei metodi statistici o quantitativi nello studio delle lingue e delle opere letterarie. Nacquero nuovi settori di applicazione, la linguistica incontrò altre scienze e l'interdisciplinarietà venne sempre più praticata fino a diventare necessaria. L'introduzione di sistemi di automazione nelle analisi linguistiche videro la nascita della Linguistica Computazionale (LC) che mise in connessione lo studio della lingua con l'ausilio dell'elaboratore elettronico. Dalla fine degli anni '40 all'inizio degli anni '60 gli utilizzi del calcolo elettronico per l'elaborazione di dati linguistici si articolarono in due filoni principali: i) gli spogli elettronici dei testi che diedero impulso alla lessicografia computazionale, avviata da Padre Roberto Busa nel 1951 con la compilazione delle concordanze di Tommaso d'Aquino; ii) i tentativi di traduzione automatica (TA), in inglese machine translation (MT), avviati da Weaver nel 1949 con la pubblicazione del memorandum "Translation". La traduzione automatica divenne da subito nucleo e centro di spinta della Linguistica Computazionale utilizzando il calcolatore per trasportare un testo da una lingua naturale all'altra. Il presente articolo sintetizza il rapido percorso della LC e la necessità di elaborare in fretta una terminologia adeguata alla neonata disciplina. Fornisce, inoltre, indicazioni per il recupero informativo di documentazione del settore. In appendice viene data una rappresentazione tabellare (Tabelle 1, 2 e 3) dei termini estratti dai titoli degli articoli delle Conferenze Internazionali di Linguistica Computazionale 1 (1965-2010) da cui si evince la rilevanza dei temi argomentali propria di questo settore disciplinare. Il sistema usato, per l'elaborazione di tali dati, è disponibile presso l'Istituto di Linguistica Computazionale "A. Zampolli" del CNR di Pisa.
Uploads
Papers by Gabriella Pardelli