4th Workshop on Linked Data in Linguistics (LDL-2015):

Resources and Applications

Co-located with ACL-IJCNLP 2015
Submission is now open

The substantial growth in the quantity, diversity and complexity of linguistic data accessible on the Web has led to many new and interesting research areas in natural language processing and linguistics. However, the lack of interoperability between linguistic and language resources represents a major challenge that still needs to be addressed, in particular if information from heterogeneous sources is to be combined, such as machine-readable dictionaries, corpus data and terminological repositories. We particularly encourage contributions discussing the application of the Linked Open Data Paradigm to linguistic data as it provides an important step towards making linguistic interoperable, uniformly queryable and sharable over the web using existing standards such as HTTP and RDF.

The Linked Data in Linguistics workshop series provides a forum to discuss the creation of resources on the web using linked data principles, as well as issues of interoperability, distribution protocols, access and integration of language resources and natural language processing pipelines developed on this basis. Given on the development of a Linked Open Data (sub-) cloud of linguistic resources that accompanied the three preceding workshops, LDL-2015 will specifically welcome papers addressing the use of Linked Data and related technologies in Natural Language Processing and related disciplines (e.g., Digital Humanities). Our workshop pursues the following goals:

  1. Provide researchers in Natural Language Processing and the Semantic Web a platform to present and discuss how to exploit linguistic resources on the web in content analytics systems handling text. The key challenge here is the application of linked data principles and the use of semantic technologies to enable NLP pipelines, which are efficient, scalable and portable. In particular, the use of semantic technologies here should be able to handle the increasing scale of big data being produced and utilized for a large number of tasks including high-quality machine translation, understanding opinion and content management.
  2. Support the publishing and linking of mono- and multilingual linguistic and knowledge data collections, including corpora, grammars, dictionaries, wordnets, translation memories, domain-specific ontologies and so forth. In addition to research papers, we thus invite dataset descriptions following the example of journals such as the Semantic Web Journal to allow researchers to present the clear and complete descriptions of resources evaluated according to the quality and usefulness of the dataset rather than focusing on the novelty of the methodology.

Organized by the interdisciplinary Open Linguistics Working Group (OWLG), the LDL workshop series has already been successful at attracting of researchers from a wide range of disciplines, including not only computational linguistics and Natural Language Processing, but also the Semantic Web, linguistic typology, corpus linguistics, terminology and lexicography. In 2015, we plan to increase the involvement of the LIDER project and the LD4LT community group, to build on their efforts to facilitate the use of linked data and language resources for commercial applications, and to continue the success of LIDER‘s series of successful roadmapping workshops in engagement with enterprise.

Background and History

The workshop is continuing a series of workshops on the application of the Linked Data paradigm to linguistic data that have been initiated and organized by the Open Linguistics Working Group of the Open Knowledge Foundation (OWLG): The First Workshop on Linked Data in Linguistics (LDL-2012) was conducted in March 2012 at the University of Frankfurt am Main, Germany, and co-located with the 34th Annual Meeting of the German Linguistics Society (DGfS-2012). The Workshop on Multilingual Linked Open Data for Enterprises ( MLODE-2012) was conducted in September 2012 at the University of Leipzig, Germany, and co-located with the 3rd Conference on Software Agents and Services for Business, Research and E-Science (SABRE-2012). The Second Workshop on Linked Data in Linguistics ( LDL-2013) was conducted in Sep 2013 at CNR in Pisa, Italy, and co-located with the 6th International Conference on the Generative Lexicon (GL2013). The third edition of the workshop (LDL-2014) was held at the 9th Language Resource and Evaluation Conference (LREC-2014) in Reykjavik, Iceland. Finally, the second workshop on Multilingual Linked Open Data for Enterprises (MLODE-2014) was co-located with the SEMANTiCS 2014 Conference in Leipzig, Germany.


This workshop is supported by two EU projects: Firstly, LIDER (Linked Data as an enabler of cross-media and multilingual content analytics for enterprises across Europe), which aims to provide an ecosystem for the establishment of linguistic linked open data, as well as media resources metadata, for a free and open exploitation of such resources in multilingual, cross- media content analytics across Europe. Secondly, QTLeap (Quality Translation with Deep Language Engineering Approaches), which explores novel ways for attaining machine translation of higher quality that are opened by a new generation of increasingly sophisticated semantic datasets (including Linked Open Data) and by recent advances in deep language processing.

In addition to the Open Linguistics Working Group of the Open Knowledge Foundation (OWLG), several community groups are also directly supported by this workshop, the W3C Ontology- Lexica Community Group, the W3C Best Practices on Multilingual Linked Open Data Community Group and the W3C Linked Data for Language Technology Community Group.


      Christian Chiarcos (Goethe-Universität Frankfurt am Main, Germany)
      John Philip McCrae (Universität Bielefeld, Germany)
      Petya Osenova (Sofia University and IICT-BAS, Bulgaria),
      Philipp Cimiano (Universität Bielefeld, Germany)
      Nancy Ide (Vassar College, USA)