Updating and Expanding a Retro-digitised Dictionary: Some Insights from the Dictionary of the Irish Language

presenter: Sharon J Arbuthnot, Queen’s University
published: July 27, 2018,   recorded: July 2018,   views: 478


Related Open Educational Resources

Related content

Report a problem or upload files

If you have found a problem with this lecture or would like to send us extra material, articles, exercises, etc., please use our ticket system to describe your request and upload the data.
Enter your e-mail into the 'Cc' field, and we will keep you updated with your request's status.
Lecture popularity: You need to login to cast your vote.


The Dictionary of the Irish Language (DIL), which covers the language from earliest evidence up to around 1650, was originally published in hardcopy between 1913 and 1976. In 2007, the entire dictionary contents were digitised, tagged in TEI-conformant XML and published free online. The resultant web resource (www.dil.ie) currently receives an average of around 18,000 unique visitors and 200,000 page downloads each month. Since 2007, DIL has been the focus of two small-scale research projects, which have aimed (a) to expand the dictionary by incorporating words, senses, idiomatic usages, and diagnostic and other evidence which came to light after the original text was published, and (b) to improve the integrity of the contents by correcting errors of interpretation, inconsistences and questionable decisions regarding orthography and presentation in the original text. A partially revised version of DIL was released online in 2013 and a further set of revisions will be incorporated in 2019. In all, about 10,000 separate corrections and additions will have been made in the course of the two projects, affecting around 20% of the entries contained in the original dictionary. As might be expected, the work of updating material has presented more complex challenges than that of drafting entirely new entries or inserting new citations, senses, and the like. Constrained by issues of finance and time, and with only two full-time members of staff, the project teams have had to work with (and sometimes around) the original text and to devise strategies for dealing with entries which are problematic in themselves but also tightly bound into the larger dictionary through cross-references. Deleting or emending such material can necessitate a series of secondary changes and lead to a situation where scholarly literature of the last century makes reference to DIL entries which no longer exist. Based on experience, and on a working method of tackling problematic entries on a case-by-case basis, this paper takes a close look at some of the issues which arose when the original dictionary was found to be potentially misleading or in error, and outlines solutions adopted in the revised version to deal with these. Topics covered include: whether ghost-words originating within the dictionary or in published texts of manuscript materials ought to be deleted, whether it is advisable to standardise headwords, even if examples are attested only in inflected or case forms, and how to treat material which was tentatively placed under two or more different headwords in the original dictionary and is still not properly understood. Steps taken to emend definitions which are now either factually or culturally obsolete will be discussed also, and there will be some consideration of the more general complexities associated with revising a paper-born dictionary in which entries have little formal or consistent structure.

See Also:

Download slides icon Download slides: euralex2018_arbuthnot_irish_01.pdf (2.8 MB)

Help icon Streaming Video Help

Link this page

Would you like to put a link to this lecture on your homepage?
Go ahead! Copy the HTML snippet !

Write your own review or comment:

make sure you have javascript enabled or clear this field: