SOAS Research Online

A Free Database of the Latest Research by SOAS Academics and PhD Students

[skip to content]

Lugli, Lifeia (2019) 'Smart lexicography for low-resource languages: lessons learned from Sanskrit and Tibetan.' In: Kosem, Iztok and Zingano Kuhn, Tanara and Correia, Margarita and Ferreira, José Pedro and Maarten, Jansen and Pereira, Isabel and Kallas, Jelena and Jakubíček, Miloš and Krek, Simon and Tiberius, Carole, (eds.), Electronic lexicography in the 21st century : Smart lexicography. Brno: Lexical Computing CZ, pp. 198-212.

[img]
Preview
Text - Published Version
Available under License Creative Commons Attribution-ShareAlike 4.0 (CC-BY-SA 4.0).

Download (305kB) | Preview

Abstract

Traditional lexicography requires titanic efforts and enormous resources. For many languages, such resources have never been available. As a result, they have received only limited lexicographic coverage. Today, these languages can take advantage of many of the same digital tools and strategies that have simplified and expedited dictionary-making for mainstream languages. However, the resource gap remains evident even in the digital era, with basic corpus processing tasks that lie at the foundation of contemporary ‘smart lexicography’ still constituting a challenge for many under-resourced languages. Drawing on my own experience in Sanskrit and Tibetan lexicography, this paper aims to offer some guidance as to the advantages and limitations of the application of smart lexicography to under-resourced languages. In particular, this paper suggests that in order to optimize resources, it may be advisable to prioritize high-quality lexical annotation of the corpus over highly curated dictionary entries, and to let digital tools take care of the lexicographic representation of the annotated linguistic information.

Item Type: Book Chapters
Additional Information: Proceedings of the eLex 2019 conference (Sintra, Portugal, 1–3 October 2019)
Keywords: automated lexicography; GDEX; Buddhist Hybrid Sanskrit; Tibetan
SOAS Departments & Centres: Regional Centres and Institutes > Centre of South East Asian Studies
ISBN: ISSN 2533-5626
Copyright Statement: This work is licensed under the Creative Commons Attribution ShareAlike 4.0 International License. http://creativecommons.org/licenses/by-sa/4.0/
Date Deposited: 04 Nov 2019 10:00
URI: https://eprints.soas.ac.uk/id/eprint/31849
Funders: Arts and Humanities Research Council

Altmetric Data

There is no Altmetric data currently associated with this item.

Statistics

Download activity - last 12 months
Downloads since deposit
21Downloads
35Hits
Accesses by country - last 12 months
Accesses by referrer - last 12 months

Repository staff only

Edit Item Edit Item