SOAS Research Online

A Free Database of the Latest Research by SOAS Academics and PhD Students

[skip to content]

Lugli, Ligeia (2019) 'Smart lexicography for low-resource languages: lessons learned from Sanskrit and Tibetan.' In: Kosem, Iztok, Zingano Kuhn, Tanara, Correia, Margarita, Ferreira, José Pedro, Maarten, Jansen, Pereira, Isabel, Kallas, Jelena, Jakubíček, Miloš, Krek, Simon and Tiberius, Carole, (eds.), Electronic lexicography in the 21st century : Smart lexicography. Brno: Lexical Computing CZ, pp. 198-212.

Text - Published Version
Available under License Creative Commons Attribution-ShareAlike 4.0 (CC-BY-SA 4.0).

Download (305kB) | Preview


Traditional lexicography requires titanic efforts and enormous resources. For many languages, such resources have never been available. As a result, they have received only limited lexicographic coverage. Today, these languages can take advantage of many of the same digital tools and strategies that have simplified and expedited dictionary-making for mainstream languages. However, the resource gap remains evident even in the digital era, with basic corpus processing tasks that lie at the foundation of contemporary ‘smart lexicography’ still constituting a challenge for many under-resourced languages. Drawing on my own experience in Sanskrit and Tibetan lexicography, this paper aims to offer some guidance as to the advantages and limitations of the application of smart lexicography to under-resourced languages. In particular, this paper suggests that in order to optimize resources, it may be advisable to prioritize high-quality lexical annotation of the corpus over highly curated dictionary entries, and to let digital tools take care of the lexicographic representation of the annotated linguistic information.

Item Type: Book Chapters
Additional Information: Proceedings of the eLex 2019 conference (Sintra, Portugal, 1–3 October 2019)
Keywords: automated lexicography; GDEX; Buddhist Hybrid Sanskrit; Tibetan
SOAS Departments & Centres: Regional Centres and Institutes > Centre of South East Asian Studies
ISBN: ISSN 2533-5626
Copyright Statement: This work is licensed under the Creative Commons Attribution ShareAlike 4.0 International License.
Date Deposited: 04 Nov 2019 10:00
Funders: Arts and Humanities Research Council

Altmetric Data

There is no Altmetric data currently associated with this item.


Download activity - last 12 monthsShow export options
Downloads since deposit
6 month trend
6 month trend
Accesses by country - last 12 monthsShow export options
Accesses by referrer - last 12 monthsShow export options

Repository staff only

Edit Item Edit Item