SOAS Research Online

A Free Database of the Latest Research by SOAS Academics and PhD Students

[skip to content]

O’Neill, Alexander James and Hill, Nathan W. (2022) 'Text Recognition for Nepalese Manuscripts in Pracalit Script.' Journal of Open Humanities Data, 8 (26).

[img]
Preview
Text - Published Version
Available under License Creative Commons Attribution 4.0 (CC-BY 4.0).

Download (1MB) | Preview
Alternative Location: https://doi.org/10.5334/johd.90

Abstract

This dataset is a model for handwritten text recognition (HTR) of Sanskrit and Newar Nepalese manuscripts in Pracalit script. This paper introduces the state of the field in Newar literature, Newar manuscripts, and HTR engines. It explains our methodology for developing the requisite ground truth consisting of manuscript images and corresponding transcriptions, training our model with a PyLAia engine, and this model’s limitations. This dataset shared on Zenodo can be used by anyone working with manuscripts in Pracalit script, which will benefit the fields of Indology and Newar studies, as well as historical and linguistic analysis.

Item Type: Journal Article
Keywords: handwritten text recognition; PyLAia; Transkribus; Sanskrit; Newar; Manuscripts
SOAS Departments & Centres: Departments and Subunits > Department of East Asian Languages & Cultures
ISSN: 2059481X
DOI (Digital Object Identifier): https://doi.org/10.5334/johd.90
Date Deposited: 07 Aug 2023 17:28
URI: https://eprints.soas.ac.uk/id/eprint/39996
Funders: Arts and Humanities Research Council

Altmetric Data

Statistics

Download activity - last 12 monthsShow export options
Downloads since deposit
6 month trend
76Downloads
6 month trend
38Hits
Accesses by country - last 12 monthsShow export options
Accesses by referrer - last 12 monthsShow export options

Repository staff only

Edit Item Edit Item