Evaluation of part of speech tagging in uzbek language: problems and proposals

Authors

Keywords:

Tag, markup, annotation, tagset, NLP, corpus, CLAWS

Abstract

Speaking of a language corpus, the issue of building a linguistic
database becomes the subject of concern because of its complexity
and importance at the same time. The process of assigning appropriate
identifiers to speech fragments in corpus texts is problematic since language
modeling is associated with the rules and patterns of tagging existing in the
language. Tagging, especially grammatical tagging or PoS tagging, is also
a topical issue for Uzbek corpus linguistics. Because a special “encoded”
symbol system serves as the primary key in solving NLP problems related
to the Uzbek language. The article analyzes the studies of tagging and PoS
tagging in world linguistics and considers the current tagging process in
Uzbek linguistics. Based on the rules of the Uzbek language, an alternative

set of tags was proposed taking sets of tags widely used in the world into
consideration.

Published

2023-01-03

How to Cite

Элов, Б., Hamroyeva, S., Abdullayeva, O., & Uzoqova, M. (2023). Evaluation of part of speech tagging in uzbek language: problems and proposals. Uzbekistan Language and Culture, 5(2), 51–68. Retrieved from https://aphil.tsuull.uz/index.php/language-and-culture/article/view/33