O‘zbek-ingliz tillarining teglangan parallel  korpusini yaratish bosqichlari

Ботир Элов; Ma’rufjon Amirqulov

Authors

Ботир Элов
Ma’rufjon Amirqulov Tashkent State University of Uzbek Language and Literature named after Alisher Navoi. https://orcid.org/0000-0002-4025-8466

Keywords:

Corpus, tagging, alignment, parallel corpora, POS tagging, XML encoding.

Abstract

The article talks about corpus linguistics, which is one of the main
directions of computer linguistics, monolingual corpora and parallel
corpora, and also about the stages of creating a parallel corpus of Uzbek-
English languages based on world experience in the field of parallel
corpora is maintained. In addition, information is provided about priority
tasks such as establishing the programming and linguistic principles of the
parallel corpus of Uzbek-English languages, linguistic and extralinguistic
tagging of selected units, and developing an algorithm for creating a parallel
corpus. Considerations are given on how to select data for the parallel
corpus, the requirements for the data, and what opportunities the creation
of the Uzbek-English parallel corpus provides to researchers and users.

In this process, linguistic and methodological problems, such as material
selection, as well as programmatic difficulties in creating a parallel corpus
are reflected.

Author Biography

Ma’rufjon Amirqulov, Tashkent State University of Uzbek Language and Literature named after Alisher Navoi.

2nd year master's degree in Computer Linguistics at Alisher Navoi Tashkent State University of Uzbek Language and Literature.

Steps of creating a tagged parallel corpus of the uzbek-english languages

Authors

Keywords:

Abstract

Author Biography

Ma’rufjon Amirqulov, Tashkent State University of Uzbek Language and Literature named after Alisher Navoi.

Downloads

Published

How to Cite

Issue

Section

Similar Articles

Most read articles by the same author(s)

Language

Current Issue

Information