Nilufar Abdurakhmonova | National University of Uzbekistan (original) (raw)
Papers by Nilufar Abdurakhmonova
Data in brief, Apr 1, 2024
Lecture notes in networks and systems, 2024
DOAJ (DOAJ: Directory of Open Access Journals), Nov 30, 2021
Zenodo (CERN European Organization for Nuclear Research), Aug 15, 2023
Якуб Умар оглы д.ф.н., профессор (Туркия) Алмаз Улви Биннатова д.ф.н., профессор (Азербайджан) Ба... more Якуб Умар оглы д.ф.н., профессор (Туркия) Алмаз Улви Биннатова д.ф.н., профессор (Азербайджан) Бакиева Гуландом д.ф.н., профессор (Узбекистан) Миннуллин Ким д.ф.н., профессор (Татарстан) Махмудов Низомиддин д.ф.н., профессор (Узбекистан) Керимов Исмаил д.ф.н., профессор (Россия) Джураев Маматкул д.ф.н., профессор (Узбекистан) Kуренов Рахыммамед к.ф.н. (Туркменистан) Кристофер Джеймс Форт Университет Мичигана (США) Умархаджаев Мухтар д.ф.н., профессор (Узбекистан) Мирзаев Ибодулло д.ф.н., профессор (Узбекистан) Балтабаев Хамидулла д.ф.н., профессор (Узбекистан) Дустмухаммедов Хуршид д.ф.н., профессор (Узбекистан) Лиходзиевский А.С. д.ф.н., профессор (Узбекистан) Сиддикова Ирода д.ф.н., профессор (Узбекистан) Шиукашвили Тамар д.ф.н. (Грузия) Юсупов Ойбек отв. секретарь, доцент (Узбекистан)
International journal of systems engineering, 2018
These days creating the corpus of texts for Uzbek language, creating and developing linguistic da... more These days creating the corpus of texts for Uzbek language, creating and developing linguistic databases, searchengine systems-are one of the crucial tasks of computational linguistics. Particularly, electronic dictionary-thesauruses, semantic dictionaries are one of them. Dictionary-thesaurus formation structure for Uzbek language, transferring the terminological dictionary into the e-version and implementing rules for establishing semantic relations between words where it gives a chance to establish automation linguistic processes of dictionary-thesauruses, which is the foundation of linguistic databases. Analyzing logical structure of paper-based dictionary thesauruses has given a chance to formalize its structure and creating rules for converting to e-version of dictionary-thesaurus syllables by using predicates language. Descriptors system is suggested in PROLOG language rules set for constructing e-version of dictionary-syllables.
2022 IEEE International Multi-Conference on Engineering, Computer and Information Sciences (SIBIRCON)
2020 International Conference on Information Science and Communications Technologies (ICISCT), 2020
This paper describes the development of Turkic Morpheme web portal, a toolkit that takes into acc... more This paper describes the development of Turkic Morpheme web portal, a toolkit that takes into account core features of Turkic languages and meets the requirements for research activities in computational linguistics and typology. This portal was created on the basis of the structural-parametric functional model of the Turkic morpheme and contains special linguistic databases that describe the categories of Turkic languages at different levels: morphological, syntactic, and semantic. The portal can also be used in educational process as a reference system for Turkic languages.
2022 7th International Conference on Computer Science and Engineering (UBMK)
The aim of the research work is to analyze theories on the formation of linguistic database of th... more The aim of the research work is to analyze theories on the formation of linguistic database of the translation program of simple texts from English into Uzbek and to create program foundations. The object of the research work is word combinations and simple sentences of English and Uzbek languages, grammatical expressions as well. Scientific novelty of the research work is as follows: Drawn conclusions provide exactness of translation on creating linguistic database of machine translation. created linguistic database of phrasal verbs, morphological lexicon, affixes of English and Uzbek languages and their morphological and syntactic models; identified coordination of simple sentence models for automatic translation; substantiated principles that have been created for analyzing morphological, syntactic-semantic texts of machine translation; worked out recommendations of coordinating paradigmatic attitudes on creating principles for electron dictionary and software for linguistic data...
2022 7th International Conference on Computer Science and Engineering (UBMK)
This article analyzes the views of scholars on the creation of learner dictionaries. According to... more This article analyzes the views of scholars on the creation of learner dictionaries. According to this, the principles of creating a dictionary such as the definition of the lexical minimum, the study of audience demand, the creation of article content through computer technology tools and corpus-based statistical analysis are theoretically studied in this paper. The specificity, structure, goals and objectives of these dictionaries are important in the creation of theoretical and methodological bases of educational lexicography of the Uzbek language in the example of practical research in this field in Russian and English.
Proceedings of the 11th International Global Wordnet Conference (GWC-2021), 2021
The results reported in this paper aim to increase the presence of the Uzbek language in the Inte... more The results reported in this paper aim to increase the presence of the Uzbek language in the Internet and its usability within IT applications. We describe the initial development of a “word-net” for the Uzbek language compatible to Princeton WordNet. We called it UZWORDNET. In the current version, UZWORDNET contains 28140 synsets, 64389 sense and 20683 words; its estimated accuracy is 75.98%. To the best of our knowledge, it is the largest wordnet for Uzbek existing to date, and the second wordnet developed overall.
The paper is devoted to the structure and its linguistic annotation for building Uzbek Corpus. Li... more The paper is devoted to the structure and its linguistic annotation for building Uzbek Corpus. Linguistic annotation, metadata and corpus manager as formal-functional model of the corpus are important for usage for many purposes. The fact that the platform allows users to address language and literature issues, use it online. The Uzbek corpus based on structural and sub corpus models, which partially represented in this paper, is going on process to develop Uzbek language technology.
2021 International Conference on Information Science and Communications Technologies (ICISCT)
Syntactic parsing is crucial stage among existing different types of parsing methods in the field... more Syntactic parsing is crucial stage among existing different types of parsing methods in the field of NLP. Syntactic parsing assists to identify the type sentence and word combinations that represented grammatical relations of the words. However, there are various grammatical features of the languages, almost all languages follow common linguistic rules. The Uzbek language belongs to agglutinative language family based on free constituent order language in syntax. Our investigations show that morphological aspect of word forms plays an essential role to identify and compose syntactic relations for the Uzbek language. Given morphological and lexical information can solve the some problems which connecting with syntactic parsing as well. Our article represents some main point of views the stages of parsing on CoNLLU format based on Uzbek corpus analysis. Tabbiy tilni qayta ishlashda turli tahlil qilish metodlari orasida sintaktik analiz qilish muhim sanaladi. Sintaktik analiz tilning g...
The Journal of social sciences and humanities, 2018
Objective: In the paper we describe the development process of the dictionary of Uzbek names and ... more Objective: In the paper we describe the development process of the dictionary of Uzbek names and surnames. Methodology: The dictionary is created to support the identification of personal names in Uzbek texts, and to aid the spell-checking of texts written in Uzbek. Results: Apart from discussing the development process, we also evaluate the dictionary by performing a set of experiments. Conclusion: We verify whether the information collected in the dictionary can be successfully used to find and, if needed, correct the misspelled names and surnames.
The idea of the “TurkLang-7” project is to create datasets and neural machine translation systems... more The idea of the “TurkLang-7” project is to create datasets and neural machine translation systems for a set of Russian-Turkic low-resource language pairs. It is planned to achieve this goal through a hybrid approach to the creation of a multilingual parallel corpus between Russian and Turkic languages, studying the applicability and effectiveness of neural network learning methods (transfer learning, multi-task learning, back-translation, dual learning) in the context of the selected language pairs, as well as the development of specialized methods for the unification of parallel data in different languages, based on the agglutinative nature of the selected Turkic languages (structural and functional model of the Turkic morpheme). In this paper, we describe the main stages of work on this project and the results of the first year: we developed a semiautomatic process for creating parallel corpora, collected data from several sources on 7 Turkic languages, and conducted the first exp...
Applied Sciences
Currently, there is an active development of the Uzbek sector of the Internet. In it, as in other... more Currently, there is an active development of the Uzbek sector of the Internet. In it, as in other national sectors, the most common form of presentation of textual information is semi-structured documents, work that presupposes the availability of reliable algorithms for text analysis, including its lexical characteristics. The article offers an intelligent web application developed for morphological analysis of words in the Uzbek language. The web application is based on the concept of generation and stem analysis of the Uzbek language word forms. A well-known Porter algorithm was chosen as the basis for stemming. The morphoanalyzer generates word forms of the Uzbek language based on the division of words into certain classes, taking into account the specifics and structure of this language. For example, nouns can be classified by meaning (related, nominal), by quantity (singular and plural), by case, and also, by the endings of belonging (possessive).
Data in brief, Apr 1, 2024
Lecture notes in networks and systems, 2024
DOAJ (DOAJ: Directory of Open Access Journals), Nov 30, 2021
Zenodo (CERN European Organization for Nuclear Research), Aug 15, 2023
Якуб Умар оглы д.ф.н., профессор (Туркия) Алмаз Улви Биннатова д.ф.н., профессор (Азербайджан) Ба... more Якуб Умар оглы д.ф.н., профессор (Туркия) Алмаз Улви Биннатова д.ф.н., профессор (Азербайджан) Бакиева Гуландом д.ф.н., профессор (Узбекистан) Миннуллин Ким д.ф.н., профессор (Татарстан) Махмудов Низомиддин д.ф.н., профессор (Узбекистан) Керимов Исмаил д.ф.н., профессор (Россия) Джураев Маматкул д.ф.н., профессор (Узбекистан) Kуренов Рахыммамед к.ф.н. (Туркменистан) Кристофер Джеймс Форт Университет Мичигана (США) Умархаджаев Мухтар д.ф.н., профессор (Узбекистан) Мирзаев Ибодулло д.ф.н., профессор (Узбекистан) Балтабаев Хамидулла д.ф.н., профессор (Узбекистан) Дустмухаммедов Хуршид д.ф.н., профессор (Узбекистан) Лиходзиевский А.С. д.ф.н., профессор (Узбекистан) Сиддикова Ирода д.ф.н., профессор (Узбекистан) Шиукашвили Тамар д.ф.н. (Грузия) Юсупов Ойбек отв. секретарь, доцент (Узбекистан)
International journal of systems engineering, 2018
These days creating the corpus of texts for Uzbek language, creating and developing linguistic da... more These days creating the corpus of texts for Uzbek language, creating and developing linguistic databases, searchengine systems-are one of the crucial tasks of computational linguistics. Particularly, electronic dictionary-thesauruses, semantic dictionaries are one of them. Dictionary-thesaurus formation structure for Uzbek language, transferring the terminological dictionary into the e-version and implementing rules for establishing semantic relations between words where it gives a chance to establish automation linguistic processes of dictionary-thesauruses, which is the foundation of linguistic databases. Analyzing logical structure of paper-based dictionary thesauruses has given a chance to formalize its structure and creating rules for converting to e-version of dictionary-thesaurus syllables by using predicates language. Descriptors system is suggested in PROLOG language rules set for constructing e-version of dictionary-syllables.
2022 IEEE International Multi-Conference on Engineering, Computer and Information Sciences (SIBIRCON)
2020 International Conference on Information Science and Communications Technologies (ICISCT), 2020
This paper describes the development of Turkic Morpheme web portal, a toolkit that takes into acc... more This paper describes the development of Turkic Morpheme web portal, a toolkit that takes into account core features of Turkic languages and meets the requirements for research activities in computational linguistics and typology. This portal was created on the basis of the structural-parametric functional model of the Turkic morpheme and contains special linguistic databases that describe the categories of Turkic languages at different levels: morphological, syntactic, and semantic. The portal can also be used in educational process as a reference system for Turkic languages.
2022 7th International Conference on Computer Science and Engineering (UBMK)
The aim of the research work is to analyze theories on the formation of linguistic database of th... more The aim of the research work is to analyze theories on the formation of linguistic database of the translation program of simple texts from English into Uzbek and to create program foundations. The object of the research work is word combinations and simple sentences of English and Uzbek languages, grammatical expressions as well. Scientific novelty of the research work is as follows: Drawn conclusions provide exactness of translation on creating linguistic database of machine translation. created linguistic database of phrasal verbs, morphological lexicon, affixes of English and Uzbek languages and their morphological and syntactic models; identified coordination of simple sentence models for automatic translation; substantiated principles that have been created for analyzing morphological, syntactic-semantic texts of machine translation; worked out recommendations of coordinating paradigmatic attitudes on creating principles for electron dictionary and software for linguistic data...
2022 7th International Conference on Computer Science and Engineering (UBMK)
This article analyzes the views of scholars on the creation of learner dictionaries. According to... more This article analyzes the views of scholars on the creation of learner dictionaries. According to this, the principles of creating a dictionary such as the definition of the lexical minimum, the study of audience demand, the creation of article content through computer technology tools and corpus-based statistical analysis are theoretically studied in this paper. The specificity, structure, goals and objectives of these dictionaries are important in the creation of theoretical and methodological bases of educational lexicography of the Uzbek language in the example of practical research in this field in Russian and English.
Proceedings of the 11th International Global Wordnet Conference (GWC-2021), 2021
The results reported in this paper aim to increase the presence of the Uzbek language in the Inte... more The results reported in this paper aim to increase the presence of the Uzbek language in the Internet and its usability within IT applications. We describe the initial development of a “word-net” for the Uzbek language compatible to Princeton WordNet. We called it UZWORDNET. In the current version, UZWORDNET contains 28140 synsets, 64389 sense and 20683 words; its estimated accuracy is 75.98%. To the best of our knowledge, it is the largest wordnet for Uzbek existing to date, and the second wordnet developed overall.
The paper is devoted to the structure and its linguistic annotation for building Uzbek Corpus. Li... more The paper is devoted to the structure and its linguistic annotation for building Uzbek Corpus. Linguistic annotation, metadata and corpus manager as formal-functional model of the corpus are important for usage for many purposes. The fact that the platform allows users to address language and literature issues, use it online. The Uzbek corpus based on structural and sub corpus models, which partially represented in this paper, is going on process to develop Uzbek language technology.
2021 International Conference on Information Science and Communications Technologies (ICISCT)
Syntactic parsing is crucial stage among existing different types of parsing methods in the field... more Syntactic parsing is crucial stage among existing different types of parsing methods in the field of NLP. Syntactic parsing assists to identify the type sentence and word combinations that represented grammatical relations of the words. However, there are various grammatical features of the languages, almost all languages follow common linguistic rules. The Uzbek language belongs to agglutinative language family based on free constituent order language in syntax. Our investigations show that morphological aspect of word forms plays an essential role to identify and compose syntactic relations for the Uzbek language. Given morphological and lexical information can solve the some problems which connecting with syntactic parsing as well. Our article represents some main point of views the stages of parsing on CoNLLU format based on Uzbek corpus analysis. Tabbiy tilni qayta ishlashda turli tahlil qilish metodlari orasida sintaktik analiz qilish muhim sanaladi. Sintaktik analiz tilning g...
The Journal of social sciences and humanities, 2018
Objective: In the paper we describe the development process of the dictionary of Uzbek names and ... more Objective: In the paper we describe the development process of the dictionary of Uzbek names and surnames. Methodology: The dictionary is created to support the identification of personal names in Uzbek texts, and to aid the spell-checking of texts written in Uzbek. Results: Apart from discussing the development process, we also evaluate the dictionary by performing a set of experiments. Conclusion: We verify whether the information collected in the dictionary can be successfully used to find and, if needed, correct the misspelled names and surnames.
The idea of the “TurkLang-7” project is to create datasets and neural machine translation systems... more The idea of the “TurkLang-7” project is to create datasets and neural machine translation systems for a set of Russian-Turkic low-resource language pairs. It is planned to achieve this goal through a hybrid approach to the creation of a multilingual parallel corpus between Russian and Turkic languages, studying the applicability and effectiveness of neural network learning methods (transfer learning, multi-task learning, back-translation, dual learning) in the context of the selected language pairs, as well as the development of specialized methods for the unification of parallel data in different languages, based on the agglutinative nature of the selected Turkic languages (structural and functional model of the Turkic morpheme). In this paper, we describe the main stages of work on this project and the results of the first year: we developed a semiautomatic process for creating parallel corpora, collected data from several sources on 7 Turkic languages, and conducted the first exp...
Applied Sciences
Currently, there is an active development of the Uzbek sector of the Internet. In it, as in other... more Currently, there is an active development of the Uzbek sector of the Internet. In it, as in other national sectors, the most common form of presentation of textual information is semi-structured documents, work that presupposes the availability of reliable algorithms for text analysis, including its lexical characteristics. The article offers an intelligent web application developed for morphological analysis of words in the Uzbek language. The web application is based on the concept of generation and stem analysis of the Uzbek language word forms. A well-known Porter algorithm was chosen as the basis for stemming. The morphoanalyzer generates word forms of the Uzbek language based on the division of words into certain classes, taking into account the specifics and structure of this language. For example, nouns can be classified by meaning (related, nominal), by quantity (singular and plural), by case, and also, by the endings of belonging (possessive).
Globe, 2023
Ushbu darslik 70230801 – Kompyuter lingvistikasi magistratura mutaxassisligidagi magistrlarga mo‘... more Ushbu darslik 70230801 – Kompyuter lingvistikasi magistratura mutaxassisligidagi magistrlarga mo‘ljallangan bo‘lib, o‘zbek tili elektron korpusini konseptologik va strukturaviy loyihalashda xorijiy tajriba amaliyotini o‘rganish, tilning lingvistik korpusini yaratishda morfologik va sintaktik teglash va tahlil qilishning FST va UdPipe kabi avtomatik usullarini o‘zbek tiliga tatbiq qilish orqali lingvistik algoritmni tuzish hamda lisoniy modellarini mashina tiliga o‘tkazish, matn fragmentining reprezentativligi va qidiriv birliklari (lemma va token)ni tahlil qilish uchun matn korpusining lingvistik va dasturiy ta’minotini tuzish, o‘zbek tili uchun korpus yaratish texnologiyalari va metodlarini lingvistik instrumentariylar yordamida amalga oshirish, korpus menejerining formal-funksional modellari asosida korpus interfeysini shakllantirishga oid bilim va ko‘nikmalarni nazariy va amaliy jihatdan shakllantirishga yordam beradi.