Violetta Cavalli-Sforza | Al Akhawayn University in Ifrane (original) (raw)
Uploads
Papers by Violetta Cavalli-Sforza
Example Based Machine Translation (EBMT) is limited by the quantity and scope of its training dat... more Example Based Machine Translation (EBMT) is limited by the quantity and scope of its training data. Even with a reasonably large corpus, we will not have examples that cover everything we want to translate. This problem is especially severe in Arabic due to its rich morphology. We demonstrate a novel method that exploits the regular nature of Arabic morphology to increase the quality and coverage of machine translation. Through the use of generalization and rewrite rules, we are able to recover the English translation of phrases that do not exist in the training corpora. Furthermore, this system shows improvement in BLEU even with a training corpus of 1.4 million sentence pairs.
This research paper presents Imikode, a virtual reality (VR)-based learning game to support the t... more This research paper presents Imikode, a virtual reality (VR)-based learning game to support the teaching and learning of objectoriented programming (OOP) concepts in computing education. The system was implemented using Google Cardboard SDK and works with any Cardboard compatible headset. The teacher can use Imikode to introduce OOP class in an immersive and engaging way. This early prototype provides basic features to help introduce OOP concepts. We aim at extending it using Artificial Intelligence (AI), by adding a conversational agent able to parse erroneous statements and provide adequate instructions on how to fix them. CCS CONCEPTS • Social and professional topics → Computer science education.
North American Chapter of the Association for Computational Linguistics, Apr 29, 2000
Arabic inflectional morphology requires infixation, prefixation and suffixation, giving rise to a... more Arabic inflectional morphology requires infixation, prefixation and suffixation, giving rise to a large space of morphological variation. In this paper we describe an approach to reducing the complexity of Arabic morphology generation using discrimination trees and transformational rules. By decoupling the problem of stem changes from that of prefixes and suffixes, we gain a significant reduction in the number of rules required, as much as a factor of three for certain verb types. We focus on hollow verbs but discuss the wider applicability of the approach.
Psychology Press eBooks, Dec 22, 2020
Springer eBooks, Oct 2, 2021
New members entering productive organizations require considerable training. Computer tools can s... more New members entering productive organizations require considerable training. Computer tools can support such training by providing an opportunity to learn while engaging in authentic activities and receiving appropriate coaching. We describe two tools that incorporate this approach. Sherlock, an existing computer coach, is an effective environment for learning how to troubleshoot complex electronic devices. A newer research effort focuses on tools for supporting knowledge-building argumentation and scientific theory evaluation in post-elementary school science education. Both tools offer users opportunities for reflecting on their own performance and support individual as well as collaborative learning.
Procedia Computer Science, 2017
Readability assessment for Arabic is still largely underserved in both research and software deve... more Readability assessment for Arabic is still largely underserved in both research and software development. We believe that improved usability of the few tools currently released will motivate a greater user-base, and in doing so garner more interest in this topic from the research community. With that in mind, we examine recently developed readability tools with a graphical component, formulate recommendations, and propose visual enhancements to the way readability scores are reported to improve usability and informativeness.
IEEE Access
Text-to-graphics systems encompass three types of tools: text-to-picture, text-to-scene and text-... more Text-to-graphics systems encompass three types of tools: text-to-picture, text-to-scene and text-to-animation. They are an artificial intelligence application wherein users can create 2D and 3D scenes or animations and recently immersive environments from natural language. These complex tasks require the collaboration of various fields, such as natural language processing, computational linguistics and computer graphics. Text-to-animation systems have received more interest than their counterparts, and have been developed for various domains, including theatrical pre-production, education or training. In this survey we focus on text-to-animation systems, discussing their requirements, challenges and proposing solutions, and investigate the natural language understanding approaches adopted in previous research works to solve the challenge of animation generation. We review text-to-animation systems developed over the period 2001-2021, and investigate their recent trends in order to paint the current landscape of the field. INDEX TERMS Natural language interface, natural language understanding, computer graphics, semantic parsing, visual semantics.
2023 3rd International Conference on Innovative Research in Applied Science, Engineering and Technology (IRASET)
Advanced Intelligent Systems for Sustainable Development (AI2SD’2020), 2022
Procedia Computer Science, 2018
We provide a perspective on the current state of Arabic readability assessment research with the ... more We provide a perspective on the current state of Arabic readability assessment research with the objective of considering directions and opportunities for future research. We review and assess the current state of progress in Arabic readability assessment, briefly surveying research that has been performed on texts targeted at different populations: readers of Arabic as L1, adult readers in nonacademic settings, and readers of Arabic as L2. Arabic readability assessment has followed trends in other languages, primarily English, but has faced challenges due to the specificities of Arabic and the relative scarcity of available corpora and tools, compared to languages with richer resources. We also consider whether readability assessment for Arabic should take into consideration the special situation of diglossia that exists in all Arab countries.
Reading is the most critical skill for satisfactory progress in school, as well as being highly i... more Reading is the most critical skill for satisfactory progress in school, as well as being highly important for access to information throughout one’s life. For this reason, readability is one of the main challenges when choosing academic texts for learners or for readers in general, and especially with materials containing important information, such as newspapers and medical or legal articles. Readability refers to the ability of a text to be understood by the reader. Readability level prediction is an important measure in several domains, but primarily in education. In the current paper we present our approach to readability prediction for Modern Standard Arabic. This method is based on 170 features of measuring different types of text characteristics. We have used a corpus of 230 Arabic texts, annotated with the Interagency Language Roundtable (ILR) scale, and a frequency dictionary obtained using Tashkeela corpora. The results obtained are very encouraging and better than for pre...
We describe an exploratory study whose goal is to build a predictive model to automatically asses... more We describe an exploratory study whose goal is to build a predictive model to automatically assess whether a given text is suitable for a learner of Modern Standard Arabic as a foreign language studying at the intermediate level. Before describing the initial version of the model, which focuses on vocabulary content of the learners’ curriculum and the texts, and on other word-related text characteristics, we review the literature on text readability for first and second language reading to extract some text features already found useful for other languages, and examine their appropriateness for our task. The resulting model, tested on different collections of unseen texts, has good predictive accuracy for clustered curriculum stages, reaching 86.95% and 91.3% respectively for k-means clusters and balanced clusters. Accuracy is poorer in some cases, around 60%, when trying to predict the exact stage in the curriculum for which a text should be used.
Example Based Machine Translation (EBMT) is limited by the quantity and scope of its training dat... more Example Based Machine Translation (EBMT) is limited by the quantity and scope of its training data. Even with a reasonably large corpus, we will not have examples that cover everything we want to translate. This problem is especially severe in Arabic due to its rich morphology. We demonstrate a novel method that exploits the regular nature of Arabic morphology to increase the quality and coverage of machine translation. Through the use of generalization and rewrite rules, we are able to recover the English translation of phrases that do not exist in the training corpora. Furthermore, this system shows improvement in BLEU even with a training corpus of 1.4 million sentence pairs.
This research paper presents Imikode, a virtual reality (VR)-based learning game to support the t... more This research paper presents Imikode, a virtual reality (VR)-based learning game to support the teaching and learning of objectoriented programming (OOP) concepts in computing education. The system was implemented using Google Cardboard SDK and works with any Cardboard compatible headset. The teacher can use Imikode to introduce OOP class in an immersive and engaging way. This early prototype provides basic features to help introduce OOP concepts. We aim at extending it using Artificial Intelligence (AI), by adding a conversational agent able to parse erroneous statements and provide adequate instructions on how to fix them. CCS CONCEPTS • Social and professional topics → Computer science education.
North American Chapter of the Association for Computational Linguistics, Apr 29, 2000
Arabic inflectional morphology requires infixation, prefixation and suffixation, giving rise to a... more Arabic inflectional morphology requires infixation, prefixation and suffixation, giving rise to a large space of morphological variation. In this paper we describe an approach to reducing the complexity of Arabic morphology generation using discrimination trees and transformational rules. By decoupling the problem of stem changes from that of prefixes and suffixes, we gain a significant reduction in the number of rules required, as much as a factor of three for certain verb types. We focus on hollow verbs but discuss the wider applicability of the approach.
Psychology Press eBooks, Dec 22, 2020
Springer eBooks, Oct 2, 2021
New members entering productive organizations require considerable training. Computer tools can s... more New members entering productive organizations require considerable training. Computer tools can support such training by providing an opportunity to learn while engaging in authentic activities and receiving appropriate coaching. We describe two tools that incorporate this approach. Sherlock, an existing computer coach, is an effective environment for learning how to troubleshoot complex electronic devices. A newer research effort focuses on tools for supporting knowledge-building argumentation and scientific theory evaluation in post-elementary school science education. Both tools offer users opportunities for reflecting on their own performance and support individual as well as collaborative learning.
Procedia Computer Science, 2017
Readability assessment for Arabic is still largely underserved in both research and software deve... more Readability assessment for Arabic is still largely underserved in both research and software development. We believe that improved usability of the few tools currently released will motivate a greater user-base, and in doing so garner more interest in this topic from the research community. With that in mind, we examine recently developed readability tools with a graphical component, formulate recommendations, and propose visual enhancements to the way readability scores are reported to improve usability and informativeness.
IEEE Access
Text-to-graphics systems encompass three types of tools: text-to-picture, text-to-scene and text-... more Text-to-graphics systems encompass three types of tools: text-to-picture, text-to-scene and text-to-animation. They are an artificial intelligence application wherein users can create 2D and 3D scenes or animations and recently immersive environments from natural language. These complex tasks require the collaboration of various fields, such as natural language processing, computational linguistics and computer graphics. Text-to-animation systems have received more interest than their counterparts, and have been developed for various domains, including theatrical pre-production, education or training. In this survey we focus on text-to-animation systems, discussing their requirements, challenges and proposing solutions, and investigate the natural language understanding approaches adopted in previous research works to solve the challenge of animation generation. We review text-to-animation systems developed over the period 2001-2021, and investigate their recent trends in order to paint the current landscape of the field. INDEX TERMS Natural language interface, natural language understanding, computer graphics, semantic parsing, visual semantics.
2023 3rd International Conference on Innovative Research in Applied Science, Engineering and Technology (IRASET)
Advanced Intelligent Systems for Sustainable Development (AI2SD’2020), 2022
Procedia Computer Science, 2018
We provide a perspective on the current state of Arabic readability assessment research with the ... more We provide a perspective on the current state of Arabic readability assessment research with the objective of considering directions and opportunities for future research. We review and assess the current state of progress in Arabic readability assessment, briefly surveying research that has been performed on texts targeted at different populations: readers of Arabic as L1, adult readers in nonacademic settings, and readers of Arabic as L2. Arabic readability assessment has followed trends in other languages, primarily English, but has faced challenges due to the specificities of Arabic and the relative scarcity of available corpora and tools, compared to languages with richer resources. We also consider whether readability assessment for Arabic should take into consideration the special situation of diglossia that exists in all Arab countries.
Reading is the most critical skill for satisfactory progress in school, as well as being highly i... more Reading is the most critical skill for satisfactory progress in school, as well as being highly important for access to information throughout one’s life. For this reason, readability is one of the main challenges when choosing academic texts for learners or for readers in general, and especially with materials containing important information, such as newspapers and medical or legal articles. Readability refers to the ability of a text to be understood by the reader. Readability level prediction is an important measure in several domains, but primarily in education. In the current paper we present our approach to readability prediction for Modern Standard Arabic. This method is based on 170 features of measuring different types of text characteristics. We have used a corpus of 230 Arabic texts, annotated with the Interagency Language Roundtable (ILR) scale, and a frequency dictionary obtained using Tashkeela corpora. The results obtained are very encouraging and better than for pre...
We describe an exploratory study whose goal is to build a predictive model to automatically asses... more We describe an exploratory study whose goal is to build a predictive model to automatically assess whether a given text is suitable for a learner of Modern Standard Arabic as a foreign language studying at the intermediate level. Before describing the initial version of the model, which focuses on vocabulary content of the learners’ curriculum and the texts, and on other word-related text characteristics, we review the literature on text readability for first and second language reading to extract some text features already found useful for other languages, and examine their appropriateness for our task. The resulting model, tested on different collections of unseen texts, has good predictive accuracy for clustered curriculum stages, reaching 86.95% and 91.3% respectively for k-means clusters and balanced clusters. Accuracy is poorer in some cases, around 60%, when trying to predict the exact stage in the curriculum for which a text should be used.