N. Mamede - Academia.edu (original) (raw)
Papers by N. Mamede
In this paper, we focus on the most frequent errors that occurred during the implementation of a ... more In this paper, we focus on the most frequent errors that occurred during the implementation of a rule-based module for semantic relations extraction, which has been integrated in STRING, a hybrid statistical and rule-based Natural Language Processing chain for Portuguese. We focus on whole-part relations (meronymy), that is, a semantic relation between an entity that is perceived as a constituent part of another entity, or a member of a set. In this case, we target the type of meronymy involving human entities and body-part nouns. We describe with some detail the decisions that were made in order to overcome the errors produced by the system and the solutions adopted to improve its performance.
In Brazil, there is generally a lack of information flow between scientists and policy makers. He... more In Brazil, there is generally a lack of information flow between scientists and policy makers. Here, we address 29 ecological questions of high policy relevance in Brazil, in an attempt to encourage the participation of science in decision-making. Similar addresses have been made by British and American scientists (Fleishman et al., 2011; Sutherland et al., 2006), and it is our goal to bring such matters to the Brazilian reality. We had several meetings during an Ecosystem Ecology seminar held at the Federal University of Juiz de Fora and categorised our questions into five different topics. Brief introductory paragraphs are shown below each topic and the questions are presented subsequently without any order of relevance.
This paper states how to profit from 5P's arrows formalism to go from a text with its surface str... more This paper states how to profit from 5P's arrows formalism to go from a text with its surface structure to a chosen semantic representation. We emphasize on this formalism flexibility both to connect models and to define sets of conditions (over that connections) that will trigger semantic functions.
Portuguese-mente ending adverbs constitute a large, morphologically homogenous, but syntactically... more Portuguese-mente ending adverbs constitute a large, morphologically homogenous, but syntactically and semantically diverse lexical set. When coordinated, the first adverb loses the adverbial suffix and takes the shape of the base adjective, in the feminine-singular form. This raises the issue of its partof-speech (POS) classification (adverb or adjective?), but especially its adequate parsing, since it may then be incorrectly analyzed as a modifier of a preceding noun. However, the POS tagging can not be adequately performed prior to some minimal syntactic analysis. The size of the lexicon involved (more than 7,000 adverbs) and the scarcity of instances, even in large corpora, make it ineffective to leave only for the POS tagger the task of solving this adjective/reduced adverbial form ambiguity. This paper proposes an integrated solution, where a rule-base disambiguating module and a POS statistical tagger combine to produce more accurate tagging and better parsing results to this non-trivial empirical problem. The system was evaluated on a large-sized corpus.
This paper presents AsdeCopas, a module designed to interface syntax and semantics. Asde-Copas is... more This paper presents AsdeCopas, a module designed to interface syntax and semantics. Asde-Copas is based on hierarchically organised semantic rules, that output formulas in a flat language. In this paper, we show how this system can be used in the following applications: a) semantic disambiguation; b) logical formulas construction (in Minimal Recursion Semantics); c) question interpretation.
This paper describes a task repository, a device semantic interface to express device capabilitie... more This paper describes a task repository, a device semantic interface to express device capabilities, and an advice algorithm that suggests the best task-device pair to satisfy a request. The purpose of the task repository is the adaptation of a pervasive environment (Ambient Intelligence) to support natural language applications, such as a natural language interface. The task repository has a predefined group of concepts linked to linguistic and semantic resources and is updated, at runtime, with task descriptors associated with a set of heterogeneous devices. We assume that each device, belonging to the pervasive environment, holds its own semantic interface essentially composed of task descriptors. This approach tries to reach the ubiquitous essence of natural language, because the coverage of handmade lexical resources is limited, coverage problems remain for applications involving specific domains or involving multiple languages. Furthermore, we reduce the interface device problem to a database access problem. An environment simulator with the respective set of devices is depicted.
ASdeCopas is a syntactic-semantic parser, implemented in Prolog, which uses hierarchically organi... more ASdeCopas is a syntactic-semantic parser, implemented in Prolog, which uses hierarchically organized order-independent rules. This paper focuses on the formalization of semantic rules, presenting the concepts of well-formed semantic rule, rules hierarchy, and the conditions for rules application. If two rules can apply, only the most specific one does so. Examples are given and some properties of the system are pointed out.
This paper describes our contribution to let end users configure mixed-initiative spoken dialogue... more This paper describes our contribution to let end users configure mixed-initiative spoken dialogue systems to suit their personalized goals. The main problem that we want to address is the reconfiguration of spoken language dialogue systems to deal with generic plug and play artifacts. Such reconfiguration can be seen as a portability problem and is a critical research issue. In order to solve this problem we describe a hybrid approach to design ubiquitous domain models that allows the dialogue system to perform recognition of available tasks on the fly. Our approach considers two kinds of domain knowledge: the global knowledge and the local knowledge. The global knowledge, that is modeled using a top-down approach, is associated at design time with the dialogue system itself. The local knowledge, that is modeled using a bottom-up approach, is defined with each one of the artifacts. When an artifact is activated or deactivated, a bilateral process, supported by a broker, updates the domain knowledge considering the artifact local knowledge. We assume that everyday artifacts are augmented with computational capabilities and semantic descriptions supported by their own knowledge model. A case study focusing a microwave oven is depicted.
A pervasive computing environment consists typically of a large heterogeneous collection of netwo... more A pervasive computing environment consists typically of a large heterogeneous collection of networked devices. This paper describes the use of lexical knowledge to improve a pervasive computing environment. In an ongoing research project, we are exploring ways to enable non-technical users to manage and control their home environment that is particularly hostile. We assume that each device belonging to the pervasive environment has its own knowledge model, linked to lexical resources, with the purpose of defining a semantic interface. This approach tries to reach the pervasive essence of the natural language. The coverage of handmade lexical resources is limited, coverage problems remain for applications involving specific domains or involving multiple languages. Our recent efforts are directed towards the technology development, focusing on devices that are household appliances. This work is a contribution to facilitate, specially: the generation of multilingual device descriptions, the automatic build of device's graphical user interfaces, and on the fly adaptation of a spoken dialogue system to the pervasive environment.
Autonomous agents are being increasingly used in a wide range of applications. The agents operate... more Autonomous agents are being increasingly used in a wide range of applications. The agents operate in common environments and, over time, conflicts inevitably occur among them. Negotiation is the predominant process for solving conflicts. Recent growing interest in electronic commerce has also given increased importance to negotiation. This paper presents a generic negotiation mechanism that handles multiparty, multi-issue and single or repeated rounds and introduces a set of negotiation tactics that express the initial attitude of the agents and generate counterproposals either by making or not making concessions.
Autonomous agents are being used in an increasing number of applications. The agents operate in c... more Autonomous agents are being used in an increasing number of applications. The agents operate in complex environments and, over time, conflicts inevitably occur among them. Negotiation is the predominant process for resolving conflicts. This paper presents a generic negotiation model for autonomous agents that handles multi-party, multi-issue and single or repeated rounds. The model is based on computationally tractable assumptions and accounts for a tight integration of the individual capability of planning and the social capability of negotiation. This paper also describes an experiment conducted to evaluate the model in different types of situations. The experimental results confirmed a number of well-documented conclusions about human negotiation.
Fill-in-the-blank questions are one of the main assessment devices in REAP.PT tutoring system. Th... more Fill-in-the-blank questions are one of the main assessment devices in REAP.PT tutoring system. The problem of automatically generating the stems, i.e. the sentences that serve as basis to this type of question, has been studied mostly for English, and it remains a challenge for a language as morphologically rich as European Portuguese (EP), for which additional data scarcity problems arise. To address this problem, a supervised classification technique is used to model a classifier that decides whether a given sentence is suitable to be used as a stem in a cloze question. The major focus is put in the feature engineering task, describing both the development of new criteria, and the adaptation to EP of features already explored in the literature. The resulting classifier filters out inadequate stems, allowing experts to build and personalize their instruction focusing on a set of potentially good sentences.
Automated negotiation systems with software agents representing individuals or organizations and ... more Automated negotiation systems with software agents representing individuals or organizations and capable of reaching agreements through negotiation are becoming increasingly important and pervasive. Examples, to mention a few, include the industrial trend toward agent-based supply chain management, the business trend toward virtual enterprises, and the pivotal role that electronic commerce is increasingly assuming in many organizations. Artificial intelligence (AI) researchers have paid a great deal of attention to automated negotiation over the past decade and a number of prominent models have been proposed in the literature. These models exhibit fairly different features, make use of a diverse range of concepts, and show performance characteristics that vary significantly depending on the negotiation context. As a consequence, assessing and relating individual research contributions is a difficult task. Currently, there is a need to build a framework to define and characterize the essential features that are necessary to conduct automated negotiation and to compare the usage of key concepts in different publications. Furthermore, the development of such a framework can be an important step to identify the core elements of autonomous negotiating agents, to provide a coherent set of concepts related to automated negotiation, to assess progress in the field, and to highlight new research directions. Accordingly, this paper introduces a generic framework for automated negotiation. It describes, in detail, the components of the framework, assesses the sophistication of the majority of work in the AI literature on these components, and discusses a number of prominent models of negotiation. This paper also highlights some of the major challenges for future automated negotiation research.
Language learning resources are constantly evolving alongside technology. One of such resources i... more Language learning resources are constantly evolving alongside technology. One of such resources is REAP.PT, a system which aims to raise users proficiency in the Portuguese language in an interactive manner. Its current features include text-based exercises. This paper describes the evolution of REAP.PT aiming on locative prepositions used to describe the spatial position between objects. In this new REAP.PT, exercises take place in a 3D environment, and are complemented with gaming aspects to make them more appealing to students. The game scenario is an o ce, where each room contains di↵erent exercises. Completing exercises awards the student with points that unlock harder rooms. This provides the student with a challenge and a sense of progression. We believe that with these gaming aspects, students will have increased motivation to complete exercises.
A listening comprehension serious game for European Portuguese learning
European Portuguese (EP) L2 learners often state that their listening skills cannot cope with spo... more European Portuguese (EP) L2 learners often state that their listening skills cannot cope with spontaneous speech. In fact, one well-known characteristic of EP that distinguishes it from Brazilian Portuguese in particular, is the strong use of vowel reduction and simplification of consonantal clusters, both within words and across word boundaries [1]. Vowel reduction can result in vowel devoicing, voicing assimilation, deletion and coalescence. Common contractions can be partial or full syllable truncation and vowel coalescence. For example, in ’campo pequeno’, the coalescence of the two plosives ’p’ would be a standard realization. Hence, in our effort to develop Computer-Assisted Language Learning (CALL) tools to help EP learners, the practice of listening comprehension appeared to be a very important feature to explore. The growing interest in using serious games in the CALL community to motivate learners, led us to develop a game dedicated to the practice of listening comprehension. The game consists in identifying the words actually pronounced in real speech utterances. The learner is asked to listen to an utterance using only audio or along with a video clip, and then the sentence should be reconstructed by choosing words from lists containing the correct words and some distractors. In fact, our main objective is to provide learners with realistic speech for them to get used to the sounds and the pronunciation of EP. The game introduces two innovative aspects: the multimedia curriculum itself, comprised of broadcast news videos and the fully-automatic generation of the exercises, allowed by the use of our automatic speech recognition (ASR) engine to transcribe the videos and the filtering chain to select appropriate sentences.
REAP. PT-An Online Portuguese Tutor
Learning Portuguese with Speech Technologies
This set of demos intends to illustrate di↵erent applications of speech technologies for Computer... more This set of demos intends to illustrate di↵erent applications of speech technologies for Computer-Assisted Language Learning. Although the most typical application in this context seems to be pronunciation training, the emphasis here is on vocabulary learning, and perception. The latter is specially important for European Portuguese as a second language, which is the target of our research. The first demos are aimed at beginners level, and consist of serious games based on 3D, and speech recognition and synthesis technologies, for learning vocabulary and the use of prepositions. The second set of demos are aimed at a higher level, and use recent multimedia documents such as TV broadcast news of the preceding week as training materials. Many speech and language processing technologies are involved in this demo, such as audio segmentation, speech recognition, capitalization, punctuation, topic segmentation and indexation. The generation of the exercises using this data is fully automatic. This may be very valuable for teachers, saving them time in search for motivating materials of appropriate quality, level and topic.
Pareceres sobre Linguateca: Relat��rio relativo ao per��odo 2000-2003
The ultimate goal of the poetry assistant currently under development in our lab is an applicatio... more The ultimate goal of the poetry assistant currently under development in our lab is an application to be used either as a poetry game or as a teaching tool for both poetry and grammar, including the complex relationships between sound and meaning. Until now we focused on the automatic classification of poems and the suggestion of the ending word for a verse. The classification module is based on poetic concepts that take into account structure and metrics. The prediction module uses several criteria to select the ending word: the structural constraints of the poem, the grammatical category of the words, and the statistical language models obtained from a text corpus. The first version of the system, rather than being selfcontained, is still based on the use of different heterogeneous modules. We are currently working on a second version based on a modular architecture that facilitates the reuse of the linguistic processing modules already developed within the lab.
This paper describes the integration of multimedia documents in the Portuguese version of REAP, a... more This paper describes the integration of multimedia documents in the Portuguese version of REAP, a tutoring system for vocabulary learning. The documents result from the pipeline processing of Broadcast News videos that automatically segments the audio files, transcribes them, adds punctuation and capitalization, and breaks them into stories classified by topics. The integration of these materials in REAP was done in a way that tries to decrease the impact of potential errors of the automatic chain in the learning process.
In this paper, we focus on the most frequent errors that occurred during the implementation of a ... more In this paper, we focus on the most frequent errors that occurred during the implementation of a rule-based module for semantic relations extraction, which has been integrated in STRING, a hybrid statistical and rule-based Natural Language Processing chain for Portuguese. We focus on whole-part relations (meronymy), that is, a semantic relation between an entity that is perceived as a constituent part of another entity, or a member of a set. In this case, we target the type of meronymy involving human entities and body-part nouns. We describe with some detail the decisions that were made in order to overcome the errors produced by the system and the solutions adopted to improve its performance.
In Brazil, there is generally a lack of information flow between scientists and policy makers. He... more In Brazil, there is generally a lack of information flow between scientists and policy makers. Here, we address 29 ecological questions of high policy relevance in Brazil, in an attempt to encourage the participation of science in decision-making. Similar addresses have been made by British and American scientists (Fleishman et al., 2011; Sutherland et al., 2006), and it is our goal to bring such matters to the Brazilian reality. We had several meetings during an Ecosystem Ecology seminar held at the Federal University of Juiz de Fora and categorised our questions into five different topics. Brief introductory paragraphs are shown below each topic and the questions are presented subsequently without any order of relevance.
This paper states how to profit from 5P's arrows formalism to go from a text with its surface str... more This paper states how to profit from 5P's arrows formalism to go from a text with its surface structure to a chosen semantic representation. We emphasize on this formalism flexibility both to connect models and to define sets of conditions (over that connections) that will trigger semantic functions.
Portuguese-mente ending adverbs constitute a large, morphologically homogenous, but syntactically... more Portuguese-mente ending adverbs constitute a large, morphologically homogenous, but syntactically and semantically diverse lexical set. When coordinated, the first adverb loses the adverbial suffix and takes the shape of the base adjective, in the feminine-singular form. This raises the issue of its partof-speech (POS) classification (adverb or adjective?), but especially its adequate parsing, since it may then be incorrectly analyzed as a modifier of a preceding noun. However, the POS tagging can not be adequately performed prior to some minimal syntactic analysis. The size of the lexicon involved (more than 7,000 adverbs) and the scarcity of instances, even in large corpora, make it ineffective to leave only for the POS tagger the task of solving this adjective/reduced adverbial form ambiguity. This paper proposes an integrated solution, where a rule-base disambiguating module and a POS statistical tagger combine to produce more accurate tagging and better parsing results to this non-trivial empirical problem. The system was evaluated on a large-sized corpus.
This paper presents AsdeCopas, a module designed to interface syntax and semantics. Asde-Copas is... more This paper presents AsdeCopas, a module designed to interface syntax and semantics. Asde-Copas is based on hierarchically organised semantic rules, that output formulas in a flat language. In this paper, we show how this system can be used in the following applications: a) semantic disambiguation; b) logical formulas construction (in Minimal Recursion Semantics); c) question interpretation.
This paper describes a task repository, a device semantic interface to express device capabilitie... more This paper describes a task repository, a device semantic interface to express device capabilities, and an advice algorithm that suggests the best task-device pair to satisfy a request. The purpose of the task repository is the adaptation of a pervasive environment (Ambient Intelligence) to support natural language applications, such as a natural language interface. The task repository has a predefined group of concepts linked to linguistic and semantic resources and is updated, at runtime, with task descriptors associated with a set of heterogeneous devices. We assume that each device, belonging to the pervasive environment, holds its own semantic interface essentially composed of task descriptors. This approach tries to reach the ubiquitous essence of natural language, because the coverage of handmade lexical resources is limited, coverage problems remain for applications involving specific domains or involving multiple languages. Furthermore, we reduce the interface device problem to a database access problem. An environment simulator with the respective set of devices is depicted.
ASdeCopas is a syntactic-semantic parser, implemented in Prolog, which uses hierarchically organi... more ASdeCopas is a syntactic-semantic parser, implemented in Prolog, which uses hierarchically organized order-independent rules. This paper focuses on the formalization of semantic rules, presenting the concepts of well-formed semantic rule, rules hierarchy, and the conditions for rules application. If two rules can apply, only the most specific one does so. Examples are given and some properties of the system are pointed out.
This paper describes our contribution to let end users configure mixed-initiative spoken dialogue... more This paper describes our contribution to let end users configure mixed-initiative spoken dialogue systems to suit their personalized goals. The main problem that we want to address is the reconfiguration of spoken language dialogue systems to deal with generic plug and play artifacts. Such reconfiguration can be seen as a portability problem and is a critical research issue. In order to solve this problem we describe a hybrid approach to design ubiquitous domain models that allows the dialogue system to perform recognition of available tasks on the fly. Our approach considers two kinds of domain knowledge: the global knowledge and the local knowledge. The global knowledge, that is modeled using a top-down approach, is associated at design time with the dialogue system itself. The local knowledge, that is modeled using a bottom-up approach, is defined with each one of the artifacts. When an artifact is activated or deactivated, a bilateral process, supported by a broker, updates the domain knowledge considering the artifact local knowledge. We assume that everyday artifacts are augmented with computational capabilities and semantic descriptions supported by their own knowledge model. A case study focusing a microwave oven is depicted.
A pervasive computing environment consists typically of a large heterogeneous collection of netwo... more A pervasive computing environment consists typically of a large heterogeneous collection of networked devices. This paper describes the use of lexical knowledge to improve a pervasive computing environment. In an ongoing research project, we are exploring ways to enable non-technical users to manage and control their home environment that is particularly hostile. We assume that each device belonging to the pervasive environment has its own knowledge model, linked to lexical resources, with the purpose of defining a semantic interface. This approach tries to reach the pervasive essence of the natural language. The coverage of handmade lexical resources is limited, coverage problems remain for applications involving specific domains or involving multiple languages. Our recent efforts are directed towards the technology development, focusing on devices that are household appliances. This work is a contribution to facilitate, specially: the generation of multilingual device descriptions, the automatic build of device's graphical user interfaces, and on the fly adaptation of a spoken dialogue system to the pervasive environment.
Autonomous agents are being increasingly used in a wide range of applications. The agents operate... more Autonomous agents are being increasingly used in a wide range of applications. The agents operate in common environments and, over time, conflicts inevitably occur among them. Negotiation is the predominant process for solving conflicts. Recent growing interest in electronic commerce has also given increased importance to negotiation. This paper presents a generic negotiation mechanism that handles multiparty, multi-issue and single or repeated rounds and introduces a set of negotiation tactics that express the initial attitude of the agents and generate counterproposals either by making or not making concessions.
Autonomous agents are being used in an increasing number of applications. The agents operate in c... more Autonomous agents are being used in an increasing number of applications. The agents operate in complex environments and, over time, conflicts inevitably occur among them. Negotiation is the predominant process for resolving conflicts. This paper presents a generic negotiation model for autonomous agents that handles multi-party, multi-issue and single or repeated rounds. The model is based on computationally tractable assumptions and accounts for a tight integration of the individual capability of planning and the social capability of negotiation. This paper also describes an experiment conducted to evaluate the model in different types of situations. The experimental results confirmed a number of well-documented conclusions about human negotiation.
Fill-in-the-blank questions are one of the main assessment devices in REAP.PT tutoring system. Th... more Fill-in-the-blank questions are one of the main assessment devices in REAP.PT tutoring system. The problem of automatically generating the stems, i.e. the sentences that serve as basis to this type of question, has been studied mostly for English, and it remains a challenge for a language as morphologically rich as European Portuguese (EP), for which additional data scarcity problems arise. To address this problem, a supervised classification technique is used to model a classifier that decides whether a given sentence is suitable to be used as a stem in a cloze question. The major focus is put in the feature engineering task, describing both the development of new criteria, and the adaptation to EP of features already explored in the literature. The resulting classifier filters out inadequate stems, allowing experts to build and personalize their instruction focusing on a set of potentially good sentences.
Automated negotiation systems with software agents representing individuals or organizations and ... more Automated negotiation systems with software agents representing individuals or organizations and capable of reaching agreements through negotiation are becoming increasingly important and pervasive. Examples, to mention a few, include the industrial trend toward agent-based supply chain management, the business trend toward virtual enterprises, and the pivotal role that electronic commerce is increasingly assuming in many organizations. Artificial intelligence (AI) researchers have paid a great deal of attention to automated negotiation over the past decade and a number of prominent models have been proposed in the literature. These models exhibit fairly different features, make use of a diverse range of concepts, and show performance characteristics that vary significantly depending on the negotiation context. As a consequence, assessing and relating individual research contributions is a difficult task. Currently, there is a need to build a framework to define and characterize the essential features that are necessary to conduct automated negotiation and to compare the usage of key concepts in different publications. Furthermore, the development of such a framework can be an important step to identify the core elements of autonomous negotiating agents, to provide a coherent set of concepts related to automated negotiation, to assess progress in the field, and to highlight new research directions. Accordingly, this paper introduces a generic framework for automated negotiation. It describes, in detail, the components of the framework, assesses the sophistication of the majority of work in the AI literature on these components, and discusses a number of prominent models of negotiation. This paper also highlights some of the major challenges for future automated negotiation research.
Language learning resources are constantly evolving alongside technology. One of such resources i... more Language learning resources are constantly evolving alongside technology. One of such resources is REAP.PT, a system which aims to raise users proficiency in the Portuguese language in an interactive manner. Its current features include text-based exercises. This paper describes the evolution of REAP.PT aiming on locative prepositions used to describe the spatial position between objects. In this new REAP.PT, exercises take place in a 3D environment, and are complemented with gaming aspects to make them more appealing to students. The game scenario is an o ce, where each room contains di↵erent exercises. Completing exercises awards the student with points that unlock harder rooms. This provides the student with a challenge and a sense of progression. We believe that with these gaming aspects, students will have increased motivation to complete exercises.
A listening comprehension serious game for European Portuguese learning
European Portuguese (EP) L2 learners often state that their listening skills cannot cope with spo... more European Portuguese (EP) L2 learners often state that their listening skills cannot cope with spontaneous speech. In fact, one well-known characteristic of EP that distinguishes it from Brazilian Portuguese in particular, is the strong use of vowel reduction and simplification of consonantal clusters, both within words and across word boundaries [1]. Vowel reduction can result in vowel devoicing, voicing assimilation, deletion and coalescence. Common contractions can be partial or full syllable truncation and vowel coalescence. For example, in ’campo pequeno’, the coalescence of the two plosives ’p’ would be a standard realization. Hence, in our effort to develop Computer-Assisted Language Learning (CALL) tools to help EP learners, the practice of listening comprehension appeared to be a very important feature to explore. The growing interest in using serious games in the CALL community to motivate learners, led us to develop a game dedicated to the practice of listening comprehension. The game consists in identifying the words actually pronounced in real speech utterances. The learner is asked to listen to an utterance using only audio or along with a video clip, and then the sentence should be reconstructed by choosing words from lists containing the correct words and some distractors. In fact, our main objective is to provide learners with realistic speech for them to get used to the sounds and the pronunciation of EP. The game introduces two innovative aspects: the multimedia curriculum itself, comprised of broadcast news videos and the fully-automatic generation of the exercises, allowed by the use of our automatic speech recognition (ASR) engine to transcribe the videos and the filtering chain to select appropriate sentences.
REAP. PT-An Online Portuguese Tutor
Learning Portuguese with Speech Technologies
This set of demos intends to illustrate di↵erent applications of speech technologies for Computer... more This set of demos intends to illustrate di↵erent applications of speech technologies for Computer-Assisted Language Learning. Although the most typical application in this context seems to be pronunciation training, the emphasis here is on vocabulary learning, and perception. The latter is specially important for European Portuguese as a second language, which is the target of our research. The first demos are aimed at beginners level, and consist of serious games based on 3D, and speech recognition and synthesis technologies, for learning vocabulary and the use of prepositions. The second set of demos are aimed at a higher level, and use recent multimedia documents such as TV broadcast news of the preceding week as training materials. Many speech and language processing technologies are involved in this demo, such as audio segmentation, speech recognition, capitalization, punctuation, topic segmentation and indexation. The generation of the exercises using this data is fully automatic. This may be very valuable for teachers, saving them time in search for motivating materials of appropriate quality, level and topic.
Pareceres sobre Linguateca: Relat��rio relativo ao per��odo 2000-2003
The ultimate goal of the poetry assistant currently under development in our lab is an applicatio... more The ultimate goal of the poetry assistant currently under development in our lab is an application to be used either as a poetry game or as a teaching tool for both poetry and grammar, including the complex relationships between sound and meaning. Until now we focused on the automatic classification of poems and the suggestion of the ending word for a verse. The classification module is based on poetic concepts that take into account structure and metrics. The prediction module uses several criteria to select the ending word: the structural constraints of the poem, the grammatical category of the words, and the statistical language models obtained from a text corpus. The first version of the system, rather than being selfcontained, is still based on the use of different heterogeneous modules. We are currently working on a second version based on a modular architecture that facilitates the reuse of the linguistic processing modules already developed within the lab.
This paper describes the integration of multimedia documents in the Portuguese version of REAP, a... more This paper describes the integration of multimedia documents in the Portuguese version of REAP, a tutoring system for vocabulary learning. The documents result from the pipeline processing of Broadcast News videos that automatically segments the audio files, transcribes them, adds punctuation and capitalization, and breaks them into stories classified by topics. The integration of these materials in REAP was done in a way that tries to decrease the impact of potential errors of the automatic chain in the learning process.