Georgios Petasis - Academia.edu (original) (raw)
Papers by Georgios Petasis
Methodology and Architecture for Multimedia Ontology Evolution
Workpackage:
Ontology Population and Enrichment: State of the Art
Knowledge-Driven Multimedia Information Extraction and Ontology Evolution, 2011
Ontology learning is the process of acquiring (constructing or integrating) an ontology (semi-) a... more Ontology learning is the process of acquiring (constructing or integrating) an ontology (semi-) automatically. Being a knowledge acquisition task, it is a complex activity, which becomes even more complex in the context of the BOEMIE project, due to the management of multimedia resources and the multi-modal semantic interpretation that they require. The purpose of this chapter is to present a survey of the most relevant methods, techniques and tools used for the task of ontology learning. Adopting a practical perspective, an overview ...
Argument Extraction from Social Media
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2008
Concept-to-Speech systems include Natural Language Generators that produce linguistically enriche... more Concept-to-Speech systems include Natural Language Generators that produce linguistically enriched text descriptions which can lead to significantly improved quality of speech synthesis. There are cases, however, where either the generator modules produce pieces of non-analyzed, non-annotated plain text, or such modules are not available at all. Moreover, the language analysis is restricted by the usually limited domain coverage of the generator due to its embedded grammar. This work reports on a language-independent framework basis, linguistic resources and language analysis procedures (word/sentence identification, partof-speech, prosodic feature annotation) for text annotation/processing for plain or enriched text corpora. It aims to produce an automated XML-annotated enriched prosodic markup for English and Greek texts, for improved synthetic speech. The markup includes information for both training the synthesizer and for actual input for synthesising. Depending on the domain and target, different methods may be used for automatic classification of entities (words, phrases, sentences) to one or more preset categories such as "emphatic event", "new/old information", "second argument to verb", "proper noun phrase", etc. The prosodic features are classified according to the analysis of the speech-specific characteristics for their role in prosody modelling and passed through to the synthesizer via an extended SOLE-ML description. Evaluation results show that using selectable hybrid methods for part-of-speech tagging high accuracy is achieved. Annotation of a large generated text corpus containing 50% enriched text and 50% canned plain text produces a fully annotated uniform SOLE-ML output containing all prosodic features found in the initial enriched source. Furthermore, additional automatically-derived prosodic feature annotation and speech synthesis related values are assigned, such as word-placement in sentences and phrases, previous and next word entity relations, emphatic phrases containing proper nouns, and more.
Proceedings of the 8th …, 2001
Abstract: This paper presents a large-scale Greek morphological lexicon, developed by the Softwar... more Abstract: This paper presents a large-scale Greek morphological lexicon, developed by the Software & Knowledge Engineering Laboratory (SKEL) of NCSR "Demokritos". The paper describes the lexicon architecture, the procedure followed to develop it, as well as the provided ...
Proceedings of the 4th Workshop on Argument Mining
In this paper we present a new unsupervised approach, "Attraction to Topics"-A2T , for the detect... more In this paper we present a new unsupervised approach, "Attraction to Topics"-A2T , for the detection of argumentative units, a sub-task of argument mining. Motivated by the importance of topic identification in manual annotation, we examine whether topic modeling can be used for performing unsupervised detection of argumentative sentences, and to what extend topic modeling can be used to classify sentences as claims and premises. Preliminary evaluation results suggest that topic information can be successfully used for the detection of argumentative sentences, at least for corpora used in the evaluation. Our approach has been evaluated on two English corpora, the first of which contains 90 persuasive essays, while the second is a collection of 340 documents from user generated content.
Proceedings of the Third Workshop on Argument Mining (ArgMining2016), 2016
In this paper we examine the application of an unsupervised extractive summarisation algorithm, T... more In this paper we examine the application of an unsupervised extractive summarisation algorithm, TextRank, on a different task, the identification of argumentative components. Our main motivation is to examine whether there is any potential overlap between extractive summarisation and argument mining, and whether approaches used in summarisation (which typically model a document as a whole) can have a positive effect on tasks of argument mining. Evaluation has been performed on two corpora containing user posts from an on-line debating forum and persuasive essays. Evaluation results suggest that graph-based approaches and approaches targeting extractive summarisation can have a positive effect on tasks related to argument mining.
Ontology Dynamics with Multimedia Information: The BOEMIE Evolution Methodology
... it 2 Hamburg University of Technology, Institute of Software Systems, 21079 Hamburg, Germany,... more ... it 2 Hamburg University of Technology, Institute of Software Systems, 21079 Hamburg, Germany, {sofia. espinosa, at. ... Sport Jumper⊑ Athlete P ole⊑ SportEquipment Horizontal Bar⊑ SportEquipment Foam Mat⊑ SportEquipment Jumping Event⊑ Event⊓ ∃≤ 1hasParticipant. ...
BlogBuster: A Tool for Extracting Corpora from the Blogosphere
Lrec, 2010
This paper presents a large-scale Greek morphological lexicon, developed by the Software & Knowle... more This paper presents a large-scale Greek morphological lexicon, developed by the Software & Knowledge Engineering Laboratory (SKEL) of NCSR "Demokritos". The paper describes the lexicon architecture, the procedure followed to develop it, as well as the provided functionalities to update it. The morphological lexicon was used to develop a lemmatiser and a morphological analyser that were included in a controlled language checker for Greek. The paper discusses the current coverage of the lexicon, as well as remaining issues and how we plan to address them. Our long-term goal is to produce a wide-coverage morphological lexicon of Greek that can be easily exploited in several natural language processing applications.
Machine Learning and Named-Entity Recognition
TkDND: a cross-platform drag'n'drop package
This paper is about TkDND, a Tcl/Tk extension that aims to add cross-application drag and drop su... more This paper is about TkDND, a Tcl/Tk extension that aims to add cross-application drag and drop support to Tk, for popular operating systems, such as Microsoft Windows, Apple OS X and GNU/Linux. Being in its second rewrite, TkDND 2.x has a stable implementation for Windows and OS X, while support for Linux and the XDND protocol is still under development.
TkRibbon: Windows Ribbons for Tk
This paper is about TkRibbon, a Tcl/Tk extension that aims to introduce support for the Windows R... more This paper is about TkRibbon, a Tcl/Tk extension that aims to introduce support for the Windows Ribbon Framework in the Tk toolkit. The Windows Ribbon is a graphical interface where a set of toolbars are placed on tabs in a notebook widget, aiming to substitute traditional menus and toolbars. This paper briefly describes Windows Ribbon framework, the TkRibbon Tk extension and presents some examples on how TkRibbon can be used by Tk applications.
TileQt and TileGtk: current status
This paper is about two Tile and Ttk themes, TileQt and TileGTK. Despite being two distinct and v... more This paper is about two Tile and Ttk themes, TileQt and TileGTK. Despite being two distinct and very different extensions, the motivation for their development was common: making Tk applications look as native as possible under the Linux operating system.
Proceedings of the MultiLing 2013 Workshop on Multilingual Multi-document Summarization
Methodology and Architecture for Multimedia Ontology Evolution
Workpackage:
Ontology Population and Enrichment: State of the Art
Knowledge-Driven Multimedia Information Extraction and Ontology Evolution, 2011
Ontology learning is the process of acquiring (constructing or integrating) an ontology (semi-) a... more Ontology learning is the process of acquiring (constructing or integrating) an ontology (semi-) automatically. Being a knowledge acquisition task, it is a complex activity, which becomes even more complex in the context of the BOEMIE project, due to the management of multimedia resources and the multi-modal semantic interpretation that they require. The purpose of this chapter is to present a survey of the most relevant methods, techniques and tools used for the task of ontology learning. Adopting a practical perspective, an overview ...
Argument Extraction from Social Media
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2008
Concept-to-Speech systems include Natural Language Generators that produce linguistically enriche... more Concept-to-Speech systems include Natural Language Generators that produce linguistically enriched text descriptions which can lead to significantly improved quality of speech synthesis. There are cases, however, where either the generator modules produce pieces of non-analyzed, non-annotated plain text, or such modules are not available at all. Moreover, the language analysis is restricted by the usually limited domain coverage of the generator due to its embedded grammar. This work reports on a language-independent framework basis, linguistic resources and language analysis procedures (word/sentence identification, partof-speech, prosodic feature annotation) for text annotation/processing for plain or enriched text corpora. It aims to produce an automated XML-annotated enriched prosodic markup for English and Greek texts, for improved synthetic speech. The markup includes information for both training the synthesizer and for actual input for synthesising. Depending on the domain and target, different methods may be used for automatic classification of entities (words, phrases, sentences) to one or more preset categories such as "emphatic event", "new/old information", "second argument to verb", "proper noun phrase", etc. The prosodic features are classified according to the analysis of the speech-specific characteristics for their role in prosody modelling and passed through to the synthesizer via an extended SOLE-ML description. Evaluation results show that using selectable hybrid methods for part-of-speech tagging high accuracy is achieved. Annotation of a large generated text corpus containing 50% enriched text and 50% canned plain text produces a fully annotated uniform SOLE-ML output containing all prosodic features found in the initial enriched source. Furthermore, additional automatically-derived prosodic feature annotation and speech synthesis related values are assigned, such as word-placement in sentences and phrases, previous and next word entity relations, emphatic phrases containing proper nouns, and more.
Proceedings of the 8th …, 2001
Abstract: This paper presents a large-scale Greek morphological lexicon, developed by the Softwar... more Abstract: This paper presents a large-scale Greek morphological lexicon, developed by the Software & Knowledge Engineering Laboratory (SKEL) of NCSR "Demokritos". The paper describes the lexicon architecture, the procedure followed to develop it, as well as the provided ...
Proceedings of the 4th Workshop on Argument Mining
In this paper we present a new unsupervised approach, "Attraction to Topics"-A2T , for the detect... more In this paper we present a new unsupervised approach, "Attraction to Topics"-A2T , for the detection of argumentative units, a sub-task of argument mining. Motivated by the importance of topic identification in manual annotation, we examine whether topic modeling can be used for performing unsupervised detection of argumentative sentences, and to what extend topic modeling can be used to classify sentences as claims and premises. Preliminary evaluation results suggest that topic information can be successfully used for the detection of argumentative sentences, at least for corpora used in the evaluation. Our approach has been evaluated on two English corpora, the first of which contains 90 persuasive essays, while the second is a collection of 340 documents from user generated content.
Proceedings of the Third Workshop on Argument Mining (ArgMining2016), 2016
In this paper we examine the application of an unsupervised extractive summarisation algorithm, T... more In this paper we examine the application of an unsupervised extractive summarisation algorithm, TextRank, on a different task, the identification of argumentative components. Our main motivation is to examine whether there is any potential overlap between extractive summarisation and argument mining, and whether approaches used in summarisation (which typically model a document as a whole) can have a positive effect on tasks of argument mining. Evaluation has been performed on two corpora containing user posts from an on-line debating forum and persuasive essays. Evaluation results suggest that graph-based approaches and approaches targeting extractive summarisation can have a positive effect on tasks related to argument mining.
Ontology Dynamics with Multimedia Information: The BOEMIE Evolution Methodology
... it 2 Hamburg University of Technology, Institute of Software Systems, 21079 Hamburg, Germany,... more ... it 2 Hamburg University of Technology, Institute of Software Systems, 21079 Hamburg, Germany, {sofia. espinosa, at. ... Sport Jumper⊑ Athlete P ole⊑ SportEquipment Horizontal Bar⊑ SportEquipment Foam Mat⊑ SportEquipment Jumping Event⊑ Event⊓ ∃≤ 1hasParticipant. ...
BlogBuster: A Tool for Extracting Corpora from the Blogosphere
Lrec, 2010
This paper presents a large-scale Greek morphological lexicon, developed by the Software & Knowle... more This paper presents a large-scale Greek morphological lexicon, developed by the Software & Knowledge Engineering Laboratory (SKEL) of NCSR "Demokritos". The paper describes the lexicon architecture, the procedure followed to develop it, as well as the provided functionalities to update it. The morphological lexicon was used to develop a lemmatiser and a morphological analyser that were included in a controlled language checker for Greek. The paper discusses the current coverage of the lexicon, as well as remaining issues and how we plan to address them. Our long-term goal is to produce a wide-coverage morphological lexicon of Greek that can be easily exploited in several natural language processing applications.
Machine Learning and Named-Entity Recognition
TkDND: a cross-platform drag'n'drop package
This paper is about TkDND, a Tcl/Tk extension that aims to add cross-application drag and drop su... more This paper is about TkDND, a Tcl/Tk extension that aims to add cross-application drag and drop support to Tk, for popular operating systems, such as Microsoft Windows, Apple OS X and GNU/Linux. Being in its second rewrite, TkDND 2.x has a stable implementation for Windows and OS X, while support for Linux and the XDND protocol is still under development.
TkRibbon: Windows Ribbons for Tk
This paper is about TkRibbon, a Tcl/Tk extension that aims to introduce support for the Windows R... more This paper is about TkRibbon, a Tcl/Tk extension that aims to introduce support for the Windows Ribbon Framework in the Tk toolkit. The Windows Ribbon is a graphical interface where a set of toolbars are placed on tabs in a notebook widget, aiming to substitute traditional menus and toolbars. This paper briefly describes Windows Ribbon framework, the TkRibbon Tk extension and presents some examples on how TkRibbon can be used by Tk applications.
TileQt and TileGtk: current status
This paper is about two Tile and Ttk themes, TileQt and TileGTK. Despite being two distinct and v... more This paper is about two Tile and Ttk themes, TileQt and TileGTK. Despite being two distinct and very different extensions, the motivation for their development was common: making Tk applications look as native as possible under the Linux operating system.
Proceedings of the MultiLing 2013 Workshop on Multilingual Multi-document Summarization