Andreas Dengel - Academia.edu (original) (raw)

Papers by Andreas Dengel

Research paper thumbnail of Circ-LocNet: A Computational Framework for Circular RNA Sub-Cellular Localization Prediction

International Journal of Molecular Sciences

Circular ribonucleic acids (circRNAs) are novel non-coding RNAs that emanate from alternative spl... more Circular ribonucleic acids (circRNAs) are novel non-coding RNAs that emanate from alternative splicing of precursor mRNA in reversed order across exons. Despite the abundant presence of circRNAs in human genes and their involvement in diverse physiological processes, the functionality of most circRNAs remains a mystery. Like other non-coding RNAs, sub-cellular localization knowledge of circRNAs has the aptitude to demystify the influence of circRNAs on protein synthesis, degradation, destination, their association with different diseases, and potential for drug development. To date, wet experimental approaches are being used to detect sub-cellular locations of circular RNAs. These approaches help to elucidate the role of circRNAs as protein scaffolds, RNA-binding protein (RBP) sponges, micro-RNA (miRNA) sponges, parental gene expression modifiers, alternative splicing regulators, and transcription regulators. To complement wet-lab experiments, considering the progress made by machin...

Research paper thumbnail of Search and Learn: Improving Semantic Coverage for Data-to-Text Generation

Proceedings of the AAAI Conference on Artificial Intelligence

Data-to-text generation systems aim to generate text descriptions based on input data (often repr... more Data-to-text generation systems aim to generate text descriptions based on input data (often represented in the tabular form). A typical system uses huge training samples for learning the correspondence between tables and texts. However, large training sets are expensive to obtain, limiting the applicability of these approaches in real-world scenarios. In this work, we focus on few-shot data-to-text generation. We observe that, while fine-tuned pretrained language models may generate plausible sentences, they suffer from the low semantic coverage problem in the few-shot setting. In other words, important input slots tend to be missing in the generated text. To this end, we propose a search-and-learning approach that leverages pretrained language models but inserts the missing slots to improve the semantic coverage. We further finetune our system based on the search results to smooth out the search noise, yielding better-quality text and improving inference efficiency to a large exte...

Research paper thumbnail of TimeREISE: Time Series Randomized Evolving Input Sample Explanation

Sensors

Deep neural networks are one of the most successful classifiers across different domains. However... more Deep neural networks are one of the most successful classifiers across different domains. However, their use is limited in safety-critical areas due to their limitations concerning interpretability. The research field of explainable artificial intelligence addresses this problem. However, most interpretability methods align to the imaging modality by design. The paper introduces TimeREISE, a model agnostic attribution method that shows success in the context of time series classification. The method applies perturbations to the input and considers different attribution map characteristics such as the granularity and density of an attribution map. The approach demonstrates superior performance compared to existing methods concerning different well-established measurements. TimeREISE shows impressive results in the deletion and insertion test, Infidelity, and Sensitivity. Concerning the continuity of an explanation, it showed superior performance while preserving the correctness of th...

Research paper thumbnail of Leveraging Context-Aware Recommender Systems for Improving Personal Knowledge Assistants by Introducing Contextual States

During the last decades, recommender systems have played a remarkable role in putting one step fu... more During the last decades, recommender systems have played a remarkable role in putting one step further toward making content platforms more intelligent in a wide variety of domains ranging from music and movies to books and documents. Notwithstanding the various applications of recommender systems, not many contributions have been made regarding their potential capabilities in the domain of personal knowledge management. Hence, it has been tried in this study to shed new light on an innovative application of recommender systems to improve personal knowledge assistants by making them capable of providing knowledge workers with useful information through every single situation during their daily work. This paper provides a comprehensive research tree involving the key information about state of the art approaches with a focus on the three most relevant categories to this research including knowledge-based, sequential, and session-based recommender systems. Furthermore, the idea of the...

Research paper thumbnail of If You Like It, GAN It—Probabilistic Multivariate Times Series Forecast with GAN

The 7th International conference on Time Series and Forecasting, 2021

This article is an open access article distributed under the terms and conditions of the Creative... more This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY

Research paper thumbnail of ExAID: A multimodal explanation framework for computer-aided diagnosis of skin lesions

Computer Methods and Programs in Biomedicine, 2022

Background and Objectives: One principal impediment in successful deployment of Artificial Intell... more Background and Objectives: One principal impediment in successful deployment of Artificial Intelligence (AI)-based Computer-Aided Diagnosis (CAD) systems in everyday clinical workflow is their lack of transparent decision making. Although commonly used eXplainable AI (XAI) methods provide some insight into these largely opaque algorithms, yet such explanations are usually convoluted and not readily comprehensible except by highly trained AI experts. The explanation of decisions regarding the malignancy of skin lesions from dermoscopic images demands particular clarity, as the underlying medical problem definition is itself ambiguous. This work presents and evaluates ExAID (Explainable AI for Dermatology), a novel XAI framework for biomedical image analysis, providing multi-modal concept-based explanations consisting of easy-to-understand textual explanations supplemented by visual maps justifying the predictions. Methods: Our framework relies on Concept Activation Vectors (CAVs) to map humanunderstandable concepts to those learnt by an arbitrary Deep Learning based algorithm in its latent space, and Concept Localisation Maps (CLMs) to highlight concepts in the input space. This identification of relevant concepts is then used to construct fine-grained textual explanations supplemented by concept-wise location information to provide comprehensive and coherent multi-modal explanations. All decision-related information is comprehensively presented in a diagnostic interface for use in clinical routines. Moreover, the framework includes an educational mode providing dataset-level explanation statistics and tools for data and model exploration to aid medical research and education processes. Results: Through rigorous quantitative and qualitative evaluation of our framework on a range of dermoscopic image datasets such as SkinL2, Derm7pt, PH 2 and ISIC, we show the utility of multimodal explanations for CAD-assisted scenarios even in case of wrong disease predictions. Conclusions: We present a new multi-modal explanation framework for biomedical image analysis on the example use-case of Melanoma classification from dermoscopic images and evaluate its utility on a row of datasets. Since comprehensible explanation is one of the cornerstones of any CAD system, we believe that ExAID will provide dermatologists an effective screening tool that they both understand and trust. Moreover, ExAID will be the basis for similar applications in other biomedical imaging fields.

Research paper thumbnail of Turbulence forecasting via Neural ODE

arXiv: Computational Physics, 2019

Fluid turbulence is characterized by strong coupling across a broad range of scales. Furthermore,... more Fluid turbulence is characterized by strong coupling across a broad range of scales. Furthermore, besides the usual local cascades, such coupling may extend to interactions that are non-local in scale-space. As such the computational demands associated with explicitly resolving the full set of scales and their interactions, as in the Direct Numerical Simulation (DNS) of the Navier-Stokes equations, in most problems of practical interest are so high that reduced modeling of scales and interactions is required before further progress can be made. While popular reduced models are typically based on phenomenological modeling of relevant turbulent processes, recent advances in machine learning techniques have energized efforts to further improve the accuracy of such reduced models. In contrast to such efforts that seek to improve an existing turbulence model, we propose a machine learning(ML) methodology that captures, de novo, underlying turbulence phenomenology without a pre-specified ...

Research paper thumbnail of A Hybrid Approach and Unified Framework for Bibliographic Reference Extraction

IEEE Access, 2020

Publications are an integral part of a scientific community. Bibliographic reference extraction f... more Publications are an integral part of a scientific community. Bibliographic reference extraction from scientific publication is a challenging task due to diversity in referencing styles and document layout. Existing methods perform sufficiently on one dataset however, applying these solutions to a different dataset proves to be challenging. Therefore, a generic solution was anticipated which could overcome the limitations of the previous approaches. The contribution of this paper is three-fold. First, it presents a novel approach called DeepBiRD which is inspired by human visual perception and exploits layout features to identify individual references in a scientific publication. Second, we release a large dataset for image-based reference detection with 2401 scans containing 38863 references, all manually annotated for individual reference. Third, we present a unified and highly configurable end-to-end automatic bibliographic reference extraction framework called BRExSys which emplo...

Research paper thumbnail of Interpreting Deep Models through the Lens of Data

2020 International Joint Conference on Neural Networks (IJCNN), 2020

Identification of input data points relevant for the classifier (i.e. serve as the support vector... more Identification of input data points relevant for the classifier (i.e. serve as the support vector) has recently spurred the interest of researchers for both interpretability as well as dataset debugging. This paper presents an in-depth analysis of the methods which attempt to identify the influence of these data points on the resulting classifier. To quantify the quality of the influence, we curated a set of experiments where we debugged and pruned the dataset based on the influence information obtained from different methods. To do so, we provided the classifier with mislabeled examples that hampered the overall performance. Since the classifier is a combination of both the data and the model, therefore, it is essential to also analyze these influences for the interpretability of deep learning models. Analysis of the results shows that some interpretability methods can detect mislabels better than using a random approach, however, contrary to the claim of these methods, the sample ...

Research paper thumbnail of Recognizable units in Pashto language for OCR

2015 13th International Conference on Document Analysis and Recognition (ICDAR), 2015

Atomic segmentation of cursive scripts into constituent characters is one of the most challenging... more Atomic segmentation of cursive scripts into constituent characters is one of the most challenging problems in pattern recognition. To avoid segmentation in cursive script, concrete shapes are considered as recognizable units. Therefore, the objective of this work is to find out the alternate recognizable units in Pashto cursive script. These alternatives are ligatures and primary ligatures. However, we need sound statistical analysis to find the appropriate numbers of ligatures and primary ligatures in Pashto script. In this work, a corpus of 2, 313, 736 Pashto words are extracted from a large scale diversified web sources, and total of 19, 268 unique ligatures have been identified in Pashto cursive script. Analysis shows that only 7000 ligatures represent 91% portion of overall corpus of the Pashto unique words. Similarly, about 7, 681 primary ligatures are also identified which represent the basic shapes of all the ligatures.

Research paper thumbnail of Visual appearance based document classification methods: Performance evaluation and benchmarking

2015 13th International Conference on Document Analysis and Recognition (ICDAR), 2015

We study existence of positive solutions to nonlinear higher-order nonlocal boundary value proble... more We study existence of positive solutions to nonlinear higher-order nonlocal boundary value problems corresponding to fractional differential equation of the type c D δ 0 u t f t, u t 0, t ∈ 0, 1 , 0 < t < 1. u 1 βu η λ 2 , u 0 αu η − λ 1 , u 0 0, u 0 0 • • • u n−1 0 0, where, n − 1 < δ < n, n ≥ 3 ∈ N, 0 < η, α, β < 1, the boundary parameters λ 1 , λ 2 ∈ R and c D δ 0 is the Caputo fractional derivative. We use the classical tools from functional analysis to obtain sufficient conditions for the existence and uniqueness of positive solutions to the boundary value problems. We also obtain conditions for the nonexistence of positive solutions to the problem. We include examples to show the applicability of our results.

Research paper thumbnail of Der Computer lernt ”Lesen”

Research paper thumbnail of A Hybrid Approach for Document Image Segmentation and Encoding

Research paper thumbnail of ALV: Lesende Systeme für die Unterstützung von Bürovorgängen

Research paper thumbnail of Document Analysis Systems, Series in Machine Perception, Artificial Intelligence

Research paper thumbnail of Diary generation from personal information models to support contextual remembering and reminiscence

2015 IEEE International Conference on Multimedia & Expo Workshops (ICMEW), 2015

  1. Motivation / Background • Vision • PIMO & Semantic Desktop 2) Technical Realization • User Int... more 1) Motivation / Background • Vision • PIMO & Semantic Desktop 2) Technical Realization • User Interface (Client) • Diary Generation (Server) 3) Early Evaluation 4) Conlusion & Outlook Contents © DFKI-2015 3 Motivation Can you name five things you were concerned with the most for an arbitrarily chosen period of your life,

Research paper thumbnail of Wissensrepräsentation

Semantische Technologien, 2012

Research paper thumbnail of Semantische Suche

Semantische Technologien, 2012

Research paper thumbnail of Posters-Believing Finite-State Cascades in Knowledge-Based Information Extraction

Lecture Notes in Computer Science, 2008

Research paper thumbnail of Generating Affective Captions using Concept And Syntax Transition Networks

Proceedings of the 24th ACM international conference on Multimedia, 2016

The area of image captioning i.e. the automatic generation of short textual descriptions of image... more The area of image captioning i.e. the automatic generation of short textual descriptions of images has experienced much progress recently. However, image captioning approaches often only focus on describing the content of the image without any emotional or sentimental dimension which is common in human captions. This paper presents an approach for image captioning designed specifically to incorporate emotions and feelings into the caption generation process. The presented approach consists of a Deep Convolutional Neural Network (CNN) for detecting Adjective Noun Pairs in the image and a graphical network architecture called "Concept And Syntax Transition (CAST)" network for generating sentences from these detected concepts.

Research paper thumbnail of Circ-LocNet: A Computational Framework for Circular RNA Sub-Cellular Localization Prediction

International Journal of Molecular Sciences

Circular ribonucleic acids (circRNAs) are novel non-coding RNAs that emanate from alternative spl... more Circular ribonucleic acids (circRNAs) are novel non-coding RNAs that emanate from alternative splicing of precursor mRNA in reversed order across exons. Despite the abundant presence of circRNAs in human genes and their involvement in diverse physiological processes, the functionality of most circRNAs remains a mystery. Like other non-coding RNAs, sub-cellular localization knowledge of circRNAs has the aptitude to demystify the influence of circRNAs on protein synthesis, degradation, destination, their association with different diseases, and potential for drug development. To date, wet experimental approaches are being used to detect sub-cellular locations of circular RNAs. These approaches help to elucidate the role of circRNAs as protein scaffolds, RNA-binding protein (RBP) sponges, micro-RNA (miRNA) sponges, parental gene expression modifiers, alternative splicing regulators, and transcription regulators. To complement wet-lab experiments, considering the progress made by machin...

Research paper thumbnail of Search and Learn: Improving Semantic Coverage for Data-to-Text Generation

Proceedings of the AAAI Conference on Artificial Intelligence

Data-to-text generation systems aim to generate text descriptions based on input data (often repr... more Data-to-text generation systems aim to generate text descriptions based on input data (often represented in the tabular form). A typical system uses huge training samples for learning the correspondence between tables and texts. However, large training sets are expensive to obtain, limiting the applicability of these approaches in real-world scenarios. In this work, we focus on few-shot data-to-text generation. We observe that, while fine-tuned pretrained language models may generate plausible sentences, they suffer from the low semantic coverage problem in the few-shot setting. In other words, important input slots tend to be missing in the generated text. To this end, we propose a search-and-learning approach that leverages pretrained language models but inserts the missing slots to improve the semantic coverage. We further finetune our system based on the search results to smooth out the search noise, yielding better-quality text and improving inference efficiency to a large exte...

Research paper thumbnail of TimeREISE: Time Series Randomized Evolving Input Sample Explanation

Sensors

Deep neural networks are one of the most successful classifiers across different domains. However... more Deep neural networks are one of the most successful classifiers across different domains. However, their use is limited in safety-critical areas due to their limitations concerning interpretability. The research field of explainable artificial intelligence addresses this problem. However, most interpretability methods align to the imaging modality by design. The paper introduces TimeREISE, a model agnostic attribution method that shows success in the context of time series classification. The method applies perturbations to the input and considers different attribution map characteristics such as the granularity and density of an attribution map. The approach demonstrates superior performance compared to existing methods concerning different well-established measurements. TimeREISE shows impressive results in the deletion and insertion test, Infidelity, and Sensitivity. Concerning the continuity of an explanation, it showed superior performance while preserving the correctness of th...

Research paper thumbnail of Leveraging Context-Aware Recommender Systems for Improving Personal Knowledge Assistants by Introducing Contextual States

During the last decades, recommender systems have played a remarkable role in putting one step fu... more During the last decades, recommender systems have played a remarkable role in putting one step further toward making content platforms more intelligent in a wide variety of domains ranging from music and movies to books and documents. Notwithstanding the various applications of recommender systems, not many contributions have been made regarding their potential capabilities in the domain of personal knowledge management. Hence, it has been tried in this study to shed new light on an innovative application of recommender systems to improve personal knowledge assistants by making them capable of providing knowledge workers with useful information through every single situation during their daily work. This paper provides a comprehensive research tree involving the key information about state of the art approaches with a focus on the three most relevant categories to this research including knowledge-based, sequential, and session-based recommender systems. Furthermore, the idea of the...

Research paper thumbnail of If You Like It, GAN It—Probabilistic Multivariate Times Series Forecast with GAN

The 7th International conference on Time Series and Forecasting, 2021

This article is an open access article distributed under the terms and conditions of the Creative... more This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY

Research paper thumbnail of ExAID: A multimodal explanation framework for computer-aided diagnosis of skin lesions

Computer Methods and Programs in Biomedicine, 2022

Background and Objectives: One principal impediment in successful deployment of Artificial Intell... more Background and Objectives: One principal impediment in successful deployment of Artificial Intelligence (AI)-based Computer-Aided Diagnosis (CAD) systems in everyday clinical workflow is their lack of transparent decision making. Although commonly used eXplainable AI (XAI) methods provide some insight into these largely opaque algorithms, yet such explanations are usually convoluted and not readily comprehensible except by highly trained AI experts. The explanation of decisions regarding the malignancy of skin lesions from dermoscopic images demands particular clarity, as the underlying medical problem definition is itself ambiguous. This work presents and evaluates ExAID (Explainable AI for Dermatology), a novel XAI framework for biomedical image analysis, providing multi-modal concept-based explanations consisting of easy-to-understand textual explanations supplemented by visual maps justifying the predictions. Methods: Our framework relies on Concept Activation Vectors (CAVs) to map humanunderstandable concepts to those learnt by an arbitrary Deep Learning based algorithm in its latent space, and Concept Localisation Maps (CLMs) to highlight concepts in the input space. This identification of relevant concepts is then used to construct fine-grained textual explanations supplemented by concept-wise location information to provide comprehensive and coherent multi-modal explanations. All decision-related information is comprehensively presented in a diagnostic interface for use in clinical routines. Moreover, the framework includes an educational mode providing dataset-level explanation statistics and tools for data and model exploration to aid medical research and education processes. Results: Through rigorous quantitative and qualitative evaluation of our framework on a range of dermoscopic image datasets such as SkinL2, Derm7pt, PH 2 and ISIC, we show the utility of multimodal explanations for CAD-assisted scenarios even in case of wrong disease predictions. Conclusions: We present a new multi-modal explanation framework for biomedical image analysis on the example use-case of Melanoma classification from dermoscopic images and evaluate its utility on a row of datasets. Since comprehensible explanation is one of the cornerstones of any CAD system, we believe that ExAID will provide dermatologists an effective screening tool that they both understand and trust. Moreover, ExAID will be the basis for similar applications in other biomedical imaging fields.

Research paper thumbnail of Turbulence forecasting via Neural ODE

arXiv: Computational Physics, 2019

Fluid turbulence is characterized by strong coupling across a broad range of scales. Furthermore,... more Fluid turbulence is characterized by strong coupling across a broad range of scales. Furthermore, besides the usual local cascades, such coupling may extend to interactions that are non-local in scale-space. As such the computational demands associated with explicitly resolving the full set of scales and their interactions, as in the Direct Numerical Simulation (DNS) of the Navier-Stokes equations, in most problems of practical interest are so high that reduced modeling of scales and interactions is required before further progress can be made. While popular reduced models are typically based on phenomenological modeling of relevant turbulent processes, recent advances in machine learning techniques have energized efforts to further improve the accuracy of such reduced models. In contrast to such efforts that seek to improve an existing turbulence model, we propose a machine learning(ML) methodology that captures, de novo, underlying turbulence phenomenology without a pre-specified ...

Research paper thumbnail of A Hybrid Approach and Unified Framework for Bibliographic Reference Extraction

IEEE Access, 2020

Publications are an integral part of a scientific community. Bibliographic reference extraction f... more Publications are an integral part of a scientific community. Bibliographic reference extraction from scientific publication is a challenging task due to diversity in referencing styles and document layout. Existing methods perform sufficiently on one dataset however, applying these solutions to a different dataset proves to be challenging. Therefore, a generic solution was anticipated which could overcome the limitations of the previous approaches. The contribution of this paper is three-fold. First, it presents a novel approach called DeepBiRD which is inspired by human visual perception and exploits layout features to identify individual references in a scientific publication. Second, we release a large dataset for image-based reference detection with 2401 scans containing 38863 references, all manually annotated for individual reference. Third, we present a unified and highly configurable end-to-end automatic bibliographic reference extraction framework called BRExSys which emplo...

Research paper thumbnail of Interpreting Deep Models through the Lens of Data

2020 International Joint Conference on Neural Networks (IJCNN), 2020

Identification of input data points relevant for the classifier (i.e. serve as the support vector... more Identification of input data points relevant for the classifier (i.e. serve as the support vector) has recently spurred the interest of researchers for both interpretability as well as dataset debugging. This paper presents an in-depth analysis of the methods which attempt to identify the influence of these data points on the resulting classifier. To quantify the quality of the influence, we curated a set of experiments where we debugged and pruned the dataset based on the influence information obtained from different methods. To do so, we provided the classifier with mislabeled examples that hampered the overall performance. Since the classifier is a combination of both the data and the model, therefore, it is essential to also analyze these influences for the interpretability of deep learning models. Analysis of the results shows that some interpretability methods can detect mislabels better than using a random approach, however, contrary to the claim of these methods, the sample ...

Research paper thumbnail of Recognizable units in Pashto language for OCR

2015 13th International Conference on Document Analysis and Recognition (ICDAR), 2015

Atomic segmentation of cursive scripts into constituent characters is one of the most challenging... more Atomic segmentation of cursive scripts into constituent characters is one of the most challenging problems in pattern recognition. To avoid segmentation in cursive script, concrete shapes are considered as recognizable units. Therefore, the objective of this work is to find out the alternate recognizable units in Pashto cursive script. These alternatives are ligatures and primary ligatures. However, we need sound statistical analysis to find the appropriate numbers of ligatures and primary ligatures in Pashto script. In this work, a corpus of 2, 313, 736 Pashto words are extracted from a large scale diversified web sources, and total of 19, 268 unique ligatures have been identified in Pashto cursive script. Analysis shows that only 7000 ligatures represent 91% portion of overall corpus of the Pashto unique words. Similarly, about 7, 681 primary ligatures are also identified which represent the basic shapes of all the ligatures.

Research paper thumbnail of Visual appearance based document classification methods: Performance evaluation and benchmarking

2015 13th International Conference on Document Analysis and Recognition (ICDAR), 2015

We study existence of positive solutions to nonlinear higher-order nonlocal boundary value proble... more We study existence of positive solutions to nonlinear higher-order nonlocal boundary value problems corresponding to fractional differential equation of the type c D δ 0 u t f t, u t 0, t ∈ 0, 1 , 0 < t < 1. u 1 βu η λ 2 , u 0 αu η − λ 1 , u 0 0, u 0 0 • • • u n−1 0 0, where, n − 1 < δ < n, n ≥ 3 ∈ N, 0 < η, α, β < 1, the boundary parameters λ 1 , λ 2 ∈ R and c D δ 0 is the Caputo fractional derivative. We use the classical tools from functional analysis to obtain sufficient conditions for the existence and uniqueness of positive solutions to the boundary value problems. We also obtain conditions for the nonexistence of positive solutions to the problem. We include examples to show the applicability of our results.

Research paper thumbnail of Der Computer lernt ”Lesen”

Research paper thumbnail of A Hybrid Approach for Document Image Segmentation and Encoding

Research paper thumbnail of ALV: Lesende Systeme für die Unterstützung von Bürovorgängen

Research paper thumbnail of Document Analysis Systems, Series in Machine Perception, Artificial Intelligence

Research paper thumbnail of Diary generation from personal information models to support contextual remembering and reminiscence

2015 IEEE International Conference on Multimedia & Expo Workshops (ICMEW), 2015

  1. Motivation / Background • Vision • PIMO & Semantic Desktop 2) Technical Realization • User Int... more 1) Motivation / Background • Vision • PIMO & Semantic Desktop 2) Technical Realization • User Interface (Client) • Diary Generation (Server) 3) Early Evaluation 4) Conlusion & Outlook Contents © DFKI-2015 3 Motivation Can you name five things you were concerned with the most for an arbitrarily chosen period of your life,

Research paper thumbnail of Wissensrepräsentation

Semantische Technologien, 2012

Research paper thumbnail of Semantische Suche

Semantische Technologien, 2012

Research paper thumbnail of Posters-Believing Finite-State Cascades in Knowledge-Based Information Extraction

Lecture Notes in Computer Science, 2008

Research paper thumbnail of Generating Affective Captions using Concept And Syntax Transition Networks

Proceedings of the 24th ACM international conference on Multimedia, 2016

The area of image captioning i.e. the automatic generation of short textual descriptions of image... more The area of image captioning i.e. the automatic generation of short textual descriptions of images has experienced much progress recently. However, image captioning approaches often only focus on describing the content of the image without any emotional or sentimental dimension which is common in human captions. This paper presents an approach for image captioning designed specifically to incorporate emotions and feelings into the caption generation process. The presented approach consists of a Deep Convolutional Neural Network (CNN) for detecting Adjective Noun Pairs in the image and a graphical network architecture called "Concept And Syntax Transition (CAST)" network for generating sentences from these detected concepts.