D. Borth - Academia.edu (original) (raw)

Papers by D. Borth

Research paper thumbnail of Content analysis meets viewers: linking concept detection with demographics on YouTube

International Journal of Multimedia Information Retrieval, 2012

Research paper thumbnail of Tag suggestion on youtube by personalizing content-based auto-annotation

Proceedings of the ACM multimedia 2012 workshop on Crowdsourcing for multimedia - CrowdMM '12, 2012

Research paper thumbnail of Linking visual concept detection with viewer demographics

Proceedings of the 2nd ACM International Conference on Multimedia Retrieval - ICMR '12, 2012

The estimation of demographic target groups for web videos -- with applications in ad targeting -... more The estimation of demographic target groups for web videos -- with applications in ad targeting -- poses a challenging problem, as the textual description and view statistics available for many clips is extremely sparse. Therefore, the goal of this paper is to link a clip's popularity across different viewer ages and genders on the one hand with the video content on the other: Employing user comments and user profiles on YouTube, we show that there is a strong correlation between demographic target groups and semantic concepts appearing in the video (like "teenage male" and "skateboarding"). Based on this observation, we suggest two approaches: First, the demographic target group of a clip is predicted automatically via a content-based concept detection. Second, should sufficient view statistics already give a good impression of a video's audience, we show that this information can serve as a valuable additional signal to disambiguate concept detection. Our experimental results on a dataset of 14,000 YouTube clips commented by 1 mio. users show that -- though content-based viewership estimation is a challenging problem -- suitable demographic groups can be suggested by concept detection. Also, a combination with demographic information as an additional signal leads to relative improvements of concept detection accuracy by 47%.

Research paper thumbnail of Large-Scale Deep Learning on the YFCC100M Dataset

Research paper thumbnail of Smart Video Buddy–Content-Based Live Recommendation

Research paper thumbnail of DeepSentiBank: Visual Sentiment Concept Classification with Deep Convolutional Neural Networks

Research paper thumbnail of Automatic Detection of CSA Media by Multi-modal Feature Fusion for Law Enforcement Support

Proceedings of International Conference on Multimedia Retrieval - ICMR '14, 2014

ABSTRACT The growing amounts of multimedia data being made available and shared via the Internet ... more ABSTRACT The growing amounts of multimedia data being made available and shared via the Internet pose an increasing problem for law enforcement to investigate the distribution and possession of child sexual abuse (CSA) media. In this paper we address the automatic detection of CSA material in image and video data by multi-modal feature description. Instead of analyzing hash sums or file names, we propose the content-based analysis on visual and, in case of videos, also audio features. To this end, we apply multiple low level features as well as SentiBank, a novel mid-level representation of visual content. In collaboration with police partners and European cyber crime units, we conducted experiments on several datasets, including real world CSA media. Our quantitative evaluation reveals the challenging nature of child pornography detection, especially in the joint presence of non-illegal pornographic data, rendering skin detection, a popular feature for detecting pornography, less discriminative. Further, the utilization of SentiBank features shows high potential for detection and explainability of such content. Overall, multi-modal feature fusion can achieve an improved detection accuracy, reducing equal error rate from 17% to 10% for images and from 16% to 8% for videos as compared to best single feature performance for the challenging task of classifying CSA content from adult media.

Research paper thumbnail of Pornography detection in video benefits (a lot) from a multi-modal approach

Proceedings of the 2012 ACM international workshop on Audio and multimedia methods for large-scale video analysis - AMVA '12, 2012

Research paper thumbnail of Keyframe Extraktion für Video-Annotation und Video-Zusammenfassung

Informatik-Spektrum, 2008

Research paper thumbnail of Audio-Concept Features and Hidden Markov Models for Multimedia Event Detection

Multimedia event detection (MED) on user-generated content is the task of finding an event, e.g.,... more Multimedia event detection (MED) on user-generated content is the task of finding an event, e.g., a Flash mob or Attempting a bike trick, using its content characteristics. Recent research has focused on approaches that use semantically defined "concepts" trained with annotated audio clips. Using audio concepts allows us to show semantic evidence of their relationship to events, by looking at the probability distribution of the audio concepts per event. However, while the concept-based approach has been useful in image detection, audio concepts have generally not surpassed the performance of low-level audio features like Mel Frequency Cepstral Coefficients (MFCCs) in addressing the unstructured acoustic composition of video events. Such audio-concept based systems could benefit from temporal information, due to one of the intrinsic characteristics of audio: it occurs across a time interval. This paper presents a multimedia event detection system that uses audio concepts; it exploits the temporal correlation of audio characteristics for each particular event at two levels. The first level involves analyzing the short-and long-term surrounding context information for the audio concepts, through an implementation of a Hierarchical Deep Neural Network (H-DNN), to determine engineered audio-concept features. At the second level, we use Hidden Markov Models (HMMs) to describe the continuous and non-stationary characteristics of the audio signal throughout the video. Experiments using the TRECVID MED 2013 corpus show that an HMM system based on audio-concept features can perform competitively when compared with an MFCC-based system.

Research paper thumbnail of The placing task: A large-scale geo-estimation challenge for social-media videos and images

Research paper thumbnail of Contributors, Nov. 1977

IEEE Transactions on Microwave Theory and Techniques, 2000

Research paper thumbnail of An Inventory Evaluation Approach for Common Use Stock Range Units in British Columbia

Canadian Journal of Agricultural Economics/Revue canadienne d'agroeconomie, 1988

Research paper thumbnail of Lookapp for Ads–Content-based Advertising by Visual Concept Detection

Research paper thumbnail of DFKI and University of Kaiserslautern Participation at TRECVID 2010-Semantic Indexing Task

Run No. Run ID Run Description infMAP (%) training on IACC data 1 F A DFKI-MADM 3 SIFT visual wor... more Run No. Run ID Run Description infMAP (%) training on IACC data 1 F A DFKI-MADM 3 SIFT visual words, Color Correlograms and Face-Detection separately trained, late fusion of SVMs scores 5.0 2 F A DFKI-MADM 4 SIFT visual words with SVMs 4.4 training on YouTube 3 F D DFKI-MADM 1 SIFT visual words, Color Correlograms and Face-Detection separately trained, late fusion of SVMs scores 2.1 4 F B DFKI-MADM 2 SIFT visual words with SVMs 1.3

Research paper thumbnail of Video Copy Detection providing Localized Matches

Gesellschaft für Informatik (GI) publishes this series in order to make available to a broad public recent findings in informatics (ie computer science and informa-tion systems), to document conferences that are organized in co-operation with GI and to publish the annual GI Award dissertation.

Abstract: With the availability of large scale online video platforms like YouTube, copyright inf... more Abstract: With the availability of large scale online video platforms like YouTube, copyright infringement becomes a severe problem, such that the demand for robust copy detection systems is growing. Such system must find multiple occurrence of copyright protected material within video clips that are created, modified, remixed and uploaded by the user. A particular challenge is to find the exact position of a copy in a–potentially huge–reference database. For this purpose, this paper presents a Content Based Copy Detection system ...

Research paper thumbnail of Introduction to spread-spectrum communications

... Appendix A: Summary of Phase-Locked Theory.....615 A-1 Introduction .....615 A-2Phase-locked ... more ... Appendix A: Summary of Phase-Locked Theory.....615 A-1 Introduction .....615 A-2Phase-locked loop models and characteristics of operation .....615 A-2.1 ...

Research paper thumbnail of Participation at TRECVID 2011 semantic indexing & content-based copy detection tasks

Run ID Run Description infMAP (%)

Research paper thumbnail of Analysis and forecasting of trending topics in online media streams

Among the vast information available on the web, social media streams capture what people current... more Among the vast information available on the web, social media streams capture what people currently pay attention to and how they feel about certain topics. Awareness of such trending topics plays a crucial role in multimedia systems such as trend aware recommendation and automatic vocabulary selection for video concept detection systems.

Research paper thumbnail of Sentibank: Large-scale ontology and classifiers for detecting sentiment and emotions in visual content

Research paper thumbnail of Content analysis meets viewers: linking concept detection with demographics on YouTube

International Journal of Multimedia Information Retrieval, 2012

Research paper thumbnail of Tag suggestion on youtube by personalizing content-based auto-annotation

Proceedings of the ACM multimedia 2012 workshop on Crowdsourcing for multimedia - CrowdMM '12, 2012

Research paper thumbnail of Linking visual concept detection with viewer demographics

Proceedings of the 2nd ACM International Conference on Multimedia Retrieval - ICMR '12, 2012

The estimation of demographic target groups for web videos -- with applications in ad targeting -... more The estimation of demographic target groups for web videos -- with applications in ad targeting -- poses a challenging problem, as the textual description and view statistics available for many clips is extremely sparse. Therefore, the goal of this paper is to link a clip's popularity across different viewer ages and genders on the one hand with the video content on the other: Employing user comments and user profiles on YouTube, we show that there is a strong correlation between demographic target groups and semantic concepts appearing in the video (like "teenage male" and "skateboarding"). Based on this observation, we suggest two approaches: First, the demographic target group of a clip is predicted automatically via a content-based concept detection. Second, should sufficient view statistics already give a good impression of a video's audience, we show that this information can serve as a valuable additional signal to disambiguate concept detection. Our experimental results on a dataset of 14,000 YouTube clips commented by 1 mio. users show that -- though content-based viewership estimation is a challenging problem -- suitable demographic groups can be suggested by concept detection. Also, a combination with demographic information as an additional signal leads to relative improvements of concept detection accuracy by 47%.

Research paper thumbnail of Large-Scale Deep Learning on the YFCC100M Dataset

Research paper thumbnail of Smart Video Buddy–Content-Based Live Recommendation

Research paper thumbnail of DeepSentiBank: Visual Sentiment Concept Classification with Deep Convolutional Neural Networks

Research paper thumbnail of Automatic Detection of CSA Media by Multi-modal Feature Fusion for Law Enforcement Support

Proceedings of International Conference on Multimedia Retrieval - ICMR '14, 2014

ABSTRACT The growing amounts of multimedia data being made available and shared via the Internet ... more ABSTRACT The growing amounts of multimedia data being made available and shared via the Internet pose an increasing problem for law enforcement to investigate the distribution and possession of child sexual abuse (CSA) media. In this paper we address the automatic detection of CSA material in image and video data by multi-modal feature description. Instead of analyzing hash sums or file names, we propose the content-based analysis on visual and, in case of videos, also audio features. To this end, we apply multiple low level features as well as SentiBank, a novel mid-level representation of visual content. In collaboration with police partners and European cyber crime units, we conducted experiments on several datasets, including real world CSA media. Our quantitative evaluation reveals the challenging nature of child pornography detection, especially in the joint presence of non-illegal pornographic data, rendering skin detection, a popular feature for detecting pornography, less discriminative. Further, the utilization of SentiBank features shows high potential for detection and explainability of such content. Overall, multi-modal feature fusion can achieve an improved detection accuracy, reducing equal error rate from 17% to 10% for images and from 16% to 8% for videos as compared to best single feature performance for the challenging task of classifying CSA content from adult media.

Research paper thumbnail of Pornography detection in video benefits (a lot) from a multi-modal approach

Proceedings of the 2012 ACM international workshop on Audio and multimedia methods for large-scale video analysis - AMVA '12, 2012

Research paper thumbnail of Keyframe Extraktion für Video-Annotation und Video-Zusammenfassung

Informatik-Spektrum, 2008

Research paper thumbnail of Audio-Concept Features and Hidden Markov Models for Multimedia Event Detection

Multimedia event detection (MED) on user-generated content is the task of finding an event, e.g.,... more Multimedia event detection (MED) on user-generated content is the task of finding an event, e.g., a Flash mob or Attempting a bike trick, using its content characteristics. Recent research has focused on approaches that use semantically defined "concepts" trained with annotated audio clips. Using audio concepts allows us to show semantic evidence of their relationship to events, by looking at the probability distribution of the audio concepts per event. However, while the concept-based approach has been useful in image detection, audio concepts have generally not surpassed the performance of low-level audio features like Mel Frequency Cepstral Coefficients (MFCCs) in addressing the unstructured acoustic composition of video events. Such audio-concept based systems could benefit from temporal information, due to one of the intrinsic characteristics of audio: it occurs across a time interval. This paper presents a multimedia event detection system that uses audio concepts; it exploits the temporal correlation of audio characteristics for each particular event at two levels. The first level involves analyzing the short-and long-term surrounding context information for the audio concepts, through an implementation of a Hierarchical Deep Neural Network (H-DNN), to determine engineered audio-concept features. At the second level, we use Hidden Markov Models (HMMs) to describe the continuous and non-stationary characteristics of the audio signal throughout the video. Experiments using the TRECVID MED 2013 corpus show that an HMM system based on audio-concept features can perform competitively when compared with an MFCC-based system.

Research paper thumbnail of The placing task: A large-scale geo-estimation challenge for social-media videos and images

Research paper thumbnail of Contributors, Nov. 1977

IEEE Transactions on Microwave Theory and Techniques, 2000

Research paper thumbnail of An Inventory Evaluation Approach for Common Use Stock Range Units in British Columbia

Canadian Journal of Agricultural Economics/Revue canadienne d'agroeconomie, 1988

Research paper thumbnail of Lookapp for Ads–Content-based Advertising by Visual Concept Detection

Research paper thumbnail of DFKI and University of Kaiserslautern Participation at TRECVID 2010-Semantic Indexing Task

Run No. Run ID Run Description infMAP (%) training on IACC data 1 F A DFKI-MADM 3 SIFT visual wor... more Run No. Run ID Run Description infMAP (%) training on IACC data 1 F A DFKI-MADM 3 SIFT visual words, Color Correlograms and Face-Detection separately trained, late fusion of SVMs scores 5.0 2 F A DFKI-MADM 4 SIFT visual words with SVMs 4.4 training on YouTube 3 F D DFKI-MADM 1 SIFT visual words, Color Correlograms and Face-Detection separately trained, late fusion of SVMs scores 2.1 4 F B DFKI-MADM 2 SIFT visual words with SVMs 1.3

Research paper thumbnail of Video Copy Detection providing Localized Matches

Gesellschaft für Informatik (GI) publishes this series in order to make available to a broad public recent findings in informatics (ie computer science and informa-tion systems), to document conferences that are organized in co-operation with GI and to publish the annual GI Award dissertation.

Abstract: With the availability of large scale online video platforms like YouTube, copyright inf... more Abstract: With the availability of large scale online video platforms like YouTube, copyright infringement becomes a severe problem, such that the demand for robust copy detection systems is growing. Such system must find multiple occurrence of copyright protected material within video clips that are created, modified, remixed and uploaded by the user. A particular challenge is to find the exact position of a copy in a–potentially huge–reference database. For this purpose, this paper presents a Content Based Copy Detection system ...

Research paper thumbnail of Introduction to spread-spectrum communications

... Appendix A: Summary of Phase-Locked Theory.....615 A-1 Introduction .....615 A-2Phase-locked ... more ... Appendix A: Summary of Phase-Locked Theory.....615 A-1 Introduction .....615 A-2Phase-locked loop models and characteristics of operation .....615 A-2.1 ...

Research paper thumbnail of Participation at TRECVID 2011 semantic indexing & content-based copy detection tasks

Run ID Run Description infMAP (%)

Research paper thumbnail of Analysis and forecasting of trending topics in online media streams

Among the vast information available on the web, social media streams capture what people current... more Among the vast information available on the web, social media streams capture what people currently pay attention to and how they feel about certain topics. Awareness of such trending topics plays a crucial role in multimedia systems such as trend aware recommendation and automatic vocabulary selection for video concept detection systems.

Research paper thumbnail of Sentibank: Large-scale ontology and classifiers for detecting sentiment and emotions in visual content