Supervised multimedia categorization (original) (raw)

Supervised multimedia categorization

Storage and Retrieval for Media Databases 2003, 2003

Static multimedia on the Web can already be hardly structured manually. Although unavoidable and necessary, manual annotation of dynamic multimedia becomes even less feasible when multimedia quickly changes in complexity, i.e. in volume, modality, and usage context. The latter context could be set by learning or other purposes of the multimedia material. This multimedia dynamics calls for categorisation systems that index, query and retrieve multimedia objects on the fly in a similar way as a human expert would. We present and demonstrate such a supervised dynamic multimedia object categorisation system. Our categorisation system comes about by continuously gauging it to a group of human experts who annotate raw multimedia for a certain domain ontology given a usage context. Thus effectively our system learns the categorisation behaviour of human experts. By inducing supervised multi-modal content and context-dependent potentials our categorisation system associates field strengths of raw dynamic multimedia object categorisations with those human experts would assign. After a sufficient long period of supervised machine learning we arrive at automated robust and discriminative multimedia categorisation. We demonstrate the usefulness and effectiveness of our multimedia categorisation system in retrieving semantically meaningful soccer-video fragments, in particular by taking advantage of multimodal and domain specific information and knowledge supplied by human experts.

Supervised multimedia categorisation

Proceedings of SPIE, 2003

Static multimedia on the Web can already be hardly structured manually. Although unavoidable and necessary, manual annotation of dynamic multimedia becomes even less feasible when multimedia quickly changes in com-plexity, ie in volume, modality, and usage context. ...

Semi-automatic Categorization of Videos on VideoLectures.net

Lecture Notes in Computer Science, 2009

Automatic or semi-automatic categorization of items (e.g. documents) into a taxonomy is an important and challenging machine-learning task. In this paper, we present a module for semi-automatic categorization of video-recorded lectures. Properly categorized lectures provide the user with a better browsing experience which makes her more efficient in accessing the desired content. Our categorizer combines information found in texts associated with lectures and information extracted from various links between lectures in a unified machinelearning framework. By taking not only texts but also the links into account, the classification accuracy is increased by 12-20%.

Towards bridging the semantic gap in multimedia annotation and retrieval

2006

We present a systematic approach to the design of multimedia ontologies based on the MPEG-7 standard and domain-specific vocabularies. The limitations of MPEG-7 in describing the semantics of highly structured domains like sports or medicine has led to an upsurge of interest in adopting an integrated approach to the design of ontologies. We follow suit and use MPEG-7 to model structural and low-level aspects of multimedia documents. The high-level semantics are modeled using a domain-specific ontology designed for soccer games. The integration of these ontologies is achieved by providing appropriate links to the individual ontologies. As a proof-of-concept, we describe a video annotation tool implemented as a plugin for the widely used Protégé ontology editor. The advantage of our methodologies lies in the fact that we use semantic web compliant languages and tools that results in alleviating the interoperability issues currently plaguing the multimedia and the semantic web communities.

MediaTable: Interactive Categorization of Multimedia Collections

IEEE Computer Graphics and Applications, 2000

Many multimedia collections include only metadata such as date created and file size and remain largely unannotated. So, browsing them is cumbersome. Automatic content-analysis techniques yield metadata in the form of high-level content-based descriptors. However, these techniques' accuracy is insufficient to automate collection categorization. A human is essential to validate and organize automated techniques' results. MediaTable helps users efficiently categorize

Personalized retrieval of sports video

2007

There has been a growing demand for effective access to video information from media archives in recent years. Personalized video retrieval is one of the most challenging issues and has spurred a significant interest in many research communities. In this paper, a novel approach is proposed to achieve personalized retrieval of sports video, which includes two research tasks: semantic annotation of sports video and acquisition of user's preference. For semantic annotation, a multi-modal framework is employed to detect sports event and index the sports video content. Web-casting text, as external information, is utilized to detect semantic events in sport videos. The semantic concepts and keywords included in the webcasting text are extracted to annotate and index the sport event segments automatically. For user's preference acquisition, relevance feedback is applied to model user's preference and non-preference, and re-ranking is used to refine the results. First, the user is asked to label some video segments as desirable and undesirable. Then, we use these labels to infer the user's interesting points (e.g. the player, the event type, the team, etc.) by analysis of text keywords; the lowlevel video features are also adopted as a supplementary to reflect the user's preference. The overall new rank of the results is the combination of the user's high-level and low-level preference. Experiments conducted on real-world soccer game videos show that the proposed method has an encouraging performance.

Semantic indexing of multimedia content using textual and visual information

International Journal of Advanced Media and Communication, 2014

We present a learning-based approach to the semantic indexing of multimedia content using cues derived from audio, visual, and text features. We approach the problem by developing a set of statistical models for a predefined lexicon. Novel concepts are then mapped in terms of the concepts in the lexicon. To achieve robust detection of concepts, we exploit features from multiple modalities, namely, audio, video, and text. Concept representations are modeled using Gaussian mixture models (GMM), hidden Markov models (HMM), and support vector machines (SVM). Models such as Bayesian networks and SVMs are used in a latefusion approach to model concepts that are not explicitly modeled in terms of features. Our experiments indicate promise in the proposed classification and fusion methodologies: our proposed fusion scheme achieves more than 10% relative improvement over the best unimodal concept detector.

Semantic representation of multimedia content: Knowledge representation and semantic indexing

Multimedia Tools and Applications, 2008

In this paper we present a framework for unified, personalized access to heterogeneous multimedia content in distributed repositories. Focusing on semantic analysis of multimedia documents, metadata, user queries and user profiles, it contributes to the bridging of the gap between the semantic nature of user queries and raw multimedia documents. The proposed approach utilizes as input visual content analysis results, as well as analyzes and exploits associated textual annotation, in order to extract the underlying semantics, construct a semantic index and classify documents to topics, based on a unified knowledge and semantics representation model. It may then accept user queries, and, carrying out semantic interpretation and expansion, retrieve documents from the index and rank them according to user preferences, similarly to text retrieval. All processes are based on a novel semantic processing methodology, employing fuzzy algebra and principles of taxonomic knowledge representation. The first part of this work presented in this paper deals with data and knowledge models, manipulation of multimedia content annotations and semantic indexing, while the second part will continue on the use of the extracted semantic information for personalized retrieval.