Sports Video Analysis: Semantics Extraction, Editorial Content Creation and Adaptation (original) (raw)

Multi-level Semantic Analysis for Sports Video

Lecture Notes in Computer Science, 2005

There has been a huge increase in the utilization of video as one of the most preferred type of media due to its content richness for many significant applications including sports. To sustain an ongoing rapid growth of sports video, there is an emerging demand for a sophisticated content-based indexing system. Users recall video contents in a high-level abstraction while video is generally stored as an arbitrary sequence of audio-visual tracks. To bridge this gap, this paper will demonstrate the use of domain knowledge and characteristics to design the extraction of high-level concepts directly from audio-visual features. In particular, we propose a multi-level semantic analysis framework to optimize the sharing of domain characteristics.

Semantic indexing of sports program sequences by audio-visual analysis

Proceedings 2003 International Conference on Image Processing (Cat. No.03CH37429), 2003

Semantic indexing of sports videos is a subject of great interest to researchers working on multimedia content characterization. Sports programs appeal to large audiences and their efficient distribution over various networks should contribute to widespread usage of multimedia services. In this paper, we propose a semantic indexing algorithm for soccer programs which uses both audio and visual information for content characterization. The video signal is processed first by extracting low-level visual descriptors from the MPEG compressed bit-stream. The temporal evolution of these descriptors during a semantic event is supposed to be governed by a controlled Markov chain. This allows to determine a list of those video segments where a semantic event of interest is likely to be found, based on the maximum likelihood criterion. The audio information is then used to refine the results of the video classification procedure by ranking the candidate video segments in the list so that the segments associated to the event of interest appear in the very first positions of the ordered list. The proposed method is applied to goal detection. Experimental results show the effectiveness of the proposed cross-modal approach.

Intermodal collaboration: a strategy for semantic content analysis for broadcasted sports video

2003

This paper presents intermodal collaboration: a strategy for semantic content analysis for broadcasted sports video. The broadcasted video can be viewed as a set of multimodal streams such as visual, auditory, text (closed caption) and graphics streams. Collaborative analysis for the multimodal streams is achieved based on temporal dependency between their streams, in order to improve the reliability and efficiency for semantic content analysis such as extracting highlight scenes from sports video and automatically generating annotations of specific scenes. A couple of case studies are shown to experimentally confirm the effectiveness of intermodal collaboration.

Automatic annotation of sport video content

2005

Automatic semantic annotation of video streams allows to extract significant clips for archiving and retrieval of video content. In this paper, we present a system that performs automatic annotation of soccer videos, detecting principal highlights, and recognizing identity of players. Highlight detection is carried out by means of finite state machines that encode domain knowledge, while player identification is based on face detection, and on the analysis of contextual information such as jersey's numbers and superimposed text captions. Results obtained on actual soccer videos shows overall highlight detection rates of about 90%. Lower, but still promising, accuracy is achieved on the very difficult player identification task.

Features extraction for soccer video semantic analysis: current achievements and remaining issues

Artificial Intelligence Review, 2014

This paper presents a state of the art review of features extraction for soccer video summarization research. The all existing approaches with regard to event detection, video summarization based on video stream and application of text sources in event detection have been surveyed. As regard the current challenges for automatic and real time provision of summary videos, different computer vision approaches are discussed and compared. Audio, video feature extraction methods and their combination with textual methods have been investigated. Available commercial products are presented to better clarify the boundaries in this domain and future directions for improvement of existing systems have been suggested.

Semantic annotation of sports videos

2002

Abstract Taking into consideration the unique qualities of sports videos, we propose a system that semantically annotates them at different layers of semantic significance, using different elements of visual content. We decompose each shot into its visual and graphic content elements and, by combining several different low-level visual primitives, capture the semantic content at a higher level of significance

A novel framework for semantic annotation of soccer sports video sequences

2008

Manual annotation is both impractical and very expensive, due to the vast amount of data generated at a rapid rate A novel framework is presented for semantic labeling of by such videos. However, automatic annotation is a very video clips, automatically segmented from broadcast video demanding and an extremely challenging computer vision task of soccer (football) games, as highlights and excitement as it involves high-level scene interpretation. In , authors clips etc. The proposed framework provides a generalizable presented a web-casting text based annotation scheme. In method for linking low-level video features with high- [1], authors proposed Finite State Machine based annotation level semantic concepts defined in a commonly understood of soccer video. Barnard et. al. proposed [3] HMM based sports lexicon. Three important contributions are made to framework to fuse audio and video features to recognize the automatic annotation of sports video, as follows. First, domain play and break scenes in soccer video sequences. Li et. al. knowledge combined with an event-lexicon and a four-level proposed rule based algorithm using low-level audio/video hierarchical classifier based on low-level video features is features for football video summarization. Babaguchi et. used to label video segments. Second, a priori event mining al. proposed event detection by recognizing the textual is used to establish probabilistic event-associations that are overlays from football video. used to assign a concept-lexicon, such as goals and saves, to There have been many successful works in soccer video each highlight video segment. And, finally, the collection of analysis as mentioned above. But most of these works fail to highlight video clips is summarized using concept-and eventrespond to action-based queries, such as "extract the goal clips lexicons to facilitate highlight browsing, video skimming, out of this soccer sequence", or "extract the saves from this indexing and retrieval. soccer video", "extract goals scored by team-A", "extract all the red card events from the collection of FIFA 2006 world 1 Introduction cup matches", etc. At higher level, user may ask the specific queries such as, "extract the replay segment from the goal clip

A unified approach to the generation of semantic cues for sports video annotation

Signal Processing, 2005

The use of video and audio features for automated annotation of audio-visual data is becoming widespread. A major limitation of many of the current methods is that the stored indexing features are too low-level—they relate directly to properties of the data. In this work we apply a further stage of processing that associates the feature measurements with real-world objects or events. The outputs, which we call “cues”, are combined to enable us to compute directly the probability of the object being present in the scene. An additional advantage of this approach is that the cues from different types of features are presented in a homogeneous way.

Multi-modal summarization of key events and top players in sports tournament videos

2011 IEEE Workshop on Applications of Computer Vision (WACV), 2011

To detect and annotate the key events of live sports videos, we need to tackle the semantic gaps of audio-visual information. Previous work has successfully extracted semantic from the time-stamped web match reports, which are synchronized with the video contents. However, web and social media articles with no time-stamps have not been fully leveraged, despite they are increasingly used to complement the coverage of major sporting tournaments. This paper aims to address this limitation using a novel multimodal summarization framework that is based on sentiment analysis and players' popularity. It uses audiovisual contents, web articles, blogs, and commentators' speech to automatically annotate and visualize the key events and key players in a sports tournament coverage. The experimental results demonstrate that the automatically generated video summaries are aligned with the events identified from the official website match reports.