Fully Automated Learning for Application-Specific Web Video Classification (original) (raw)
Related papers
Boosting web video categorization with contextual information from social web
World Wide Web, 2012
Web video categorization is a fundamental task for web video search. In this paper, we explore web video categorization from a new perspective, by integrating the model-based and data-driven approaches to boost the performance. The boosting comes from two aspects: one is the performance improvement for text classifiers through query expansion from related videos and user videos. The model-based classifiers are built based on the text features extracted from title and tags. Related videos and user videos act as external resources for compensating the shortcoming of the limited and noisy text features. Query expansion is adopted to reinforce the classification performance of text features through related videos and user videos. The other improvement is derived from the integration of model-based classification and data-driven majority voting from related videos and user videos. From the data-driven viewpoint, related videos and user videos are treated as sources for majority voting from the perspective of video relevance and user interest, respectively. Semantic meaning from text, video relevance from related videos, and user interest induced from user videos, are combined to robustly World Wide Web determine the video category. Their combination from semantics, relevance and interest further improves the performance of web video categorization. Experiments on YouTube videos demonstrate the significant improvement of the proposed approach compared to the traditional text based classifiers.
Metadata Based Classification and Analysis of Large Scale Web Videos
The astonishing growth of videos on the Internet such as YouTube, Yahoo Screen, Face Book etc, organizing videos into categories is of paramount importance for improving user experience and website utilization. In this information age, video information is the rapidly sharing by the people through social media websites such as YouTube, Face Book, yahoo Screen etc. Different categories of web video are shared on social websites and used by the billions of users all over the world. The classification/partitioning of web videos in terms of length of the video, ratings, age of the video, number of comments etc, and analysis of this web video as a unstructured complex data is a challenging task. In this work we propose effective classification model to classify each category of web-videos (Ex- ‘Entertainment’, ‘People and Blogs’, ‘Sports’, ‘News and Politics’, ‘Science and Technology’ etc) based on other web metadata attributes as splitting criteria. An attempt is made to extract metadata from web videos. Based on the extracted metadata, web videos are classified/partitioned into different categories by applying data mining classification algorithms such as and Random Tree and J48 classification model. The classification results are compared and analyzed using cost/benefit analysis. Also the results demonstrate classification of web videos depends largely on available metadata and accuracy of the classification model. Classification/partitioning of web-based videos are important task with many applications in video search and information retrieval process. However, collecting metadata required for classification model may be prohibitively expensive. The experimental difficulties arise from large data diversity within a category is pitiable of metadata and dreadful conditions of web video metadata.
Semi-automatic Categorization of Videos on VideoLectures.net
Lecture Notes in Computer Science, 2009
Automatic or semi-automatic categorization of items (e.g. documents) into a taxonomy is an important and challenging machine-learning task. In this paper, we present a module for semi-automatic categorization of video-recorded lectures. Properly categorized lectures provide the user with a better browsing experience which makes her more efficient in accessing the desired content. Our categorizer combines information found in texts associated with lectures and information extracted from various links between lectures in a unified machinelearning framework. By taking not only texts but also the links into account, the classification accuracy is increased by 12-20%.
Know your data: Understanding implicit usage versus explicit action in video content classification
2011
In this paper, we present a method for video category classification using only social metadata from websites like YouTube. In place of content analysis, we utilize communicative and social contexts surrounding videos as a means to determine a categorical genre, e.g. Comedy, Music. We hypothesize that video clips belonging to different genre categories would have distinct signatures and patterns that are reflected in their collected metadata. In particular, we define and describe social metadata as usage or action to aid in classification. We trained a Naive Bayes classifier to predict categories from a sample of 1,740 YouTube videos representing the top five genre categories. Using just a small number of the available metadata features, we compare the classifications produced by our Naive Bayes classifier with those provided by the uploader of that particular video. Compared to random predictions with the YouTube data (21% accurate), our classifier attained a mediocre 33% accuracy in predicting video genres. However, we found that the accuracy of our classifier significantly improves by nominal factoring of the explicit data features. By factoring the ratings of the videos in the dataset, the classifier was able to accurately predict the genres of 75% of the videos. We argue that the patterns of social activity found in the metadata are not just meaningful in their own right, but are indicative of the meaning of the shared video content. The results presented by this project represents a first step in investigating the potential meaning and significance of social metadata and its relation to the media experience.
Know your data: understanding implicit usage versus explicit action in video content classification
Multimedia on Mobile Devices 2011; and Multimedia Content Access: Algorithms and Systems V, 2011
In this paper, we present a method for video category classification using only social metadata from websites like YouTube. In place of content analysis, we utilize communicative and social contexts surrounding videos as a means to determine a categorical genre, e.g. Comedy, Music. We hypothesize that video clips belonging to different genre categories would have distinct signatures and patterns that are reflected in their collected metadata. In particular, we define and describe social metadata as usage or action to aid in classification. We trained a Naive Bayes classifier to predict categories from a sample of 1,740 YouTube videos representing the top five genre categories. Using just a small number of the available metadata features, we compare the classifications produced by our Naive Bayes classifier with those provided by the uploader of that particular video. Compared to random predictions with the YouTube data (21% accurate), our classifier attained a mediocre 33% accuracy in predicting video genres. However, we found that the accuracy of our classifier significantly improves by nominal factoring of the explicit data features. By factoring the ratings of the videos in the dataset, the classifier was able to accurately predict the genres of 75% of the videos. We argue that the patterns of social activity found in the metadata are not just meaningful in their own right, but are indicative of the meaning of the shared video content. The results presented by this project represents a first step in investigating the potential meaning and significance of social metadata and its relation to the media experience.
Enhancing multi-class web video categorization model using machine and deep learning approaches
International Journal of Electrical and Computer Engineering (IJECE), 2022
With today's digital revolution, many people communicate and collaborate in cyberspace. Users rely on social media platforms, such as Facebook, YouTube and Twitter, all of which exert a considerable impact on human lives. In particular, watching videos has become more preferable than simply browsing the internet because of many reasons. However, difficulties arise when searching for specific videos accurately in the same domains, such as entertainment, politics, education, video and TV shows. This problem can be solved through web video categorization (WVC) approaches that utilize video textual information, visual features, or audio approaches. However, retrieving or obtaining videos with similar content with high accuracy is challenging. Therefore, this paper proposes a novel mode for enhancing WVC that is based on user comments and weighted features from video descriptions. Specifically, this model uses supervised learning, along with machine learning classifiers (MLCs) and deep learning (DL) models. Two experiments are conducted on the proposed balanced dataset on the basis of the two proposed algorithms based on multi-classes, namely, education, politics, health and sports. The model achieves high accuracy rates of 97% and 99% by using MLCs and DL models that are based on artificial neural network (ANN) and long short-term memory (LSTM), respectively.
2020
Every day, people around the world upload 1.2 million videos to YouTube or more than 100 hours per minute, and this number is increasing. The condition of this continuous data will be useless if not utilized again. To dig up information on large-scale data, a technique called data mining can be a solution. One of the techniques in data mining is classification. For most YouTube users, when searching for video titles do not match the desired video category. Therefore, this research was conducted to classify YouTube data based on its search text. This article focuses on comparing three algorithms for the classification of YouTube data into the Kesenian and Sains category. Data collection in this study uses scraping techniques taken from the YouTube website in the form of links, titles, descriptions, and searches. The method used in this research is an experimental method by conducting data collection, data processing, proposed models, testing, and evaluating models. The models applied...
Supervised multimedia categorization
Storage and Retrieval for Media Databases 2003, 2003
Static multimedia on the Web can already be hardly structured manually. Although unavoidable and necessary, manual annotation of dynamic multimedia becomes even less feasible when multimedia quickly changes in complexity, i.e. in volume, modality, and usage context. The latter context could be set by learning or other purposes of the multimedia material. This multimedia dynamics calls for categorisation systems that index, query and retrieve multimedia objects on the fly in a similar way as a human expert would. We present and demonstrate such a supervised dynamic multimedia object categorisation system. Our categorisation system comes about by continuously gauging it to a group of human experts who annotate raw multimedia for a certain domain ontology given a usage context. Thus effectively our system learns the categorisation behaviour of human experts. By inducing supervised multi-modal content and context-dependent potentials our categorisation system associates field strengths of raw dynamic multimedia object categorisations with those human experts would assign. After a sufficient long period of supervised machine learning we arrive at automated robust and discriminative multimedia categorisation. We demonstrate the usefulness and effectiveness of our multimedia categorisation system in retrieving semantically meaningful soccer-video fragments, in particular by taking advantage of multimodal and domain specific information and knowledge supplied by human experts.
A System That Learns to Tag Videos by Watching Youtube
Lecture Notes in Computer Science, 2008
We present a system that automatically tags videos, i.e. detects high-level semantic concepts like objects or actions in them. To do so, our system does not rely on datasets manually annotated for research purposes. Instead, we propose to use videos from online portals like youtube.com as a novel source of training data, whereas tags provided by users during upload serve as ground truth annotations. This allows our system to learn autonomously by automatically downloading its training set.