ADNet: A Deep Network for Detecting Adverts (original) (raw)

Exploring Online Ad Images Using a Deep Convolutional Neural Network Approach

The 3rd IEEE International Conference on Smart Data (SmartData), 2017

Online advertising is a huge, rapidly growing advertising market in today's world. One common form of online advertising is using image ads. A decision is made (often in real time) every time a user sees an ad, and the advertiser is eager to determine the best ad to display. Consequently, many algorithms have been developed that calculate the optimal ad to show to the current user at the present time. Typically, these algorithms focus on variations of the ad, optimizing among different properties such as background color, image size, or set of images. However, there is a more fundamental layer. Our study looks at new qualities of ads that can be determined before an ad is shown (rather than online optimization) and defines which ads are most likely to be successful. We present a set of novel algorithms that utilize deep-learning image processing, machine learning, and graph theory to investigate online advertising and to construct prediction models which can foresee an image ad's success. We evaluated our algorithms on a dataset with over 260,000 ad images, as well as a smaller dataset specifically related to the automotive industry, and we succeeded in constructing regression models for ad image click rate prediction. The obtained results emphasize the great potential of using deep-learning algorithms to effectively and efficiently analyze image ads and to create better and more innovative online ads. Moreover, the algorithms presented in this paper can help predict ad success and can be applied to analyze other large-scale image corpora.

AD or Non-AD: A Deep Learning Approach to Detect Advertisements from Magazines

Entropy, 2018

The processing and analyzing of multimedia data has become a popular research topic due to the evolution of deep learning. Deep learning has played an important role in addressing many challenging problems, such as computer vision, image recognition, and image detection, which can be useful in many real-world applications. In this study, we analyzed visual features of images to detect advertising images from scanned images of various magazines. The aim is to identify key features of advertising images and to apply them to real-world application. The proposed work will eventually help improve marketing strategies, which requires the classification of advertising images from magazines. We employed convolutional neural networks to classify scanned images as either advertisements or non-advertisements (i.e., articles). The results show that the proposed approach outperforms other classifiers and the related work in terms of accuracy.

Automatic Understanding of Image and Video Advertisements

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

There is more to images than their objective physical content: for example, advertisements are created to persuade a viewer to take a certain action. We propose the novel problem of automatic advertisement understanding. To enable research on this problem, we create two datasets: an image dataset of 64,832 image ads, and a video dataset of 3,477 ads. Our data contains rich annotations encompassing the topic and sentiment of the ads, questions and answers describing what actions the viewer is prompted to take and the reasoning that the ad presents to persuade the viewer ("What should I do according to this ad, and why should I do it?"), and symbolic references ads make (e.g. a dove symbolizes peace). We also analyze the most common persuasive strategies ads use, and the capabilities that computer vision systems should have to understand these strategies. We present baseline classification results for several prediction tasks, including automatically answering questions about the messages of the ads.

Deep Learning Based Modeling in Computational Advertising: A Winning Formula

Industrial Engineering & Management, 2018

Targeting the correct customers is critical to computational advertising. Once you find your potential customer, you want to know who can offer the most suitable offer at the right time. There are other ways to visualize your perfect customer using deep learning, and we developed a method to target and attract the best online customers and get to know what they want before you start selling.

The CASE Dataset of Candidate Spaces for Advert Implantation

2019

With the advent of faster internet services and growth of multimedia content, we observe a massive growth in the number of online videos. The users generate these video contents at an unprecedented rate, owing to the use of smart-phones and other hand-held video capturing devices. This creates immense potential for the advertising and marketing agencies to create personalized content for the users. In this paper, we attempt to assist the video editors to generate augmented video content, by proposing candidate spaces in video frames. We propose and release a large-scale dataset of outdoor scenes, along with manually annotated maps for candidate spaces. We also benchmark several deep-learning based semantic segmentation algorithms on this proposed dataset.

Multimodal Content Analysis for Effective Advertisements on YouTube

2017 IEEE International Conference on Data Mining (ICDM), 2017

The rapid advances in e-commerce and Web 2.0 technologies have greatly increased the impact of commercial advertisements on the general public. As a key enabling technology, a multitude of recommender systems exists which analyzes user features and browsing patterns to recommend appealing advertisements to users. In this work, we seek to study the characteristics or attributes that characterize an effective advertisement and recommend a useful set of features to aid the designing and production processes of commercial advertisements. We analyze the temporal patterns from multimedia content of advertisement videos including auditory, visual and textual components, and study their individual roles and synergies in the success of an advertisement. The objective of this work is then to measure the effectiveness of an advertisement, and to recommend a useful set of features to advertisement designers to make it more successful and approachable to users. Our proposed framework employs the signal processing technique of cross modality feature learning where data streams from different components are employed to train separate neural network models and are then fused together to learn a shared representation. Subsequently, a neural network model trained on this joint feature embedding representation is utilized as a classifier to predict advertisement effectiveness. We validate our approach using subjective ratings from a dedicated user study, the sentiment strength of online viewer comments, and a viewer opinion metric of the ratio of the Likes and Views received by each advertisement from an online platform.

Localizing Adverts in Outdoor Scenes

2019 IEEE International Conference on Multimedia & Expo Workshops (ICMEW), 2019

Online videos have witnessed an unprecedented growth over the last decade, owing to wide range of content creation. This provides the advertisement and marketing agencies plethora of opportunities for targeted advertisements. Such techniques involve replacing an existing advertisement in a video frame, with a new advertisement. However, such post-processing of online videos is mostly done manually by video editors. This is cumbersome and time-consuming. In this paper, we propose DeepAds-a deep neural network, based on the simple encoder-decoder architecture, that can accurately localize the position of an advert in a video frame. Our approach of localizing billboards in outdoor scenes using neural nets, is the first of its kind, and achieves the best performance. We benchmark our proposed method with other semantic segmentation algorithms, on a public dataset of outdoor scenes with manually annotated billboard binary maps.

SalAds: A Multimodal Approach for Contextual Video Advertising

The explosive growth of multimedia data on Internet creates huge opportunities for online video advertising. In this paper, we propose a novel advertising system called SalAds, which utilizes textual information, visual content and the webpage saliency, to automatically associate the most proper companion ads with online videos. Unlike most existing approaches that only focus on selecting the most relevant ads, SalAds further considers the saliency of selected ads to reduce intentional ignorance. SalAds consists of three basic steps. Given an online video and a set of advertisements, we first roughly identify a set of relevant advertisements based on the textual information matching. We then carefully select a set of candidates based on the visual content matching. In this regard, our selected ads are contextually relevant to online video content in terms of both textual information and visual content. We finally select the most salient ad among the relevant ads as the most appropriate one. To demonstrate the effectiveness of our method, we conduct a rigorous eye-tracking experiment on two ad-datasets. Our experimental results show that our method enhances the user engagement with the ad content, and yet maintain users' video viewing experience, when compared with existing approaches.

Enhancing Dynamic Image Advertising with Vision-Language Pre-training

Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval

In the multimedia era, image is an effective medium in search advertising. Dynamic Image Advertising (DIA), a system that matches queries with ad images and generates multimodal ads, is introduced to improve user experience and ad revenue. The core of DIA is a query-image matching module performing ad image retrieval and relevance modeling. Current query-image matching suffers from limited and inconsistent data, and insufficient cross-modal interaction. Also, the separate optimization of retrieval and relevance models affects overall performance. To address this issue, we propose a vision-language framework consisting of two parts. First, we train a base model on large-scale image-text pairs to learn general multimodal representation. Then, we fine-tune the base model

A tool for analyzing audio and video advertising in real time

Recognizing an audio sequence among a set of n previously stored ones is a standard task in signal processing if (i) n is small and (ii) we have as much time as desired to perform the task. Even in these cases, we must set the parameters of the process (number of coefficients inthe decompositions used, sampling rates...) "high" enough, say, if the error probabilities must be kept very small. This paper points out the fact that these standard techniques can also handle the case of very large values of n and very small processing times following two lines: first, combining several different standard techniques and second, controlling the techniques and their combination using a pow- erful optimization process.