Computer Vision through Learning (original) (raw)

Machine Learning for Object Recognition and Scene Analysis

International Journal of Pattern Recognition and Artificial Intelligence, 1994

Learning is a critical research field for autonomous computer vision systems. It can bring solutions to the knowledge acquisition bottleneck of image understanding systems. Recent developments of machine learning for computer vision are reported in this paper. We describe several different approaches for learning at different levels of the image understanding process, including learning 2-D shape models, learning strategic knowledge for optimizing model matching, learning for adaptive target recognition systems, knowledge acquisition of constraint rules for labelling and automatic parameter optimization for vision systems. Each approach will be commented on and its strong and weak points will be underlined. In conclusion we will suggest what could be the “ideal” learning system for vision.

Computer vision and artificial intelligence

XRDS: Crossroads, The ACM Magazine for Students, 1996

One of the monolithic goals of computer vision is to automatically interpret general digital images of arbitrary scenes. This goal has produced a vast array of research over the last 35 years, yet a solution to this general problem still remains out of reach. A reason for this is that the problem of visual perception is typically under-constrained. Information like absolute scale and depth is lost when the scene is projected onto an image plane. In fact, there are an infinite number of scenes that can produce the exact same image, which makes direct computation of scene geometry from a single image impossible. The difficulty of this ``traditional goal'' of computer vision has caused the field to focus on smaller, more constrained pieces of the problem. The hope is that when the pieces are put back together, a successful scene interpreter will have been created. Digital filtering, motion analysis , image registration, segmentation, and model matching schemes are all examples of areas where progress has been made in the field. Other research has focused on the general problem through the use of knowledge and context. The use of external knowledge both about the world and about the current visual task reduces the number of plausible scene interpretations and may make the problem solvable. This approach is referred to as knowledge-based vision. Work in the area of knowledge-based vision incorporates methods from the field of AI in order to focus on the influence of context on scene understanding, the role of high level knowledge, and appropriate knowledge representations for visual tasks. The importance of computer vision to the field of AI is fairly obvious: intelligent agents need to acquire knowledge of the world through a set of sensors. What is not so obvious is the importance that AI has to the field of computer vision. Indeed, I believe that the study of perception and intelligence are necessarily intertwined. This article will look at the role that knowledge plays in computer vision and how the use of reasoning, context, and knowledge in visual tasks reduces the complexity of the general problem. The importance of context and knowledge in vision has been pointed out by psychologists many times, and these ideas have driven computer vision studies as well. Take the example in Figure 1.

Machine learning in computer vision

Applied Artificial Intelligence, 2001

During last few years the computer applications have gone dramatic transformation from simple data processing to machine learning, thanks to the availability and accessibility of huge volume of data collected through sensors and internet. The idea of machine learning demonstrates and propagates the facts that computer has the ability to improve itself with the passage of time. The western countries have shown great interest on the topic of machine learning, computer vision, and pattern recognition via organizing conferences, workshops, collective discussion, experimentation, and real life implementation. This study on machine learning and computer vision explores and analytically evaluates the machine learning applications in computer vision and predicts future prospects. The study has found that the machine learning strategies in computer vision are supervised, un-supervised, and semisupervised. The commonly used algorithms are neural networks, k-means clustering, and support vector machine. The most recent applications of machine learning in computer vision are object detection, object classification, and extraction of relevant information from images, graphic documents, and videos. Additionally, Tensor flow, Faster-RCNN-Inception-V2 model, and Anaconda software development environment used to identify cars and persons in images.

A Novel Vision-Based Classification System for Explosion Phenomena

Journal of Imaging, 2017

The need for a proper design and implementation of adequate surveillance system for detecting and categorizing explosion phenomena is nowadays rising as a part of the development planning for risk reduction processes including mitigation and preparedness. In this context, we introduce state-of-the-art explosions classification using pattern recognition techniques. Consequently, we define seven patterns for some of explosion and non-explosion phenomena including: pyroclastic density currents, lava fountains, lava and tephra fallout, nuclear explosions, wildfires, fireworks, and sky clouds. Towards the classification goal, we collected a new dataset of 5327 2D RGB images that are used for training the classifier. Furthermore, in order to achieve high reliability in the proposed explosion classification system and to provide multiple analysis for the monitored phenomena, we propose employing multiple approaches for feature extraction on images including texture features, features in the spatial domain, and features in the transform domain. Texture features are measured on intensity levels using the Principal Component Analysis (PCA) algorithm to obtain the highest 100 eigenvectors and eigenvalues. Moreover, features in the spatial domain are calculated using amplitude features such as the YC b C r color model; then, PCA is used to reduce vectors' dimensionality to 100 features. Lastly, features in the transform domain are calculated using Radix-2 Fast Fourier Transform (Radix-2 FFT), and PCA is then employed to extract the highest 100 eigenvectors. In addition, these textures, amplitude and frequency features are combined in an input vector of length 300 which provides a valuable insight into the images under consideration. Accordingly, these features are fed into a combiner to map the input frames to the desired outputs and divide the space into regions or categories. Thus, we propose to employ one-against-one multi-class degree-3 polynomial kernel Support Vector Machine (SVM). The efficiency of the proposed research methodology was evaluated on a totality of 980 frames that were retrieved from multiple YouTube videos. These videos were taken in real outdoor environments for the seven scenarios of the respective defined classes. As a result, we obtained an accuracy of 94.08%, and the total time for categorizing one frame was approximately 0.12 s.

Artificial Intelligence and the Science of Image Understanding

Advanced automation promises to increase productivity, improve working conditions and assure product quality. Some computer-based systems perform tasks blindly, without elaborate sensor-based feedback. In many applications however visual input can speed up an automated system by eliminating search or the need for costly fixtures that maintain exact alignment of parts. In still other situations, many inspection jobs for example, there may be no alternative to machine vision. Unfortunately, image analysis turned out not to involve just a simple extension of some well-known subfield of computer science or optics. A long history of frustrations with techniques borrowed from other domains mixed with clever special case solutions based on ad hoc techniques has brought the field to a point where it is finally considered worthy of serious attention. The foundations of a science of image understanding are beginning to be built on the ancestral paradigms of image processing, pattern recognition and scene analysis. One component of this new thrust is an improved understanding of the physics of image formation. Understanding how the measurements obtained from the vision input device are determined by the lighting, shape and surface material of the objects being imaged helps one develop methods for "inverting" the imaging process, that is, exploit physical contraints to allow one to built internal symbolic descriptions of the scene being viewed. Another ingredient of the renewed optimism is a better understanding of the computations underlying early stages of the processing of visual information in biological systems. Aside from providing existence proofs that certain aspects of a scene can be understood from an image, this also suggests computational architectures for performing such tasks in real time. A focal point of recent work is the choice of a representation for the objects being References pp. 75-77

Visual Learning in Surveillance Systems

Learning paths of the 3D scene, BMVA meeting on understanding visual behaviour, London, UK, 2001

Nowadays, surveillance cameras are common to all the public areas in UK, from small off-licence stores to train stations, large buildings, motorways and park areas. A traditional security surveillance system can be described as a set of CCTV cameras that send their video signals to displaying monitors and perhaps at the same time to analogue recording devices. Human personnel are required to monitor the displaying devices in real time, or to check the recorded videos off-line. Monitoring and analysis of surveillance videos can be ...

Application of Machine Learning in Computer Vision: A Brief Review

International Journal for Research in Applied Science and Engineering Technology IJRASET, 2020

A scientific study on the importance of machine learning and its applications in the field of computer vision is carried out in this paper. Recent advancements in Artificial Intelligence, deep learning, computing resources and availability of large training datasets made tasks such as computer vision and natural language processing extremely fast and accurate. Thus Artificial intelligence is a trending topic in the field of computing. Deep learning is a subcategory of machine learning in the field of artificial intelligence. Image processing task can be performed efficiently by using machine learning methods, thus machine learning will provide a better understanding of complex images. Object detection, recognition and tracking are the fields related to computer vision. In the computer vision with the help convolutional neural network-based algorithms like YOLO and R-CNN make a big leap in this field. Algorithms based on machine learning models are excellent at recognizing patterns but typically requires an enormous amount of data sets and lots of computational power. Generally, the neural network requires graphics processing unit for faster execution of machine learning models. This review paper gives a brief overview of real-time object detection and machine learning algorithms implemented by various researchers around the world. Also, this paper consists of a study of various methodology used to detect and recognize a particular object in the image. Real-time object detection algorithms are going to play a vital role in the field of computer vision.