Low-Power Tracking Image Sensor Based on Biological Models of Attention (original) (raw)

Computational sensor for visual tracking with attention

IEEE Journal of Solid-State Circuits, 1998

This paper presents a VLSI embodiment of an optical tracking computational sensor which focuses attention on a salient target in its field of view. Using both low-latency massive parallel processing and top-down sensory adaptation, the sensor suppresses interference from features irrelevant for the task at hand, and tracks a target of interest at speeds of up to 7000 pixels/s. The sensor locks onto the target to continuously provide control for the execution of a perceptually guided activity. The sensor prototype, a 24 2 2 2 24 array of cells, is built in 2-m CMOS technology. Each cell occupies 62 m 2 2 2 62 m of silicon, and contains a photodetector and processing electronics.

A Computationally Efficient Visual Saliency Algorithm Suitable for an Analog CMOS Implementation

Neural Computation, 2018

Computer vision algorithms are often limited in their application by the large amount of data that must be processed. Mammalian vision systems mitigate this high bandwidth requirement by prioritizing certain regions of the visual field with neural circuits that select the most salient regions. This work introduces a novel and computationally efficient visual saliency algorithm for performing this neuromorphic attention-based data reduction. The proposed algorithm has the added advantage that it is compatible with an analog CMOS design while still achieving comparable performance to existing state-of-the-art saliency algorithms. This compatibility allows for direct integration with the analog-to-digital conversion circuitry present in CMOS image sensors. This integration leads to power savings in the converter by quantizing only the salient pixels. Further system-level power savings are gained by reducing the amount of data that must be transmitted and processed in the digital domain...

Active vision using an analog VLSI model of selective attention

IEEE Transactions on Circuits and Systems II: Analog and Digital Signal Processing, 2001

Detailed processing of sensory information is a computationally demanding task. This is especially true for vision, where the amount of information provided by the sensors typically exceeds the processing capacity of the system. Rather than attempting to process all the sensory data simultaneously, an effective strategy is to focus on subregions of the input space, shifting from one subregion to the other, in a serial fashion. This strategy is commonly referred to as selective attention. We present a neuromorphic active-vision system, that implements a saliency-based model of selective attention. Visual data is sensed and preprocessed in parallel by a transient imager chip and transmitted to a selective-attention chip. This chip sequentially selects the spatial locations of salient regions in the vision sensor's field of view. A host computer uses the output of the selective-attention chip to drive the motors on which the imager is mounted, and to orient it toward the selected regions. The system's design framework is modular and allows the integration of multiple sensors and multiple selective-attention chips. We present experimental results showing the performance of a two-chip system in response to well-controlled test stimuli and to natural stimuli.

A Focal-Plane Image Processor for Low Power Adaptive Capture and Analysis of the Visual Stimulus

2007 IEEE International Symposium on Circuits and Systems, 2007

Portable applications of artificial vision are limited by the fact that conventional processing schemes fail to meet the specifications under a tight power budget. A bio-inspired approach, based in the goal-directed organization of sensory organs found in nature, has been employed to implement a focal-plane image processor for low power vision applications. The prototype contains a multi-layered CNN structure concurrent with 32×32 photosensors with locally programmable integration time for adaptive image capture with on-chip local and global adaptation mechanisms. A more robust and linear multiplier block has been employed to reduce irregular analog wave propagation ought to asymmetric synapses. The predicted computing power per power consumption, 142MOPS/mW, is orders of magnitude above what rendered by conventional architectures.

A software-hardware selective attention system

2003

Selective attention is a biological mechanism to process salient subregions of the sensory input space, while suppressing non-salient inputs. We present a hardware selective attention system, implemented using a neuromorphic VLSI chip interfaced to a workstation, via a custom PCI board and based on an address event (spike based) representation of signals. The chip selects salient inputs and sequentially shifts from one salient input to the other. The PCI board acts as an interface between the chip and an algorithm that generates saliency maps. We present experimental data showing the system's response to saliency maps generated from natural scenes.

Real-Time Visual Saliency Architecture for FPGA With Top-Down Attention Modulation

IEEE Transactions on Industrial Informatics, 2014

Biological vision uses attention to reduce the visual bandwidth simplifying the higher-level processing. This paper presents a model and its hardware real-time architecture in a field programmable gate array (FPGA) to be integrated in a robotic system that emulates this powerful biological process. It is based on the combination of bottom-up saliency and top-down taskdependent modulation. The bottom-up stream is deployed including local energy, orientation, color opponencies, and motion maps. The most novel part of this work is the saliency modulation by two highlevel features: 1) optical flow and 2) disparity. Furthermore, the influence of the features may be adjusted depending on the application. The proposed system reaches 180 fps for resolution. Finally, an example shows its modulation potential for driving assistance systems.

A real time implementation of the saliency-based model of visual attention on a simd architecture

Pattern Recognition, 2002

Visual attention is the ability to rapidly detect the visually salient parts of a given scene. Inspired by biological vision, the saliencybased algorithm e ciently models the visual attention process. Due to its complexity, the saliency-based model of visual attention needs, for a real time implementation, higher computation resources than available in conventional processors. This work reports a real time implementation of this attention model on a highly parallel Single Instruction Multiple Data (SIMD) architecture called ProtoEye. Tailored for low-level image processing, ProtoEye consists of a 2D array of mixed analog-digital processing elements (PE). The operations required for visual attention computation are optimally distributed on the analog and digital parts. The analog di usion network is used to implement the spatial ltering-based transformations such as the conspicuity operator and the competitive normalization of conspicuity maps. Whereas the digital part of Proto-Eye a l l o ws the implementation of logical and arithmetical operations, for instance, the integration of the normalized conspicuity maps into the nal saliency map. Using 64 64 gray l e v el images, the on ProtoEye i mplemented attention process operates in real-time. It runs at a frequency of 14 images per second.

Sensory Attention: Computational Sensor Paradigm for Low-Latency Adaptive Vision

The need for robust self-contained and low-latency vision systems is growing: high speed visual servoing and vision-based human computer interface. Conventional vision systems can hardly meet this need because 1) the latency is incurred in a data transfer and computational bottlenecks, and 2) there is no top-down feedback to adapt sensor performance for improved robustness. In this paper we present a tracking computational sensor-a VLSI implementation of a sensory attention. The tracking sensor focuses attention on a salient feature in its receptive field and maintains this attention in the world coordinates. Using both low-latency massive parallel processing and top-down sensory adaptation, the sensor reliably tracks features of interest while it suppresses other irrelevant features that may interfere with the task at hand.

Saliency-driven image acuity modulation on a reconfigurable silicon array of spiking neurons

2005

We have constructed a system that uses an array of 9,600 spiking silicon neurons, a fast microcontroller, and digital memory, to implement a reconfigurable network of integrate-and-fire neurons. The system is designed for rapid prototyping of spiking neural networks that require high-throughput communication with external address-event hardware. Arbitrary network topologies can be implemented by selectively routing address-events to specific internal or external targets according to a memory-based projective field mapping. The utility and versatility of the system is demonstrated by configuring it as a three-stage network that accepts input from an address-event imager, detects salient regions of the image, and performs spatial acuity modulation around a high-resolution fovea that is centered on the location of highest salience.

Hardware accelerated visual attention algorithm

… and Systems (CISS …, 2011

We present a hardware-accelerated implementation of a bottom-up visual attention algorithm. This algorithm generates a multi-scale saliency map from differences in image intensity, color, presence of edges and presence of motion. The visual attention algorithm is computed on a custom-designed FPGA-based dataflow computer for general-purpose state-of-theart vision algorithms. The vision algorithm is accelerated by our hardware platform and reports ×4 speedup when compared to a standard laptop with a 2.26 GHz Intel Dual Core processor and for image sizes of 480 × 480 pixels. We developed a real time demo application capable of > 12 frames per second with the same size images. We also compared the results of the hardware implementation of the algorithm to the eye fixation points of different subjects on six video sequences. We find that our implementation achieves precisions of fixation predictions of up to 1/14th of the size of time video frames.