Steve Omohundro - Academia.edu (original) (raw)
Uploads
Papers by Steve Omohundro
Searching for objects in scenes is a natural task for people and has been extensively studied by ... more Searching for objects in scenes is a natural task for people and has been extensively studied by psychologists. In this paper we examine this task from a connectionist perspective. Computational complexity arguments suggest that parallel feed-forward networks cannot perform this task efficiently. One difficulty is that, in order to distinguish the target from distractors, a combination of features must be associated with a single object. Often called the binding problem, this requirement presents a serious hurdle for connectionist models of visual processing when multiple objects are present. Psychophysical experiments suggest that people use covert visual attention to get around this problem. In this paper we describe a psychologically plausible system which uses a focus of attention mechanism to locate target objects. A strategy that combines top-down and bottom-up information is used to minimize search time. The behavior of the resulting system matches the reaction time behavior of people in several interesting tasks.
A technique for representing and learning smooth nonlinear manifolds is presented and applied to ... more A technique for representing and learning smooth nonlinear manifolds is presented and applied to sev eral lip reading tasks. Given a set of points drawn from a smooth manifold in an abstract feature space, the technique is capable of determining the structure of the surface and of finding the closest manifold point to a given query point. We use this technique to learn the "space of lips" in a visual speech recognition task. The learned manifold is used for tracking and extracting the lips, for interpolating between frames in an image se quence and for providing features for recognition. We describe a system based on Hidden Markov Models and this learned lip manifold that significantly improves the performance of acoustic speech recognizers in degraded environments. We also present preliminary results on a purely visual lip reader.
In this paper we explore the problem of dynamically computing visual relations in a connectionist... more In this paper we explore the problem of dynamically computing visual relations in a connectionist system. The task. of detecting equilateral triangles from clusters of points is used to test our architecture. We argue that this is a difficult task for traditional feed-forward architectures although it is a simple task for people. Our solution implements a biologically inspired networlc which uses an efficient focus of attention mecha nism and cluster detectors to sequentially extract the locations of the vertices.
Neural Information Processing Systems, Oct 1, 1990
A new class of data structures called "bumptrees" is described. These structures are useful for e... more A new class of data structures called "bumptrees" is described. These structures are useful for efficiently implementing a number of neural network related operations. An empirical comparison with radial basis functions is presented on a robot ann mapping learning task. Applications to density estimation. classification. and constraint representation and learning are also outlined.
Neural Information Processing Systems, Nov 27, 1995
"Family discovery" is the task of learning the dimension and structure of a parameterized family ... more "Family discovery" is the task of learning the dimension and structure of a parameterized family of stochastic models. It is especially appropriate when the training examples are partitioned into "episodes" of samples drawn from a single parameter value. We present three family discovery algorithms based on surface learning and show that they significantly improve performance over two alternatives on a parameterized classification task.
We explore multimodal recognition by combining visual lipreading with acoustic speech recognition... more We explore multimodal recognition by combining visual lipreading with acoustic speech recognition. We show that combining visual and acoustic speech information improves the recognition performance significantly, especially in noisy environments. This is achieved with a hybrid speech recognition architecture, consisting of a new visual learning and tracking mechanism, a channel robust acoustic front end, a connectionist phone classifier, and a HMM based sentence classifier. Our focus in this paper is on the visual subsystem based on "surface-learning" and active vision models. Our bimodal hybrid speech recognition system has already been applied to a multi-speaker spelling task, and work is in progress to apply it to a speaker independent spontaneous speech task, the "Berkeley Restaurant Project (BeRP)".<<ETX>>
Informatik-Fachberichte, 1991
This paper describes “bumptrees”, a new approach to improving the computational efficiency of a w... more This paper describes “bumptrees”, a new approach to improving the computational efficiency of a wide variety of connectionist algorithms. We describe the use of these structures for representing, learning, and evaluating smooth mappings, smooth constraints, classification regions, and probability densities. We present an empirical comparison of a bumptree approach to more traditional connectionist approaches for learning the mapping between the kinematic and visual representations of the state of a 3 joint robot arm. Simple networks based on backpropagation with sigmoidal units are unable to perform the task at all. Radial basis function networks perform the task but by using bumptrees, the learning rate is hundreds of times faster at reasonable error levels and the retrieval time is over fifty times faster with 10,000 samples. Bumptrees are a natural generalization of oct-trees, k-d trees, balltrees and boxtrees and are useful in a variety of circumstances. We describe both the underlying ideas and extensions to constraint and classification learning that are under current investigation.
Chapman and Hall/CRC eBooks, Jul 27, 2018
WORLD SCIENTIFIC eBooks, Oct 1, 1986
The MIT Press eBooks, 1993
Sather is an object-oriented language derived from Eiffel which is particularly well suited for t... more Sather is an object-oriented language derived from Eiffel which is particularly well suited for the needs of scientific research groups. It is designed to be very efficient and simple while supporting strong typing, garbage collection, object-oriented dispatch, multiple inheritance, parameterized types, and a clean syntax:. It compiles into portable C code and easily links with existing C code. The compiler, debugger and several hundred library classes are freely available by anonymous FTP. This paper describes aspects of the language design, implementation and libraries.
MIT Press eBooks, Jun 1, 1991
Physica D: Nonlinear Phenomena, Aug 1, 1984
For a wide-class of period doubling flows on R 3 , we analyze the global structure of the invaria... more For a wide-class of period doubling flows on R 3 , we analyze the global structure of the invariant manifolds and the topology of the bifurcating periodic orbits. We emphasize aspects of the dynamics which are not visible in an analysis of the associated Poincare/return map. The global manifold structure implies constraints for the subsequent bifurcational behavior of the flow. The period doubled orbits are classified using the theory of iterated torus knots. This classification reveals an infinite number of topologically distinct period doubling flows. This is of experimental interest because distinct flows can generate qualitatively different power spectra. Possible implications for the universality theory of period doubling flows are discussed.
ACM Transactions on Programming Languages and Systems, 1996
Sather extends the notion of an iterator in a powerful new way. We argue that iteration abstracti... more Sather extends the notion of an iterator in a powerful new way. We argue that iteration abstractions belong in class interfaces on an equal footing with routines. Sather iterators were derived from CLU iterators but are much more flexible and better suited for object-oriented programming. We retain the property that iterators are structured, i.e., strictly bound to a controlling structured statement. We motivate and describe the construct along with several simple examples. We compare it with iteration based on CLU iterators, cursors, riders, streams, series, generators, coroutines, blocks, closures, and lambda expressions. Finally, we describe experiences with iterators in the Sather compiler and libraries.
Balltrees are simple geometric data structures with a wide range of practical applica tions to ge... more Balltrees are simple geometric data structures with a wide range of practical applica tions to geometric •learning tasks. In this report we compare 5 different algorithms for. constructing ball trees from data. We study the trade-off between construction time and the quality of the constructed tree. Two of the algorithms are on-line, two construct the structures from the data set in a top down fashion, and one uses a bottom up approach. We empirically study the algorithms on random data drawn from eight different probability distributions representing smooth, clustered, and curve distributed data in different ambient space dimen sions. We find that the bottom up approach usually produces the best trees but has the longest construction time. The other approaches have uses in specific circumstances.
Neural Information Processing Systems, Dec 2, 1991
Best-first model merging" is a general technique for dynamically choosing the structure of a neur... more Best-first model merging" is a general technique for dynamically choosing the structure of a neural or related architecture while avoiding overfitting. It is applicable to both leaming and recognition tasks and often generalizes significantly better than fixed structures. We demonstrate the approach applied to the tasks of choosing radial basis functions for function learning, choosing local affine models for curve and constraint surface modelling, and choosing the structure of a balltree or bumptree to maximize efficiency of access.
Searching for objects in scenes is a natural task for people and has been extensively studied by ... more Searching for objects in scenes is a natural task for people and has been extensively studied by psychologists. In this paper we examine this task from a connectionist perspective. Computational complexity arguments suggest that parallel feed-forward networks cannot perform this task efficiently. One difficulty is that, in order to distinguish the target from distractors, a combination of features must be associated with a single object. Often called the binding problem, this requirement presents a serious hurdle for connectionist models of visual processing when multiple objects are present. Psychophysical experiments suggest that people use covert visual attention to get around this problem. In this paper we describe a psychologically plausible system which uses a focus of attention mechanism to locate target objects. A strategy that combines top-down and bottom-up information is used to minimize search time. The behavior of the resulting system matches the reaction time behavior of people in several interesting tasks.
A technique for representing and learning smooth nonlinear manifolds is presented and applied to ... more A technique for representing and learning smooth nonlinear manifolds is presented and applied to sev eral lip reading tasks. Given a set of points drawn from a smooth manifold in an abstract feature space, the technique is capable of determining the structure of the surface and of finding the closest manifold point to a given query point. We use this technique to learn the "space of lips" in a visual speech recognition task. The learned manifold is used for tracking and extracting the lips, for interpolating between frames in an image se quence and for providing features for recognition. We describe a system based on Hidden Markov Models and this learned lip manifold that significantly improves the performance of acoustic speech recognizers in degraded environments. We also present preliminary results on a purely visual lip reader.
In this paper we explore the problem of dynamically computing visual relations in a connectionist... more In this paper we explore the problem of dynamically computing visual relations in a connectionist system. The task. of detecting equilateral triangles from clusters of points is used to test our architecture. We argue that this is a difficult task for traditional feed-forward architectures although it is a simple task for people. Our solution implements a biologically inspired networlc which uses an efficient focus of attention mecha nism and cluster detectors to sequentially extract the locations of the vertices.
Neural Information Processing Systems, Oct 1, 1990
A new class of data structures called "bumptrees" is described. These structures are useful for e... more A new class of data structures called "bumptrees" is described. These structures are useful for efficiently implementing a number of neural network related operations. An empirical comparison with radial basis functions is presented on a robot ann mapping learning task. Applications to density estimation. classification. and constraint representation and learning are also outlined.
Neural Information Processing Systems, Nov 27, 1995
"Family discovery" is the task of learning the dimension and structure of a parameterized family ... more "Family discovery" is the task of learning the dimension and structure of a parameterized family of stochastic models. It is especially appropriate when the training examples are partitioned into "episodes" of samples drawn from a single parameter value. We present three family discovery algorithms based on surface learning and show that they significantly improve performance over two alternatives on a parameterized classification task.
We explore multimodal recognition by combining visual lipreading with acoustic speech recognition... more We explore multimodal recognition by combining visual lipreading with acoustic speech recognition. We show that combining visual and acoustic speech information improves the recognition performance significantly, especially in noisy environments. This is achieved with a hybrid speech recognition architecture, consisting of a new visual learning and tracking mechanism, a channel robust acoustic front end, a connectionist phone classifier, and a HMM based sentence classifier. Our focus in this paper is on the visual subsystem based on "surface-learning" and active vision models. Our bimodal hybrid speech recognition system has already been applied to a multi-speaker spelling task, and work is in progress to apply it to a speaker independent spontaneous speech task, the "Berkeley Restaurant Project (BeRP)".<<ETX>>
Informatik-Fachberichte, 1991
This paper describes “bumptrees”, a new approach to improving the computational efficiency of a w... more This paper describes “bumptrees”, a new approach to improving the computational efficiency of a wide variety of connectionist algorithms. We describe the use of these structures for representing, learning, and evaluating smooth mappings, smooth constraints, classification regions, and probability densities. We present an empirical comparison of a bumptree approach to more traditional connectionist approaches for learning the mapping between the kinematic and visual representations of the state of a 3 joint robot arm. Simple networks based on backpropagation with sigmoidal units are unable to perform the task at all. Radial basis function networks perform the task but by using bumptrees, the learning rate is hundreds of times faster at reasonable error levels and the retrieval time is over fifty times faster with 10,000 samples. Bumptrees are a natural generalization of oct-trees, k-d trees, balltrees and boxtrees and are useful in a variety of circumstances. We describe both the underlying ideas and extensions to constraint and classification learning that are under current investigation.
Chapman and Hall/CRC eBooks, Jul 27, 2018
WORLD SCIENTIFIC eBooks, Oct 1, 1986
The MIT Press eBooks, 1993
Sather is an object-oriented language derived from Eiffel which is particularly well suited for t... more Sather is an object-oriented language derived from Eiffel which is particularly well suited for the needs of scientific research groups. It is designed to be very efficient and simple while supporting strong typing, garbage collection, object-oriented dispatch, multiple inheritance, parameterized types, and a clean syntax:. It compiles into portable C code and easily links with existing C code. The compiler, debugger and several hundred library classes are freely available by anonymous FTP. This paper describes aspects of the language design, implementation and libraries.
MIT Press eBooks, Jun 1, 1991
Physica D: Nonlinear Phenomena, Aug 1, 1984
For a wide-class of period doubling flows on R 3 , we analyze the global structure of the invaria... more For a wide-class of period doubling flows on R 3 , we analyze the global structure of the invariant manifolds and the topology of the bifurcating periodic orbits. We emphasize aspects of the dynamics which are not visible in an analysis of the associated Poincare/return map. The global manifold structure implies constraints for the subsequent bifurcational behavior of the flow. The period doubled orbits are classified using the theory of iterated torus knots. This classification reveals an infinite number of topologically distinct period doubling flows. This is of experimental interest because distinct flows can generate qualitatively different power spectra. Possible implications for the universality theory of period doubling flows are discussed.
ACM Transactions on Programming Languages and Systems, 1996
Sather extends the notion of an iterator in a powerful new way. We argue that iteration abstracti... more Sather extends the notion of an iterator in a powerful new way. We argue that iteration abstractions belong in class interfaces on an equal footing with routines. Sather iterators were derived from CLU iterators but are much more flexible and better suited for object-oriented programming. We retain the property that iterators are structured, i.e., strictly bound to a controlling structured statement. We motivate and describe the construct along with several simple examples. We compare it with iteration based on CLU iterators, cursors, riders, streams, series, generators, coroutines, blocks, closures, and lambda expressions. Finally, we describe experiences with iterators in the Sather compiler and libraries.
Balltrees are simple geometric data structures with a wide range of practical applica tions to ge... more Balltrees are simple geometric data structures with a wide range of practical applica tions to geometric •learning tasks. In this report we compare 5 different algorithms for. constructing ball trees from data. We study the trade-off between construction time and the quality of the constructed tree. Two of the algorithms are on-line, two construct the structures from the data set in a top down fashion, and one uses a bottom up approach. We empirically study the algorithms on random data drawn from eight different probability distributions representing smooth, clustered, and curve distributed data in different ambient space dimen sions. We find that the bottom up approach usually produces the best trees but has the longest construction time. The other approaches have uses in specific circumstances.
Neural Information Processing Systems, Dec 2, 1991
Best-first model merging" is a general technique for dynamically choosing the structure of a neur... more Best-first model merging" is a general technique for dynamically choosing the structure of a neural or related architecture while avoiding overfitting. It is applicable to both leaming and recognition tasks and often generalizes significantly better than fixed structures. We demonstrate the approach applied to the tasks of choosing radial basis functions for function learning, choosing local affine models for curve and constraint surface modelling, and choosing the structure of a balltree or bumptree to maximize efficiency of access.