Mohan Trivedi | University of California, San Diego (original) (raw)

Papers by Mohan Trivedi

Research paper thumbnail of On Salience-Sensitive Sign Classification in Autonomous Vehicle Path Planning: Experimental Explorations with a Novel Dataset

2022 IEEE/CVF Winter Conference on Applications of Computer Vision Workshops (WACVW)

Research paper thumbnail of Forced Spatial Attention for Driver Foot Activity Classification

2019 IEEE/CVF International Conference on Computer Vision Workshop (ICCVW)

Research paper thumbnail of Hand Gesture Recognition in Real Time for Automotive Interfaces: A Multimodal Vision-Based Approach and Evaluations

IEEE Transactions on Intelligent Transportation Systems, 2014

Research paper thumbnail of No Blind Spots: Full-Surround Multi-Object Tracking for Autonomous Vehicles using Cameras LiDARs

IEEE Transactions on Intelligent Vehicles

Research paper thumbnail of Scene Induced Multi-Modal Trajectory Forecasting via Planning

We address multi-modal trajectory forecasting of agents in unknown scenes by formulating it as a ... more We address multi-modal trajectory forecasting of agents in unknown scenes by formulating it as a planning problem. We present an approach consisting of three models; a goal prediction model to identify potential goals of the agent, an inverse reinforcement learning model to plan optimal paths to each goal, and a trajectory generator to obtain future trajectories along the planned paths. Analysis of predictions on the Stanford drone dataset, shows generalizability of our approach to novel scenes.

Research paper thumbnail of Trajectory Forecasts in Unknown Environments Conditioned on Grid-Based Plans

In this paper, we address the problem of forecasting agent trajectories in unknown environments, ... more In this paper, we address the problem of forecasting agent trajectories in unknown environments, conditioned on their past motion and scene structure. Trajectory forecasting is a challenging problem due to the large variation in scene structure, and the multi-modal nature of the distribution of future trajectories. Unlike prior approaches that directly learn one-to-many mappings from observed context, to multiple future trajectories, we propose to condition trajectory forecasts on plans sampled from a grid based policy learned using maximum entropy inverse reinforcement learning policy (MaxEnt IRL). We reformulate MaxEnt IRL to allow the policy to jointly infer plausible agent goals and paths to those goals on a coarse 2-D grid defined over an unknown scene. We propose an attention based trajectory generator that generates continuous valued future trajectories conditioned on state sequences sampled from the MaxEnt policy. Quantitative and qualitative evaluation on the publicly avail...

Research paper thumbnail of Two-level framework · Personal space · Context

Multi-person interaction and activity analysis: a synergistic track- and body-level analysis fram... more Multi-person interaction and activity analysis: a synergistic track- and body-level analysis framework

Research paper thumbnail of Vehicle Occupant Posture Analysis Using Voxel Data

Analysis of the vehicle occupant posture is a key problem in designing "smart airbag" s... more Analysis of the vehicle occupant posture is a key problem in designing "smart airbag" systems. Vision based technology could enable the use of precise information about the occupant's size, position, and posture in making airbag deployment decisions. In this paper, we present our experiments in estimation of the body posture of a sitting person using a motion capture system that performs analysis of voxel data. We will show that this system is capable of extracting posture of the body and the head with good accuracy.

Research paper thumbnail of Calibration of a reconfigurable array of omnidirectional cameras using a moving person

Reconfigurable arrays of omnidirectional cameras are useful for applications where multiple camer... more Reconfigurable arrays of omnidirectional cameras are useful for applications where multiple cameras working together are to be deployed at a short notice. This paper addresses the important issue of calibration of such arrays in terms of the relative camera positions and orientations. The lo-cation of a one-dimensional object moving parallel to itself, such as a moving person is used to establish correspondences between multiple cameras. In such case, the non-linear 3-D problem of calibration can be approximated by a 2-D problem in plan view. This enables an initial solution us-ing factorization method. A non-linear optimization stage is then used to account for the the approximations, as well as to minimize the geometric error between the observed and projected omni pixel coordinates. Experimental results with simulated and real data illustrate the effectiveness of the method. Categories and Subject Descriptors

Research paper thumbnail of A Novel Graphical Interface and Context Aware Map for

This paper focuses on the development of and experiments with a visualization and telepresence en... more This paper focuses on the development of and experiments with a visualization and telepresence environment that together present an interactive, immersive, and context-preserving display of outdoor information. The developed visualization integrates a variety of types of information about a section of interstate such as aerial photographs, maps, and live outdoor videos into a single interactive environment. The persistence of context in the visualization aids the remote viewer in more easily comprehending and exploring the outdoor space. All the environments systems and algorithms are integrated and are being extensively tested in a real-world ITS application context using several novel testbeds

Research paper thumbnail of 3-D Posture and Gesture Recognition for Interactivity in Smart Spaces

Abstract—Automatic perception of human posture and gesture from vision input has an important rol... more Abstract—Automatic perception of human posture and gesture from vision input has an important role in developing intelligent video systems. In this paper, we present a novel gesture recognition approach for human computer interactivity based on marker-less upper body pose tracking in 3-D with multiple cameras. To achieve the robustness and real-time performance required for practical applications, the idea is to break the exponentially large search problem of upper body pose into two steps: first, the 3-D movements of upper body extremities (i.e., head and hands) are tracked. Then using knowledge of upper body model constraints, these extremities movements are used to infer the whole 3-D upper body motion as an inverse kinematics problem. Since the head and hand regions are typically well defined and undergo less occlusion, tracking is more reliable and could enable more robust upper body pose determination. Moreover, by breaking the problem of upper body pose tracking into two step...

Research paper thumbnail of Machine Vision and Applications DOI 10.1007/s00138-011-0388-y SPECIAL ISSUE PAPER Active learning for on-road vehicle detection: a comparative study

In recent years, active learning has emerged as a powerful tool in building robust systems for ob... more In recent years, active learning has emerged as a powerful tool in building robust systems for object detection using computer vision. Indeed, active learning approaches to on-road vehicle detection have achieved impressive results. While active learning approaches for object detection have been explored and presented in the literature, few studies have been performed to comparatively assess costs and merits. In this study, we provide a cost-sensitive analysis of three popular active learning methods for on-road vehicle detection. The generality of active learning findings is demonstrated via learning experiments performed with detectors based on histogram of oriented gradient features and SVM classification (HOG–SVM), and Haar-like features and Adaboost classification (Haar–Adaboost). Experimental evaluation has been performed on static images and real-world on-road vehicle datasets. Learning approaches are assessed in terms of the time spent annotating, data required, recall, and ...

Research paper thumbnail of Digital Object Identifier (DOI) 10.1007/s00138-002-0109-7

s and furniture. Video and audio signals are analyzed in real time for a wide range of low-level ... more s and furniture. Video and audio signals are analyzed in real time for a wide range of low-level tasks, including person identification, localization and tracking, and gesture and voice recognition [6]. Combining the analysis tasks with human face and body synthesis enables efficient interactions with remote observers, effectively merging disjoint spaces into a single intelligent environment. We are currently embedding distributed video networks in rooms, laboratories, museums, and even outdoor public spaces in support of experimental research in this domain. This involves the development of new frameworks, architectures, and algorithms for audio and video processing as well as for the control of various functions associated with proper execution of a transaction within such intelligent spaces. These test beds are also helping to identify novel applications of such Correspondence to: e-mail: mtrivedi@ucsd.edu systems in distance learning, teleconferencing, entertainment, and smart hom

Research paper thumbnail of Workshop Committee

computer.org

Organizing Committee Swarup Medasani HRL Laboratories, LLC ... Claus Bahlman, Siemens Corporate R... more Organizing Committee Swarup Medasani HRL Laboratories, LLC ... Claus Bahlman, Siemens Corporate Research, USA George Bebis, Univ. of Nevada, USA Alberto Broggi, Univ. di Parma, Italy Larry Davis, Univ. of Maryland, USA Earnst Dickmanns, Univ. Bundeswehr Muenchen, Germany Thorsten Graf, Volkswagen AG, Germany Kikuo Fujimura, Honda Research, USA Riad Hammoud, Delphi Corporation, USA Anil Jain, Michigan state university, USA Thorsten Koehler, Siemens VDO, Germany Yoshiki Ninomiya, Toyota, ...

Research paper thumbnail of Design and evaluation of a multistage object detection approach

Design and evaluation of a multistage object detection approach. MV Shirvaikar, MM Trivedi SPIE C... more Design and evaluation of a multistage object detection approach. MV Shirvaikar, MM Trivedi SPIE Conference on Applications of Artificial Intelligence VIII. Part 1, 14-22, 1990. Accurate detection of unique objects with minimal ...

Research paper thumbnail of Towards Semantic Understanding of Surrounding Vehicular Maneuvers: A Panoramic Vision-Based Framework for Real-World Highway Studies

2016 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), Jun 1, 2016

Research paper thumbnail of Trajectory Forecasts in Unknown Environments Conditioned on Grid-Based Plans

ArXiv, 2020

In this paper, we address the problem of forecasting agent trajectories in unknown environments, ... more In this paper, we address the problem of forecasting agent trajectories in unknown environments, conditioned on their past motion and scene structure. Trajectory forecasting is a challenging problem due to the large variation in scene structure, and the multi-modal nature of the distribution of future trajectories. Unlike prior approaches that directly learn one-to-many mappings from observed context, to multiple future trajectories, we propose to condition trajectory forecasts on \textit{plans} sampled from a grid based policy learned using maximum entropy inverse reinforcement learning policy (MaxEnt IRL). We reformulate MaxEnt IRL to allow the policy to jointly infer plausible agent goals and paths to those goals on a coarse 2-D grid defined over an unknown scene. We propose an attention based trajectory generator that generates continuous valued future trajectories conditioned on state sequences sampled from the MaxEnt policy. Quantitative and qualitative evaluation on the publi...

Research paper thumbnail of Automated Drive Analysis of Naturalistic Driving Studies with Looking-out Video

Problem Large volumes of data from multiple sensors are captured in Naturalistic Driving Studies ... more Problem Large volumes of data from multiple sensors are captured in Naturalistic Driving Studies (NDS) such as in the Strategic Highway Research Program 2 (SHRP2). In order to extract and characterize distraction events leading to crashes and near-crashes, visual data from multiple cameras coupled with other sensory data are analyzed by human data reductionists. This not only requires tremendous human effort and time, it is also subject to human error and interpretation. The research aims at developing novel computer vision and machine learning techniques, which analyze the visual data of the surrounding environment outside the vehicle, and the interactions and movements of the driver and passengers inside the vehicle. In this paper, we present research methods and results for automated drive analysis using looking out videos. Our ongoing research is focused on robust and efficient algorithms for looking-in videos as well.

Research paper thumbnail of Aalborg Universitet Traffic sign detection and analysis Recent studies and emerging

Traffic sign recognition (TSR) is a research field that has seen much activity in the recent deca... more Traffic sign recognition (TSR) is a research field that has seen much activity in the recent decade. This paper introduces the problem and presents 4 recent papers on traffic sign detection and 4 recent papers on traffic sign classification. It attempts to extract recent trends in the field and touch upon unexplored areas, especially the lack of research into integrating TSR with a driver-in-the-loop system and some of the problems that presents. TSR is an exciting field with great promises for integration in driver assistance systems and that particular area deserves to be explored further.

Research paper thumbnail of Guest Editorial on the Special Issue on Automation and Engineering for Ambient Intelligence

The Ambient Intelligence community is working towards a future where humans live in autonomous an... more The Ambient Intelligence community is working towards a future where humans live in autonomous and responsive environments, a goal shared by many researchers, above all by the Intelligent and Smart Environments’ research proponents. This special issue will bring together solutions from research and engineering for the automatic understanding of a complex scene via multi-modal arrays of sensors, with automation of adaptation as the theme connecting various areas of interest. Manuscripts of scientific results from technical experts from academe and industry are solicited, and topics to be covered include, but are not limited to:

Research paper thumbnail of On Salience-Sensitive Sign Classification in Autonomous Vehicle Path Planning: Experimental Explorations with a Novel Dataset

2022 IEEE/CVF Winter Conference on Applications of Computer Vision Workshops (WACVW)

Research paper thumbnail of Forced Spatial Attention for Driver Foot Activity Classification

2019 IEEE/CVF International Conference on Computer Vision Workshop (ICCVW)

Research paper thumbnail of Hand Gesture Recognition in Real Time for Automotive Interfaces: A Multimodal Vision-Based Approach and Evaluations

IEEE Transactions on Intelligent Transportation Systems, 2014

Research paper thumbnail of No Blind Spots: Full-Surround Multi-Object Tracking for Autonomous Vehicles using Cameras LiDARs

IEEE Transactions on Intelligent Vehicles

Research paper thumbnail of Scene Induced Multi-Modal Trajectory Forecasting via Planning

We address multi-modal trajectory forecasting of agents in unknown scenes by formulating it as a ... more We address multi-modal trajectory forecasting of agents in unknown scenes by formulating it as a planning problem. We present an approach consisting of three models; a goal prediction model to identify potential goals of the agent, an inverse reinforcement learning model to plan optimal paths to each goal, and a trajectory generator to obtain future trajectories along the planned paths. Analysis of predictions on the Stanford drone dataset, shows generalizability of our approach to novel scenes.

Research paper thumbnail of Trajectory Forecasts in Unknown Environments Conditioned on Grid-Based Plans

In this paper, we address the problem of forecasting agent trajectories in unknown environments, ... more In this paper, we address the problem of forecasting agent trajectories in unknown environments, conditioned on their past motion and scene structure. Trajectory forecasting is a challenging problem due to the large variation in scene structure, and the multi-modal nature of the distribution of future trajectories. Unlike prior approaches that directly learn one-to-many mappings from observed context, to multiple future trajectories, we propose to condition trajectory forecasts on plans sampled from a grid based policy learned using maximum entropy inverse reinforcement learning policy (MaxEnt IRL). We reformulate MaxEnt IRL to allow the policy to jointly infer plausible agent goals and paths to those goals on a coarse 2-D grid defined over an unknown scene. We propose an attention based trajectory generator that generates continuous valued future trajectories conditioned on state sequences sampled from the MaxEnt policy. Quantitative and qualitative evaluation on the publicly avail...

Research paper thumbnail of Two-level framework · Personal space · Context

Multi-person interaction and activity analysis: a synergistic track- and body-level analysis fram... more Multi-person interaction and activity analysis: a synergistic track- and body-level analysis framework

Research paper thumbnail of Vehicle Occupant Posture Analysis Using Voxel Data

Analysis of the vehicle occupant posture is a key problem in designing "smart airbag" s... more Analysis of the vehicle occupant posture is a key problem in designing "smart airbag" systems. Vision based technology could enable the use of precise information about the occupant's size, position, and posture in making airbag deployment decisions. In this paper, we present our experiments in estimation of the body posture of a sitting person using a motion capture system that performs analysis of voxel data. We will show that this system is capable of extracting posture of the body and the head with good accuracy.

Research paper thumbnail of Calibration of a reconfigurable array of omnidirectional cameras using a moving person

Reconfigurable arrays of omnidirectional cameras are useful for applications where multiple camer... more Reconfigurable arrays of omnidirectional cameras are useful for applications where multiple cameras working together are to be deployed at a short notice. This paper addresses the important issue of calibration of such arrays in terms of the relative camera positions and orientations. The lo-cation of a one-dimensional object moving parallel to itself, such as a moving person is used to establish correspondences between multiple cameras. In such case, the non-linear 3-D problem of calibration can be approximated by a 2-D problem in plan view. This enables an initial solution us-ing factorization method. A non-linear optimization stage is then used to account for the the approximations, as well as to minimize the geometric error between the observed and projected omni pixel coordinates. Experimental results with simulated and real data illustrate the effectiveness of the method. Categories and Subject Descriptors

Research paper thumbnail of A Novel Graphical Interface and Context Aware Map for

This paper focuses on the development of and experiments with a visualization and telepresence en... more This paper focuses on the development of and experiments with a visualization and telepresence environment that together present an interactive, immersive, and context-preserving display of outdoor information. The developed visualization integrates a variety of types of information about a section of interstate such as aerial photographs, maps, and live outdoor videos into a single interactive environment. The persistence of context in the visualization aids the remote viewer in more easily comprehending and exploring the outdoor space. All the environments systems and algorithms are integrated and are being extensively tested in a real-world ITS application context using several novel testbeds

Research paper thumbnail of 3-D Posture and Gesture Recognition for Interactivity in Smart Spaces

Abstract—Automatic perception of human posture and gesture from vision input has an important rol... more Abstract—Automatic perception of human posture and gesture from vision input has an important role in developing intelligent video systems. In this paper, we present a novel gesture recognition approach for human computer interactivity based on marker-less upper body pose tracking in 3-D with multiple cameras. To achieve the robustness and real-time performance required for practical applications, the idea is to break the exponentially large search problem of upper body pose into two steps: first, the 3-D movements of upper body extremities (i.e., head and hands) are tracked. Then using knowledge of upper body model constraints, these extremities movements are used to infer the whole 3-D upper body motion as an inverse kinematics problem. Since the head and hand regions are typically well defined and undergo less occlusion, tracking is more reliable and could enable more robust upper body pose determination. Moreover, by breaking the problem of upper body pose tracking into two step...

Research paper thumbnail of Machine Vision and Applications DOI 10.1007/s00138-011-0388-y SPECIAL ISSUE PAPER Active learning for on-road vehicle detection: a comparative study

In recent years, active learning has emerged as a powerful tool in building robust systems for ob... more In recent years, active learning has emerged as a powerful tool in building robust systems for object detection using computer vision. Indeed, active learning approaches to on-road vehicle detection have achieved impressive results. While active learning approaches for object detection have been explored and presented in the literature, few studies have been performed to comparatively assess costs and merits. In this study, we provide a cost-sensitive analysis of three popular active learning methods for on-road vehicle detection. The generality of active learning findings is demonstrated via learning experiments performed with detectors based on histogram of oriented gradient features and SVM classification (HOG–SVM), and Haar-like features and Adaboost classification (Haar–Adaboost). Experimental evaluation has been performed on static images and real-world on-road vehicle datasets. Learning approaches are assessed in terms of the time spent annotating, data required, recall, and ...

Research paper thumbnail of Digital Object Identifier (DOI) 10.1007/s00138-002-0109-7

s and furniture. Video and audio signals are analyzed in real time for a wide range of low-level ... more s and furniture. Video and audio signals are analyzed in real time for a wide range of low-level tasks, including person identification, localization and tracking, and gesture and voice recognition [6]. Combining the analysis tasks with human face and body synthesis enables efficient interactions with remote observers, effectively merging disjoint spaces into a single intelligent environment. We are currently embedding distributed video networks in rooms, laboratories, museums, and even outdoor public spaces in support of experimental research in this domain. This involves the development of new frameworks, architectures, and algorithms for audio and video processing as well as for the control of various functions associated with proper execution of a transaction within such intelligent spaces. These test beds are also helping to identify novel applications of such Correspondence to: e-mail: mtrivedi@ucsd.edu systems in distance learning, teleconferencing, entertainment, and smart hom

Research paper thumbnail of Workshop Committee

computer.org

Organizing Committee Swarup Medasani HRL Laboratories, LLC ... Claus Bahlman, Siemens Corporate R... more Organizing Committee Swarup Medasani HRL Laboratories, LLC ... Claus Bahlman, Siemens Corporate Research, USA George Bebis, Univ. of Nevada, USA Alberto Broggi, Univ. di Parma, Italy Larry Davis, Univ. of Maryland, USA Earnst Dickmanns, Univ. Bundeswehr Muenchen, Germany Thorsten Graf, Volkswagen AG, Germany Kikuo Fujimura, Honda Research, USA Riad Hammoud, Delphi Corporation, USA Anil Jain, Michigan state university, USA Thorsten Koehler, Siemens VDO, Germany Yoshiki Ninomiya, Toyota, ...

Research paper thumbnail of Design and evaluation of a multistage object detection approach

Design and evaluation of a multistage object detection approach. MV Shirvaikar, MM Trivedi SPIE C... more Design and evaluation of a multistage object detection approach. MV Shirvaikar, MM Trivedi SPIE Conference on Applications of Artificial Intelligence VIII. Part 1, 14-22, 1990. Accurate detection of unique objects with minimal ...

Research paper thumbnail of Towards Semantic Understanding of Surrounding Vehicular Maneuvers: A Panoramic Vision-Based Framework for Real-World Highway Studies

2016 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), Jun 1, 2016

Research paper thumbnail of Trajectory Forecasts in Unknown Environments Conditioned on Grid-Based Plans

ArXiv, 2020

In this paper, we address the problem of forecasting agent trajectories in unknown environments, ... more In this paper, we address the problem of forecasting agent trajectories in unknown environments, conditioned on their past motion and scene structure. Trajectory forecasting is a challenging problem due to the large variation in scene structure, and the multi-modal nature of the distribution of future trajectories. Unlike prior approaches that directly learn one-to-many mappings from observed context, to multiple future trajectories, we propose to condition trajectory forecasts on \textit{plans} sampled from a grid based policy learned using maximum entropy inverse reinforcement learning policy (MaxEnt IRL). We reformulate MaxEnt IRL to allow the policy to jointly infer plausible agent goals and paths to those goals on a coarse 2-D grid defined over an unknown scene. We propose an attention based trajectory generator that generates continuous valued future trajectories conditioned on state sequences sampled from the MaxEnt policy. Quantitative and qualitative evaluation on the publi...

Research paper thumbnail of Automated Drive Analysis of Naturalistic Driving Studies with Looking-out Video

Problem Large volumes of data from multiple sensors are captured in Naturalistic Driving Studies ... more Problem Large volumes of data from multiple sensors are captured in Naturalistic Driving Studies (NDS) such as in the Strategic Highway Research Program 2 (SHRP2). In order to extract and characterize distraction events leading to crashes and near-crashes, visual data from multiple cameras coupled with other sensory data are analyzed by human data reductionists. This not only requires tremendous human effort and time, it is also subject to human error and interpretation. The research aims at developing novel computer vision and machine learning techniques, which analyze the visual data of the surrounding environment outside the vehicle, and the interactions and movements of the driver and passengers inside the vehicle. In this paper, we present research methods and results for automated drive analysis using looking out videos. Our ongoing research is focused on robust and efficient algorithms for looking-in videos as well.

Research paper thumbnail of Aalborg Universitet Traffic sign detection and analysis Recent studies and emerging

Traffic sign recognition (TSR) is a research field that has seen much activity in the recent deca... more Traffic sign recognition (TSR) is a research field that has seen much activity in the recent decade. This paper introduces the problem and presents 4 recent papers on traffic sign detection and 4 recent papers on traffic sign classification. It attempts to extract recent trends in the field and touch upon unexplored areas, especially the lack of research into integrating TSR with a driver-in-the-loop system and some of the problems that presents. TSR is an exciting field with great promises for integration in driver assistance systems and that particular area deserves to be explored further.

Research paper thumbnail of Guest Editorial on the Special Issue on Automation and Engineering for Ambient Intelligence

The Ambient Intelligence community is working towards a future where humans live in autonomous an... more The Ambient Intelligence community is working towards a future where humans live in autonomous and responsive environments, a goal shared by many researchers, above all by the Intelligent and Smart Environments’ research proponents. This special issue will bring together solutions from research and engineering for the automatic understanding of a complex scene via multi-modal arrays of sensors, with automation of adaptation as the theme connecting various areas of interest. Manuscripts of scientific results from technical experts from academe and industry are solicited, and topics to be covered include, but are not limited to: