Learning to Recognize, Observe, Analyze and Drive Through Work Zones (original) (raw)

Robotics Institute, Carnegie Mellon University

ROADWork dataset contains annotated images and sequences taken by driving through nearly 5000 different work zones in 18 US cities.

Abstract

Perceiving and navigating through work zones is challenging and under-explored, even with major strides in self-driving research. An important reason is the lack of open datasets for developing new algorithms to address this long-tailed scenario. We propose the ROADWork dataset to learn how to recognize, observe and analyze and drive through work zones. We find that state-of-the-art foundation models perform poorly on work zones. With our dataset,

We improve upon detecting work zone objects (+26.2 AP) compared to open-vocabulary models.
We discover work zones with higher precision (+32.5 %) at a much higher discovery rate (12.8X) than open-vocabulary models.
We significantly improve detecting (+23.9 AP) and reading (+14.2% 1 - NED) work zone signs than State-of-the-art methods.
We improve upon poorly-performing foundation models for describing work zones (+36.7 SPICE).
We also compute drivable paths from work zone navigation videos and show that it is possible to predict 53.6% navigational goals (+9.9 %) and 75.3% pathways (+8.1 %) with angular error (AE) < 0.5 degrees over baseline.

Understanding and Navigating Work Zones is Difficult

ROADWork Dataset teaser.

Why are work zones so hard for self-driving cars? No two work zones are truly alike and objects like barriers, and work vehicles widely vary by the type, status, location and geography of work zones. Navigational aids (like signs) are customized to the particular work zone and require fine-grained understanding. Spatial configurations of work zone objects do not conform to the lane, road and sidewalk layouts. Often, rules of traffic are suspended and new rules are enforced that may change over time. All these reasons make work zone understanding and navigation difficult.

ROADWork Dataset teaser.

Work zones are dynamic. For example, in this situation, a self-driving is expected to read the signs held by the workers saying STOP, wait till the car from the opposite direction passes, and observe that the workers have changed the sign to SLOW and then proceed. This requires the car to understand the global scene context, fine grained observations like signs, and the larger context of the work zone and workers which makes navigation challenging.

ROADWork Dataset Overview

ROADWork Dataset Details.

Work zone images and sequences from 18 U.S. cities. We have segmented 15 object instances like workers, vehicles and barriers. We provide scene level and object attributes (for signs and arrow boards) to enable fine-grained understanding. Work zone scene descriptions help analyze the scene globally and one passable trajectory automatically estimated from video to learn how to drive through work zones.

Recognize Work Zones

Using detectors trained on ROADWork dataset, we discovered work zones around the world in Mappilary and BDD datasets.

Observe Work Zones

ROADWork dataset improves fine-grained understanding of work zone signs, arrow boards and other rare objects. These objects are poorly detected by pre-trained foundation models like Detic and OpenSEED.

Scene Description.

Our sign attributes contain 62 types of graphics and 360 different text annotations.

Object Description.

Trajectory Description.

We annotate rare and diverse object instances like police cars, tubular markers, barriers and work vehicles.

search
Please hover on the images to zoom in.

Analyze Work Zones

Pre-trained foundation models like LLaVA poorly understand work zones. Using ROADWork Dataset, we improved their performance via ground truth descriptions and work zone objects as context.

ROADWork Dataset Details.

Drive Through Work Zones

Using drivable paths from ROADWork dataset, we can learn drivable goals and pathways for navigating work zones.

Acknowledgments

This work was supported by a research contract from General Motors Research-Israel, NSF Grant CNS-2038612, a US DOT grant 69A3551747111 through the Mobility21 UTC and grants 69A3552344811 and 69A3552348316 through the Safety21 UTC. We thank N. Dinesh Reddy, Khiem Vuong, Shefali Srivastava, Neha Boloor, Tiffany Ma for insightful discussions.