Multi-view Dynamic Human Datasets (original) (raw)

This dataset collects five sequences by a multi-camera dome system with up to 80 cameras arranged on a cylinder. All cameras are synchronized, calibrated and captured at 25 frames per second. Performers are in different clothing and perform different actions. All sequences have a length between 8 to 24 seconds. Specifically, `sport1`, `sport2`, `sport3` correspond to dumbbell lifting with relatively tight clothing, `dance` contains complex and highly deformable clothing, and `basketball` involves interactions between a player and a ball.

Each sequence contains RGB images, foreground masks, RGB point cloud sequence and camera calibration. We use chrome key based matting followed by manual fixing to extract the ground truth masks for all views. We use one of the best commercial SfM software Metashape to compute all frames' initial 3D point clouds.

Datasets are now released for non-commercial purposes. Please see here
File structure is described in here