GitHub - ken2576/multiview_preprocessing (original) (raw)
Multiview Camera Preprocessing Scripts
Requirements
- Run
pip install -r requirements.txt
- (Optional for checking camera pose) Install PyTorch (1.7.1, 1.9.0 tested)
- (Optional for generating camera poses) COLMAP
- Foreground image generation BackgroundMattingV2
Usage
- Download data and extract
- Run
sh extract_frame.sh [data_root_directory] [output_directory] [intrinsics_folder]
- Use the precalibrated camera poses or generate your own (see below).
(Optional) Acquire camera pose with COLMAP usingpython get_pose.py [scene_directory] --scene_ids [desired scene ids]
For example,python get_pose.py [output_directory]/test/2_7k/2_8/ --scene_ids 18
to process only scene018 or omit--scene_ids
to process all scenes
Might need to check COLMAP results to see if camera poses are reasonable (looks like a grid). If not, consider supplying a pose prior as follows.
Runpython get_pose_with_prior.py --root_dir [images_folder] --prior_path [poses_bounds_npy] --extension [jpg_or_png]
For example,python get_pose_with_prior.py --root_dir [output_directory]/test/2_7k/2_8/scene018/images --prior_path scene018_pb.npy --extension .jpg
to acquire pose for scene018 using a pose prior stored in the npy file. - Generate background images with
sh gen_bg.sh [output_directory]
- Generate foreground images (requires BackgroundMattingV2)
Dropgen_fg.py
andgen_fg.sh
intoBackgroundMattingV2
's folder and runsh gen_fg.sh [ckpt_path] [resnet101 | resnet50] [output_directory]
- Compress images with
sh compress_data.sh [output_directory] [compressed_output_directory]
- Collect the poses with
sh collect_poses.sh [output_directory] [compressed_output_directory]
Dataset usage
The camera poses are stored in LLFF convention in npy
format.
E.g. camera poses are stored in shape (#views, 17)
where the camera parameters are flattened.
It can be converted to intrinsic and world-to-camera matrix with:
import numpy as np
def pose2mat(pose): """Convert pose matrix (3x5) to extrinsic matrix (4x4) and intrinsic matrix (3x3)
Args:
pose: 3x5 pose matrix
Returns:
Extrinsic matrix (4x4) and intrinsic matrix (3x3)
"""
extrinsic = np.eye(4)
extrinsic[:3, :] = pose[:, :4]
h, w, focal_length = pose[:, 4]
intrinsic = np.array([[focal_length, 0, w/2],
[0, focal_length, h/2],
[0, 0, 1]])
return extrinsic, intrinsic
def convert_llff(pose): """Convert LLFF poses to OpenCV convention (w2c extrinsic and hwf) """ hwf = pose[:3, 4:]
ext = np.eye(4)
ext[:3, :4] = pose[:3, :4]
ext = np.concatenate([ext[:, 1:2],
ext[:, 0:1],
-ext[:, 2:3],
ext[:, 3:4]], axis=1)
mat = np.linalg.inv(ext)
return np.concatenate([mat[:3, :4], hwf], -1)
input_poses = np.load('scene000_pb.npy') pose = input_poses[0, :-2].reshape([3, 5]) bounds = input_poses[0, -2:] w2c, k = pose2mat(convert_llff(pose)) print(w2c.shape) print(k.shape)
The image data after compression is stored in h5
format with keys:
rgb
: original RGB frames in shape (#views, #frames, height, width, 3)
fg_rgb
: processed foreground RGB frames in shape (#views, #frames, height, width, 4)
bg_rgb
: median-filtered background RGB images in shape (#views, height, width, 3)
Example script:
import h5py with h5py.File('scene000.h5', 'r') as hf: rgb = hf['rgb'] # (#views, #frames, height, width, 3)
Troubleshooting
Sometimes the last frame could be corrupted when doing frame extraction. Rerun undistort_opencv.py
with argument --fix_last
to fix it.