SOCIAL MEDIA TITLE TAG

MeshAvatar: Learning High-quality Triangular

Human Avatars from Multi-view Videos

Tsinghua University1, NNKosmos Technology2
ECCV 2024

MY ALT TEXT

Given the multi-view videos of a specific subject, our method learns his triangular avatar with intrinsic material decomposition. After training, the avatar not only supports synthesis under novel poses and novel lighting conditions, but also enables texture editing and material manipulation.

Abstract

We present a novel pipeline for learning high-quality triangular human avatars from multi-view videos. Recent methods for avatar learning are typically based on neural radiance fields (NeRF), which is not compatible with traditional graphics pipeline and poses great challenges for operations like editing or synthesizing under different environments. To overcome these limitations, our method represents the avatar with an explicit triangular mesh extracted from an implicit SDF field, complemented by an implicit material field conditioned on given poses. Leveraging this triangular avatar representation, we incorporate physics-based rendering to accurately decompose geometry and texture. To enhance both the geometric and appearance details, we further employ a 2D UNet as the network backbone and introduce pseudo normal ground-truth as additional supervision. Experiments show that our method can learn triangular avatars with high-quality geometry reconstruction and plausible material decomposition, inherently supporting editing, manipulation or relighting operations.

Method Overview

MY ALT TEXT

Our pipeline learns a hybrid human avatar represented in the form of (a) an explicit skinned mesh and (b) implicit pose-dependent material fields. Such a representation inherently supports (c) physics-based ray tracing and can be trained in an end-to-end manner using (d) normal estimation as an additional supervision signal.

Video Presentation

Comparisons

Quantitative Results

Ours	AvatarReX	AnimatableGaussians	AnimatableGaussians*	Xu et al.	Lin et al.	IntrinsicAvatar
Representation	hybrid	SDF	3DGS	3DGS	SDF	SDF	SDF
Relightable?	✔	✔	✔	✔	✔
Training Time(~100 frames)	~3h	2.5 days	4h(mono.)
Training Time(~1000 frames)	~16h	2 days	2 days(RTX 4090)	2 days(RTX 4090)	30h
Inference Time(per image)	180ms	30s	100ms	4~10s	5s	40s	20s

Comparisons with recent SOTA methods on neural avatars. We achieved 20x faster at inference.

Qualitative Results

MY ALT TEXT

We evaluated our method on AvatarReX[1] and ActorsHQ[2] datasets. Our method could reconstruct fine-grained dynamic human geometry.

References

[1] Zheng, Zerong, et al. "Avatarrex: Real-time expressive full-body avatars." ACM Transactions on Graphics (TOG) 42.4 (2023): 1-19.

[2] Işık, Mustafa, et al. "Humanrf: High-fidelity neural radiance fields for humans in motion." ACM Transactions on Graphics (TOG) 42.4 (2023): 1-12.

BibTeX

@misc{chen2024meshavatar,
    title={MeshAvatar: Learning High-quality Triangular Human Avatars from Multi-view Videos}, 
    author={Yushuo Chen and Zerong Zheng and Zhe Li and Chao Xu and Yebin Liu},
    year={2024},
    eprint={2407.08414},
    archivePrefix={arXiv},
    primaryClass={cs.CV},
    url={https://arxiv.org/abs/2407.08414}, 
}

SOCIAL MEDIA TITLE TAG (original) (raw)

MeshAvatar: Learning High-quality Triangular

Given the multi-view videos of a specific subject, our method learns his triangular avatar with intrinsic material decomposition. After training, the avatar not only supports synthesis under novel poses and novel lighting conditions, but also enables texture editing and material manipulation.

Abstract

Method Overview

Video Presentation

Comparisons

Quantitative Results

Comparisons with recent SOTA methods on neural avatars. We achieved 20x faster at inference.

Qualitative Results

We evaluated our method on AvatarReX[1] and ActorsHQ[2] datasets. Our method could reconstruct fine-grained dynamic human geometry.

References

[1] Zheng, Zerong, et al. "Avatarrex: Real-time expressive full-body avatars." ACM Transactions on Graphics (TOG) 42.4 (2023): 1-19.

BibTeX