Meta 3D AssetGen

Abstract

We present Meta 3D AssetGen (AssetGen), a significant advancement in text-to-3D generation which produces faithful, high-quality meshes with texture and material control. Compared to works that bake shading in the 3D object’s appearance, AssetGen outputs physically-based rendering (PBR) materials, supporting realistic relighting. AssetGen generates first several views of the object with factored shaded and albedo appearance channels, and then reconstructs colours, metalness and roughness in 3D, using a deferred shading loss for efficient supervision. It also uses a sign-distance function to represent 3D shape more reliably and introduces a corresponding loss for direct shape supervision. This is implemented using fused kernels for high memory efficiency. After mesh extraction, a texture refinement transformer operating in UV space significantly improves sharpness and details. AssetGen achieves 17% improvement in Chamfer Distance and 40% in LPIPS over the best concurrent work for few-view reconstruction, and a human preference of 72% over the best industry competitors of comparable speed, including those that support PBR.

Video

Relightability of Assets

Meta 3D AssetGen is able to generate assets with varying material properties, which allows faithful modelling of the interaction between the object surface as the environment lighting changes. Here, we show assets generated with the prompt "A cat made of MATERIAL".

Shiny Plastic

Rock

Shiny Silver

Rusted Iron

[ Press G to toggle between geometry and textured meshes. Press R to reset the view. ]

Method Overview

Given a text prompt, AssetGen generates a 3D mesh with PBR materials in two stages. The first text-to-image stage (blue) predicts a 6-channel image depicting 4 views of the object with shaded and albedo colors. The second image-to-3D stage includes two steps. First, a 3D reconstructor (dubbed MetaILRM) outputs a triplane-supported SDF field converted into a mesh with textured PBR materials (orange). Then, PBR materials are enhanced with our texture refiner which recovers missing details from the input views (green).

BibTeX

@inproceedings{siddiqui2024assetgen,
 author = {Siddiqui, Yawar and Monnier, Tom and Kokkinos, Filippos and Kariya, Mahendra and Kleiman, Yanir and Garreau, Emilien and Gafni, Oran and Neverova, Natalia and Vedaldi, Andrea and Shapovalov, Roman and Novotny, David},
 booktitle = {Advances in Neural Information Processing Systems},
 editor = {A. Globerson and L. Mackey and D. Belgrave and A. Fan and U. Paquet and J. Tomczak and C. Zhang},
 pages = {9532--9564},
 publisher = {Curran Associates, Inc.},
 title = {Meta 3D AssetGen: Text-to-Mesh Generation with High-Quality Geometry, Texture, and PBR Materials},
 url = {https://proceedings.neurips.cc/paper_files/paper/2024/file/123cfe7d8b7702ac97aaf4468fc05fa5-Paper-Conference.pdf},
 volume = {37},
 year = {2024}
}

Meta 3D AssetGen: Text-to-Mesh Generation with High-Quality Geometry, Texture, and PBR Materials

NeurIPS 2024

Abstract

Video

Relightability of Assets

Method Overview

Related Links

BibTeX