DreamFusion: Text-to-3D using 2D Diffusion

    Ben Poole
    Google Research
    Ajay Jain
    UC Berkeley
    Jonathan T. Barron
    Google Research
    Ben Mildenhall
    Google Research
    Paper Project Gallery

    Abstract

    Recent breakthroughs in text-to-image synthesis have been driven by diffusion models trained on billions of image-text pairs. Adapting this approach to 3D synthesis would require large-scale datasets of labeled 3D assets and efficient architectures for denoising 3D data, neither of which currently exist. In this work, we circumvent these limitations by using a pretrained 2D text-to-image diffusion model to perform text-to-3D synthesis. We introduce a loss based on probability density distillation that enables the use of a 2D diffusion model as a prior for optimization of a parametric image generator. Using this loss in a DeepDream-like procedure, we optimize a randomly-initialized 3D model (a Neural Radiance Field, or NeRF) via gradient descent such that its 2D renderings from random angles achieve a low loss. The resulting 3D model of the given text can be viewed from any angle, relit by arbitrary illumination, or composited into any 3D environment. Our approach requires no 3D training data and no modifications to the image diffusion model, demonstrating the effectiveness of pretrained image diffusion models as priors.

    Given a caption, DreamFusion generates relightable 3D objects with high-fidelity appearance, depth, and normals. Objects are represented as a Neural Radiance Field and leverage a pretrained text-to-image diffusion prior such as Imagen.

    Generate 3D from text yourself!


    Example generated objects

    DreamFusion generates objects and scenes from diverse captions. Search through hundreds of generated assets in our full gallery.


    Composing objects into a scene


    Mesh exports

    Our generated NeRF models can be exported to meshes using the marching cubes algorithm for easy integration into 3D renderers or modeling software.


    How does DreamFusion work?

    Given a caption, DreamFusion uses a text-to-image generative model called Imagen to optimize a 3D scene. We propose Score Distillation Sampling (SDS), a way to generate samples from a diffusion model by optimizing a loss function. SDS allows us to optimize samples in an arbitrary parameter space, such as a 3D space, as long as we can map back to images differentiably. We use a 3D scene parameterization similar to Neural Radiance Fields, or NeRFs, to define this differentiable mapping. SDS alone produces reasonable scene appearance, but DreamFusion adds additional regularizers and optimization strategies to improve geometry. The resulting trained NeRFs are coherent, with high-quality normals, surface geometry and depth, and are relightable with a Lambertian shading model.


    Citation

    @article{poole2022dreamfusion,
      author = {Poole, Ben and Jain, Ajay and Barron, Jonathan T. and Mildenhall, Ben},
      title = {DreamFusion: Text-to-3D using 2D Diffusion},
      journal = {arXiv},
      year = {2022},
    }
    主站蜘蛛池模板: 精彩视频一区二区三区| 亚洲.国产.欧美一区二区三区| 国产高清在线精品一区二区三区| 波多野结衣的AV一区二区三区 | 国产精品无码一区二区三级| 无码少妇A片一区二区三区| 久久99热狠狠色精品一区| 天堂不卡一区二区视频在线观看 | 亚洲国产高清在线精品一区| 香蕉视频一区二区| 无码精品久久一区二区三区| 色欲AV蜜桃一区二区三| 国产主播福利一区二区| 日韩国产精品无码一区二区三区| 中文字幕永久一区二区三区在线观看| 日本一区二区三区不卡视频中文字幕 | 中文字幕一区一区三区| 日韩精品乱码AV一区二区| 久久精品一区二区三区资源网| 中文字幕亚洲一区二区三区| 一区二区日韩国产精品| 一区二区三区91| 国产一区二区三区在线影院| 国模大尺度视频一区二区| 狠狠色成人一区二区三区| 亚洲一区二区三区影院 | 亚洲AV日韩综合一区| 色综合视频一区中文字幕| 色综合久久一区二区三区| 国产在线第一区二区三区| 夜夜精品无码一区二区三区| 国产日韩高清一区二区三区| 国产熟女一区二区三区四区五区| 国产探花在线精品一区二区| 无码人妻久久久一区二区三区| 精品欧洲av无码一区二区三区| 亚洲一区二区三区国产精华液| 国产视频一区二区在线观看| 一区二区免费国产在线观看| 精品国产福利第一区二区三区| 亚洲午夜在线一区|