SPARF: Large-Scale Learning of 3D Sparse Radiance Fields from Few Input Images

1KAUST 2Technical University of Munich,
(Work was done during Abdullah’s internship at TUM as well as work at KAUST.)

323 Voxels 1283 Voxels 5123 Voxels


sparf is a large-scale sparse radiance field dataset consisting of ~ 1 million SRFs with multiple voxel resolutions (32, 128, and 512) and 17 million posed images with a resolution of 400 X 400. Furthermore, we propose SuRFNet, a pipline to generate SRFs conditioned on input images, achieving SOTA on ShapeNet novel views synthesis from one or few input images.

Abstract

Recent advances in Neural Radiance Fields (NeRFs) treat the problem of novel view synthesis as Sparse Radiance Field (SRF) optimization using sparse voxels for efficient and fast rendering (Plenoxels, InstantNGP). In order to leverage machine learning and adoption of SRFs as a 3D representation, we present SPARF, a large-scale ShapeNet-based synthetic dataset for novel view synthesis consisting of ~ 17 million images rendered from nearly 40,000 shapes at high resolution (400 X 400 pixels). The dataset is orders of magnitude larger than existing synthetic datasets for novel view synthesis and includes more than one million 3D-optimized radiance fields with multiple voxel resolutions. Furthermore, we propose a novel pipeline (SuRFNet) that learns to generate sparse voxel radiance fields from only few views. This is done by using the densely collected SPARF dataset and 3D sparse convolutions. SuRFNet employs partial SRFs from few/one images and a specialized SRF loss to learn to generate high-quality sparse voxel radiance fields that can be rendered from novel views. Our approach achieves state-of-the-art results in the task of unconstrained novel view synthesis based on few views on ShapeNet as compared to recent baselines.

Video

Multi-Resolution Sparse Radiance Fields

sparf is a large dataset of Sparse Radiance Fields (SRFs) on multiple voxel resolutions: 323, 1283, and 5123.

323 Voxels 1283 Voxels 5123 Voxels

SPARF: Posed Multi-View Dataset

sparf is a large high resolution posed multi-view dataset, compared to other shapes datasets.

SuRFNet: Learning to Generate Sparse Radiance Fields

The SuRFNet pipelin consists of learning sparse convolution network conditioned on partial SRFs. The pipeline utilize a combination of losses on the genrated SRFs and a perceptual loss on the rendered images from the output SRF.

Mesh Extraction from SRFs

Since Sparse Radiance Fields (SRFs) are explicit 3D structures, obtaining high quality 3D meshes is straightforward.

Novel Views Synthesis from a Single View

SuRFNet naturally enables novel views Synthesis from a single view by cnoditioning the output SRF on the partial SRFS based on the few input views. This piepline achieves state-of-the-art on novel views synthesis on unseen shapes compared to recent baselines (PixelNeRF, and VisionNeRF).

Related Links

For some more 3D radiance fields generation works, please also check out

DreamFusion: Text-to-3D using 2D Diffusion performs text-guided NeRF generation by 2D Diffusion. They propose Score Distillation Sampling in order to optimize samples via diffusion which could potentially also been applied to other modalities than text.

LION: Latent Point Diffusion Models for 3D Shape Generation introduces a hierarchical approach to learn high-quality point cloud synthesis that can be augmented with mdoern surface reconstruction techniques to generate smooth 3D meshes.

DiffRF: Rendering-guided 3D Radiance Field Diffusion is a denoising diffusion probabilistic model directly operating on 3D radiance fields and trained with an additional volumetric rendering loss. This enables learning strong radiance priors with high rendering quality and accurate geometry.

BibTeX

@InProceedings{hamdi2022sparf, 
    author = {Hamdi, Abdullah and Ghanem, Bernard and Nie{\ss}ner, Matthias}, 
    title = {SPARF: Large-Scale Learning of 3D Sparse Radiance Fields from Few Input Images}, 
    publisher = {arxiv}, 
    year = {2022},
}