3D point cloud

TrackNeRF: Bundle Adjusting NeRF from Sparse and Noisy Views via Feature Tracks

Neural radiance fields (NeRFs) generally require many images with accurate poses for accurate novel view synthesis, which does not reflect realistic setups where views can be sparse and poses can be noisy. Previous solutions for learning NeRFs with …

GES: Generalized Exponential Splatting for Efficient Radiance Field Rendering

Advancements in 3D Gaussian Splatting (GS) have significantly accelerated 3D reconstruction and generation. However, it may require a very large number of Gaussians, which can become a substantial memory footprint. This paper introduces GES …

Magic123: One Image to High-Quality 3D Object Generation Using Both 2D and 3D Diffusion Priors

We present `Magic123`, a two-stage coarse-to-fine solution for high-quality, textured 3D meshes generation from a single unposed image in the wild using both 2D and 3D priors. In the first stage, we optimize a neural radiance field to produce a …

Pix4Point: Image Pretrained Standard Transformers for 3D Point Cloud Understanding

While Transformers have achieved impressive success in natural language processing and computer vision, their performance on 3D point clouds is relatively poor. This is mainly due to the limitation of Transformers: a demanding need for extensive …

EgoLoc: Revisiting 3D Object Localization from Egocentric Videos with Visual Queries

With the recent advances in video and 3D understanding, novel 4D spatio-temporal methods fusing both concepts have emerged. Towards this direction, the Ego4D Episodic Memory Benchmark proposed a task for Visual Queries with 3D Localization (VQ3D). …

SPARF: Large-Scale Learning of 3D Sparse Radiance Fields from Few Input Images

Recent advances in Neural Radiance Fields (NeRFs) treat the problem of novel view synthesis as Sparse Radiance Field (SRF) optimization using sparse voxels for efficient and fast rendering (Plenoxels,InstantNGP). In order to leverage machine learning …

VARS: Video Assistant Referee System for Automated Soccer Decision Making From Multiple Views

The Video Assistant Referee (VAR) has revolutionized association football, enabling referees to review incidents on the pitch, make informed decisions, and ensure fairness. However, due to the lack of referees in many countries and the high cost of …

Voint Cloud: Multi-View Point Cloud Representation for 3D Understanding

Multi-view projection methods have demonstrated promising performance on 3D understanding tasks like 3D classification and segmentation. However, it remains unclear how to combine such multi-view methods with the widely available 3D point clouds. …

MVTN: Multi-View Transformation Network for 3D Shape Recognition

Multi-view projection methods have demonstrated their ability to reach state-of-the-art performance on 3D shape recognition. Those methods learn different ways to aggregate information from multiple views. However, the camera view-points for those …

AdvPC: Transferable Adversarial Perturbations on 3D Point Clouds

Deep neural networks are vulnerable to adversarial attacks, in which imperceptible perturbations to their input lead to erroneous network predictions. This phenomenon has been extensively studied in the image domain, and has only recently been …