2'nd Workshop for Learning 3D with Multi-View Supervision (3DMV)

at CVPR 2024




Call for papers:   February 10th

Submission Deadline:   March 15th (11.59 pm Any Time on Earth)

Workshop Day:   June 17th, 2024

Location:   Seattle Convention Center, WA (room: Summit 331)

Event Pictures

Second Workshop for Learning 3D with Multi-View Supervision @ CVPR 2024

Following the success of the first Workshop for Learning 3D with Multi-View Supervision held during CVPR 2023, we are excited to bring forth the second iteration of this workshop for CVPR 2024. With the growing interest and advancements in the domain, this year's workshop promises more depth, diverse topics, and inclusive participation. It would cover various topics that involve multi-view deep learning for core 3D understanding tasks (recognition, detection, segmentation) and methods that use posed or un-posed multi-view images for 3D reconstruction and generation. A set of new topics of interest will be added such as dynamic multi-view datasets and generative 4D models that leverage multi-view representation. The detailed topics covered in the workshop include the following:

  • Multi-View for 3D Understanding
  • Deep Multi-View Stereo
  • Multi-View for 3D Generation and Novel View Synthesis
  • Dynamic Multi-View Datasets and 4D Generative models
  • Submission TimeLine

  • Paper Submission start: February 10th
  • Paper submission deadline: March 15th (11.59 pm Any Time on Earth)
  • Review period: March 15th - April 1st
  • Decision to authors: April 1st
  • Camera-ready papers: April 7th
  • Call For Papers

    We are soliciting papers that use Multi-view deep learning to address problems in 3D Understanding and 3D Generation, including but not limited to the following topics:

  • Bird-Eye View for 3D Object Detection
  • multi-view fusion for 3D Object Detection
  • indoor/outdoor scenes segmentation
  • 3D Diffusions for 3D generation
  • Diffusions for 4D generation
  • 4D understanding
  • part object segmentation
  • language+3D
  • Medical 3D segmentation and analysis
  • 3D shape generation
  • Deep multi-view stereo
  • Inverse Graphics from multi-view images
  • indoor/outdoor scenes generation and reconstruction
  • Volumetric Multi-view representation for 3D generation and novel view synthesis
  • Nerfs and Gaussian Splatting
  • 3D shape classification
  • 3D shape retrieval
  • Paper Submission Guidelines

  • We accept both archival and non-archival paper submissions.
  • Archival submissions should be of max 8 pages (excluding references) on the aforementioned and related topics.
  • Non-archival submissions can be previously published works in major venues (in the last two years or at CVPR 2024) or based on new works (max 8 pages as well).
  • Archival Accepted papers will be included in the proceedings of CVPR 2024, while non-archival accpeted papers will not be included
  • Submitted manuscripts should follow the CVPR 2024 paper template (if they have not been published previously).
  • All submissions (except for previoulsly published) will be peer-reviewed under a double-blind policy (authors should not include names in submissions)
  • PDFs need to be submitted online through the link.
  • Accepted papers' authors will be notified to prepare camera-ready posters to be uploaded based on the schedule above.
  • Every accepted paper will have the opportunity to host a poster presentation at the workshop.
  • Some acccpeted papers will be selcted for oral presentations at the workshop.
  • There will be a `best poster award` announced during the workshop with a sponsored money prize.
  • Schedule (June 17th, 2024) [room: Summit 331]

    {{item}}
    Time Session Speakers Recordings
    {{tableData[currentCountry][0]}} Opening Remarks Abdullah Hamdi -
    {{tableData[currentCountry][1]}} AI Generation of Immersive 3D Worlds Matthias Niessner -
    {{tableData[currentCountry][2]}} 3DTopia: Foundation Ecosystem for 3D Generative Models Ziwei Liu -
    {{tableData[currentCountry][3]}} Coffee Break
    {{tableData[currentCountry][4]}} Scaling Multiview Reconstruction over Space and Time Deva Ramanan -
    {{tableData[currentCountry][5]}} Lunch Break
    {{tableData[currentCountry][6]}} Oral Sessions (5 Orals)
    {{tableData[currentCountry][7]}} Posters Session
    {{tableData[currentCountry][8]}} Coffee Break
    {{tableData[currentCountry][9]}} Learning Objects and Pose from Multi-view Data David Novotny -
    {{tableData[currentCountry][10]}} 3D Gaussian Splatting as Markov Chain Monte Carlo Andrea Tagliasacchi -
    {{tableData[currentCountry][11]}} Announcement & Panel Discussion All speakers and moderators -

    Oral Session Schedule (June 17th, 2024)

    {{item}}
    Time Title Presenters Recordings
    {{tableData[currentCountry][0]}} DUSt3R: Geometric 3D Vision Made Easy Vincent Leroy, Naver Labs Europe -
    {{tableData[currentCountry][1]}} Genie: Generative Interactive Environments Yuge Shi , DeepMind London -
    {{tableData[currentCountry][2]}} SACReg: Scene-Agnostic Coordinate Regression for Visual Localization Jerome Revaud, Naver Labs Europe -
    {{tableData[currentCountry][3]}} PointOfView: A Multi-modal Network for Few-shot 3D Point Cloud Classification Fusing Point and Multi-view Image Features Senem Velipasalar, Syracuse University -
    {{tableData[currentCountry][4]}} From 2D Portraits to 3D Realities: Advancing GAN Inversion for Enhanced Image Synthesis Wonseok Oh, University of Michigan -

    Highlighted Paper Awards

    • Best Paper Award:

      Global Latent Neural Rendering [PDF]
      Thomas Tanay (Huawei Noah's Ark Lab)*;Matteo Maggioni (Huawei Noah's Ark Lab);
    • Best Paper Runner-up Awards:

      Depth-Regularized Optimization for 3D Gaussian Splatting in Few-Shot Images [Proceedings]
      Jaeyoung Chung (Seoul National University)*;Jeongtaek Oh (Seoul National University);Kyoung Mu Lee (Seoul National University);
      Lifting Multi-View Detection and Tracking to the Bird's Eye View [Proceedings]
      Torben Teepe (Technical University of Munich)*;Johannes Gilg (Technical University of Munich);Philipp Wolters (Technical University of Munich);Fabian Herzog (Technical University of Munich);Gerhard Rigoll (TUM);

    Accepted Papers (proceedings link)

    Semi-Stereo: A Universal Stereo Matching Framework for Imperfect Data via Semi-supervised Learning [Proceedings]
    Xin Yue (Tsinghua University);Zongqing Lu (Tsinghua University international Graduate School at Shenzhen);Xiangru Lin (HKU);Wenjia Ren (Tsinghua University);Zhijing SHAO (Prometheus Vision Technology Co., Ltd.)*;Haonan Hu (Tsinghua University);Yu Zhang (Promethues Vision Technology);Qingmin Liao (Tsinghua Univeristy);
    MonoSelfRecon: Purely Self-Supervised Explicit Generalizable 3D Reconstruction of Indoor Scenes from Monocular RGB Views [Proceedings]
    Runfa Li (UC San Diego)*;Upal Mahbub (Qualcomm);Vasudev Bhaskaran (Qualcomm Inc.);Truong Nguyen (University of California, San Diego);
    Lifting Multi-View Detection and Tracking to the Bird's Eye View [Proceedings]
    Torben Teepe (Technical University of Munich)*;Johannes Gilg (Technical University of Munich);Philipp Wolters (Technical University of Munich);Fabian Herzog (Technical University of Munich);Gerhard Rigoll (TUM);
    Global Latent Neural Rendering [PDF]
    Thomas Tanay (Huawei Noah's Ark Lab)*;Matteo Maggioni (Huawei Noah's Ark Lab);
    3D Clothed Human Reconstruction from Sparse Multi-view Images [Proceedings]
    Jin Gyu Hong (KwangWoon university);Seung Young Noh (Kwangwoon University);HeeKyung Lee (ETRI(Electronics and Telecommunications Reseach Institute));Won Sik Cheong (Electronics and Telecommunications Research Institute);Ju Yong Chang (Kwangwoon University)*;
    SACReg: Scene-Agnostic Coordinate Regression for Visual Localization [Proceedings]
    Jerome Revaud (Naver Labs Europe);Yohann Cabon (Naver Labs Europe);Romain Brégier (Naver Labs Europe);JongMin Lee (SNU);Philippe Weinzaepfel (NAVER LABS Europe)*;
    DepthVoting: A Few-Shot Point Cloud Classification Model Incorporating a Projection-Based Voting Mechanism [Proceedings]
    Yunhui Zhu (Syracuse University)*;Jiajing Chen (Syracuse University);Senem Velipasalar (Syracuse University);
    Cross-Modal Self-Training: Aligning Images and Pointclouds to learn Classification without Labels [Proceedings]
    Amaya Dharmasiri (Princeton University)*;Muzammal Naseer (MBZUAI);Salman Khan (MBZUAI);Fahad Shahbaz (MBZUAI);
    MIMIC: Masked Image Modeling with Image Correspondences [Proceedings]
    Kalyani S Marathe (University of Washington, Seattle)*;Mahtab Bigverdi (University of Washington);Nishat Anjum Khan (University of Washington );Tuhin Kundu (N/A);Patrick Howe (University of Washington);Sharan Ranjit S (University of Washington);Anand Bhattad (TTIC);Aniruddha Kembhavi (Allen Institute for Artificial Intelligence);Linda Shapiro (University of Washington);Ranjay Krishna (University of Washington);
    Selective Multi-View Deep Model for 3D Object Classification [Proceedings]
    Mona Saleh Alzahrani (KFUPM);Muhammad Usman (King Fahd University of Petroleum and Minerals)*;Saeed Anwar (King Fahd University of Petroleum and Minerals);Tarek Helmy (KFUPM);
    Posterior Distillation Sampling [PDF]
    Juil Koo (KAIST)*;Chanho Park (KAIST);Minhyuk Sung (KAIST);
    As-Plausible-As-Possible: Plausibility-Aware Mesh Deformation Using 2D Diffusion Priors [PDF]
    Seungwoo Yoo (KAIST)*;Kunho Kim (KAIST);Vladimir Kim (Adobe Research);Minhyuk Sung (KAIST);
    SOAC: Spatio-Temporal Overlap-Aware Multi-Sensor Calibration using Neural Radiance Fields [PDF]
    Quentin HERAU (Huawei Technologies - Université de Bourgogne)*;Nathan Piasco (Huawei Technologies France);Moussab Bennehar (Huawei Paris Research Center);Luis G Roldao Jimenez (Huawei Noah's Ark Lab);Dzmitry Tsishkou (Huawei Technologies);Cyrille Migniot (Universite de Bourgogne);Pascal Vasseur (Université de Picardie);Cedric Demonceaux (Université de Bourgogne);
    From 2D Portraits to 3D Realities: Advancing GAN Inversion for Enhanced Image Synthesis [Proceedings]
    Wonseok Oh (University of Michigan)*;Youngjoo Jo (ETRI);
    DGBD: Depth Guided Branched Diffusion for Comprehensive Controllablity in Multi-View Generation [Proceedings]
    Hovhannes Margaryan (Picsart)*;Daniil Hayrapetyan (PicsArt);Wenyan Cong (University of Texas at Austin);Zhangyang Wang (University of Texas at Austin);Humphrey Shi (Georgia Tech | Oregon | UIUC | PAIR);
    2T-UNET: A Two Tower UNet with Depth Clues for Robust Stereo Depth Estimation [Proceedings]
    Rohit Choudhary (IIT Madras);Mansi Sharma ( Department of Computer Science and Engineering, Thapar Institute of Engineering and Technology, Patiala 147004, India)*;Rithvik Anil (Indian Institute of Technology, Madras);
    Exploring 3D-aware Latent Spaces for Efficiently Learning Numerous Scenes [PDF]
    Antoine Schnepf (Criteo, CNRS-I3S/Université Cote d'Azur);Karim Kassab (LASTIG/IGN-UGE)*;Jean-Yves Franceschi (Criteo AI Lab);Laurent Caraffa (Institut national de l'information géographique et forestière);Flavian Vasile (Criteo);Jeremie Mary (Criteo AI Lab);Andrew I Comport (CNRS-I3S/Université Cote d'Azur);Valérie Gouet-Brunet (LASTIG/IGN-UGE);
    AgileGAN3D: Few-Shot 3D Portrait Stylization by Augmented Transfer Learning [Proceedings]
    Guoxian Song (Bytedance )*;
    ODIN: A Single Model for 2D and 3D Segmentation [Proceedings]
    Ayush Jain (Carnegie Mellon University)*;Pushkal Katara (Carnegie Mellon University);Nikolaos Gkanatsios (Carnegie Mellon University);Adam Harley (Stanford University);Gabriel Sarch (Carnegie Mellon University);Kriti Aggarwal (Hippocratic AI);Vishrav Chaudhary (Microsoft Turing);Katerina Fragkiadaki (Carnegie Mellon University);
    Color-cued Efficient Densification Method for 3D Gaussian Splatting [Proceedings]
    Sieun Kim (Seoul National University)*;Kyungjin Lee (Seoul National University);Youngki Lee (Seoul National University);
    PointOfView: A Multi-modal Network for Few-shot 3D Point Cloud Classification Fusing Point and Multi-view Image Features [Proceedings]
    Huantao Ren (Syracuse University)*;Jiyang Wang (Syracuse University);Minmin Yang (Syracuse University);Senem Velipasalar (Syracuse University);
    OGRMPI: An Efficient Multiview Integrated Multiplane Image based on Occlusion Guided Residuals [Proceedings]
    Dae Yeol Lee (Dolby Laboratories)*;Guan-Ming Su (Dolby Labs );Peng Yin (Dolby Labs);
    Sparse multi-view hand-object reconstruction for unseen environments [Proceedings]
    Yik Lung Pang (Queen Mary University of London)*;Changjae Oh (Queen Mary University of London);Andrea Cavallaro (Idiap Research Institute);
    Depth-Regularized Optimization for 3D Gaussian Splatting in Few-Shot Images [Proceedings]
    Jaeyoung Chung (Seoul National University)*;Jeongtaek Oh (Seoul National University);Kyoung Mu Lee (Seoul National University);
    Aerial Diffusion: Text Guided Ground-to-Aerial View Translation from a Single Image using Diffusion Models [Proceedings]
    Divya Kothandaraman (University of Maryland College Park)*;Tianyi Zhou (University of Maryland, College Park);Ming Lin (University of Maryland College Park);Dinesh Manocha (University of Maryland at College Park);

    Speakers

    Ziwei Liu

    Nanyang Technological University

    David Novotny

    Meta AI Research

    Andrea Tagliasacchi

    Google, Simon Fraser University

    Sponsors

    Organizers

    Abdullah Hamdi

    University of Oxford

    Chuanxia Zheng

    University of Oxford
    Contact: abdullah.hamdi@kaust.edu.sa
    CVPR 2024 Workshop ©2024