Multi-view Vision Fusion Network: Can 2D Pre-trained Model Boost 3D Point Cloud Data-scarce Learning?

Page view(s)
35
Checked on Mar 04, 2025
Multi-view Vision Fusion Network: Can 2D Pre-trained Model Boost 3D Point Cloud Data-scarce Learning?
Title:
Multi-view Vision Fusion Network: Can 2D Pre-trained Model Boost 3D Point Cloud Data-scarce Learning?
Journal Title:
IEEE Transactions on Circuits and Systems for Video Technology
Keywords:
Publication Date:
15 December 2023
Citation:
Peng, H., Li, B., Zhang, B., Chen, X., Chen, T., & Zhu, H. (2023). Multi-view Vision Fusion Network: Can 2D Pre-trained Model Boost 3D Point Cloud Data-scarce Learning? IEEE Transactions on Circuits and Systems for Video Technology, 1–1. https://doi.org/10.1109/tcsvt.2023.3343495
Abstract:
Point cloud based 3D deep model has wide applications in many applications such as autonomous driving, house robot, etc. Inspired by the recent prompt learning in natural language processing, this work proposes a novel Multi-view Vision Fusion Network (MvNet) for few-shot 3D point cloud classification. MvNet investigates the possibility of leveraging the off-the-shelf 2D pre-trained models to achieve the few-shot classification, which can alleviate the over-dependence issue of the existing baseline models towards the large-scale annotated 3D point cloud data. Specifically, MvNet first encodes a 3D point cloud into multi-view image features for a number of different views. Then, a novel multi-view prompt fusion module is developed to fuse information from different views effectively to bridge the gap between 3D point cloud data and 2D pre-trained models. A set of 2D image prompts can then be derived to better describe the suitable prior knowledge for a large-scale pre-trained image model for few-shot 3D point cloud classification. Extensive experiments on ModelNet, ScanObjectNN, and ShapeNet datasets demonstrate that MvNet achieves new state-of-the-art performance for 3D few-shot point cloud image classification. The source code of this work is available at https://github.com/invictus717/MetaTransformer.
License type:
Publisher Copyright
Funding Info:
This research / project is supported by the A*STAR - MTC Programmatic
Grant Reference no. : A18A2b0046

This research / project is supported by the A*STAR - RobotHTPO
Grant Reference no. : C211518008

This research / project is supported by the Singapore Economic Development Board (EDB) - Space Technology Development Grant (STDP)
Grant Reference no. : S22-19016- STDP
Description:
© 2023 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.
ISSN:
1558-2205
1051-8215
Files uploaded:

File Size Format Action
mvnet-tcsvt23.pdf 4.61 MB PDF Request a copy