Extend ImageBind to 3D Point Cloud domain: Point-Bind

Open ZrrSkywalker opened this issue 2 years ago • 0 comments

Thanks very much for releasing such insightful work!

We develop a project based on ImageBind by aligning 3D point cloud modality with image, text, and audio as Point-Bind. Our project exhibits four main characters:

Align 3D with ImageBind . With a joint embedding space, 3D objects can be aligned with their corresponding 2D images, textual descriptions, and audio.
3D LLM via LLaMA-Adapter. In Multi-modal LLaMA-Adapter (ImageBind-LLM), we introduce an LLM following 3D instructions in Engish/中文.
3D Zero-shot Classify/Seg/Det . Point-Bind achieves state-of-the-art performance for 3D zero-shot tasks, including classification, segmentation, and detection.
Embedding Arithmetic with 3D. We observe that 3D features from Point-Bind can be added with other modalities to compose their semantics.

The Multi-modality LLaMA-Adapter (ImageBind-LLM) with Point-Bind's 3D embeddings is as follows: Thanks!

Jun 05 '23 23:06 ZrrSkywalker