GraphDreamer
GraphDreamer copied to clipboard
[CVPR'24] GraphDreamer: a novel framework of generating compositional 3D scenes from scene graphs.
GraphDreamer: Compositional 3D Scene Synthesis from Scene Graphs
Gege Gao, Weiyang Liu, Anpei Chen, Andreas Geiger, Bernhard Schölkopf
CVPR 2024
Full Paper | arXiv | Project Page
GraphDreamer takes scene graphs as input and generates object compositional 3D scenes.
Abstract
This repository contains a pytorch implementation for the paper GraphDreamer: Compositional 3D Scene Synthesis from Scene Graphs. Our work present the first framework capable of generating compositional 3D scenes from scene graphs, where objects are represented as nodes and their interactions as edges. See the demo bellow to get a general idea.
Installation
Tested on CentOS 7.9 + Python 3.10.10 + Pytorch 2.0.1
git clone https://github.com/GGGHSL/GraphDreamer.git
cd GraphDreamer
Create environment:
python3.10 -m venv venv/GraphDreamer
source venv/GraphDreamer/bin/activate # Repeat this step for every new terminal
Install dependencies:
pip install -r requirements.txt
Install tiny-cuda-nn for running Hash Grid based representations:
pip install git+https://github.com/NVlabs/tiny-cuda-nn/#subdirectory=bindings/torch
Install NerfAcc for NeRF acceleration:
pip install git+https://github.com/KAIR-BAIR/nerfacc.git
Guidance model DeepFloyd IF currently requires to accept its usage conditions. To do so, you need to have a Hugging Face account (login in the terminal by huggingface-cli login) and accept the license on the model card of DeepFloyd/IF-I-XL-v1.0.
Quick Start
Generate a compositional scene of "a blue jay standing on a large basket of rainbow macarons":
bash scripts/blue_jay.sh
Results of the first (coarse) and the second (fine) stage will be save to examples/gd-if/blue_jay/ and examples/gd-sd-refine/blue_jay/.
Try different seeds by setting seed=YOUR_SEED in the script.
Use different tags to name different trials by setting export TG=YOUR_TAG to avoid overwriting. More examples can be found under scripts/.
Try with Your Own Prompts
Generating a compositional scene with GraphDreamer is as easy as with other dreamers. Here are the steps:
Step 1 - Describe your objects
Give each object you want to create in the scene a prompt by setting
export P1=YOUR_TEXT_FOR_OBJECT_1
export P2=YOUR_TEXT_FOR_OBJECT_2
export P3=YOUR_TEXT_FOR_OBJECT_3
and system.prompt_obj=[["$P1"],["$P2"],["$P3"]] in the bash script .
By default, object SDFs will be initialized as spheres centered randomly, with the dispersion of the centers adjusted by multiplying a hyperparameter system.geometry.sdf_center_dispersion set to 0.2.
Step 2 - Describe object relationships
Compose your objects into a scene by giving each object a prompt on its relationship to another object
export P12=RELATIONSHIP_BETWEEN_OBJECT_1_AND_2
export P13=RELATIONSHIP_BETWEEN_OBJECT_1_AND_3
export P23=RELATIONSHIP_BETWEEN_OBJECT_2_AND_3
and add system.prompt_global=[["$P12"],["$P23"],["$P13"]] to your script. Based on these relationships, a graph is created accordingly with edges export E=[[0,1],[1,2],[0,2]] and system.edge_list=$E.
Prompt the global scene by combining P12, P13, and P23 into a sentence
export P=GLOBAL_TEXT_FOR_THE_SCENE
and add system.prompt_processor.prompt="$P" into the script.
Step 3 - Negative prompts (optional)
In this compositional senarios, we found a simple way to create the "negative" prompt for individual objects. For each object, all other objects plus their relationships can be used as a negative prompt,
export N1=$P23
export N2=$P13
export N3=$P12
and settingsystem.prompt_obj_neg=[["$N1"],["$N2"],["$N3"]].
You can further refine each negative prompts based on this general rule.
Step 4 - Coarse-to-fine training
Start a new trainining simply by
export TG=YOUR_OWN_TAG
# Use different tags to avoid overwriting
python launch.py --config CONFIG_FILE --train --gpu 0 exp_root_dir="examples" system.geometry.num_objects=3 use_timestamp=false tag=$TG OTHER_CONFIGS
Set your own tag of the saving folder by export TG=YOUR_OWN_TAG and tag=$TG, enable time stamps for naming the folder by settinguse_timestamp=true.
The training configurations for the coarse stage are stored in configs/gd-if.yaml and the fine stage in configs/gd-sd-refine.yaml.
To resume from a previous checkpoint, e.g., resume from a coarse-stage training for the fine stage
resume=examples/gd-if/$TG/ckpts/last.ckpt
More Applications
GraphDreamer can be used to inverse the semantics in a given image into a 3D scene, by extracting a scene graph directly from an input image with ChatGPT-4.
To generate more objects and accelerate convergence, you may provide rough center coordinates for initializing each object by setting in the script:
export C=[[X1,Y1,Z1],[X2,Y2,Z3],...,[Xm,Ym,Zm]]
This will initialize the SDF-based objects as spheres centered at your given coordinates. The initial size of each object SDF sphere can also be custimized by setting the radius:
export R=[R1,R2,...,Rm]
Check ./threestudio/models/geometry/gdreamer_implicit_sdf.py for more details on this implementation.
Acknowledgement
The authors extend their thanks to Zehao Yu and Stefano Esposito for their invaluable feedback on the initial draft. Our thanks also go to Yao Feng, Zhen Liu, Zeju Qiu, Yandong Wen, and Yuliang Xiu for their proofreading of the final draft and for their insightful suggestions which enhanced the quality of this paper. Additionally, we appreciate the assistance of those who participated in our user study.
Weiyang Liu and Bernhard Sch"olkopf was supported by the German Federal Ministry of Education and Research (BMBF): T"ubingen AI Center, FKZ: 01IS18039B, and by the Machine Learning Cluster of Excellence, the German Research Foundation (DFG): SFB 1233, Robust Vision: Inference Principles and Neural Mechanisms, TP XX, project number: 276693517. Andreas Geiger and Anpei Chen were supported by the ERC Starting Grant LEGO-3D (850533) and the DFG EXC number 2064/1 - project number 390727645.
This codebase is developed upon threestudio. We appreciate its maintainers for their significant contributions to the community.
Citation
@Inproceedings{gao2024graphdreamer,
author = {Gege Gao, Weiyang Liu, Anpei Chen, Andreas Geiger, Bernhard Schölkopf},
title = {GraphDreamer: Compositional 3D Scene Synthesis from Scene Graphs},
booktitle = {Conference on Computer Vision and Pattern Recognition (CVPR)},
year = {2024},
}