This is the official implementation of My3DGen: A Scalable Personalized 3D Generative Model. We personalize a pretrained GAN-based model (EG3D) using a few (~50) selfies of an individual without full finetuning, enabling scalable personalization in a real-world scenario.
The framework of the proposed My3DGen model:
We build our code on top of the EG3D repository. Please refer to EG3D to setup the environment & checkpoints and run the following commands after setup the EG3D environment.
conda activate eg3d
pip install lpips
We use the dataset of celebrities from images-of-celebs.
We follow EG3D's processing pipeline to preprocess the images-of-celebs dataset.
We originally follow the oneThousand1000/EG3D-projector: An unofficial inversion code of eg3d. to generate the latent codes of the images-of-celebs dataset. You may use more recent SOTA inversion methods to generate the latent codes e.g.
[2303.13497] TriPlaneNet: An Encoder for EG3D Inversion
[2303.12326] Make Encoder Great Again in 3D GAN Inversion through Geometry and Occlusion-Aware Encoding
└── datasets
├── Barack
│ ├── image_0.png
│ ├── image_0_latent.npy
│ ├── ...
│ ├── image_n.png
│ ├── image_n_latent.npy
│ └── dataset.json
├── Morgan
│ ├── image_0.png
│ ├── image_0_latent.npy
│ ├── ...
│ ├── image_n.png
│ ├── image_n_latent.npy
│ └── dataset.json
├── ...
└── Celebrity_N
├── image_0.png
├── image_0_latent.npy
├── ...
├── image_n.png
├── image_n_latent.npy
└── dataset.json
Go to the ./eg3d
directory and modify the following command to train the model:
python train.py --outdir=<output_dir> \
--data=<dataset_dir e.g. ./datasets/Barack> \
--resume=<pretrained_model_path e.g. ./networks/ffhqrebalanced512-128.pkl> \
--cfg=<cfg_file e.g. ffhq> \
--batch=<batch_size e.g. 4> \
--gpus=<num_gpus e.g. 4> \
--snap=<num_snapshots e.g. 5> \
--kimg=<num_kimg e.g. 500> \
--lora=<lora_rank e.g. 1> \
--lora_alpha=<lora_alpha e.g. 1> \
--adalora=False \
--freeze_render=False
After training, you can run various downstream tasks.
For inversion, you can refer to danielroich/PTI: Official Implementation for "Pivotal Tuning for Latent-based editing of Real Images" (ACM TOG 2022) https://arxiv.org/abs/2106.05744.
For editing, you can refer to google/mystyle.
If you find this work useful for your research, please kindly cite our paper:
@misc{qi2024my3dgenscalablepersonalized3d,
title={My3DGen: A Scalable Personalized 3D Generative Model},
author={Luchao Qi and Jiaye Wu and Annie N. Wang and Shengze Wang and Roni Sengupta},
year={2024},
eprint={2307.05468},
archivePrefix={arXiv},
primaryClass={cs.CV},
url={https://arxiv.org/abs/2307.05468},
}