🧪 Dataset Curation using Blender & BlenderProc

This directory contains scripts for generating synthetic reflection datasets using BlenderProc. Make sure your working directory is Reflection-Exploration/BlenderProc.

🔗 For an overview of BlenderProc, refer to the examples/ folder in the BlenderProc repository.

📚 Table of Contents

🧪 Dataset Curation using Blender & BlenderProc

🔧 Environment Setup

conda create -n blender python=3.10
conda activate blender
pip install -e .
blenderproc pip install debugpy hydra-core==1.3.2 hydra-colorlog==1.2.0 loguru
pip install fiftyone loguru simple-aesthetics-predictor tqdm autoroot autorootcwd

✨ Dataset Sources and Aesthetic filtering

We provide utilities to filter objects based on aesthetic scores.

Scripts:

scripts/download_renderings.py: to fetch renderings.
scripts/predict_aesthetics.py: to score renderings.

💡 You can optionally download the renderings manually from this link.

💡 You can download the Amazon Berkley Objects from this link

🏗️ Data Generation

During execution, Blender will be installed if not already present (required by BlenderProc).

Basic Command

blenderproc run reflection/main.py

Recommended (for robust and reproducible runs)

python rerun.py --seed 1234 \
    run reflection/main.py \
    --camera reflection/resources/cam_novel_poses.txt \
    --input_dir ~/data/hf-objaverse-v1/glbs \
    --output_dir ~/data/blenderproc/hf-objaverse-v2/ \
    --hdri ~/data/blenderproc/resources/HDRI \
    --textures ~/data/blenderproc/resources/cc_textures \
    --split_file reflection/resources/splits/split_0.txt \
    --spurious_file reflection/resources/spurious.json

🔧 Command Line Arguments

This script accepts several command-line arguments for controlling rendering, scene setup, and dataset configuration:

Required / Optional Arguments

Argument	Description	Default
`--camera`	Path to the camera pose file used for rendering.	`reflection/resources/cam_poses.txt`
`--mirror`	Path to the mirror `.glb` file (scene geometry).	`reflection/resources/all_mirrors.glb`
`--hdri`	Directory containing HDRI environment maps.	`/data/manan/data/objaverse/blenderproc/resources/HDRI`
`--textures`	Directory containing floor textures.	`blenderproc/resources/textures`
`--object`	Path to a specific 3D object file (usually `.glb`).	`reflection/resources/objaverse_examples/063b1b7d877a402ead76cedb06341681/063b1b7d877a402ead76cedb06341681.glb`
`--input_dir`	Directory containing all input 3D objects.	`reflection/resources/objaverse_examples/`
`--split_file`	Optional path to a dataset split file.	`""`
`--num_render`	Number of renderings per object.	`3`
`--spurious_file`	JSON file listing spurious objects to exclude.	`/data/manan/data/objaverse/blenderproc/hf-objaverse-v1/spurious_0.json`
`--output_dir`	Output directory for saving generated data.	`reflection/output/blenderproc`
`--max_objects`	Max number of objects to process in this run.	`75`
`--max_time`	Maximum allowed processing time (in minutes).	`30`
`--max_render_time`	Timeout for rendering a single object (in seconds).	`30`
`--model_3d_type`	File format of the 3D model (`glb`, `obj`, `fbx`).	`glb`
`--seed`	Random seed for reproducibility.	`None`

Boolean Flags

These flags are optional and toggle specific behaviors when provided.

Flag	Description
`--small_mirrors`	Randomly select mirrors from a small-mirrors subset.
`--disable_rotate`	Prevent automatic rotation of objects.
`--fast_testing`	Enable quick rendering with reduced quality (for debugging).
`--single_run`	Perform only a single rendering operation for test/debug.
`--reprocess`	Reprocess an object even if already present in the output. (⚠ Avoid using with `rerun.py`)
`--check_spurious`	Dynamically check for spurious objects during import.
`--multiple_objects`	Render scenes with multiple objects instead of just one.
`--create_rotate_trans_test_set`	Create a test set specifically for evaluating rotation and translation understanding.

Visualize Outputs

blenderproc vis hdf5 reflection/output/blenderproc/0.hdf5

📊 Dataset Construction & Upload

Final splits are created via dataset.ipynb using cuDF-pandas.
👉 Install RAPIDS before usage.

Upload to HuggingFace

Create a .env file containing your HuggingFace token:

HF_TOKEN=<your_token>

Then run:

python reflection/upload.py

Faster Uploads

pip install 'huggingface_hub[hf_transfer]'

Update .env:

HF_HUB_ENABLE_HF_TRANSFER=1
HF_HUB_ETAG_TIMEOUT=500

🖼️ Extract Images & Visualize

HDF5 Rendering Checker

check_rendering.py is designed to inspect HDF5 files generated by rendering processe to identify common rendering issues. It checks for:

Incomplete Renderings: Verifies if each unique ID (assumed to be the immediate parent directory name of an HDF5 file) has a specific number of associated renderings (defaulting to 3, as per the check_number_renderings function).
Black Images/Maps: Checks if colors, category_id_segmaps, depth, and normals datasets within each HDF5 file are entirely black (all pixel values are zero), which often indicates a rendering failure.
Monochromatic Normal Maps: Detects if the normals dataset contains entirely uniform values, which could indicate a rendering issue where normal data is not properly generated (e.g., a flat color instead of actual surface normals).

If any issues are detected, the script will output the UIDs (Unique Identifiers) corresponding to the problematic renderings.

python check_renderings.py \
    --input_dir <input_dir_path> \
    --input_file <input_file_path> \
    --output_file <output_file_path>

Convert HDF5 to PNG

The extract_images.py script can be used to convert the generated HDF5 files into PNG images. This script offers several command-line arguments to control the input, output, and types of images extracted.

Command Line Arguments for extract_images.py

Argument	Type	Description
`--input_dir`	`str`	Input directory containing HDF5 files to extract images from.
`--input_file`	`str`	Optional path to a file containing UIDs (e.g., a list of object IDs) to selectively extract images for. If `None`, all found HDF5 files in `input_dir` will be processed.
`--count`	`int`	Number of images to extract. If `None`, all available images will be extracted.
`--output_dir`	`str`	Output directory where the extracted PNG images will be saved.
`--extract_mask`	`store_true`	Flag to extract mask images (binary segmentation masks).
`--extract_masked_image`	`store_true`	Flag to extract masked images (e.g., foreground objects with a black background).
`--extract_depth`	`store_true`	Flag to extract depth images.
`--extract_normal`	`store_true`	Flag to extract normal map images.

Usage Examples:

1. Extract all image types from a specific input directory to a specified output directory:

python reflection/extract_images.py \
    --input_dir <input_dir_path> \
    --output_dir <output_dir_path> \
    --extract_mask --extract_masked_image --extract_depth --extract_normal

Visualize Using FiftyOne

On Remote

pip install fiftyone
python reflection/visualise.py

On Local

pip install "fiftyone[desktop]"
fiftyone app connect --destination test@<remote-ip>

You can tag images (e.g., flag) and revisit them later via filter UI.

📂 Public Dataset

Dataset hosted on HuggingFace:
👉 Coming Soon!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

🧪 Dataset Curation using Blender & BlenderProc

📚 Table of Contents

🔧 Environment Setup

✨ Dataset Sources and Aesthetic filtering

🏗️ Data Generation

Basic Command

Recommended (for robust and reproducible runs)

🔧 Command Line Arguments

Required / Optional Arguments

Boolean Flags

Visualize Outputs

📊 Dataset Construction & Upload

Upload to HuggingFace

Faster Uploads

🖼️ Extract Images & Visualize

HDF5 Rendering Checker

Convert HDF5 to PNG

Usage Examples:

Visualize Using FiftyOne

On Remote

On Local

📂 Public Dataset

FilesExpand file tree

README.md

Latest commit

History

README.md

File metadata and controls

🧪 Dataset Curation using Blender & BlenderProc

📚 Table of Contents

🔧 Environment Setup

✨ Dataset Sources and Aesthetic filtering

🏗️ Data Generation

Basic Command

Recommended (for robust and reproducible runs)

🔧 Command Line Arguments

Required / Optional Arguments

Boolean Flags

Visualize Outputs

📊 Dataset Construction & Upload

Upload to HuggingFace

Faster Uploads

🖼️ Extract Images & Visualize

HDF5 Rendering Checker

Convert HDF5 to PNG

Usage Examples:

Visualize Using FiftyOne

On Remote

On Local

📂 Public Dataset