Multi-Modal Fine-Tuning and Inference with Janus-Pro

This repository provides scripts to fine-tune and infer a multi-modal model based on Janus-Pro. The project leverages LoRA for parameter-efficient fine-tuning and supports image-text interactions.

Environment Setup

Before running the scripts, prepare your environment:

pip install -r requirements.txt

git lfs install
git clone https://huggingface.co/deepseek-ai/Janus-Pro-1B

git clone https://github.com/deepseek-ai/Janus.git
mv Janus/janus janus

Training: `train_janus_pro_lora.py`

Description

This script fine-tunes the Janus-Pro model using LoRA on a dataset of image-text pairs.

Usage

python train_janus_pro_lora.py --data_dir dataset/images \
    --pretrained_model Janus-Pro-1B \
    --output_dir ./janus_lora_output \
    --batch_size 2 \
    --max_epochs 15 \
    --lr 5e-5 \
    --seed 42

Arguments

--data_dir: Path to the directory containing images and text files.
--pretrained_model: Name or path of the pre-trained Janus-Pro model.
--output_dir: Directory to save the fine-tuned model.
--batch_size: Number of samples per batch.
--max_epochs: Number of training epochs.
--lr: Learning rate.
--seed: Random seed for reproducibility.

Output

The fine-tuned model will be saved in output_dir, including the LoRA adapter and updated processor.

Inference: `image2text.py`

Description

This script performs inference using a fine-tuned Janus-Pro model with an optional LoRA adapter.

Usage

python image2text.py --model_path ./janus_lora_output \
    --image_path sample.jpg \
    --question "What is in the image?" \
    --max_new_tokens 512

Arguments

--model_path: Path to the base Janus-Pro model.
--lora_path: (Optional) Path to the fine-tuned LoRA adapter.
--image_path: Path to the input image.
--question: Text query to ask about the image.
--max_new_tokens: Maximum number of generated tokens.

Output

The script prints the generated text response to the console.

For further details, refer to the source code in train_janus_pro_lora.py and image2text.py.

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
.gitignore		.gitignore
LICENSE.md		LICENSE.md
README.md		README.md
README_CN.md		README_CN.md
image2text.py		image2text.py
requirements.txt		requirements.txt
train_janus_pro_lora.py		train_janus_pro_lora.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Multi-Modal Fine-Tuning and Inference with Janus-Pro

Environment Setup

Training: `train_janus_pro_lora.py`

Description

Usage

Arguments

Output

Inference: `image2text.py`

Description

Usage

Arguments

Output

About

Releases

Packages

Languages

License

iMMIQ/Janus-Pro-Lora-Train

Folders and files

Latest commit

History

Repository files navigation

Multi-Modal Fine-Tuning and Inference with Janus-Pro

Environment Setup

Training: train_janus_pro_lora.py

Description

Usage

Arguments

Output

Inference: image2text.py

Description

Usage

Arguments

Output

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Training: `train_janus_pro_lora.py`

Inference: `image2text.py`

Packages