Skip to content

Create a good4cir script #1

@PranaviKolouju

Description

@PranaviKolouju

Script for Composed Image Retrieval (CIR) Dataset Generation using GPT-4o

Description

Create a user-friendly script that enables the generation of a Composed Image Retrieval (CIR) dataset leveraging the GPT-4o model. The script should encapsulate the full pipeline:

Requirements

The script should:

  1. Allow users to:
    • Specify their input dataset.
    • Provide their OpenAI API key.
    • Customize their prompts for GPT-4o.
  2. Implement all three stages of the CIR pipeline.
  3. Include post-processing logic to clean, validate, and store the final dataset.
  4. Be modular but runnable end-to-end from a single script.
  5. Include in-line comments directing users on:
    • Where to specify their dataset.
    • How to insert their API key.
    • What the output directory will be.
    • Where to specify custom prompts.

Deliverables

  • A single script: generate_dataset.py
  • A requirements.txt file listing all necessary dependencies.

Metadata

Metadata

Labels

No labels
No labels

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions