GitHub - mbari-org/aidata: Extract, transform, and load/download and augment images and annotations for detection, clustering or classification workflows

mbari-aidata

A command line tool to do extract, transform, load (ETL) and download operations on AI data for a number of projects at MBARI that require detection, clustering or classification workflows. This tool is designed to work with Tator, a web based platform for video and image annotation and data management and Redis queues for ingesting data from real-time workflows.

More documentation and examples are available at https://docs.mbari.org/internal/ai/data.

🚀 Features

🧠 Object Detection/Clustering Integration: Loads detection/classification/clustering output from SDCAT formatted results.
Flexible Data Export: Downloads from Tator into machine learning formats like COCO, CIFAR, or PASCAL VOC.
Crop localizaions into optimized datasets for training classification models.
Real-Time Uploads: Pushes localizations to Tator via Redis queues for real-time workflows.
Metadata Extraction: Parses images metadata such as GPS/time/date through a plugin-based system (extractors).
Duplicate Detection & flexible media references: Supports duplicate media load checks with the --check-duplicates flag.
Images or video can be loaded through a web server without needing to upload or move them from your internal NFS project mounts (e.g. Thalassa)
Video can be uploaded without needing to figure out how to do the video transcoding required for web viewing.
Video tracks can be uploaded into Tator for training and evaluation.
Multiple data versions can be downloaded into a single dataset for training or evaluation using the --version flag with comma separated values. Data is combined through Non-Maximum Suppression (NMS) to remove duplicate boxes.
Augmentation Support: Augment VOC datasets with Albumentations to boost your object detection model performance.

Requirements

Python 3.10 or higher
A Tator API token and (optional) Redis password for the .env file. Contact the MBARI AI team for access.
🐳Docker for development and testing only, but it can also be used instead of a local Python installation.
For video loads, you will need to install the required Python packages listed in the requirements.txt file, ffmpeg, and the mp4dump tool from https://www.bento4.com/

📦 Installation

Install as a Python package:

pip install mbari-aidata

Create the .env file with the following contents in the root directory of the project:

TATOR_TOKEN=your_api_token
REDIS_PASSWORD=your_redis_password
ENVIRONMENT=testing or production

Create a configuration file in the root directory of the project:

touch config_cfe.yaml

Or, use the project specific configuration from our docs server at https://docs.mbari.org/internal/ai/projects/

This file will be used to configure the project data, such as mounts, plugins, and database connections.

aidata download --version Baseline --labels "Diatoms, Copepods" --config https://docs.mbari.org/internal/ai/projects/uav-901902/config_uav.yml

⚙️Example configuration file:

# config_cfe.yml
# Config file for CFE project production
mounts:
  - name: "image"
    path: "/mnt/CFElab"
    host: "https://mantis.shore.mbari.org"
    nginx_root: "/CFElab"

  - name: "video"
    path: "/mnt/CFElab"
    host: "https://mantis.shore.mbari.org"
    nginx_root: "/CFElab"


plugins:
  - name: "extractor"
    module: "mbari_aidata.plugins.extractors.tap_cfe_media"
    function: "extract_media"

redis:
  host: "doris.shore.mbari.org"
  port: 6382

vss:
  project: "902111-CFE"
  model: "google/vit-base-patch16-224"

tator:
  project: "902111-CFE"
  host: "https://mantis.shore.mbari.org"
  image:
    attributes:
      iso_datetime: #<-------Required for images
        type: datetime
      depth:
        type: float
  video:
    attributes:
      iso_start_datetime:  #<-------Required for videos
        type: datetime
  box:
    attributes:
      Label:
        type: string
      score:
        type: float
      cluster:
        type: string
      saliency:
        type: float
      area:
        type: int
      exemplar:
        type: bool

  track_state:
    attributes:
      Label:
        type: string
      max_score:
        type: float
      num_frames:
        type: int
      verified:
        type: bool

Tracks Format

Track data is stored in a compressed .tar.gz file with the -tracks.tar.gz, e.g.

aidata load tracks --input video-tracks/tracks.tar.gz --dry-run --config config_cfe.yml

video-tracks/tracks.tar.gz. This compressed file contains a structure like:

The detections.csv file contains the detections for each frame, e.g.

frame	tracker_id	label	score	x	y	xx	xy
3	2	Copepod	0.6826763153076172	0.7003568708896637	0.4995344939055266	0.7221783697605133	0.5368460761176215
3	1	Copepod	0.7094097137451172	0.2693319320678711	0.6148265485410337	0.29686012864112854	0.6434915330674913
3	3	Detritus	0.2776843011379242	0.2693319320678711	0.6148265485410337	0.29686012864112854	0.6434915330674913
4	1	Copepod	0.49819645285606384	0.2683655321598053	0.6125818323206018	0.2965434789657593	0.6455737643771702

Metadata about video is in the metadata.json file, e.g.

{ "video_name": "video.mp4", 
  "video_path": "/data/input/video.mp4", 
  "processed_at": "2025-11-15T13:37:35.997007Z", 
  "total_frames": 12000, 
  "video_width": 1920, 
  "video_height": 1080, 
  "video_fps": 10, 
  "total_detections": 3000, 
  "unique_tracks": 148, 
  "detection_threshold": 0.15, 
  "min_track_frames": 5, 
  "slice_size": 800, 
  "rfdetr_model": "/mnt/models/best/checkpoint_best_total.pth" }

The tracks.csv file contains the tracks for each frame, e.g.

tracker_id	label	first_frame	last_frame	frame_count	avg_score
2	Detritus	3	37	35	0.3780171153800829
1	Copepod	3	36	31	0.5898609180604258
3	Copepod	3	37	34	0.5619616565458914

🐳 Docker usage

A docker version is also available at mbari/aidata:latest or mbari/aidata:latest:cuda-124. For example, to download data from version Baseline using the docker image:

docker run -it --rm -v $(pwd):/mnt mbari/aidata:latest aidata download --version Baseline --labels "Diatoms, Copepods" --config config_cfe.yml

to download multiple versions

docker run -it --rm -v $(pwd):/mnt mbari/aidata:latest aidata download --version Baseline,ver0 --labels "Diatoms, Copepods" --config config_cfe.yml`

Commands

aidata download --help - Download data, such as images, boxes, into various formats for machine learning e.g. COCO, CIFAR, or PASCAL VOC format. Augmentation supported for VOC exported data using Albumentations.
aidata load --help - Load data, such as images, boxes, or clusters into either a Postgres or REDIS database
aidata db --help - Commands related to database management
aidata transform --help - Commands related to transforming downloaded data
aidata -h - Print help message and exit.

Source code is available at github.com/mbari-org/aidata.

Development

See the Development Guide for more information on how to set up the development environment or the justfile

🗓️ Last updated: 2025-11-17

Name		Name	Last commit message	Last commit date
Latest commit History 435 Commits
.github/workflows		.github/workflows
docker		docker
docs		docs
mbari_aidata		mbari_aidata
tests		tests
.dockerignore		.dockerignore
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
DEVELOPMENT.md		DEVELOPMENT.md
README.md		README.md
example-env		example-env
justfile		justfile
poetry.lock		poetry.lock
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

mbari-aidata

🚀 Features

Requirements

📦 Installation

Tracks Format

🐳 Docker usage

Commands

Development

About

Uh oh!

Releases 141

Packages

Uh oh!

Contributors 3

Uh oh!

Languages

mbari-org/aidata

Folders and files

Latest commit

History

Repository files navigation

mbari-aidata

🚀 Features

Requirements

📦 Installation

Tracks Format

🐳 Docker usage

Commands

Development

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases 141

Packages 0

Uh oh!

Contributors 3

Uh oh!

Languages

Packages