A command line tool to do extract, transform, load (ETL) and download operations on AI data for a number of projects at MBARI that require detection, clustering or classification workflows. This tool is designed to work with Tator, a web based platform for video and image annotation and data management and Redis queues for ingesting data from real-time workflows.
More documentation and examples are available at https://docs.mbari.org/internal/ai/data.
- 🧠 Object Detection/Clustering Integration: Loads detection/classification/clustering output from SDCAT formatted results.
- Flexible Data Export: Downloads from Tator into machine learning formats like COCO, CIFAR, or PASCAL VOC.
- Crop localizaions into optimized datasets for training classification models.
- Real-Time Uploads: Pushes localizations to Tator via Redis queues for real-time workflows.
- Metadata Extraction: Parses images metadata such as GPS/time/date through a plugin-based system (extractors).
- Duplicate Detection & flexible media references: Supports duplicate media load checks with the --check-duplicates flag.
- Images or video can be loaded through a web server without needing to upload or move them from your internal NFS project mounts (e.g. Thalassa)
- Video can be uploaded without needing to figure out how to do the video transcoding required for web viewing.
- Video tracks can be uploaded into Tator for training and evaluation.
- Multiple data versions can be downloaded into a single dataset for training or evaluation using the --version flag with comma separated values. Data is combined through Non-Maximum Suppression (NMS) to remove duplicate boxes.
- Augmentation Support: Augment VOC datasets with Albumentations to boost your object detection model performance.
- Python 3.10 or higher
- A Tator API token and (optional) Redis password for the .env file. Contact the MBARI AI team for access.
- 🐳Docker for development and testing only, but it can also be used instead of a local Python installation.
- For video loads, you will need to install the required Python packages listed in the
requirements.txtfile, ffmpeg, and the mp4dump tool from https://www.bento4.com/
Install as a Python package:
pip install mbari-aidataCreate the .env file with the following contents in the root directory of the project:
TATOR_TOKEN=your_api_token
REDIS_PASSWORD=your_redis_password
ENVIRONMENT=testing or production
Create a configuration file in the root directory of the project:
touch config_cfe.yamlOr, use the project specific configuration from our docs server at https://docs.mbari.org/internal/ai/projects/
This file will be used to configure the project data, such as mounts, plugins, and database connections.
aidata download --version Baseline --labels "Diatoms, Copepods" --config https://docs.mbari.org/internal/ai/projects/uav-901902/config_uav.yml⚙️Example configuration file:
# config_cfe.yml
# Config file for CFE project production
mounts:
- name: "image"
path: "/mnt/CFElab"
host: "https://mantis.shore.mbari.org"
nginx_root: "/CFElab"
- name: "video"
path: "/mnt/CFElab"
host: "https://mantis.shore.mbari.org"
nginx_root: "/CFElab"
plugins:
- name: "extractor"
module: "mbari_aidata.plugins.extractors.tap_cfe_media"
function: "extract_media"
redis:
host: "doris.shore.mbari.org"
port: 6382
vss:
project: "902111-CFE"
model: "google/vit-base-patch16-224"
tator:
project: "902111-CFE"
host: "https://mantis.shore.mbari.org"
image:
attributes:
iso_datetime: #<-------Required for images
type: datetime
depth:
type: float
video:
attributes:
iso_start_datetime: #<-------Required for videos
type: datetime
box:
attributes:
Label:
type: string
score:
type: float
cluster:
type: string
saliency:
type: float
area:
type: int
exemplar:
type: bool
track_state:
attributes:
Label:
type: string
max_score:
type: float
num_frames:
type: int
verified:
type: bool
Track data is stored in a compressed .tar.gz file with the -tracks.tar.gz, e.g.
aidata load tracks --input video-tracks/tracks.tar.gz --dry-run --config config_cfe.ymlvideo-tracks/tracks.tar.gz. This compressed file contains a structure like:
The detections.csv file contains the detections for each frame, e.g.
| frame | tracker_id | label | score | x | y | xx | xy |
|---|---|---|---|---|---|---|---|
| 3 | 2 | Copepod | 0.6826763153076172 | 0.7003568708896637 | 0.4995344939055266 | 0.7221783697605133 | 0.5368460761176215 |
| 3 | 1 | Copepod | 0.7094097137451172 | 0.2693319320678711 | 0.6148265485410337 | 0.29686012864112854 | 0.6434915330674913 |
| 3 | 3 | Detritus | 0.2776843011379242 | 0.2693319320678711 | 0.6148265485410337 | 0.29686012864112854 | 0.6434915330674913 |
| 4 | 1 | Copepod | 0.49819645285606384 | 0.2683655321598053 | 0.6125818323206018 | 0.2965434789657593 | 0.6455737643771702 |
Metadata about video is in the metadata.json file, e.g.
{ "video_name": "video.mp4",
"video_path": "/data/input/video.mp4",
"processed_at": "2025-11-15T13:37:35.997007Z",
"total_frames": 12000,
"video_width": 1920,
"video_height": 1080,
"video_fps": 10,
"total_detections": 3000,
"unique_tracks": 148,
"detection_threshold": 0.15,
"min_track_frames": 5,
"slice_size": 800,
"rfdetr_model": "/mnt/models/best/checkpoint_best_total.pth" }The tracks.csv file contains the tracks for each frame, e.g.
| tracker_id | label | first_frame | last_frame | frame_count | avg_score |
|---|---|---|---|---|---|
| 2 | Detritus | 3 | 37 | 35 | 0.3780171153800829 |
| 1 | Copepod | 3 | 36 | 31 | 0.5898609180604258 |
| 3 | Copepod | 3 | 37 | 34 | 0.5619616565458914 |
A docker version is also available at mbari/aidata:latest or mbari/aidata:latest:cuda-124.
For example, to download data from version Baseline using the docker image:
docker run -it --rm -v $(pwd):/mnt mbari/aidata:latest aidata download --version Baseline --labels "Diatoms, Copepods" --config config_cfe.ymlto download multiple versions
docker run -it --rm -v $(pwd):/mnt mbari/aidata:latest aidata download --version Baseline,ver0 --labels "Diatoms, Copepods" --config config_cfe.yml`aidata download --help- Download data, such as images, boxes, into various formats for machine learning e.g. COCO, CIFAR, or PASCAL VOC format. Augmentation supported for VOC exported data using Albumentations.aidata load --help- Load data, such as images, boxes, or clusters into either a Postgres or REDIS databaseaidata db --help- Commands related to database managementaidata transform --help- Commands related to transforming downloaded dataaidata -h- Print help message and exit.
Source code is available at github.com/mbari-org/aidata.
See the Development Guide for more information on how to set up the development environment or the justfile
🗓️ Last updated: 2025-11-17
