Skip to content

dandi/dandi-schema

Repository files navigation

DANDI Schema

dandi-schema is a Python library for maintaining and managing DANDI metadata models and schemas.

Installation

pip install dandischema

Description

DANDI — the Distributed Archives for Neurophysiology Data Integration — is a BRAIN Initiative-supported platform for publishing, sharing, and processing cellular neurophysiology data. Every Dandiset or associated asset is described by a structured metadata object that can be retrieved through the DANDI API. The dandi-schema library provides Python data models and helper utilities to create, validate, migrate, and manage these metadata objects. It uses Pydantic models to define metadata models, and the JSON Schema schemas corresponding to the Pydantic models are generated, versioned, and stored in the dandi/schema repository with an associated context.json file for JSON-LD compliance. Both representations of the metadata models — the Pydantic models and their corresponding JSON Schema schemas — are used across the DANDI ecosystem (see the dedicated section in DANDI Docs for integration details). Additionally, this library provides tools for converting Dandiset metadata to DataCite metadata for DOI generation.

Important files in this repository include:

  • models.py - contains the Pydantic models defining the metadata models
  • metadata.py - contains functions for validating, migrating, and aggregating metadata
  • datacite package - contains functions for converting Dandiset metadata to DataCite metadata

Customization with Vendor Information

The DANDI metadata models defined in this library can be customized with vendor-specific information. The parameters of the customization are defined by the fields of the Config class in dandischema/conf.py. The Config class is a subclass of pydantic_settings.BaseSettings, and the values of the fields in an instance of the Config class can be set through environment variables and .env files, as documented in the Pydantic Settings documentation. Specifically,

  • The value of a field is set from an environment variable with the same name, case-insensitively, as one of the aliases of the field. For example, the instance_name field can be set from the DANDI_INSTANCE_NAME or DJANGO_DANDI_INSTANCE_NAME environment variable.
  • A value of a complex type (e.g., list, set, dict) should be expressed as a JSON-encoded string in an environment variable. For example, the value for the licenses field, which is of type set, can be set from the DANDI_LICENSES environment variable defined as the following:
    export DANDI_LICENSES='["spdx:CC0-1.0", "spdx:CC-BY-4.0"]'

Resources

About

Schemata for DANDI archive project

Resources

License

Stars

Watchers

Forks

Contributors 18

Languages