Data transformation pipeline for stable isotope measurements from TANGO (Trophic structure in Antarctic benthic ecology Gathering Observations) expeditions 1 and 2.
This repository processes stable isotope data (δ13C, δ15N, δ34S) from Sterechinus neumayeri (sea urchin) specimens collected during TANGO 1 and TANGO 2 expeditions in Antarctic waters. The pipeline transforms raw data into Darwin Core format for standardized biodiversity data sharing.
- Data Integration: Combines stable isotope measurements with TANGO expedition occurrence and event data
- Darwin Core Compliance: Outputs standardized Darwin Core occurrence and measurement-or-fact (MoF) tables
- Quality Control: Handles coordinate precision, depth measurements, and taxonomic information
- Range-based Matching: Implements sophisticated sample ID matching for TANGO 2 data
- Reproducible Environment: Uses
renvfor package dependency management
- R (>= 4.0)
- RStudio (recommended)
- renv package (automatically installed when opening the project)
git clone https://github.com/biodiversity-aq/tango_stable_isotope.git
cd tango_stable_isotopeThis project uses renv for reproducible package management. When you first open the project in RStudio or start R in this directory, renv will automatically activate via the .Rprofile file.
- Open the
tango_stable_isotope.Rprojfile in RStudio renvwill automatically activate and prompt you to restore the project library- Run the following command to install all required packages:
renv::restore()This will install all packages specified in the renv.lock file (if present) or create a new project library.
- Start R in the project directory
- The
.Rprofilewill automatically loadrenv - Restore the project library:
renv::restore()If this is a fresh setup without an renv.lock file, you'll need to install the required packages:
# Required packages
install.packages(c("tidyverse", "here", "readxl", "janitor", "hms", "data.table"))
# Capture the project dependencies
renv::snapshot()Once the environment is set up with renv, you can run the data transformation:
- Open
src/transform-data.R - Click "Source" or press
Ctrl+Shift+S(Windows/Linux) orCmd+Shift+S(Mac)
source("src/transform-data.R")The script will:
- Download TANGO 1 and TANGO 2 expedition data from GitHub
- Read local stable isotope metadata from
data/01_raw/Stable_Isotopes_Metadata.xlsx - Join and transform the datasets
- Generate Darwin Core outputs:
data/02_output/occurrence.txt- Occurrence records with taxonomic and event datadata/02_output/mof.txt- Extended measurements-or-facts table with isotope values
data/
├── 01_raw/ # Raw input data
│ ├── Stable_Isotopes_Metadata.xlsx # Primary isotope measurements
│ ├── TANGO_1_DATA.xlsx # TANGO 1 expedition data (reference)
│ └── TANGO_2_DATA.xlsx # TANGO 2 expedition data (reference)
└── 02_output/ # Processed Darwin Core outputs
├── occurrence.txt # Occurrence table
└── mof.txt # Measurement-or-fact table
# Check project status
renv::status()
# Install a new package and add to dependencies
install.packages("package_name")
renv::snapshot()
# Update all packages to latest versions
renv::update()
# Remove unused packages
renv::clean()
# Reset library to match renv.lock
renv::restore()
# Deactivate renv (temporary)
renv::deactivate()
# Reactivate renv
renv::activate().Rprofile: Automatically activatesrenvwhen R starts in this directoryrenv/activate.R: Therenvactivation scriptrenv.lock(if present): JSON file recording exact package versions and sourcesrenv/library/(gitignored): Project-specific package library
If you encounter issues:
# Repair the local library
renv::repair()
# Completely rebuild the library
renv::restore(rebuild = TRUE)
# Clear the cache and reinstall
renv::purge()
renv::restore()The pipeline integrates three data sources:
-
Stable Isotope Measurements (
data/01_raw/Stable_Isotopes_Metadata.xlsx)- δ13C, δ15N, δ34S measurements
- Specimen morphometrics (size, height)
- Collection metadata (date, location, depth)
-
TANGO 1 Expedition Data (from GitHub repository)
- Event and sample records
- Collection methodology
- Collector information
-
TANGO 2 Expedition Data (from GitHub repository)
- Event and sample records with sample ID ranges
- Collection methodology
- Collector information
Darwin Core formatted table containing:
- Taxonomic identification (Sterechinus neumayeri)
- Event information (date, time, location, depth)
- Material sample identifiers
- Collector information
- Geographic coordinates (WGS84)
Extended measurements including:
- Stable isotope ratios (δ13C, δ15N, δ34S) in per mille
- Morphometric measurements (height, size ambitus) in mm
- Standardized measurement types and units
Contributions are welcome! Please:
- Fork the repository
- Create a feature branch (
git checkout -b feature/amazing-feature) - Make your changes
- Commit your changes (
git commit -m 'Add amazing feature') - Push to the branch (
git push origin feature/amazing-feature) - Open a Pull Request
When adding or modifying code:
- Ensure
renvis activated - Install any new packages with
install.packages() - Update the lockfile with
renv::snapshot() - Test the transformation pipeline
- Commit both code changes and
renv.lockupdates
Please refer to the repository license file for terms of use.
- biodiversity-aq organization
- TANGO 1 expedition: https://github.com/biodiversity-aq/TANGO_1
- TANGO 2 expedition: https://github.com/biodiversity-aq/TANGO_2
- Darwin Core Standard: https://dwc.tdwg.org/
- OBIS Extended Measurement or Fact: https://obis.org/manual/dataformat/#measurementorfact
This project uses the following R packages:
tidyverse- Data manipulation and visualizationhere- Project-relative path handlingreadxl- Excel file readingjanitor- Data cleaninghms- Time data handlingdata.table- Efficient data operations
For questions or issues, please open an issue in this repository or contact the biodiversity-aq team.
This README follows the cookiecutter template style for consistent documentation.