Syndiff Pipeline

Complete end-to-end pipeline for creating TESS Full Frame Image Template with PanSTARRS1 (PS1) data. The pipeline automatically extracts sector, camera, and CCD information from the TESS FITS file and runs the full processing workflow.

Requirements

System Requirements

Python: 3.8 or higher
Memory: 128GB+ recommended
Storage: 1TB+ free space for PS1 data downloads
CPU: Multi-core processor

Python Dependencies

Install the required packages using pip:

pip install numpy astropy pandas zarr sep scipy dask dask-image numba shapely tqdm filelock

Core Dependencies

numpy: Array operations and mathematical functions
astropy: FITS file handling, WCS operations, and astronomical coordinate transformations
pandas: Data manipulation and CSV file handling
zarr: Efficient array storage and retrieval for large datasets
sep: Source Extractor Python for astronomical object detection and background removal
scipy: Signal processing and convolution operations
dask: Parallel computing and distributed processing
dask-image: Image processing with Dask arrays
numba: JIT compilation for performance-critical functions
shapely: Geometric operations for sky cell processing
tqdm: Progress bars for long-running operations
filelock: Thread-safe file operations

Special Dependencies

Custom MOCPy Installation

The pipeline uses a custom modified version of MOCPy with enhanced performance for astronomical region processing:

./install_mocpy.sh

Alternative Installation Method

Install Rust (required for building):

curl --proto '=https' --tlsv1.2 -sSf https://sh.rust-lang.org | sh
source ~/.cargo/env

Install maturin (Python-Rust build tool):
```
pip install maturin
```

Build and install custom MOCPy:

cd mocpy_syndiff
maturin develop --release
cd ..

Quick Start

Simply provide the TESS FITS file - the pipeline will automatically determine sector, camera, and CCD, and use the default skycell catalog:

# Basic usage with default skycell catalog (data/SkyCells/skycell_wcs.csv)
python pipeline.py /path/to/tess-ffi.fits

# With custom skycell catalog
python pipeline.py /path/to/tess-ffi.fits /path/to/custom-catalog.csv

# With verbose output
python pipeline.py /path/to/tess-ffi.fits -v

# Override data directory and processing parameters
python pipeline.py /path/to/tess-ffi.fits \
    --data-root /custom/data/path \
    --cores 16 \
    --jobs 100 \
    --overwrite

Pipeline Steps

The pipeline automatically runs four main steps:

Pancakes v2 - Generate TESS↔PS1 mapping files and skycell list
Download - Download PS1 skycells and store in efficient Zarr format
Process PS1 - Combine PS1 bands using modern sliding window pipeline
Downsample - Multi-offset downsample to TESS grid

Command Line Options

Required Arguments

tess_fits - Path to TESS FFI FITS file (sector/camera/CCD auto-extracted)

Optional Arguments

skycell_wcs_csv - Path to skycell WCS catalog CSV for Pancakes (default: data/SkyCells/skycell_wcs.csv)
--data-root - Root directory for data storage (default: data)
--cores - Number of CPU cores to use (default: 8)
--jobs - Number of parallel download jobs (default: 60)
--overwrite - Overwrite existing files
--verbose - Increase verbosity (-v for INFO, -vv for DEBUG)
--multi-offset-array - Comma-separated dx,dy pairs for downsampling (default: 0.0,0.0)
--ignore-mask-bits - Comma-separated mask bits to ignore (default: 12)

Output Structure

The pipeline creates the following directory structure under data_root:

data/
├── mapping_output/
│   └── sector_XXXX/
│       └── camera_X/
│           └── ccd_X/
│               ├── tess_sXXXX_X_X_master_skycells_list.csv
│               └── TESS_sXXXX_X_X_skycell.*.fits.gz
├── ps1_skycells_zarr/
│   └── sector_XXXX_camera_X_ccd_X.zarr/
└── convolved_results/
    └── sector_XXXX/
        └── camera_X/
            └── ccd_X/
                ├── convolved_images.zarr
                └── cell_metadata.json

Examples

Process a single TESS image with default settings

python pipeline.py tess2020123456-s0020-1-3-0000-s_ffic.fits

High-performance processing

python pipeline.py tess2020123456-s0020-1-3-0000-s_ffic.fits \
    --cores 32 \
    --jobs 200 \
    --verbose

Multi-offset downsampling

python pipeline.py tess2020123456-s0020-1-3-0000-s_ffic.fits \
    --multi-offset-array "0.0,0.0,0.5,0.0,0.0,0.5,0.5,0.5"

Using custom skycell catalog

python pipeline.py tess2020123456-s0020-1-3-0000-s_ffic.fits /path/to/custom-catalog.csv

Running Individual Pipeline Components

While the main pipeline runs all steps automatically, you can also run individual components for debugging, development, or partial processing:

1. Pancakes v2 - TESS↔PS1 Mapping

Generate TESS to PS1 skycell mappings and create the master skycells list.

python pancakes_v2.py /path/to/tess-ffi.fits

Key Options:

--skycell_wcs_csv - Path to skycell WCS catalog (default: ./data/SkyCells/skycell_wcs.csv)
--output_path - Output directory for mapping files (default: ./data/skycell_pixel_mapping)
--max_workers - Number of parallel workers for processing
--overwrite - Overwrite existing output files

📖 More Info: See README_pancakes.md for detailed documentation.

2. Download PS1 Data

Download PS1 skycell data and store in efficient Zarr format.

python download_and_store_zarr.py 20 3 3

Required Arguments:

sector - TESS sector number
camera - TESS camera number (1-4)
ccd - TESS CCD number (1-4)

Key Options:

--num-workers - Number of parallel download workers (default: 32)
--zarr-output-dir - Directory for Zarr output (default: data/ps1_skycells_zarr)
--use-local-files - Use locally saved FITS files instead of downloading
--overwrite - Overwrite existing Zarr arrays

3. Process PS1 - Modern Sliding Window Pipeline

Combine PS1 bands and convolve using the modern sliding window approach.

python process_ps1.py 20 3 3

Required Arguments:

sector - TESS sector number
camera - TESS camera number (1-4)
ccd - TESS CCD number (1-4)

Key Options:

--data-root - Root data directory (default: data)
--limit - Limit number of projections for testing
--psf-sigma - PSF sigma for convolution (default: 40.0)

📖 More Info: See README_process_ps1.md for comprehensive documentation of the sliding window architecture.

4. Multi-Offset Downsampling

Generate multiple downsampled images with different pixel offsets.

python multi_offset_downsampling.py 20 3 3

Optional Arguments:

sector - TESS sector number (default: 20)
camera - Camera number (default: 3)
ccd - CCD number (default: 3)

Key Options:

--data-root - Root data directory
--convolved-dir - Convolved results directory override
--output-base - Base output directory override

Component Dependencies

The pipeline components have the following dependencies:

Pancakes v2 → Download PS1 → Process PS1 → Multi-Offset Downsampling

Each step uses outputs from the previous step, so they must be run in order when using individual components.

Notes

Automatic Metadata Extraction: Sector, camera, and CCD are automatically extracted from the TESS FITS filename and header
Default Skycell Catalog: The pipeline uses data/SkyCells/skycell_wcs.csv by default - ensure this file exists or provide a custom catalog
Resumable Processing: Each step checks for existing outputs and can resume if interrupted
Memory Efficient: Uses Zarr format for efficient storage and streaming processing
Parallel Processing: Optimized for multi-core systems with configurable parallelism
Error Handling: Comprehensive error checking and informative logging

Name		Name	Last commit message	Last commit date
Latest commit History 34 Commits
notebooks		notebooks
tools		tools
.gitignore		.gitignore
README.md		README.md
README_pancakes.md		README_pancakes.md
README_process_ps1.md		README_process_ps1.md
band_utils.py		band_utils.py
combine_masks_and_write_output.py		combine_masks_and_write_output.py
compute_ps1_skycell_shifts.py		compute_ps1_skycell_shifts.py
convolution_utils.py		convolution_utils.py
correct_saturation.py		correct_saturation.py
csv_utils.py		csv_utils.py
download_and_store_zarr.py		download_and_store_zarr.py
downsample.py		downsample.py
install_mocpy.sh		install_mocpy.sh
load_and_combine.py		load_and_combine.py
modern_padding.py		modern_padding.py
multi_offset_downsampling.py		multi_offset_downsampling.py
pancakes_v2.py		pancakes_v2.py
pipeline.py		pipeline.py
process_ps1.py		process_ps1.py
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt
zarr_utils.py		zarr_utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Syndiff Pipeline

Table of Contents

Requirements

System Requirements

Python Dependencies

Core Dependencies

Special Dependencies

Custom MOCPy Installation

Alternative Installation Method

Quick Start

Pipeline Steps

Command Line Options

Required Arguments

Optional Arguments

Output Structure

Examples

Process a single TESS image with default settings

High-performance processing

Multi-offset downsampling

Using custom skycell catalog

Running Individual Pipeline Components

1. Pancakes v2 - TESS↔PS1 Mapping

2. Download PS1 Data

3. Process PS1 - Modern Sliding Window Pipeline

4. Multi-Offset Downsampling

Component Dependencies

Notes

About

Uh oh!

Releases

Packages

Languages

zoutei/TSST_Syndiff_Core

Folders and files

Latest commit

History

Repository files navigation

Syndiff Pipeline

Table of Contents

Requirements

System Requirements

Python Dependencies

Core Dependencies

Special Dependencies

Custom MOCPy Installation

Alternative Installation Method

Quick Start

Pipeline Steps

Command Line Options

Required Arguments

Optional Arguments

Output Structure

Examples

Process a single TESS image with default settings

High-performance processing

Multi-offset downsampling

Using custom skycell catalog

Running Individual Pipeline Components

1. Pancakes v2 - TESS↔PS1 Mapping

2. Download PS1 Data

3. Process PS1 - Modern Sliding Window Pipeline

4. Multi-Offset Downsampling

Component Dependencies

Notes

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages