Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
56 changes: 56 additions & 0 deletions AGENTS.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,56 @@
# Project Overview

DeepForest is a Python library for object detection in aerial images, primarily trees, but also supporting models for livestock and other objects of interest. It uses pytorch throughout and the Lightning framework for model training, prediction and evaluation.

Imagery is of a relatively high resolution and is generally not satellite data.

## Setup and usage

- We use `uv` for package management
- NEVER edit dependencies in pyproject.toml directly, ALWAYS use `uv add`
- To set up, use `uv sync --all-extras --dev`. You can reference the CI workflow in .github
- Use the `deepforest` command line interface with Hydra overrides for simple tests
- When you've finished, run `uv run pre-commit` with appropriate arguments to run formatting and linters on your work.

## Folder Structure

- `/src`: Contains the source code for the DeepForest library
- `/src/conf`: Contains configuration files in OmegaConf/Hydra format, and the schema.py
- `/src/scripts`: Contains CLI interface(s) to DeepForest
- `/tests`: Contains the unit tests; we use pytest
- `/docs`: Contains documentation for the project, build using Sphinx and hosted on readthedocs.

## Libraries

- torch, torchvision, torchmetrics, transformers, pandas, numpy, scipy
- pytorch lightning
- visualization using matplotlib and supervision

## Data formats

- Annotations are usually provided in CSV with Pascal VOC convention for bounding box coordinates. For example:

```
image_path,xmin,ymin,xmax,ymax,label
OSBS_029.tif,203,67,227,90,Tree
OSBS_029.tif,256,99,288,140,Tree
OSBS_029.tif,166,253,225,304,Tree
```

- Other input types are possible using utilities.read_file
- Example CSV and shapefile inputs are in `/src/data`

## Configuration

- Use the configuration system to specify user-facing parameters instead of hardcoding/magic numbers in code
- Make sure the schema is up to date with the config files

## Coding standards

- Simple and maintainable
- Write docstrings and use type annotation
- Limited comments, highlighting important details
- Keep documentation in sync with code changes
- We use ruff for linting with options set in pyproject.toml
- Enforced by .pre-commit.yaml
- Take advantage of implicit vectorization using pandas, numpy, etc. Avoid explicit loops if you can.
15 changes: 15 additions & 0 deletions tests/AGENTS.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
## Testing

- Tests will be run via pytest
- Keep tests short and focused, with a clear contract
- You can find data for testing in src/deepforest/data. Prefer to use existing files with utilities.get_data over generating data at test time.
- Do not use print statements in tests, document failure with assertions
- Use fixtures for repeated code sections, with appropriate scoping
- Run models instead of mocking them. Use existing fixtures if suitable and set config options to be more efficient if necessary (such as fast_dev_run, low numbers of epochs)
- Tests should check for behaviour, not duplicate logic from the library
- Avoid excessive testing for initialization

## Running tests

- From the root folder, `uv run pytest tests`
- Useful flags include `-s` to show output, `-x` to fail on first error and `--ff` to run failed tests first.