Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
37 commits
Select commit Hold shift + click to select a range
ae98bb6
Update of metadata (cf convention) and removal of time
Dana-Maggie Jun 11, 2025
35847d7
Moving add_cf_attributes to make_xstats and rename time to timestamp
Dana-Maggie Jun 11, 2025
d521029
Update of version
Dana-Maggie Jun 12, 2025
0384a2d
Check for timestamp as a coordinate not a data variable
Dana-Maggie Jun 12, 2025
9d07870
334 Add init-project function and being auxillary data handler
nepstad Jun 12, 2025
fd8e843
Merge remote-tracking branch 'origin/main' into 334-metadata-helper
nepstad Jun 16, 2025
7155fad
Read in the auxillary metadatafile
Dana-Maggie Jun 16, 2025
a9b22fc
Drop the first row of the auxillary data
Dana-Maggie Jun 16, 2025
acf6417
334 Add cli option for making a filtered montage from processed stats
nepstad Jun 16, 2025
aa1a8d0
Merge remote-tracking branch 'origin/334-metadata-helper' into 334-me…
nepstad Jun 16, 2025
ea998ad
Errorcorrection
Dana-Maggie Jun 16, 2025
c2b0790
334 Add example data to init-project, add convert to png cli cmd
nepstad Jun 16, 2025
8b0e11c
Merge branch '334-metadata-helper' of github.com:SINTEF/pyopia into 3…
Dana-Maggie Jun 16, 2025
580cc85
Interpolate auxillary data variables on data based on their timestep
Dana-Maggie Jun 16, 2025
0d06307
Test for auxillary data interpolation
Dana-Maggie Jun 16, 2025
4a7bf7e
334 Remove duplicated globale attribute setting for netcdf output
nepstad Jun 16, 2025
8f5ad31
334 Fix issues with metadata.txt generator
nepstad Jun 16, 2025
99eaedb
Update of the input of aux_data
Dana-Maggie Jun 17, 2025
03414db
Update run format, so it does not run all tests but only this
Dana-Maggie Jun 17, 2025
364423b
334 Fix init-project template writing issue, add raw image shape to m…
nepstad Jun 17, 2025
cfc207a
334 Add classifier weights hash metadata, fix tests
nepstad Jun 17, 2025
24353cc
334 Fix wrong read_csv argument
nepstad Jun 17, 2025
425a9af
334 Bump minor version
nepstad Jun 17, 2025
6366a12
334 Move auxillary data handling to StatasToDisc step, add units and …
nepstad Jul 7, 2025
224b398
334 Revert silcam.py
nepstad Jul 7, 2025
4d1193b
334 Update silcam config generator, fix auxillary data not specified …
nepstad Jul 7, 2025
639f965
334 Fix linting issues
nepstad Jul 7, 2025
df87b13
334 Add auxillary data to image_stats
nepstad Jul 8, 2025
2af4aa3
334 Change metadata handling to pydantic model and json input file
nepstad Jul 9, 2025
f0d34c7
Merge remote-tracking branch 'origin/334-metadata-helper' into 334-me…
nepstad Jul 9, 2025
c8d5e7e
334 Add new metadata module
nepstad Jul 9, 2025
69a7167
Change units from micrometeres to number of pixels
Dana-Maggie Jul 9, 2025
70af160
334 Fix auxillary data calls when no file is specified
nepstad Jul 10, 2025
ccb45f4
334 Add error handling and print info for AuxillaryData load failure
nepstad Sep 4, 2025
40aa9f9
334 Update processing docs to use init-project command
nepstad Sep 4, 2025
6d92cf5
334 Fix docs typeos
nepstad Sep 4, 2025
143de93
334 Add SilCam seavox instrument identifier to metadata
nepstad Sep 5, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
9 changes: 7 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,10 +6,15 @@ A Python Ocean Particle Image Analysis toolbox
# Quick tryout of PyOPIA

1) Install [uv](https://docs.astral.sh/uv/getting-started/installation)
2) Run PyOPIA classification tests on database particles
2) Initialize PyOPIA project with a small example image dataset and run processing
```bash
uv run --python 3.12 --with git+https://github.com/SINTEF/pyopia --with tensorflow==2.16.2 --with keras==3.5.0 python -m pyopia.tests.test_classify
uvx --python 3.12 --from pyopia[classification] pyopia --init-project pyopiatest --example-data
cd pyopiatest
uvx --python 3.12 --from pyopia[classification] pyopia process config.toml
```
3) Inspect the processed particle statistics in the processed/ folder

See the documentation for more information on how to install and use PyOPIA.

# Documentation:

Expand Down
13 changes: 12 additions & 1 deletion docs/intro.md
Original file line number Diff line number Diff line change
Expand Up @@ -44,7 +44,18 @@ cd mypyopiaproject
uv add pyopia[classification]
```

To run PyOPIA, either use uv (uv run pyopia --help), or activate the venv first (source .venv/bin/activate), before running pyopia (pyopia --help).
To run PyOPIA, either use uv
```
uv run pyopia --help
```

or activate the venv before running PyOPIA (without uv)

```
source .venv/bin/activate
pyopia --help
```
Note that the activation command differs between operating systems.

The [classification] part installs tensorflow which is required by PyOPIA's Classification module, and is optional.

Expand Down
4 changes: 2 additions & 2 deletions docs/notebooks/cli.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -160,7 +160,7 @@
"```\n",
"\n",
"Will put an example config.toml file in the current directory.\n",
"Some elements of the pipelines are instrument specific, so either `silcam` or `holo` must be specified. In future, we will add a generic pipline that uses a an imread function that can load most image types - for the moement, though you will need to setup your own pipeline config if you are not using a silcam or holo pipeline.\n",
"Some elements of the pipelines are instrument specific, so either `silcam` or `holo` must be specified. In future, we will add a generic pipline that uses a an imread function that can load most image types - for the moment though, you will need to setup your own pipeline config if you are not using a silcam or holo pipeline.\n",
"\n",
"Generate a config for a silcam pipeline:\n",
"```\n",
Expand Down Expand Up @@ -423,7 +423,7 @@
"\n",
"See {func}`pyopia.cli.process` for more details\n",
"\n",
"Please have a look at the page about analysing {ref}(big-data) if you have a lot of images and/or a lot of particles."
"Please have a look at the page about analysing {ref}`big-data` if you have a lot of images and/or a lot of particles."
]
},
{
Expand Down
51 changes: 43 additions & 8 deletions docs/notebooks/processing_raw_data.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -29,16 +29,28 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"## 1) Get yourself a config file.\n",
"You can do this either by copy-paste from the page on {ref}`toml-config` into a new toml file (you might call it 'silcam-config.toml', for example), or from generating a very basic config from the command line tool: `pyopia generate-config`, e.g. for silcam:\n",
"## 1) Create a new project folder with a config file and metadata template\n",
"To start a new image processing project with PyOPIA, you can use the 'init-project' command (here called 'myproject'):\n",
"\n",
"```\n",
"pyopia generate-config silcam 'rawdatapath/*.silc' 'modelpath/keras_model.keras' 'proc_folder_path' 'testdata'\n",
"pyopia init-project myproject\n",
"```\n",
"\n",
"If you want help on what these options are, do: `pyopia generate-config --help`\n",
"If you want help and additional options for this command, do: `pyopia init-project --help`\n",
"\n",
"You should now have a toml file (e.g. called 'silcam-config.toml')"
"You should now have a new project folder ('myproject') contaning a config file ('config.toml') and a README file with suggestions for steps to perform before starting processing. Several other input files and subfolders are also generated:\n",
"\n",
"```\n",
"myproject/\n",
"├── auxillarydata\n",
"│   └── auxillary_data.csv\n",
"├── config.toml\n",
"├── images\n",
"├── metadata.json\n",
"├── processed\n",
"├── pyopia-default-classifier-20250409.keras\n",
"└── README\n",
"```"
]
},
{
Expand All @@ -51,19 +63,42 @@
"\n",
"If you need detailed help on arguments specific to a pipeline class, then you may wish to refer to the API documentation for that specific class.\n",
"\n",
"If you want to do classification, you need to give the `model_path` argument within `[steps.classifier]` a path to a trained keras model. You can download a silcam example [here](https://pysilcam.blob.core.windows.net/test-data/silcam-classification_database_20240822-200-20240829T091048.zip)"
"Particle classification is provided by [steps.classifier], which points to a pre-trained Keras CNN model. A default classifier for PyOPIA was provided by default using the init-project command. "
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 3) Add project-relevant metadata\n",
"\n",
"PyOPIA generates a self-describing netCDF file during processing, which in addition to particle statistics contain some basic metadata. These are in part taken from the 'metadata.json' file generated in the previous step.\n",
"\n",
"The generated template file 'metadata.json' contains several items that should be filled out, such as 'title' and 'creator_name'. Also check that you are happy with the default license proposed (CC BY-SA). \n",
"\n",
"\n",
"You can add your own metadata items in this file as well."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 4) Add auxillary data\n",
"\n",
"A typical image dataset will be associated with some auxillary data variables, e.g. temperature, salinity and depth for a profiling setup deployed at sea. This information can optionally be incorporated into the particle statistics netCDF that PyOPIA generates, to ease post-processing of the data. Such information should be added as time series in the auxillary data file ('auxillary_data.csv'). Each row in this file should consist of a time stamp and one or more auxillary data elements. The time stamps are interpolated to match each image being processed, so they need not match exactly, but should cover the same time period. See the generated template file for more information ('auxillarydata/auxillary_data.csv').\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 3) Process!\n",
"## 5) Process!\n",
"\n",
"Run the command line processing which simply needs to know which config file you want it to work on, e.g.:\n",
"\n",
"```\n",
"pyopia process silcam-config.toml\n",
"pyopia process config.toml\n",
"```"
]
},
Expand Down
2 changes: 1 addition & 1 deletion pyopia/__init__.py
Original file line number Diff line number Diff line change
@@ -1 +1 @@
__version__ = "2.10.0"
__version__ = "2.11.0"
106 changes: 106 additions & 0 deletions pyopia/auxillarydata.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,106 @@
import pandas as pd
import logging
import xarray as xr

logger = logging.getLogger()

AUXILLARY_DATA_FILE_TEMPLATE = """% COMMENT LINE: PLEASE UPDATE THIS FILE WITH PROJECT RELEVANT DATA. EACH COLUMN WILL BECOME A NETCDF VARIABLE.
% COMMENT LINE: ONE LINE PER MEASUREMENT, TIME IS INTERPOLATED TO IMAGE DATA TIMES IN PYOPIA. FOLLOWING LINES ARE UNITS, DESCRIPTION AND VARIABLE NAME.
,metres,degC
Time of measurement,Depth at measurement location,Temperature at measurement location
time,depth,temperature
2022-06-08T18:40:00.00000,0.0,5.0
2022-06-08T18:41:00.00000,5.0,6.0
2022-06-08T18:42:00.00000,10.0,7.0
2022-06-08T18:43:00.00000,20.0,8.0
""" # noqa: E501


class AuxillaryData:
"""
Handle auxillary data for PyOPIA particle statistics file.

Auxillary data variables may include (image) depth, longitude, latitude, etc.
This class parses a well-defined defined .csv input format, see example below.

Parameters
----------
auxillary_data_path : str
Path to auxillary data files .csv creates by the enduser


Example of auxillary data file
------------------------------
% COMMENT LINE:
% COMMENT LINE:
,m,degC,psu
Time of measurement,Depth at measurement location,Temperature at measurement location
time,depth,temperature,salinity
2025-03-19T16:59:29.950729,0.0,5.0,34
2025-03-19T17:59:29.950729,0.0,5.0,34


Note
----
Each column will become a netCDF variable
The two first lines are comments, and are ignored
Note that the third and fourth rows contain units and description for each variable
"""

def __init__(self, auxillary_data_path=None):
self.auxillary_data_path = auxillary_data_path

# Create empty dataframe for cases where no file was specified, or an error occured reading it
self.auxillary_data = pd.DataFrame(index=pd.Index([], name="time")).to_xarray()
if auxillary_data_path is not None:
try:
self.auxillary_data = self.load_auxillary_data(auxillary_data_path)
except RuntimeError as e:
print(f"Failed to load auxillary data from file: {self.auxillary_data}")
logging.error(
f"Failed to load auxillary data from file: {self.auxillary_data}"
)
logging.error(e)

def load_auxillary_data(self, auxillary_data_path):
"""Load and format uxillary data from .csv file"""

# Load in the auxillary data file
auxillary_data = pd.read_csv(auxillary_data_path, skiprows=4)

# Load units and description rows
units = pd.read_csv(auxillary_data_path, skiprows=2, nrows=0).columns
long_names = pd.read_csv(auxillary_data_path, skiprows=3, nrows=0).columns

# Set time as the index and make sure its type is datetime64[ns]
auxillary_data["time"] = auxillary_data["time"].astype("datetime64[ns]")
auxillary_data = auxillary_data.set_index("time")

# Transform into xarray, add units
auxillary_data = auxillary_data.to_xarray()
for i, col in enumerate(auxillary_data.data_vars): # Iterate over each column
auxillary_data[col].attrs["units"] = units[i + 1]
auxillary_data[col].attrs["long_name"] = long_names[i + 1]

logging.info(auxillary_data)

return auxillary_data

def add_auxillary_data_to_xstats(self, xstats):
"""Add auxillary data to a PyOPIA xstats object"""
logging.info("Adding auxillary data to xstats and storing to new file")

# Add each auxillary data variable to xstats, interpolated to xstats times
for (
data_var
) in self.auxillary_data.data_vars: # Iterate over each data variable
xstats[data_var] = xr.DataArray(
data=self.auxillary_data[data_var]
.astype(float)
.interp(time=xstats["timestamp"]),
dims=xstats.dims,
coords=xstats.coords,
attrs=self.auxillary_data[data_var].attrs,
)

return xstats
21 changes: 8 additions & 13 deletions pyopia/cf_metadata.json
Original file line number Diff line number Diff line change
@@ -1,49 +1,49 @@
{"major_axis_length": {
"standard_name": "major_axis_length",
"long_name": "The length of the major axis of the ellipse that has the same normalized second central moments as the region",
"units": "micrometer",
"units": "Pixels",
"calculation_method": "Computed using skimage.measure.regionprops (axis_major_length)",
"pyopia_process_level": 1},
"minor_axis_length": {
"standard_name": "minor_axis_length",
"long_name": "The length of the minor axis of the ellipse that has the same normalized second central moments as the region",
"units": "micrometer",
"units": "Pixels",
"calculation_method": "Computed using skimage.measure.regionprops (axis_minor_length)",
"pyopia_process_level": 1},
"equivalent_diameter": {
"standard_name": "equivalent_circular_diameter",
"long_name": "Diameter of a circle with the same area as the particle",
"units": "micrometer",
"units": "Pixels",
"calculation_method": "Computed using skimage.measure.regionprops (equivalent_diameter)",
"pyopia_process_level": 1},
"minr": {
"standard_name": "minimum_row_index",
"long_name": "Minimum row index of the particle bounding box",
"units": "pixels",
"units": "Pixels",
"calculation_method": "Extracted from skimage.measure.regionprops (bbox[0])",
"pyopia_process_level": 1},
"maxr": {
"standard_name": "maximum_row_index",
"long_name": "Maximum row index of the particle bounding box",
"units": "pixels",
"units": "Pixels",
"calculation_method": "Extracted from skimage.measure.regionprops (bbox[2])",
"pyopia_process_level": 1},
"minc": {
"standard_name": "minimum_column_index",
"long_name": "Minimum column index of the particle bounding box",
"units": "pixels",
"units": "Pixels",
"calculation_method": "Extracted from skimage.measure.regionprops (bbox[1])",
"pyopia_process_level": 1},
"maxc": {
"standard_name": "maximum_column_index",
"long_name": "Maximum column index of the particle bounding box",
"units": "pixels",
"units": "Pixels",
"calculation_method": "Extracted from skimage.measure.regionprops (bbox[3])",
"pyopia_process_level": 1},
"saturation": {
"standard_name": "image_saturation",
"long_name": "Percentage saturation of the image",
"units": "percent",
"units": "Percent",
"calculation_method": "Computed as the percentage of the image covered by particles relative to the maximum acceptable coverage",
"pyopia_process_level": 1},
"index": {
Expand All @@ -58,11 +58,6 @@
"units": "",
"calculation_method": "Generated during particle export",
"pyopia_process_level": 1},
"time": {
"standard_name": "time",
"long_name": "Time of particle observation",
"calculation_method": "Extracted from the timestamp of the observation",
"pyopia_process_level": 0},
"timestamp": {
"standard_name": "timestamp",
"long_name": "Timestamp of particle observation",
Expand Down
6 changes: 6 additions & 0 deletions pyopia/classify.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,7 @@
"""

import os
import hashlib
import numpy as np
import pandas as pd
import logging
Expand Down Expand Up @@ -125,6 +126,11 @@ def load_model(self):
path, filename = os.path.split(model_path)
self.model = keras.models.load_model(model_path)

# Create a hash of the model weights file
with open(model_path, "rb") as f:
digest = hashlib.file_digest(f, "sha256")
self.model_hash = digest.hexdigest()

# Try to create model output class name list from last model layer name
class_labels = None
try:
Expand Down
Loading
Loading