Data management and analysis of metal–organic framework synthesis using data models

Overview

Preprint available at: https://chemrxiv.org/doi/full/10.26434/chemrxiv.10001842/v1 This project formats, validates, serializes, and analyzes the given MOF synthesis data using data models. It was developed to demonstrate the usefulness of data models for realizing FAIR data and software management in chemical synthesis projects. The data models and codes can be reused by those who are interested in developing or using such a workflow.

Features

Two example datasets of MOF synthesis
Formatting synthesis and characterization data into a well-defined and interoperable structure with the data models
Rigorous data validation using JSON schema
Data serialization into known formats, XDL and MPIF
Phase mole fraction analysis of PXRD data using reference patterns
Phase yield calculation based on mole fractions and yields using molar masses
Decision tree modeling of the main products and convex hull analysis of phase yields with synthesis parameters
Modular scripts based on the data model APIs

Installation

# Install uv
pip install uv
# Clone the repository
git clone https://github.com/FAIRChemistry/mof-synthesis-data-modeling.git
cd mof-synthesis-data-modeling
# Install dependencies in a virtual environment
uv pip install -e .

Install Graphviz for decision tree visualization

macOS: brew install graphviz
Debian/Ubuntu: sudo apt install graphviz
Windows: https://graphviz.org/download/

Usage

Formatting, validation, and serialization into XDL

uv run scripts/format_and_serialize_all.py

Serialization into MPIF

cd scripts/mofsy2mpif
uv run npm install
uv run npm start
cd ../..

Phase mole fraction analysis of PXRD patterns

uv run marimo edit

Workspace > scripts > pxrd_analysis.mo.py
Run all slate cells (right bottom)

Decision tree modeling

uv run scripts/generate_decision_trees.py

Convex hull plotting

uv run marimo edit

Workspace > scripts > convex-hull.mo.py
Run all slate cells (right bottom)

Contributing

Use uv run ... as a substitution for python ....
New packages can be added using uv add ... instead of pip install ....
Best results using the VSCode debugger are achieved by running uv sync --all-packages. Then just run Python Debugger: Clear Cache and Reload Window in the command palette.

Name		Name	Last commit message	Last commit date
Latest commit History 187 Commits
.idea		.idea
data		data
data_model		data_model
decision_tree_results		decision_tree_results
scripts		scripts
src		src
.gitignore		.gitignore
README.md		README.md
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Data management and analysis of metal–organic framework synthesis using data models

Table of Contents

Overview

Features

Installation

Install Graphviz for decision tree visualization

Usage

Formatting, validation, and serialization into XDL

Serialization into MPIF

Phase mole fraction analysis of PXRD patterns

Decision tree modeling

Convex hull plotting

Contributing

About

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Data management and analysis of metal–organic framework synthesis using data models

Table of Contents

Overview

Features

Installation

Install Graphviz for decision tree visualization

Usage

Formatting, validation, and serialization into XDL

Serialization into MPIF

Phase mole fraction analysis of PXRD patterns

Decision tree modeling

Convex hull plotting

Contributing

About

Resources

Uh oh!

Stars

Watchers

Forks

Contributors

Uh oh!

Languages