Skip to content

Commit

Permalink
Merge pull request #105 from Materials-Data-Science-and-Informatics/f…
Browse files Browse the repository at this point in the history
…eature/codemeta_overwrite

Feature/codemeta overwrite
  • Loading branch information
mustafasoylu authored Feb 14, 2025
2 parents 5be736d + c67c814 commit 3a11844
Show file tree
Hide file tree
Showing 22 changed files with 824 additions and 94 deletions.
18 changes: 2 additions & 16 deletions .somesy.toml
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
[project]
name = "somesy"
version = "0.5.0"
version = "0.6.0"
description = "A CLI tool for synchronizing software project metadata."
keywords = ["metadata", "FAIR"]
license = "MIT"
Expand Down Expand Up @@ -54,7 +54,7 @@ orcid = "https://orcid.org/0000-0002-5149-603X"
contribution = "Discussions and suggestions concerning tool scope and usability."
contribution_begin = "2023-06-01"
contribution_end = "2023-06-30"
contribution_types = ["ideas"]
contribution_types = ["ideas", "doc"]

publication_author = true

Expand All @@ -67,17 +67,3 @@ orcid = "https://orcid.org/0000-0001-9560-4728"
contribution_types = ["fundingFinding"]

publication_author = true

[config]
no_sync_cff = false
cff_file = "CITATION.cff"
no_sync_pyproject = false
pyproject_file = "pyproject.toml"
no_sync_codemeta = false
codemeta_file = "codemeta.json"
no_sync_package_json = true
no_sync_julia = true
no_sync_fortran = true
show_info = false
verbose = false
debug = true
5 changes: 3 additions & 2 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,12 +6,13 @@ Please consult the changelog to inform yourself about breaking changes and secur

## [v0.6.0](https://github.com/Materials-Data-Science-and-Informatics/somesy/tree/v0.6.0) <small>(2025-xx-xx)</small> { id="0.6.0" }

- implement CFF Entity model for author/maintainer/contributor
- implement CFF Entity (Organization) model for author/maintainer/contributor
- add a new config option to use existing codemeta.json when syncing
- fix SomesyBaseModel kwargs being overwritten

## [v0.5.0](https://github.com/Materials-Data-Science-and-Informatics/somesy/tree/v0.5.0) <small>(2025-01-15)</small> { id="0.5.0" }

- make person argument email optional
- make person (and entity) argument email optional

## [v0.4.3](https://github.com/Materials-Data-Science-and-Informatics/somesy/tree/v0.4.3) <small>(2024-07-29)</small> { id="0.4.3" }

Expand Down
2 changes: 1 addition & 1 deletion CITATION.cff
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@ type: software
message: If you use this software, please cite it using this metadata.

title: somesy
version: 0.5.0
version: 0.6.0
abstract: A CLI tool for synchronizing software project metadata.
url: https://materials-data-science-and-informatics.github.io/somesy
repository-code: https://github.com/Materials-Data-Science-and-Informatics/somesy
Expand Down
6 changes: 3 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -118,10 +118,10 @@ rorid = "https://ror.org/02nv7yv05" # highly recommended set a ror id for your o
verbose = true # show detailed information about what somesy is doing
```

As Helmholtz Metadata Collaboration (HMC), our goal is to increase usage of metadata and improve metadata quality. Therefore, some fields in `somesy.toml` are set as required fields. This is to increase rigour and completeness of metadata recorded with `somesy` .

<!-- --8<-- [end:somesytoml] -->

As Helmholtz Metadata Collaboration (HMC), our goal is to increase usage of metadata and improve metadata quality. Therefore, some fields in `somesy.toml` are set as required fields. This is to increase rigour and completeness of metadata recorded with `somesy` .

Alternatively, you can also add the somesy configuration to an existing
`pyproject.toml`, `package.json`, `Project.toml`, or `fpm.toml` file. The somesy [manual](https://materials-data-science-and-informatics.github.io/somesy/main/manual/#somesy-input-file) contains examples showing how to do that.

Expand Down Expand Up @@ -213,7 +213,7 @@ Here is an overview of all the currently supported files and formats.
3. `fpm.toml` only supports one author and maintainer, so `somesy` will pick the _first_ listed author and maintainer
4. `pom.xml` has no concept of `maintainers`, but it can have multiple licenses (somesy only supports one main project license)
5. `mkdocs.yml` is a bit special, as it is not a project file, but a documentation file. `somesy` will only update it if it exists and is enabled in the configuration
6. unlike other targets, `somesy` will _re-create_ the `codemeta.json` (i.e. do not edit it by hand!)
6. For handling `codemeta.json` different options exists: Either (A) `somesy` removes any prior existing `codemata.json` files and re-creates it anew, or (B) `somesy` merges an existing `codemeta.json` with the information handled by `somesy`. See the [user manual](https://materials-data-science-and-informatics.github.io/somesy/main/manual/#codemeta) for additional details about CodeMeta handling.

<!-- --8<-- [end:quickstart] -->

Expand Down
2 changes: 1 addition & 1 deletion codemeta.json
Original file line number Diff line number Diff line change
Expand Up @@ -27,7 +27,7 @@
],
"name": "somesy",
"description": "A CLI tool for synchronizing software project metadata.",
"version": "0.5.0",
"version": "0.6.0",
"keywords": [
"metadata",
"FAIR"
Expand Down
110 changes: 66 additions & 44 deletions docs/manual.md
Original file line number Diff line number Diff line change
Expand Up @@ -253,10 +253,10 @@ one of the supported input formats:

=== "Project.toml"

````toml
name = "my-amazing-project"
version = "0.1.0"
uuid = "c7e460c6-3f3e-11ec-8d3d-0242ac130003"
```toml
name = "my-amazing-project"
version = "0.1.0"
uuid = "c7e460c6-3f3e-11ec-8d3d-0242ac130003"

[deps]
...
Expand Down Expand Up @@ -298,9 +298,10 @@ uuid = "c7e460c6-3f3e-11ec-8d3d-0242ac130003"
```

=== "fpm.toml"
```toml
name = "my-amazing-project"
version = "0.1.0"

```toml
name = "my-amazing-project"
version = "0.1.0"

[tool.somesy.project]
name = "my-amazing-project"
Expand Down Expand Up @@ -342,34 +343,34 @@ version = "0.1.0"

```json
{
"name": "my-amazing-project",
"version": "0.1.0",
...
"name": "my-amazing-project",
"version": "0.1.0",
...

"somesy": {
"somesy": {
"project": {
"name": "my-amazing-project",
"version": "0.1.0",
"description": "Brief description of my amazing software.",
"keywords": ["some", "descriptive", "keywords"],
"license": "MIT",
"repository": "https://github.com/username/my-amazing-project",
"people": [
"name": "my-amazing-project",
"version": "0.1.0",
"description": "Brief description of my amazing software.",
"keywords": ["some", "descriptive", "keywords"],
"license": "MIT",
"repository": "https://github.com/username/my-amazing-project",
"people": [
{
"given-names": "Jane",
"family-names": "Doe",
"email": "[email protected]",
"orcid": "https://orcid.org/0000-0000-0000-0001",
"author": true,
"maintainer": true
"given-names": "Jane",
"family-names": "Doe",
"email": "[email protected]",
"orcid": "https://orcid.org/0000-0000-0000-0001",
"author": true,
"maintainer": true
},
{
"given-names": "Another",
"family-names": "Contributor",
"email": "[email protected]",
"orcid": "https://orcid.org/0000-0000-0000-0002"
"given-names": "Another",
"family-names": "Contributor",
"email": "[email protected]",
"orcid": "https://orcid.org/0000-0000-0000-0002"
}
]
]
},
"entities":[
{
Expand All @@ -380,9 +381,9 @@ version = "0.1.0"
}
],
"config": {
"verbose": true
"verbose": true
}
}
}
}
```

Expand Down Expand Up @@ -545,26 +546,47 @@ after running somesy (to remove the duplicate entries with the incorrect ROR ID)
### Codemeta

While `somesy` is modifying existing files for most supported formats and implements
features such as person identification and merging,
[CodeMeta](https://codemeta.github.io/) is implemented differently.
features such as person identification and merging, [CodeMeta](https://codemeta.github.io/)
requires special handling.

As `codemeta.json` is a [**JSON-LD**](https://json-ld.org/) file, it represents a graph and
can have various equally valid representations in JSON format. The behavior of `somesy`
when handling CodeMeta files is controlled by the `codemeta_merge` configuration option:

As that `codemeta.json` is a [**JSON-LD**](https://json-ld.org/) file, it actually represents a graph,
has various equally valid representations in a JSON file.
Thus, supporting the same features as for other formats is technically much more
challenging, if at all feasible. Therefore, for the time being, we regenerate the
`codemeta.json` file directly from the source file, in order to avoid data inconsistency
due to many pitfalls hiding in the details of the format.
When `codemeta_merge = true`, `somesy` will:

1. Read and parse any existing `codemeta.json` file
2. Update only the fields that `somesy` manages. Values that are already present will be overwritten.
3. Preserve any additional fields or metadata present in the file and append it to the record.


When `codemeta_merge = false` (default), `somesy` will:

1. Delete any existing `codemeta.json` file
2. Create a new file containing only the metadata from your somesy project configuration

!!! note

If you have additional CodeMeta fields you want to preserve, make sure to set
`codemeta_merge = true` in your somesy configuration.

!!! warning

The `codemeta.json` is overwritten and regenerated from scratch every time you `sync`,
so **do not edit it** if you have the codemeta target enabled in `somesy`!
Unlike other formats, person and entity merging heuristics are not
implemented for CodeMeta. The author, maintainer, and contributor
fields are directly created from your somesy project metadata,
overwriting any existing entries in these fields.

Please note that due to the above behavior and the linked-data nature of values in
`codemeta.json` records using the option `codemeta_merge = true` can create
conflicts within the CodeMeta record, i.e. if values in `somesy.toml` and those
that get appended to the CodeMeta record show inconsistencies.

As `codemeta.json` is considered a technical "backend-format" derived from other
inputs, in most cases you probably do not need or should edit it by hand anyway.
inputs, in most cases you probably do not need to edit it by hand anyway.

Of course, you are welcome to contribute an improved CodeMeta writer for somesy that can correctly
understand and update the linked data graph which the `codemeta.json` file represents!
Of course, you are welcome to contribute improvements to the CodeMeta handling in somesy
to make it even more robust and feature-complete!

## Using somesy to insert metadata into project documentation

Expand Down
Loading

0 comments on commit 3a11844

Please sign in to comment.