This plugin automatically harvests metadata from GitHub and GitLab repositories. The extraction follows the CodeMeta (version 3.0) standard to ensure compatibility and interoperability with other metadata tools and formats. In addition, we align with the CodeMeta Crosswalks, which offers a detailed mapping between metadata fields from a variety of software repositories, registries, and archives.
The following metadata fields are extracted automatically:
(Listed in alphabetical order)
- codeRepository: Link to the repository where the un-compiled, human readable code and related code is located.
- contributor: A secondary contributor to the CreativeWork or Event.
- copyrightHolder: The party holding the legal copyright to the CreativeWork.
- dateCreated: The date on which the CreativeWork was created or the item was added to a DataFeed.
- dateModified: The date on which the CreativeWork was most recently modified or when the item's entry was modified within a DataFeed.
- datePublished: Date of first broadcast/publication.
- description: A description of the item.
- downloadUrl: If the file can be downloaded, URL to download the binary.
- identifier: The identifier property represents any kind of identifier for any kind of Thing, such as ISBNs, GTIN codes, UUIDs etc. Schema.org provides dedicated properties for representing many of these, either as textual strings or as URL (URI) links.
- issueTracker: Link to software bug reporting or issue tracking system.
- keywords: Keywords or tags used to describe this content. Multiple entries in a keywords list are typically delimited by commas.
- license: A license document that applies to this content, typically indicated by URL.
- name: The name of the item (software, Organization).
- programmingLanguage: The computer programming language.
- readme: Link to software Readme file.
- url: URL of the item.
Clone the HERMES project (feature branch)
git clone --branch feature/276-harvesting-metadata-from-a-provided-repository-URL https://github.com/Aidajafarbigloo/hermes.git
Go to the project directory
cd hermes
Install HERMES dependencies
pip install .
Clone the plugin repository
git clone https://github.com/softwarepub/hermes-plugin-github-gitlab.git
Go to the plugin directory
cd hermes-plugin-github-gitlab
Install plugin dependencies
pip install .
Configure HERMES
Ensure you have a hermes.toml
file in your working directory.
Edit the file to include or remove sources as needed:
sources = ["cff", "codemeta", "githublab"]
Verify Installation
hermes --help
If you see the help message, HERMES is installed correctly.
Harvest Metadata
From a local repository:
hermes harvest
From a remote repository: (This extracts metadata from the defined sources for the specified repository.)
hermes harvest --path <URL>