Skip to content

Document types #1

@statzhero

Description

@statzhero

I have two suggestions:

  1. Perhaps you can indicate the total size of the corpus in GB if someone were to plan to bulk download (and if you even want this).
  2. I was expecting the reports to be all documents (e.g. PDF) but there seem to be webpages among them. They could be an issue as I don't think this is a stable URL nor static content. Example below.
    {
        "id": "2c7fa66b-4fd4-4fcd-9ab4-94b021575110",
        "name": "SAP Interactive Chart Generator",
        "href": "https://www.sap.com/integrated-reports/2021/en/interactive-chart-generator.environment.greenhouse-gas-emissions.html",
        "type": "Other",
        "year": "2021",
        "company_id": "ba08c2fa-d4c2-49d1-b2cf-15202362eac2"
    }

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions