Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
The table of contents is too big for display.
Diff view
Diff view
  •  
  •  
  •  
The diff you're trying to view is too large. We only load the first 3000 changed files.
6 changes: 6 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,8 +10,14 @@ The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/),

### Added

- **`articleType` page gate** — `site.json#noPageArticleTypes` lists `articleType` values that don't get rendered site pages. Initial set: `obituary`, `other`, `news`, `calendar`, `announcement`, `correction`, `addendum`, `reprint`. Gated docs are skipped at the per-doc page, reference-tree, and sitemap emit loops, and dropped from `search-index.json`/`facets.json` so they don't appear in the browse list either. Stale pages from earlier builds are removed on rebuild. Gated docs remain fully present in the API and registry — the gate only suppresses generated pages, not data. New `src/main/lib/pageGate.js` drives the decision; matching is case-insensitive but exact-value.
- **`articleType` shown on doc pages** — surfaced in `docId.hbs` directly below Doc Type when present.
- **Journal Article breakdown in `/api/stats.json`** — new `documents.journalArticles` block: `{ total, articleTypes, byArticleType }` (sorted descending by count). Stats `apiVersion` bumps `1.0.0 → 1.1.0`.

### Changed

- **`/api/documents.json` is now a lightweight index** (`apiVersion` bumps `1.0.0 → 2.0.0`) — the full-bundle shape grew past GitHub's 100 MB per-file limit on the `gh-pages` branch as the SMPTE journal-article backfill landed. The endpoint now emits one row per doc — `{ docId, publisher, docType, docLabel, docTitle, articleType?, path }` — each linking to `/api/doc/{docId}.json` for the full record with `$meta` provenance. Drops the file from ~120 MB to ~7 MB and stays small as the corpus grows. Per-doc shards are unchanged and remain the canonical full-data endpoint. Closes [#1173](https://github.com/PrZ3r/MSRBot.io/issues/1173).

### Fixed

## [v2.0.0] - 2026-05-19
Expand Down
9 changes: 8 additions & 1 deletion docs/smpte-source-backfill.md
Original file line number Diff line number Diff line change
Expand Up @@ -27,6 +27,8 @@ the lot once this branch is merged and the follow-ups below have landed.
Scripts (one-time runners + the audit tool, all built for this project):

- [ ] `src/main/scripts/extras/extractSmpteSourceRefs.js` — `-ref.xml` extraction runner
- [ ] `src/main/scripts/extras/extractSmpteJournalArticles.js` — NLM journal-article + conference-paper backfill runner
- [ ] `src/main/scripts/extras/extractSmpteJournalIssues.js` — APTARA/Allen Press journal_metadata coverage + cross-fill runner
- [ ] `src/main/scripts/extras/resolveSmpteSourceRefs.js` — unresolved-refs resolver pass
- [ ] `src/main/scripts/extras/fixUndatedSourceRefs.js` — fixup, dates undated org-lineage refIds
- [ ] `src/main/scripts/extras/inventorySource.smpte.js` — SMPTE source-vs-registry audit tool
Expand All @@ -38,6 +40,10 @@ Reports / ad-hoc outputs:
- [ ] `src/main/reports/sourceInventory.smpte.json` — audit-tool output (stale post-backfill)
- [ ] `src/main/reports/sourceInventory.smpte.md` — audit-tool output (stale post-backfill)
- [ ] `src/main/reports/sourceInventory.smpte.schemaMap.md` — audit-tool output (stale post-backfill)
- [ ] `src/main/reports/smpteJournalImport.json` — journal-backfill runner output
- [ ] `src/main/reports/smpteJournalImport.md` — journal-backfill runner output
- [ ] `src/main/reports/smpteJournalIssueImport.json` — journal-issue (APTARA) runner output
- [ ] `src/main/reports/smpteJournalIssueImport.md` — journal-issue (APTARA) runner output

This tracking doc itself:

Expand All @@ -53,7 +59,8 @@ This tracking doc itself:
These are the actual product of the backfill, not scaffolding:

- `src/main/scripts/utils/extractSourceMetadata.js` — `readRefXml` restructured
to return per-`<ref>` records.
to return per-`<ref>` records; `readNlmArticleXml` added for the HIGHWIRE
NLM journal-article corpus.
- `src/main/lib/referencing.js` — new `parseRefId` parser families (ITU, ANSI,
AES, EBU, CIE, IEEE, ETSI, ARIB, ATSC, TIA, EIA, DVB, CEA, FCC, legacy-SMPTE,
patents, ISO drafts, FIPS).
Expand Down
5 changes: 5 additions & 0 deletions src/main/config/site.json
Original file line number Diff line number Diff line change
Expand Up @@ -300,8 +300,12 @@
"license": "BSD-3-Clause",
"licenseUrl": "https://opensource.org/licenses/BSD-3-Clause",
"locale": "en-US",
"noPageArticleTypes": [
""
],
"nonLineageDocTypes": [
"Book",
"Conference Paper",
"Dissertation",
"FAQ",
"Guideline",
Expand Down Expand Up @@ -610,6 +614,7 @@
"siteName": "MSRBot.io",
"titleLabelDocTypes": [
"Book",
"Conference Paper",
"Dissertation",
"Guideline",
"Journal Article",
Expand Down
160 changes: 160 additions & 0 deletions src/main/data/docs/smpte/conference-paper/1969/10.5594-M00918.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,160 @@
{
"articleType": "abstract",
"articleType$meta": {
"confidence": "high",
"note": "Parsed from NLM article XML (_source/SMPTE/HIGHWIRE/ORIGINAL SAMPLES/Conferences/smptem_1969_33/smptem_1969_33.xml/10.5594_M00918.xml)",
"source": "parsed",
"updated": "2026-05-22T21:44:19.125Z",
"version": "smpte-conference-paper-nlm@v1"
},
"authors": [
"Richard J. Goldberg"
],
"authors$meta": {
"confidence": "high",
"note": "Parsed from NLM article XML (_source/SMPTE/HIGHWIRE/ORIGINAL SAMPLES/Conferences/smptem_1969_33/smptem_1969_33.xml/10.5594_M00918.xml)",
"source": "parsed",
"updated": "2026-05-22T21:44:19.125Z",
"version": "smpte-conference-paper-nlm@v1"
},
"copyright": {
"holder": "Society of Motion Picture and Television Engineers, Inc.",
"holder$meta": {
"confidence": "high",
"note": "Parsed from NLM article XML (_source/SMPTE/HIGHWIRE/ORIGINAL SAMPLES/Conferences/smptem_1969_33/smptem_1969_33.xml/10.5594_M00918.xml)",
"source": "parsed",
"updated": "2026-05-22T21:44:19.125Z",
"version": "smpte-conference-paper-nlm@v1"
},
"year": "1969",
"year$meta": {
"confidence": "high",
"note": "Parsed from NLM article XML (_source/SMPTE/HIGHWIRE/ORIGINAL SAMPLES/Conferences/smptem_1969_33/smptem_1969_33.xml/10.5594_M00918.xml)",
"source": "parsed",
"updated": "2026-05-22T21:44:19.125Z",
"version": "smpte-conference-paper-nlm@v1"
}
},
"copyright$meta": {
"confidence": "high",
"note": "Parsed from NLM article XML (_source/SMPTE/HIGHWIRE/ORIGINAL SAMPLES/Conferences/smptem_1969_33/smptem_1969_33.xml/10.5594_M00918.xml)",
"source": "parsed",
"updated": "2026-05-22T21:44:19.125Z",
"version": "smpte-conference-paper-nlm@v1"
},
"docId": "10.5594-M00918",
"docId$meta": {
"confidence": "high",
"note": "Derived from DOI 10.5594/M00918",
"source": "parsed",
"updated": "2026-05-22T21:44:19.125Z",
"version": "smpte-conference-paper-nlm@v1"
},
"docLabel": "SMPTE Meetings and Conferences ( October 1969)",
"docLabel$meta": {
"confidence": "medium",
"note": "Composed from venue title, volume/issue and date",
"source": "inferred",
"updated": "2026-05-22T21:44:19.125Z",
"version": "smpte-conference-paper-nlm@v1"
},
"docTitle": "Foreword",
"docTitle$meta": {
"confidence": "high",
"note": "Parsed from NLM article XML (_source/SMPTE/HIGHWIRE/ORIGINAL SAMPLES/Conferences/smptem_1969_33/smptem_1969_33.xml/10.5594_M00918.xml)",
"source": "parsed",
"updated": "2026-05-22T21:44:19.125Z",
"version": "smpte-conference-paper-nlm@v1"
},
"docType": "Conference Paper",
"docType$meta": {
"confidence": "high",
"note": "NLM <article> — SMPTE conference paper",
"source": "inferred",
"updated": "2026-05-22T21:44:19.125Z",
"version": "smpte-conference-paper-nlm@v1"
},
"doi": "10.5594/M00918",
"doi$meta": {
"confidence": "high",
"note": "Parsed from NLM article XML (_source/SMPTE/HIGHWIRE/ORIGINAL SAMPLES/Conferences/smptem_1969_33/smptem_1969_33.xml/10.5594_M00918.xml)",
"source": "parsed",
"updated": "2026-05-22T21:44:19.125Z",
"version": "smpte-conference-paper-nlm@v1"
},
"href": "https://doi.org/10.5594/M00918",
"href$meta": {
"confidence": "high",
"note": "Constructed from DOI 10.5594/M00918",
"source": "inferred",
"updated": "2026-05-22T21:44:19.125Z",
"version": "smpte-conference-paper-nlm@v1"
},
"number": "33",
"number$meta": {
"confidence": "high",
"note": "Parsed from NLM article XML (_source/SMPTE/HIGHWIRE/ORIGINAL SAMPLES/Conferences/smptem_1969_33/smptem_1969_33.xml/10.5594_M00918.xml)",
"source": "parsed",
"updated": "2026-05-22T21:44:19.125Z",
"version": "smpte-conference-paper-nlm@v1"
},
"pages": "1–1",
"pages$meta": {
"confidence": "high",
"note": "Parsed from NLM article XML (_source/SMPTE/HIGHWIRE/ORIGINAL SAMPLES/Conferences/smptem_1969_33/smptem_1969_33.xml/10.5594_M00918.xml)",
"source": "parsed",
"updated": "2026-05-22T21:44:19.125Z",
"version": "smpte-conference-paper-nlm@v1"
},
"publicationDate": "1969-10-01",
"publicationDate$meta": {
"confidence": "high",
"note": "Day absent in NLM source — padded to 01",
"source": "parsed",
"updated": "2026-05-22T21:44:19.125Z",
"version": "smpte-conference-paper-nlm@v1"
},
"publisher": "SMPTE",
"publisher$meta": {
"confidence": "high",
"note": "Normalised to registry \"SMPTE\" convention from NLM publisher-name (The Society of Motion Picture and Television Engineers)",
"source": "inferred",
"updated": "2026-05-22T21:44:19.125Z",
"version": "smpte-conference-paper-nlm@v1"
},
"publisherLocation": {
"city": "White Plains, NY",
"city$meta": {
"confidence": "high",
"note": "Parsed from NLM article XML (_source/SMPTE/HIGHWIRE/ORIGINAL SAMPLES/Conferences/smptem_1969_33/smptem_1969_33.xml/10.5594_M00918.xml)",
"source": "parsed",
"updated": "2026-05-22T21:44:19.125Z",
"version": "smpte-conference-paper-nlm@v1"
}
},
"publisherLocation$meta": {
"confidence": "high",
"note": "Parsed from NLM article XML (_source/SMPTE/HIGHWIRE/ORIGINAL SAMPLES/Conferences/smptem_1969_33/smptem_1969_33.xml/10.5594_M00918.xml)",
"source": "parsed",
"updated": "2026-05-22T21:44:19.125Z",
"version": "smpte-conference-paper-nlm@v1"
},
"status": {
"active": true,
"active$meta": {
"confidence": "medium",
"note": "No explicit status in NLM source — conference papers default active",
"source": "inferred",
"updated": "2026-05-22T21:44:19.125Z",
"version": "smpte-conference-paper-nlm@v1"
}
},
"volume": "1969",
"volume$meta": {
"confidence": "high",
"note": "Parsed from NLM article XML (_source/SMPTE/HIGHWIRE/ORIGINAL SAMPLES/Conferences/smptem_1969_33/smptem_1969_33.xml/10.5594_M00918.xml)",
"source": "parsed",
"updated": "2026-05-22T21:44:19.125Z",
"version": "smpte-conference-paper-nlm@v1"
}
}
168 changes: 168 additions & 0 deletions src/main/data/docs/smpte/conference-paper/1969/10.5594-M00919.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,168 @@
{
"abstract": "A significant benefit of the super 8 system is that it now makes the motion picture medium accessible for small-group and individual viewing. Implications of this breakthrough are discussed in terms of 8mm statistics and potential markets. Six key attributes of super 8 are flexibility, accessibility, repeatability, controllability, compatibility and profitability. Compatibility and performance factors influencing the choice of a super 8 sound system and a cartridge design are discussed in terms of the eventual potentialities of this new format in screen communication.",
"abstract$meta": {
"confidence": "high",
"note": "Parsed from NLM article XML (_source/SMPTE/HIGHWIRE/ORIGINAL SAMPLES/Conferences/smptem_1969_33/smptem_1969_33.xml/10.5594_M00919.xml)",
"source": "parsed",
"updated": "2026-05-22T21:44:19.125Z",
"version": "smpte-conference-paper-nlm@v1"
},
"articleType": "abstract",
"articleType$meta": {
"confidence": "high",
"note": "Parsed from NLM article XML (_source/SMPTE/HIGHWIRE/ORIGINAL SAMPLES/Conferences/smptem_1969_33/smptem_1969_33.xml/10.5594_M00919.xml)",
"source": "parsed",
"updated": "2026-05-22T21:44:19.125Z",
"version": "smpte-conference-paper-nlm@v1"
},
"authors": [
"Norwood L. Simmons"
],
"authors$meta": {
"confidence": "high",
"note": "Parsed from NLM article XML (_source/SMPTE/HIGHWIRE/ORIGINAL SAMPLES/Conferences/smptem_1969_33/smptem_1969_33.xml/10.5594_M00919.xml)",
"source": "parsed",
"updated": "2026-05-22T21:44:19.125Z",
"version": "smpte-conference-paper-nlm@v1"
},
"copyright": {
"holder": "Society of Motion Picture and Television Engineers, Inc.",
"holder$meta": {
"confidence": "high",
"note": "Parsed from NLM article XML (_source/SMPTE/HIGHWIRE/ORIGINAL SAMPLES/Conferences/smptem_1969_33/smptem_1969_33.xml/10.5594_M00919.xml)",
"source": "parsed",
"updated": "2026-05-22T21:44:19.125Z",
"version": "smpte-conference-paper-nlm@v1"
},
"year": "1969",
"year$meta": {
"confidence": "high",
"note": "Parsed from NLM article XML (_source/SMPTE/HIGHWIRE/ORIGINAL SAMPLES/Conferences/smptem_1969_33/smptem_1969_33.xml/10.5594_M00919.xml)",
"source": "parsed",
"updated": "2026-05-22T21:44:19.125Z",
"version": "smpte-conference-paper-nlm@v1"
}
},
"copyright$meta": {
"confidence": "high",
"note": "Parsed from NLM article XML (_source/SMPTE/HIGHWIRE/ORIGINAL SAMPLES/Conferences/smptem_1969_33/smptem_1969_33.xml/10.5594_M00919.xml)",
"source": "parsed",
"updated": "2026-05-22T21:44:19.125Z",
"version": "smpte-conference-paper-nlm@v1"
},
"docId": "10.5594-M00919",
"docId$meta": {
"confidence": "high",
"note": "Derived from DOI 10.5594/M00919",
"source": "parsed",
"updated": "2026-05-22T21:44:19.125Z",
"version": "smpte-conference-paper-nlm@v1"
},
"docLabel": "SMPTE Meetings and Conferences ( October 1969)",
"docLabel$meta": {
"confidence": "medium",
"note": "Composed from venue title, volume/issue and date",
"source": "inferred",
"updated": "2026-05-22T21:44:19.125Z",
"version": "smpte-conference-paper-nlm@v1"
},
"docTitle": "Super 8: Whither Bound?",
"docTitle$meta": {
"confidence": "high",
"note": "Parsed from NLM article XML (_source/SMPTE/HIGHWIRE/ORIGINAL SAMPLES/Conferences/smptem_1969_33/smptem_1969_33.xml/10.5594_M00919.xml)",
"source": "parsed",
"updated": "2026-05-22T21:44:19.125Z",
"version": "smpte-conference-paper-nlm@v1"
},
"docType": "Conference Paper",
"docType$meta": {
"confidence": "high",
"note": "NLM <article> — SMPTE conference paper",
"source": "inferred",
"updated": "2026-05-22T21:44:19.125Z",
"version": "smpte-conference-paper-nlm@v1"
},
"doi": "10.5594/M00919",
"doi$meta": {
"confidence": "high",
"note": "Parsed from NLM article XML (_source/SMPTE/HIGHWIRE/ORIGINAL SAMPLES/Conferences/smptem_1969_33/smptem_1969_33.xml/10.5594_M00919.xml)",
"source": "parsed",
"updated": "2026-05-22T21:44:19.125Z",
"version": "smpte-conference-paper-nlm@v1"
},
"href": "https://doi.org/10.5594/M00919",
"href$meta": {
"confidence": "high",
"note": "Constructed from DOI 10.5594/M00919",
"source": "inferred",
"updated": "2026-05-22T21:44:19.125Z",
"version": "smpte-conference-paper-nlm@v1"
},
"number": "33",
"number$meta": {
"confidence": "high",
"note": "Parsed from NLM article XML (_source/SMPTE/HIGHWIRE/ORIGINAL SAMPLES/Conferences/smptem_1969_33/smptem_1969_33.xml/10.5594_M00919.xml)",
"source": "parsed",
"updated": "2026-05-22T21:44:19.125Z",
"version": "smpte-conference-paper-nlm@v1"
},
"pages": "2–9",
"pages$meta": {
"confidence": "high",
"note": "Parsed from NLM article XML (_source/SMPTE/HIGHWIRE/ORIGINAL SAMPLES/Conferences/smptem_1969_33/smptem_1969_33.xml/10.5594_M00919.xml)",
"source": "parsed",
"updated": "2026-05-22T21:44:19.125Z",
"version": "smpte-conference-paper-nlm@v1"
},
"publicationDate": "1969-10-01",
"publicationDate$meta": {
"confidence": "high",
"note": "Day absent in NLM source — padded to 01",
"source": "parsed",
"updated": "2026-05-22T21:44:19.125Z",
"version": "smpte-conference-paper-nlm@v1"
},
"publisher": "SMPTE",
"publisher$meta": {
"confidence": "high",
"note": "Normalised to registry \"SMPTE\" convention from NLM publisher-name (The Society of Motion Picture and Television Engineers)",
"source": "inferred",
"updated": "2026-05-22T21:44:19.125Z",
"version": "smpte-conference-paper-nlm@v1"
},
"publisherLocation": {
"city": "White Plains, NY",
"city$meta": {
"confidence": "high",
"note": "Parsed from NLM article XML (_source/SMPTE/HIGHWIRE/ORIGINAL SAMPLES/Conferences/smptem_1969_33/smptem_1969_33.xml/10.5594_M00919.xml)",
"source": "parsed",
"updated": "2026-05-22T21:44:19.125Z",
"version": "smpte-conference-paper-nlm@v1"
}
},
"publisherLocation$meta": {
"confidence": "high",
"note": "Parsed from NLM article XML (_source/SMPTE/HIGHWIRE/ORIGINAL SAMPLES/Conferences/smptem_1969_33/smptem_1969_33.xml/10.5594_M00919.xml)",
"source": "parsed",
"updated": "2026-05-22T21:44:19.125Z",
"version": "smpte-conference-paper-nlm@v1"
},
"status": {
"active": true,
"active$meta": {
"confidence": "medium",
"note": "No explicit status in NLM source — conference papers default active",
"source": "inferred",
"updated": "2026-05-22T21:44:19.125Z",
"version": "smpte-conference-paper-nlm@v1"
}
},
"volume": "1969",
"volume$meta": {
"confidence": "high",
"note": "Parsed from NLM article XML (_source/SMPTE/HIGHWIRE/ORIGINAL SAMPLES/Conferences/smptem_1969_33/smptem_1969_33.xml/10.5594_M00919.xml)",
"source": "parsed",
"updated": "2026-05-22T21:44:19.125Z",
"version": "smpte-conference-paper-nlm@v1"
}
}
Loading
Loading