Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
31 commits
Select commit Hold shift + click to select a range
533c70f
patch(ui): text-body for all headers inside CardContent
wackerow Jun 5, 2026
f90791b
i18n(ar): LLM translation
wackerow Jun 8, 2026
33bbce8
i18n(bn): LLM translation
wackerow Jun 8, 2026
d7f1d78
i18n(cs): LLM translation
wackerow Jun 8, 2026
7a2db42
i18n(de): LLM translation
wackerow Jun 8, 2026
4d8dcee
i18n(es): LLM translation
wackerow Jun 8, 2026
a4aaf3a
i18n(fr): LLM translation
wackerow Jun 8, 2026
f0e427f
i18n(hi): LLM translation
wackerow Jun 8, 2026
32af880
i18n(id): LLM translation
wackerow Jun 8, 2026
5f16f09
i18n(it): LLM translation
wackerow Jun 8, 2026
d31cd56
i18n(ja): LLM translation
wackerow Jun 8, 2026
eec97e1
i18n(ko): LLM translation
wackerow Jun 8, 2026
6903d22
i18n(mr): LLM translation
wackerow Jun 8, 2026
5a44e57
i18n(pl): LLM translation
wackerow Jun 8, 2026
0718608
i18n(pt-br): LLM translation
wackerow Jun 8, 2026
9cd2270
i18n(ru): LLM translation
wackerow Jun 8, 2026
5eb86a5
i18n(sw): LLM translation
wackerow Jun 8, 2026
677f119
i18n(ta): LLM translation
wackerow Jun 8, 2026
9d95056
i18n(te): LLM translation
wackerow Jun 8, 2026
b7eef87
i18n(tr): LLM translation
wackerow Jun 8, 2026
764f61f
i18n(uk): LLM translation
wackerow Jun 8, 2026
f8ef44e
i18n(ur): LLM translation
wackerow Jun 8, 2026
c2f97f9
i18n(vi): LLM translation
wackerow Jun 8, 2026
9b36b64
i18n(zh): LLM translation
wackerow Jun 8, 2026
dd68af4
i18n(zh-tw): LLM translation
wackerow Jun 8, 2026
2787b3e
i18n: merge tmp-intl/run-0608-1139 into intl/pending-dev
wackerow Jun 8, 2026
79b863c
feat(intl): sanitizer guard for ghost headings
myelinated-wackerow Jun 9, 2026
699f00a
docs(intl): ETHGlossary authority + new patterns
myelinated-wackerow Jun 9, 2026
5f3d5a7
Merge pull request #18363 from ethereum/fix-bug-bounty-header-colors
wackerow Jun 10, 2026
cc80357
Merge pull request #18376 from ethereum/intl/pending-dev
wackerow Jun 10, 2026
359dc5d
Merge pull request #18383 from ethereum/intl-ethglossary-authority
wackerow Jun 10, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 9 additions & 1 deletion .claude/skills/intl-review/references/ethglossary-usage.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,14 +19,22 @@ This doc focuses on the review-specific patterns (severity mapping, what to flag

This is the determinism backbone. Reviewers don't argue terminology with the pipeline — both pipeline and reviewer defer to ETHGlossary. If a glossary entry looks wrong during review, flag it in the report; don't patch the locale to disagree with it.

## Authority hierarchy — and what to do for items the glossary doesn't cover

ETHGlossary is the source of truth for **term translations AND for transliteration/calque/keep-Latin guidance**. Apply it in this strict order; do not substitute your own instinct:

1. **Term IS in ETHGlossary** → its per-term `script_rule` is the *only* authority for transliterate / calque / keep-latin / always-latin. Query it (`/filter` per file, or `/translations/{lang}/{termId}`). A deviation is CRITICAL.
2. **Term is NOT in ETHGlossary** (author names, brand-new products, etc.) → apply the script-aware fallback in `known-patterns.md` §1: **transliterate** into non-Latin target scripts, **keep as-is** for Latin scripts.
3. **Never infer a "default" `script_rule` for an unlisted term.** An absent entry means "use the fallback," **not** "keep Latin." Flagging a correctly-transliterated non-Latin author name (e.g. `te` "మారియో హావెల్" for "Mario Havel") as "should be Latin" is a **false positive** — the kind of fabricated critical that wastes reviewer time. When in doubt, query the API; if the term isn't there, the fallback decides, not you.

## How to query

**Preferred — per-file `/filter`:**

```bash
curl -sf -X POST "$GLOSSARY_API_URL/filter" \
-H "Content-Type: application/json" \
-d "$(jq -n --arg text "$ENGLISH_SOURCE" --arg lang "$LANG" '{text: $text, language: $lang}')"
-d "$(jq -n --arg content "$ENGLISH_SOURCE" --arg lang "$LANG" '{content: $content, language: $lang}')"
```

Returns only the glossary terms that appear in the English source for the file being reviewed, with translations sorted by occurrence. Avoids pulling hundreds of irrelevant terms into review context.
Expand Down
24 changes: 21 additions & 3 deletions .claude/translation-review/known-patterns.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
# Known Translation Patterns & Issues

> This is a living document. Updated after each language review.
> Last updated: 2026-03-16 (updated with Gemini-confirmed transliteration policy from Hindi PR #17101)
> Last updated: 2026-06-09 (PR #18375: added MDX duplicated-closer / dropped-`>` breakers, the duplicate ghost-heading migration artifact, and the ETHGlossary authority hierarchy for unlisted terms)

## Issue Categories

Expand Down Expand Up @@ -70,6 +70,12 @@ flag phonetic transliterations.

**Transliteration authority:** ETHGlossary (https://ethglossary.visual-20-hoists.workers.dev) is the canonical source for term translations, including per-language transliterated forms for non-Latin scripts. The pipeline queries ETHGlossary directly; reviewers verify against the per-term `script_rule` returned by the API. The previous local bank at `.claude/translation-review/transliterations/` has been removed as of ETHGlossary v0.3.0.

**Authority hierarchy — terms ETHGlossary covers vs. items it doesn't (READ THIS before flagging a transliteration/calque/keep-Latin "error"):**

1. **For any term ETHGlossary covers, its per-term `script_rule` is the ONLY authority** for the transliterate / calque / keep-latin / always-latin decision. Query the API (`/filter` per file, or `/translations/{lang}/{termId}`); never assume.
2. **For items ETHGlossary does NOT cover** (author names, brand-new product names not yet in the glossary, etc.), apply the script-aware fallback above: **transliterate** into non-Latin target scripts, **keep as-is** for Latin scripts.
3. **Never infer a "default" `script_rule` for an unlisted term.** An absent glossary entry means "fall back to the script-aware policy," **NOT** "keep Latin." Example caught in PR #18375 review: a `te` author name "Mario Havel" rendered as "మారియో హావెల్" is **CORRECT** per the fallback — a reviewer flagging it as "should stay Latin" by assuming an `always_latin` default was a **false positive**. When ETHGlossary and a reviewer's instinct disagree, ETHGlossary (or, for unlisted items, this documented fallback) wins — there is one source of truth.

### 2. Cross-Script Contamination (CRITICAL)

Crowdin translation memory leaks content from other language translations.
Expand All @@ -82,16 +88,28 @@ Crowdin translation memory leaks content from other language translations.

### 3. MDX Syntax Errors (CRITICAL — breaks builds)

Four predictable categories that appear in every import:
Predictable categories that appear in nearly every import:

| Pattern | Example | Fix |
|---------|---------|-----|
| Raw `<` before numbers | `<5GB` in MDX context | Escape to `&lt;5GB` |
| Missing closing backtick | `` `<contract>.<function>() `` | Add closing backtick |
| Misplaced backtick exposing JSX | ``(`<> ...` </>`)`` | Fix backtick placement |
| Orphaned HTML closing tags | `</a>` from sentence restructuring | Remove orphaned tag |
| Duplicated inner closer over a wrapper | `<ExpandableCard>…<ButtonLink>x</ButtonLink></ButtonLink>` (2nd should close the wrapper) | Restore the wrapper's real closing tag from the English source (`</ExpandableCard>`, `</Callout>`, …) |
| Dropped `>` in angle-bracket link whose URL has parens | `[t](<https://en.wikipedia.org/wiki/Electra_(star))` | Restore the `>` before the final `)`: `…_(star)>)` |

**Pattern:** The first two are most common. The last two were the entire cause of the failing build in **PR #18375** (77 files, all 24 langs). **Detect deterministically** by compiling each changed file through `@mdx-js/mdx` (the parser `next-mdx-remote` uses) — strip frontmatter and `{#id}` heading anchors first (else every file false-positives on the custom heading-id syntax), and confirm the English sources compile clean as a control before trusting the run.

### 3b. Duplicate "Ghost" Headings (CRITICAL — structural-migration artifact)

When a base-branch change shifts page block structure — e.g. the h1 → `frontmatter.title` migration that removed leading `#` headings — the pipeline's incremental block-matching can mis-align and emit a section **twice**: an anchor-less "ghost" copy (often an older or differently-worded translation, sometimes a different formality register) immediately followed by the correct `{#anchor}` copy. The reader sees the section rendered twice in a row.

**Signature:** a translated `h2`–`h4` heading **without** `{#id}` immediately followed (after blank lines and/or one duplicate paragraph) by a **same-level** heading **with** `{#id}`. English requires `{#id}` on every heading, so any anchor-less translated heading is the tell.

**Fix:** delete the ghost block (the anchor-less heading + its duplicate paragraph) up to the anchored twin; keep the anchored version that matches the English source. Observed in **PR #18375** at **254 occurrences across 69 files** (24 langs × `community/grants`, `contributing/adding-videos`, `roadmap/glamsterdam`).

**Pattern:** These same 4 patterns recur in every Crowdin import. The first two are most common.
**Detection:** scan changed translated files for `^#{2,4} ` lines lacking `{#` (outside code fences); classify each as ghost-twin (next same-level heading is anchored → safe to delete) vs. lone-missing-anchor (needs the anchor *added* from English) before fixing. **The durable fix belongs in the sanitizer** (collapse adjacent duplicate headings during sanitization) so future structural changes self-heal rather than shipping duplicates.

### 4. Semantic Inversions (CRITICAL)

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@

> **Source:** Gemini (2026-03-18), confirmed by project maintainer
> **Applies to:** All 13 non-Latin-script locales in the transliteration pipeline
> **Used by:** sanitizer (post_import_sanitize.ts), transliteration script (transliterate.ts), review agents
> **Used by:** sanitizer (`src/scripts/intl-pipeline/intl-sanitizer.ts`), ETHGlossary integration, review agents

## Quick Reference Table

Expand Down Expand Up @@ -141,7 +141,7 @@ acts as a strong RTL character that anchors the directionality:

## Impact on Sanitizer and Transliteration Scripts

### Sanitizer (post_import_sanitize.ts)
### Sanitizer (`src/scripts/intl-pipeline/intl-sanitizer.ts`)

Current behavior and needed changes:

Expand Down
55 changes: 50 additions & 5 deletions .manifests/src/intl/ar/common.json/source.json
Original file line number Diff line number Diff line change
@@ -1,11 +1,11 @@
{
"version": 1,
"sourceFile": "src/intl/en/common.json",
"generatedAt": "2026-06-02T09:58:27.177Z",
"rootHash": "f6744960f9ba",
"generatedAt": "2026-06-08T11:40:15.805Z",
"rootHash": "8859281c9894",
"tree": {
"contentHash": "d0599bca3752",
"anchorHash": "9073bb8b8520",
"contentHash": "37cbcc725ee7",
"anchorHash": "04ad04fb0db4",
"children": {
"about-ethereum-org": {
"contentHash": "9aafb3686977",
Expand Down Expand Up @@ -174,6 +174,42 @@
"contentHash": "e21f935f11d7",
"anchorHash": "e3b0c44298fc"
},
"copy-page": {
"contentHash": "62d9b9487f59",
"anchorHash": "e3b0c44298fc"
},
"copy-page-aria-label": {
"contentHash": "c1188f922b2d",
"anchorHash": "e3b0c44298fc"
},
"copy-page-options": {
"contentHash": "66c1ae393307",
"anchorHash": "e3b0c44298fc"
},
"copy-page-markdown-title": {
"contentHash": "17d36a505f97",
"anchorHash": "e3b0c44298fc"
},
"copy-page-markdown-description": {
"contentHash": "28cded2269c4",
"anchorHash": "e3b0c44298fc"
},
"copy-page-chatgpt-title": {
"contentHash": "c27a47a5c24f",
"anchorHash": "e3b0c44298fc"
},
"copy-page-chatgpt-description": {
"contentHash": "babab9a22ef4",
"anchorHash": "e3b0c44298fc"
},
"copy-page-claude-title": {
"contentHash": "9512e37d4a15",
"anchorHash": "e3b0c44298fc"
},
"copy-page-claude-description": {
"contentHash": "babab9a22ef4",
"anchorHash": "e3b0c44298fc"
},
"danksharding": {
"contentHash": "d5c7c6294c3d",
"anchorHash": "e3b0c44298fc"
Expand Down Expand Up @@ -2241,6 +2277,15 @@
"cookie-policy",
"copied",
"copy",
"copy-page",
"copy-page-aria-label",
"copy-page-options",
"copy-page-markdown-title",
"copy-page-markdown-description",
"copy-page-chatgpt-title",
"copy-page-chatgpt-description",
"copy-page-claude-title",
"copy-page-claude-description",
"danksharding",
"dao-page",
"dark-mode",
Expand Down Expand Up @@ -2739,5 +2784,5 @@
"region-sint-eustatius-netherlands"
]
},
"sourceCommitSha": "9c8ec781d111c9802b6d9677f581d21a46bedcb4"
"sourceCommitSha": "e7089af701ba3acee7872235b2cc7706c10efc91"
}
55 changes: 50 additions & 5 deletions .manifests/src/intl/bn/common.json/source.json
Original file line number Diff line number Diff line change
@@ -1,11 +1,11 @@
{
"version": 1,
"sourceFile": "src/intl/en/common.json",
"generatedAt": "2026-06-02T09:58:58.348Z",
"rootHash": "f6744960f9ba",
"generatedAt": "2026-06-08T11:40:42.465Z",
"rootHash": "8859281c9894",
"tree": {
"contentHash": "d0599bca3752",
"anchorHash": "9073bb8b8520",
"contentHash": "37cbcc725ee7",
"anchorHash": "04ad04fb0db4",
"children": {
"about-ethereum-org": {
"contentHash": "9aafb3686977",
Expand Down Expand Up @@ -174,6 +174,42 @@
"contentHash": "e21f935f11d7",
"anchorHash": "e3b0c44298fc"
},
"copy-page": {
"contentHash": "62d9b9487f59",
"anchorHash": "e3b0c44298fc"
},
"copy-page-aria-label": {
"contentHash": "c1188f922b2d",
"anchorHash": "e3b0c44298fc"
},
"copy-page-options": {
"contentHash": "66c1ae393307",
"anchorHash": "e3b0c44298fc"
},
"copy-page-markdown-title": {
"contentHash": "17d36a505f97",
"anchorHash": "e3b0c44298fc"
},
"copy-page-markdown-description": {
"contentHash": "28cded2269c4",
"anchorHash": "e3b0c44298fc"
},
"copy-page-chatgpt-title": {
"contentHash": "c27a47a5c24f",
"anchorHash": "e3b0c44298fc"
},
"copy-page-chatgpt-description": {
"contentHash": "babab9a22ef4",
"anchorHash": "e3b0c44298fc"
},
"copy-page-claude-title": {
"contentHash": "9512e37d4a15",
"anchorHash": "e3b0c44298fc"
},
"copy-page-claude-description": {
"contentHash": "babab9a22ef4",
"anchorHash": "e3b0c44298fc"
},
"danksharding": {
"contentHash": "d5c7c6294c3d",
"anchorHash": "e3b0c44298fc"
Expand Down Expand Up @@ -2241,6 +2277,15 @@
"cookie-policy",
"copied",
"copy",
"copy-page",
"copy-page-aria-label",
"copy-page-options",
"copy-page-markdown-title",
"copy-page-markdown-description",
"copy-page-chatgpt-title",
"copy-page-chatgpt-description",
"copy-page-claude-title",
"copy-page-claude-description",
"danksharding",
"dao-page",
"dark-mode",
Expand Down Expand Up @@ -2739,5 +2784,5 @@
"region-sint-eustatius-netherlands"
]
},
"sourceCommitSha": "9c8ec781d111c9802b6d9677f581d21a46bedcb4"
"sourceCommitSha": "e7089af701ba3acee7872235b2cc7706c10efc91"
}
55 changes: 50 additions & 5 deletions .manifests/src/intl/cs/common.json/source.json
Original file line number Diff line number Diff line change
@@ -1,11 +1,11 @@
{
"version": 1,
"sourceFile": "src/intl/en/common.json",
"generatedAt": "2026-06-02T09:58:20.388Z",
"rootHash": "f6744960f9ba",
"generatedAt": "2026-06-08T11:40:07.313Z",
"rootHash": "8859281c9894",
"tree": {
"contentHash": "d0599bca3752",
"anchorHash": "9073bb8b8520",
"contentHash": "37cbcc725ee7",
"anchorHash": "04ad04fb0db4",
"children": {
"about-ethereum-org": {
"contentHash": "9aafb3686977",
Expand Down Expand Up @@ -174,6 +174,42 @@
"contentHash": "e21f935f11d7",
"anchorHash": "e3b0c44298fc"
},
"copy-page": {
"contentHash": "62d9b9487f59",
"anchorHash": "e3b0c44298fc"
},
"copy-page-aria-label": {
"contentHash": "c1188f922b2d",
"anchorHash": "e3b0c44298fc"
},
"copy-page-options": {
"contentHash": "66c1ae393307",
"anchorHash": "e3b0c44298fc"
},
"copy-page-markdown-title": {
"contentHash": "17d36a505f97",
"anchorHash": "e3b0c44298fc"
},
"copy-page-markdown-description": {
"contentHash": "28cded2269c4",
"anchorHash": "e3b0c44298fc"
},
"copy-page-chatgpt-title": {
"contentHash": "c27a47a5c24f",
"anchorHash": "e3b0c44298fc"
},
"copy-page-chatgpt-description": {
"contentHash": "babab9a22ef4",
"anchorHash": "e3b0c44298fc"
},
"copy-page-claude-title": {
"contentHash": "9512e37d4a15",
"anchorHash": "e3b0c44298fc"
},
"copy-page-claude-description": {
"contentHash": "babab9a22ef4",
"anchorHash": "e3b0c44298fc"
},
"danksharding": {
"contentHash": "d5c7c6294c3d",
"anchorHash": "e3b0c44298fc"
Expand Down Expand Up @@ -2241,6 +2277,15 @@
"cookie-policy",
"copied",
"copy",
"copy-page",
"copy-page-aria-label",
"copy-page-options",
"copy-page-markdown-title",
"copy-page-markdown-description",
"copy-page-chatgpt-title",
"copy-page-chatgpt-description",
"copy-page-claude-title",
"copy-page-claude-description",
"danksharding",
"dao-page",
"dark-mode",
Expand Down Expand Up @@ -2739,5 +2784,5 @@
"region-sint-eustatius-netherlands"
]
},
"sourceCommitSha": "9c8ec781d111c9802b6d9677f581d21a46bedcb4"
"sourceCommitSha": "e7089af701ba3acee7872235b2cc7706c10efc91"
}
Loading
Loading