-
-
Notifications
You must be signed in to change notification settings - Fork 84
feat: RFC 6570 Compatible @fedify/uri-template Package
#475
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
notJoon
wants to merge
6
commits into
fedify-dev:next
Choose a base branch
from
notJoon:feat/uri-template
base: next
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from 2 commits
Commits
Show all changes
6 commits
Select commit
Hold shift + click to select a range
5bcfcd7
feat: RFC 6570 compatable uti-template
notJoon 32bf073
Merge branch 'next' into feat/uri-template
notJoon cc605fd
fix: update configure files
notJoon 007ba8e
fix: docs
notJoon acc31cf
Update packages/uri-template/src/compile.ts
notJoon 7dbe61c
update change.md and pnpm-workspace
notJoon File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.
Oops, something went wrong.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,78 @@ | ||
| <!-- deno-fmt-ignore-file --> | ||
|
|
||
| @fedify/uri-template: RFC 6570 URI Template implementation | ||
| =========================================================== | ||
|
|
||
| [![JSR][JSR badge]][JSR] | ||
| [![npm][npm badge]][npm] | ||
|
|
||
| This package provides [RFC 6570] fully compliant URI template expansion and | ||
| pattern matching library. Supports symmetric matching where | ||
| `expand(match(url))` and `match(expand(vars))` behave predictably. | ||
|
|
||
| [JSR]: https://jsr.io/@fedify/uri-template | ||
| [JSR badge]: https://jsr.io/badges/@fedify/uri-template | ||
| [npm]: https://www.npmjs.com/package/@fedify/uri-template | ||
| [npm badge]: https://img.shields.io/npm/v/@fedify/uri-template?logo=npm | ||
| [RFC 6570]: https://datatracker.ietf.org/doc/html/rfc6570 | ||
|
|
||
|
|
||
| Features | ||
| -------- | ||
|
|
||
| - **Full RFC 6570 Level 4 support**: Handles all operators and modifiers | ||
| (explode `*`, prefix `:n`) | ||
| - **Symmetric pattern matching**: | ||
| - `opaque`: byte-for-byte exact round-trips | ||
| - `cooked`: human-readable decoded values | ||
| - `lossless`: preserves both raw and decoded forms | ||
| - **Strict percent-encoding validation**: Prevents malformed sequences | ||
| (`%GZ`, etc.) | ||
| - **Deterministic expansion**: Correctly handles undefined/empty values per | ||
| RFC rules | ||
|
|
||
|
|
||
| Installation | ||
| ------------ | ||
|
|
||
| ~~~~ sh | ||
| deno add jsr:@fedify/uri-template # Deno | ||
| npm add @fedify/uri-template # npm | ||
| pnpm add @fedify/uri-template # pnpm | ||
| yarn add @fedify/uri-template # Yarn | ||
| bun add @fedify/uri-template # Bun | ||
| ~~~~ | ||
|
|
||
|
|
||
| Usage | ||
| ----- | ||
|
|
||
| ~~~~ typescript | ||
| import { compile } from "@fedify/uri-template"; | ||
|
|
||
| const tmpl = compile("/repos{/owner,repo}{?q,lang}"); | ||
|
|
||
| // Expansion | ||
| const url = tmpl.expand({ owner: "foo", repo: "hello/world", q: "a b" }); | ||
| // => "/repos/foo/hello%2Fworld?q=a%20b" | ||
|
|
||
| // Matching | ||
| const result = tmpl.match("/repos/foo/hello%2Fworld?q=a%20b", { | ||
| encoding: "cooked" | ||
| }); | ||
| // => { owner: "foo", repo: "hello/world", q: "a b" } | ||
| ~~~~ | ||
|
|
||
| **Matching options:** | ||
|
|
||
| - `encoding`: `"opaque"` (default, preserves raw) | `"cooked"` (decoded) | | ||
| `"lossless"` (both) | ||
| - `strict`: `true` (default, strict) | `false` (lenient parsing) | ||
|
|
||
|
|
||
| Documentation | ||
| ------------- | ||
|
|
||
| For detailed implementation details, see [*specification.md*]. | ||
|
|
||
| [*specification.md*]: ./docs/specification.md |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,15 @@ | ||
| { | ||
| "name": "@fedify/uri-template", | ||
| "version": "2.0.0", | ||
| "license": "MIT", | ||
| "exports": { | ||
| ".": "./src/index.ts" | ||
| }, | ||
| "exclude": [ | ||
| "dist", | ||
| "node_modules" | ||
| ], | ||
| "tasks": { | ||
| "check": "deno fmt --check && deno lint && deno check *.ts" | ||
| } | ||
| } | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,248 @@ | ||
| URI Template specification | ||
| ========================== | ||
|
|
||
| *This specification describes the `@fedify/uri-template` package implementation | ||
| as of version 0.1.0.* | ||
notJoon marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
|
||
|
|
||
| Introduction | ||
| ------------ | ||
|
|
||
| This document explains how the `@fedify/uri-template` package implements | ||
| [RFC 6570] URI Templates and extends them with symmetric pattern matching. | ||
|
|
||
| The package is built on three foundations. First, it parses a template string | ||
| into a small abstract syntax tree (AST) that represents RFC 6570 constructs. | ||
| Second, it *expands* variables into a URL using a single, deterministic encoder | ||
| that follows operator-specific rules. Third, it *matches* existing URLs back | ||
| to variables using the same AST and rule table, adding explicit encoding modes | ||
| so callers can choose byte-preserving or human-readable behavior. This | ||
| unification is what makes round-trips predictable with no ad-hoc heuristics. | ||
|
|
||
| [RFC 6570]: https://datatracker.ietf.org/doc/html/rfc6570 | ||
|
|
||
|
|
||
| RFC 6570 elements in practice | ||
| ------------------------------ | ||
|
|
||
| > RFC 6570 divides a template into **literals** and **expressions**. | ||
|
|
||
| A **literal** is any substring outside curly braces. Literals are copied | ||
| directly during expansion and must match exactly during pattern matching. | ||
| If a literal contains `%` sequences, those sequences are not decoded—literals | ||
| are treated as already-encoded text. | ||
|
|
||
| For example, in the template `/users/{id}/profile`, the strings `/users/`, `/`, | ||
| and `/profile` are literals. During expansion, these parts remain unchanged, | ||
| so if `id` expands to `123`, the result would be `/users/123/profile`. If the | ||
| literal contained encoded characters like `/users%2F{id}`, the `%2F` sequence | ||
| would remain as-is rather than being decoded to `/`. | ||
|
|
||
| An *expression* is enclosed in `{...}` and contains an optional *operator* | ||
| followed by a comma-separated list of variable specifications (varspecs)[^1]: | ||
|
|
||
| ~~~~ | ||
| { operator? var1, var2, var3 } | ||
| ~~~~ | ||
|
|
||
| Each varspec may include two modifiers: | ||
|
|
||
| - `:n` (prefix): Only the first `n` characters of the variable are used, and | ||
| the truncation happens before any percent-encoding. | ||
| - `*` (explode): Lists and maps expand into multiple items rather than | ||
| a single comma-joined value. | ||
|
|
||
| RFC 6570 defines eight operators, each with distinct behavior: | ||
|
|
||
| - *Simple* (`{var}`): Outputs values comma-separated. Reserved characters | ||
| are encoded. | ||
| - *Reserved* (`{+var}`): Like simple, but reserved characters are allowed to | ||
| pass through unencoded. | ||
| - *Fragment* (`{#var}`): Like reserved, but the full expression is prefixed | ||
| with `#`. The operator allows many reserved characters to pass, but never | ||
| a literal `#`, which would start a new fragment. | ||
| - *Label* (`{.var}`): Each value is prefixed with a dot. Even an *empty* | ||
| value still emits the dot (e.g., `"X{.y}"` with `y=""` yields `"X."`)[^2]. | ||
| - *Path segments* (`{/var}`): Each value is prefixed with `/`. | ||
| - *Matrix parameters* (`{;x,y}`): Each variable is prefixed with `;` and is | ||
| *named*. | ||
| - Empty becomes `;x` | ||
| - Undefined is omitted entirely | ||
| - *Query* (`{?x,y}`): First character `?`, then name/value pairs joined with | ||
| `&`. | ||
| - An empty value becomes `x=` | ||
| - Undefined is omitted | ||
| - *Query continuation* (`{&x,y}`): Like query but begins with `&`, intended | ||
| to append to existing query strings. | ||
|
|
||
| > [!NOTE] | ||
| > The distinction between "undefined" and "empty" is critical and depends on | ||
| > the specific operator being used. "Undefined" means the variable should be | ||
| > *omitted entirely* from the output. In contrast, "empty" means the operator | ||
| > should *emit something* according to its rules: `nameOnly` format for matrix | ||
| > parameters (`;x`), `empty` format for queries (`x=`), or omission for most | ||
| > other operators—except for labels, which still print the dot separator. | ||
|
|
||
|
|
||
| Parsing strategy | ||
| ----------------- | ||
|
|
||
| Parsing is a single forward scan that alternates between collecting literals and | ||
| parsing expressions. We avoid broad regex for resilient parsing and, more | ||
| importantly, it is less error-prone when you need exact source positions and | ||
| behavior around edge cases. | ||
|
|
||
| 1. *Scan for `{`*: Everything preceding it forms a Literal node. | ||
|
|
||
| 2. *Read an expression*: The next character may be one of `+#./;?&`. | ||
| If present, this character serves as the operator; otherwise, default to the | ||
| "simple" operator. | ||
|
|
||
| 3. *Parse a varspec list*: For each variable specification, read the | ||
| following components: | ||
|
|
||
| - The variable name (must be non-empty) | ||
| - An optional `:n` prefix modifier, where `n` is a non-negative integer | ||
| - An optional `*` explode flag | ||
| - Either a comma (indicating additional varspecs follow) or a closing | ||
| brace | ||
|
|
||
| 4. *Require `}`*: If the input terminates before encountering the closing | ||
| brace, this constitutes a parse error—templates must be properly balanced. | ||
|
|
||
| The result is a small AST: a sequence of `Literal` and `Expression` nodes. | ||
| Every later phase—expansion and matching—walks this same AST and consults | ||
| a single "operator spec" table. This is the design fulcrum for symmetry: both | ||
| directions share exactly the same structure and tables. | ||
|
|
||
|
|
||
| Expansion—from variables to URL | ||
| -------------------------------- | ||
|
|
||
| Expansion takes the AST and a dictionary of variables. For literals, it copies | ||
| text unchanged. For expressions, it computes a sequence of *pieces* and then | ||
| emits them with the operator's rules: | ||
|
|
||
| *Encoding is idempotent* | ||
| : Existing `%XX` sequences remain intact, while characters requiring encoding | ||
| are converted to UTF-8 bytes (`%HH`). | ||
|
|
||
| *Truncation (prefix `:n`) occurs before encoding* | ||
| : Truncating after encoding risks splitting a `%HH` triplet; RFC 6570 requires | ||
| truncation on the pre-encoded string. | ||
|
|
||
| *Explode* | ||
| : Transforms lists or maps into multiple items instead of a single joined | ||
| value. | ||
|
|
||
| *Join* | ||
| : Pieces using the operator's separator and prepend the operator's *first | ||
| character* (such as `#`, `.`, `/`, `;`, `?`, `&`) once, if defined. | ||
|
|
||
| *Empty/undefined handling*: The RFC specifies precise rules for these edge | ||
| cases: | ||
|
|
||
| - *Matrix* (`;`): Empty values yield `;x`, undefined values are omitted | ||
| - *Query* (`?`/`&`): Empty values yield `x=`, undefined values are omitted | ||
| - *Label* (`.`): Empty values still emit the dot separator (a commonly | ||
| overlooked edge case) | ||
|
|
||
| These rules ensure that expansion from structured data produces deterministic | ||
| and stable results, eliminating ambiguity about when to include separators or | ||
| variable names. | ||
|
|
||
|
|
||
| Pattern matching | ||
| ----------------- | ||
|
|
||
| Matching reads a URL string and attempts to recover the variables that would | ||
| produce that URL when expanded with the same template. | ||
|
|
||
| *Core Approach*: The fundamental concept is to reverse the expansion process | ||
| systematically rather than rely on heuristics. We traverse the same AST used | ||
| for expansion. For each literal node, we require it to appear exactly at the | ||
| current position. For expressions, we: | ||
|
|
||
| - *Consume operator prefix*: If the operator defines a first character | ||
| (`?`, `;`, `#`, `/`, `.`), we require its presence and consume it. | ||
|
|
||
| - *Greedy capture*: Until reaching the next concrete boundary: | ||
|
|
||
| 1. The subsequent literal in the AST | ||
| 2. The operator's item separator when matching multiple variables within | ||
| the same expression | ||
|
|
||
| - *Preserve encoding integrity*: When splitting captured text by separators, | ||
| we treat percent triplets as indivisible atoms, never splitting within `%HH` | ||
| sequences to avoid corrupting encoded bytes. | ||
|
|
||
| - *Parse named operators*: For operators like `;`, `?`, and `&`, we parse | ||
| `name=value` pairs but store **only the value** for each variable, mirroring | ||
| how expansion generates names from operators rather than variable content. | ||
|
|
||
| - *Infer exploded structure*: For exploded named lists (e.g., `;tags*`), we | ||
| determine structure based on patterns: | ||
|
|
||
| - If every segment follows `tags=...` format, we return an array of values | ||
| - Otherwise, we interpret as a key-value mapping (`?a=1&b=2` → | ||
| `{ a: "1", b: "2" }`) | ||
|
|
||
| ### Encoding modes | ||
|
|
||
| Encoding modes control the form of captured values: | ||
|
|
||
| *Opaque* | ||
| : Preserves raw bytes (percent sequences) exactly. If you match a URL with | ||
| `"a%2Fb"`, you get `"a%2Fb"`. This enables byte-for-byte round-trips. | ||
|
|
||
| *Cooked* | ||
| : Decodes a valid `%HH` sequence exactly once, returning human-readable values | ||
| such as `"a/b"`. This is convenient for application logic and enables | ||
| semantic round-trips. | ||
|
|
||
| *Lossless* | ||
| : Returns both views `{ raw, decoded }`, allowing callers to decide per | ||
| variable whether to preserve original bytes or use decoded text. | ||
|
|
||
| These options are explicit rather than implicit, providing flexibility while | ||
| maintaining correctness. | ||
|
|
||
|
|
||
| Round-trip guarantees | ||
| --------------------- | ||
|
|
||
| While RFC 6570 briefly mentions that "some URI Templates can be used in reverse | ||
| for the purpose of variable matching"[^3], it provides no formal specification | ||
| or guarantees for this behavior. Symmetry is often promised by implementations | ||
| but rarely defined precisely. | ||
|
|
||
| This package provides explicit round-trip guarantees as a core feature: | ||
|
|
||
| ### Matching then expanding (byte symmetry) | ||
|
|
||
| Under `opaque` mode, for any URL that matches the template, re-expanding the | ||
| matched variables produces *the exact same bytes*. Formally: | ||
|
|
||
| ~~~~ typescript | ||
| expand(match(url, { encoding: "opaque" }).vars) === url | ||
| ~~~~ | ||
|
|
||
| This is essential for reverse routing, ensuring that URL patterns can be | ||
| reliably inverted. | ||
|
|
||
| ### Expanding then matching (semantic symmetry) | ||
|
|
||
| Under `cooked` mode, for any valid variable dictionary, expanding and then | ||
| matching recovers semantically equivalent values: | ||
|
|
||
| ~~~~ typescript | ||
| const matched = match(expand(vars), { encoding: "cooked" }); | ||
| // matched.vars is semantically equivalent to vars | ||
| ~~~~ | ||
|
|
||
| This guarantees that the meaning of variables is preserved through the | ||
| round-trip, even if the exact byte representation differs due to normalization. | ||
|
|
||
| [^1]: <https://www.rfc-editor.org/rfc/rfc6570.html#section-2.3> | ||
| [^2]: <https://www.rfc-editor.org/rfc/rfc6570.html#section-3.2.5> | ||
| [^3]: <https://www.rfc-editor.org/rfc/rfc6570.html#section-1.4> (page 10) | ||
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.