Feature/rumil tokens 0.6.0#1
Merged
hakimjonas merged 4 commits intomainfrom Apr 30, 2026
Merged
Conversation
Zero-width parser that yields the current byte offset, enabling span capture via the existing Zip combinator: final spanned = position().zip(p).zip(position()); Discovered as an ergonomic gap while building Doxa — surface AST nodes need source spans for Phase 6 error messages, and Rumil previously offered offsets only via ParseError on parse failures. One new sealed case (GetPosition), one interpreter branch, one public primitive. No effect on existing parsers.
Bumps every package to 0.6.0 in lockstep, with one first publish
(rumil_tokens) and version-only alignment for the rest.
Packages:
- rumil 0.5.0 -> 0.6.0 (additive: position() primitive
shipped in the preceding commit)
- rumil_codec 0.5.0 -> 0.6.0 (version-only)
- rumil_codec_builder 0.5.0 -> 0.6.0 (version-only; bumps deps on
rumil_codec and rumil_parsers)
- rumil_parsers 0.5.0 -> 0.6.0 (version-only; rumil dep ^0.6.0)
- rumil_expressions 0.5.0 -> 0.6.0 (version-only; rumil dep ^0.6.0)
- rumil_tokens 0.6.0 (first publish)
rumil_tokens 0.6.0 highlights:
Spans:
- tokenizeSpans(source, grammar) -> List<Spanned<Token>> with byte
offsets. Spans are half-open, contiguous, anchored to
[0, source.length).
- Spanned<T extends Token> extension type over (T, int, int).
- tokenize() is a wrapper over tokenizeSpans(); behaviour unchanged.
Token vocabulary:
- New Operator token class, separate from Punctuation. Value-computing
operators (+, *, ==, &&, =>) are Operator; structural delimiters
are Punctuation.
- New Variable token class for shell $NAME, ${...} references.
Grammar API additions (all opt-in, backwards-compatible):
- operatorChars / multiCharOperators (explicit multi-char vocabulary,
matched longest-first)
- identifiersAllowDollar
- rawStringPrefix
- identifierStringPrefix
- backtickIdentifiers
- shellVariables
- backtickCommandSubstitution
- heredocs
Grammar fixes:
- Dart: raw strings r'...' as one StringLit; generics < > as
Punctuation; ?? and ?. recognized as operators.
- Scala: backtick identifiers as one Identifier; interpolator prefix
strings s"..." / f"..." as one StringLit.
- JSON: negative numbers as one NumberLit.
- YAML: flow collections as Punctuation; YAML 1.1 legacy keywords
dropped.
- Shell: $HOME / ${PATH} / $(cmd) as Variable; heredocs (<<EOF,
<<-EOF, <<'EOF') as one StringLit.
Documentation and release polish:
- rumil_tokens is the one package with rich documentation in this
release; siblings get short version-aligned entries.
- All 0.6.0 documentation authored without em-dashes, marketing
adjectives, or bold-emphasis styling.
- Comments in source files stripped to invariant-describing and
contract-describing lines only.
- location.dart dartdoc: unresolved [input] reference fixed.
Release gate across all six packages:
- Format clean.
- Analyze --fatal-infos clean.
- Dart doc --dry-run clean.
- Tests: rumil 95, rumil_codec 85, rumil_codec_builder 13,
rumil_parsers 1177, rumil_expressions 57, rumil_tokens 205.
Total 1632.
Extends all four CI jobs to cover the new rumil_tokens package: - Write pubspec_overrides.yaml for rumil_tokens (path-override on rumil) in every job that sets up overrides. - Add rumil_tokens to the install-dependencies loop. - Add rumil_tokens to the analyze loop and the format arg list. - Add a `Test rumil_tokens` step to the test job. - Add rumil_tokens to the doc job's validate-links loop. Without this, CI would silently not test rumil_tokens even though it is the package with the most new work in 0.6.0.
Marks rumil_tokens as internal to this monorepo. It remains fully
tested and developed here, but is not released to pub.dev.
Changes:
- pubspec.yaml: publish_to: none; version 0.1.0; drop the pub.dev
description and topics fields.
- CHANGELOG.md: collapse to a single 0.1.0 "initial in-tree cut"
entry. The prior 0.5.0/0.6.0 entries assumed a pub.dev debut that
is not happening.
- README.md: lead with a "not published" status note so readers
don't expect it on pub.dev.
CI coverage retained: rumil_tokens still runs through the monorepo
CI's analyze / format / test / doc jobs.
Release cascade to pub.dev now excludes rumil_tokens:
rumil 0.6.0
rumil_codec 0.6.0
rumil_parsers 0.6.0 (depends on rumil ^0.6.0)
rumil_expressions 0.6.0 (depends on rumil ^0.6.0)
rumil_codec_builder 0.6.0 (depends on rumil_codec ^0.6.0,
rumil_parsers ^0.6.0)
6b19867 to
ad06ad4
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
No description provided.