Skip to content

Feature/rumil tokens 0.6.0#1

Merged
hakimjonas merged 4 commits intomainfrom
feature/rumil-tokens-0.6.0
Apr 30, 2026
Merged

Feature/rumil tokens 0.6.0#1
hakimjonas merged 4 commits intomainfrom
feature/rumil-tokens-0.6.0

Conversation

@hakimjonas
Copy link
Copy Markdown
Owner

No description provided.

Zero-width parser that yields the current byte offset, enabling
span capture via the existing Zip combinator:

  final spanned = position().zip(p).zip(position());

Discovered as an ergonomic gap while building Doxa — surface AST
nodes need source spans for Phase 6 error messages, and Rumil
previously offered offsets only via ParseError on parse failures.

One new sealed case (GetPosition), one interpreter branch, one
public primitive. No effect on existing parsers.
Bumps every package to 0.6.0 in lockstep, with one first publish
(rumil_tokens) and version-only alignment for the rest.

Packages:
- rumil           0.5.0 -> 0.6.0   (additive: position() primitive
                                     shipped in the preceding commit)
- rumil_codec     0.5.0 -> 0.6.0   (version-only)
- rumil_codec_builder 0.5.0 -> 0.6.0  (version-only; bumps deps on
                                     rumil_codec and rumil_parsers)
- rumil_parsers   0.5.0 -> 0.6.0   (version-only; rumil dep ^0.6.0)
- rumil_expressions 0.5.0 -> 0.6.0 (version-only; rumil dep ^0.6.0)
- rumil_tokens    0.6.0              (first publish)

rumil_tokens 0.6.0 highlights:

Spans:
- tokenizeSpans(source, grammar) -> List<Spanned<Token>> with byte
  offsets. Spans are half-open, contiguous, anchored to
  [0, source.length).
- Spanned<T extends Token> extension type over (T, int, int).
- tokenize() is a wrapper over tokenizeSpans(); behaviour unchanged.

Token vocabulary:
- New Operator token class, separate from Punctuation. Value-computing
  operators (+, *, ==, &&, =>) are Operator; structural delimiters
  are Punctuation.
- New Variable token class for shell $NAME, ${...} references.

Grammar API additions (all opt-in, backwards-compatible):
- operatorChars / multiCharOperators (explicit multi-char vocabulary,
  matched longest-first)
- identifiersAllowDollar
- rawStringPrefix
- identifierStringPrefix
- backtickIdentifiers
- shellVariables
- backtickCommandSubstitution
- heredocs

Grammar fixes:
- Dart: raw strings r'...' as one StringLit; generics < > as
  Punctuation; ?? and ?. recognized as operators.
- Scala: backtick identifiers as one Identifier; interpolator prefix
  strings s"..." / f"..." as one StringLit.
- JSON: negative numbers as one NumberLit.
- YAML: flow collections as Punctuation; YAML 1.1 legacy keywords
  dropped.
- Shell: $HOME / ${PATH} / $(cmd) as Variable; heredocs (<<EOF,
  <<-EOF, <<'EOF') as one StringLit.

Documentation and release polish:
- rumil_tokens is the one package with rich documentation in this
  release; siblings get short version-aligned entries.
- All 0.6.0 documentation authored without em-dashes, marketing
  adjectives, or bold-emphasis styling.
- Comments in source files stripped to invariant-describing and
  contract-describing lines only.
- location.dart dartdoc: unresolved [input] reference fixed.

Release gate across all six packages:
- Format clean.
- Analyze --fatal-infos clean.
- Dart doc --dry-run clean.
- Tests: rumil 95, rumil_codec 85, rumil_codec_builder 13,
  rumil_parsers 1177, rumil_expressions 57, rumil_tokens 205.
  Total 1632.
Extends all four CI jobs to cover the new rumil_tokens package:
- Write pubspec_overrides.yaml for rumil_tokens (path-override on
  rumil) in every job that sets up overrides.
- Add rumil_tokens to the install-dependencies loop.
- Add rumil_tokens to the analyze loop and the format arg list.
- Add a `Test rumil_tokens` step to the test job.
- Add rumil_tokens to the doc job's validate-links loop.

Without this, CI would silently not test rumil_tokens even though it
is the package with the most new work in 0.6.0.
Marks rumil_tokens as internal to this monorepo. It remains fully
tested and developed here, but is not released to pub.dev.

Changes:
- pubspec.yaml: publish_to: none; version 0.1.0; drop the pub.dev
  description and topics fields.
- CHANGELOG.md: collapse to a single 0.1.0 "initial in-tree cut"
  entry. The prior 0.5.0/0.6.0 entries assumed a pub.dev debut that
  is not happening.
- README.md: lead with a "not published" status note so readers
  don't expect it on pub.dev.

CI coverage retained: rumil_tokens still runs through the monorepo
CI's analyze / format / test / doc jobs.

Release cascade to pub.dev now excludes rumil_tokens:
  rumil 0.6.0
  rumil_codec 0.6.0
  rumil_parsers 0.6.0  (depends on rumil ^0.6.0)
  rumil_expressions 0.6.0  (depends on rumil ^0.6.0)
  rumil_codec_builder 0.6.0  (depends on rumil_codec ^0.6.0,
                              rumil_parsers ^0.6.0)
@hakimjonas hakimjonas force-pushed the feature/rumil-tokens-0.6.0 branch from 6b19867 to ad06ad4 Compare April 30, 2026 16:58
@hakimjonas hakimjonas merged commit a53f418 into main Apr 30, 2026
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant