Macro fragment fields #3714

joshtriplett · 2024-10-20T20:08:21Z

Add a syntax and mechanism for macros to access "fields" of high-level fragment
specifiers that they've matched, to let macros use the Rust parser for
robustness and future compatibility, while still extracting pieces of the
matched syntax.

This RFC introduces the syntax ${fragname.field}, and a couple of fragment specifiers and their fields. The goal is to add more such fragment specifiers and fields, to allow more macros to leverage the Rust parser, but the purpose of this RFC is to introduce the concept and syntax.

Rendered

text/3714-macro-fragment-fields.md

nikomatsakis · 2024-10-21T23:23:11Z

Oh, I like this. Cute idea.

joshtriplett · 2024-10-22T01:38:09Z

Nominating this (and related RFCs) for discussion, to decide whether we can process it asynchronously or whether we need a design meeting.

text/3714-macro-fragment-fields.md

…out parsing

text/3714-macro-fragment-fields.md

matthieu-m · 2024-10-24T17:38:59Z

I think an important discussion to be had will be whether it's okay for fields to "generate" tokens.

The RFC itself already proposes that fn.return_type materializes a () AST type node ex-nihilo, and the discussion on param notes that &self could simply be materialized as self: &Self thus fitting the larger pat: ty pattern.

Unless the plan is to drop fn.return_type from this RFC, hamstringing fn, I believe the lang/compiler teams should come to a consensus on the policy here:

Is sticking to the source more important? In which case fn.return_type should be a ty?.
Is simplification, if semantically equivalent, preferable?

Another discussion which may be necessary is pinning down exactly which types the fields should have.

Unless macro-rules are significantly complicated by allowing subtyping in the future, for now, types are final.

For example, using the pat: ty syntax for a function parameter may seem more favorable than adding an ad-hoc fragment type. Okay, &self is a bit weird, but it can be matched as self: &Self so all good?

The problem, though, is that suddenly:

How do you attach attributes?
How do you extend the parameter syntax to allow splat for variadic generics (eg. name...: T...), is splat shoehorned in the pat/ty?

Keen readers may notice that C-variadics ... would already be a problem, but please bear with me here. Finding examples is hard.

The conservative choice, it seems to me, would be to err on the side of introduce fragment types more often than not, even if in the meantime they end up being functionally equivalent to another fragment type (or a set thereof).

Note: editions may help here, but any change risks introducing breakage so... it may be best to think of editions as a last resort rather than as the default way.

text/3714-macro-fragment-fields.md

Co-authored-by: René Kijewski <[email protected]>

joshtriplett · 2024-10-24T22:52:25Z

@matthieu-m wrote:

Unless macro-rules are significantly complicated by allowing subtyping in the future, for now, types are final.

I don't think this is the case.

Today, you can write a macro that matches the same tokens several different ways. And I think we could, for instance, present param as pat: ty today, and later present it as a param type containing the same tokens. I don't think that would add any complexity or compatibility issues.

We can also add fields to existing fragment specifiers, without breaking compatibility.

There are compatibility considerations we have to take care with, and we may need to introduce new fragment specifiers in the future to handle those; for instance, if we make a field required and it later becomes optional, we might have to introduce a new fragment specifier with it optional.

But I don't think switching the type of a field (e.g. to a newly created fragment specifier) would break compatibility as long as it contains the same tokens.

vincenzopalazzo

LGMT otherwise

text/3714-macro-fragment-fields.md

Co-authored-by: Vincenzo Palazzo <[email protected]>

This gives an example of needing to synthesize tokens.

jhpratt

Glad to see someone materializing the idea I've had floating around for a while.

For the purpose of avoiding RFCs for future fields, I believe it would be best to explicitly grant T-lang the ability to decide this on their own volition.

jhpratt · 2024-12-02T20:24:21Z

text/3714-macro-fragment-fields.md

+
+- `:fn`: A function definition (including body).
+  - `name`: The name of the function, as an `ident`.
+  - `param`: The parameters of the function, presented as though captured by a


What is the "type" of this field?

If we add the macro fragment param, then each repetition will have type param; until then, each repetition looks like pat_param: ty. (Handwaving the ... case here.)

text/3714-macro-fragment-fields.md

joshtriplett · 2024-12-02T21:16:15Z

For the purpose of avoiding RFCs for future fields, I believe it would be best to explicitly grant T-lang the ability to decide this on their own volition.

I added an unresolved question about whether we should develop a lighter-weight process/policy for approving these, and whether we should delegate them to another team (e.g. wg-macros).

veluca93 · 2025-01-18T20:33:11Z

IMO this is a great feature - giving macros access to parts of high-level fragments massively simplifies the job of people writing macros, and makes robust, future-proof declarative macros significantly easier to write.

safinaskar · 2025-02-01T18:21:58Z

I don't like this RFC. Yes, I totally agree that the compiler's parser should be exposed to decl. macros, i. e. macros should not reinvent their own parsing. But I believe that this parser should be exposed in more generic way (as opposed to your opinionated ad-hoc way). We should give decl. macros full grammar of Rust or at least some big part of it. I. e. we should assign fixed names to all (or to many) Rust AST nodes and productions and give decl. macros ability to extract nodes.

This is how get_name macro would be written:

macro_rules! get_name {
  // struct struct
  (struct $i:ident $($g:GenericParams)? $($w:WhereClause)? { $($s:StructFields)? }) => { stringify!($i) };
  (struct $i:ident $($g:GenericParams)? $($w:WhereClause)? ;) => { stringify!($i) };

  // tuple struct
  (struct $i:ident $($g:GenericParams)? ( $($t:TupleFields)? ) $($w:WhereClause)? ;) => { stringify!($i) };

  // enums and unions are left as exercise to a reader :)
}

Here I took names for AST nodes from Rust reference.

Yes, you may say this is verbose. But this allows us to get access to AST in generic way. Of course, some simplifications can be developed in 3rd party crates based on this feature. Say, some crate can expose macro for getting name of any ADT type in single macro call.

Also, of course, it will be beneficial to give same AST access to proc macros. I. e. proc macros will not rely on 3rd party crates, such as syn, for actual parsing of Rust.

But to do this we should first ensure that proc macros and decl macros see same lexical syntax. As well as I understand, this is currently not so

joshtriplett · 2025-02-01T20:34:29Z

@safinaskar The problem with exposing the full Rust AST to macros is that the Rust AST evolves over time. One of the major goals of this RFC is to allow macros to take advantage of the compiler's knowledge of the Rust language while still allowing the macros to ignore things they don't understand.

The way you wrote the get_name macro requires the macro author to handle all the productions of struct/tuple/union/enum/etc. And if Rust adds more, the macro will break. That's true whether that parsing lives in a separate crate or is written directly. For that matter, we don't have any good structured way for macros in a separate crate to provide things like the name of an ADT without parsing the whole ADT, and then a macro to return some other part of the ADT will also have to parse the whole ADT; we don't have a way to parse it once and return all the components in a convenient way.

Macro fragment fields solve that problem.

Also, note from the RFC that the intention is to expand the set of fields; the RFC just specifies a very minimal set of fields to prove the concept.

safinaskar · 2025-02-01T22:55:57Z

@joshtriplett

And if Rust adds more, the macro will break

No. The macro will still handle old syntax. It just will not understand new productions. And this is okay. Author of that macro will need to release new version, which will handle missing productions. This is similar to how syn-based proc macros work.

Also: Rust sometimes adds new productions, but it usually not removes them (unless over edition boundary). So we totally can expose full AST.

Also, syn crate was able to do breaking release once in a 3.5 years! (See https://github.com/dtolnay/syn/releases/tag/2.0.0 ). This is rarer than new Rust editions emerge. This proves that we totally can maintain AST exposed to decl. macros

safinaskar · 2025-02-01T23:08:41Z

Note: task of syn author was even harder than ours, because he supports not only stable syntax, but also unstable one. For example, he supported box x syntax in syn 1.x (and removed it in syn 2.x). But we can support stable syntax only. And yet syn was able to do breaking release once in 3.5 years

programmerjake · 2025-02-01T23:51:00Z

one other problem with just having macro matchers for each ast thing and then writing out rust's syntax explicitly in the macro pattern is that iirc macro_rules isn't powerful enough to fully parse Rust's syntax, since Rust needs some lookahead (iirc 3 tokens), but afaik macro_rules simply don't support that much lookahead.

Plus, just trying to match rust's syntax isn't enough for ergonomic macros, since if you need to access some interior part of the input (e.g. field names from an enum), you have to write out the full syntax until it gets down to the level of the field names whereas with macro fragment fields it's quite trivial to write (maybe like ($a:enum) => ($($(${$a.variants.fields})*)*))

safinaskar · 2025-02-02T00:28:25Z

@programmerjake

one other problem with just having macro matchers for each ast thing and then writing out rust's syntax explicitly in the macro pattern is that iirc macro_rules isn't powerful enough to fully parse Rust's syntax, since Rust needs some lookahead (iirc 3 tokens), but afaik macro_rules simply don't support that much lookahead.

When you say "macro_rules isn't powerful enough to fully parse Rust's syntax" you mean parsing Rust by manual macro_rules code? This is not what we are talking about. In my approach, code will be parsed by compiler, of course.

Plus, just trying to match rust's syntax isn't enough for ergonomic macros, since if you need to access some interior part of the input (e.g. field names from an enum), you have to write out the full syntax until it gets down to the level of the field names whereas with macro fragment fields it's quite trivial to write (maybe like ($a:enum) => ($($(${$a.variants.fields})*)*))

All ergonomics improvements can be put to 3rd party crates. I. e. compiler should provide core AST and improvements like "extract enum fields simple way" should go to crates.io . Yes, this will probably require https://docs.rs/tt-call/latest/tt_call/ weirdness or something similar for passing resulting list of fields to user code. But I still believe that supporting full (or almost full) AST is aesthetically good approach.

Full AST approach is in line with long-standing goal of establishing the Rust grammar ( https://github.com/rust-lang/wg-grammar ). This RFC simply adds some alternative feature instead, which will never replace proper AST

joshtriplett · 2025-02-03T20:52:36Z

No. The macro will still handle old syntax. It just will not understand new productions. And this is okay. Author of that macro will need to release new version, which will handle missing productions. This is similar to how syn-based proc macros work.

That is one of several problems this RFC sets out to solve: macros should not have to constantly update so that they can parse new syntax, if what they want to extract is something that existed in the old syntax.

All ergonomics improvements can be put to 3rd party crates.

They might be able to be, but that doesn't mean they should be.

Yes, this will probably require https://docs.rs/tt-call/latest/tt_call/ weirdness or something similar for passing resulting list of fields to user code.

Being able to avoid those kinds of hacks is another goal of this RFC.

Full AST approach is in line with long-standing goal of establishing the Rust grammar

That project is dead and archived, and nobody has stepped up to change that. The Rust spec will likely end up specifying a grammar of Rust, but that doesn't mean it'll be in a programmatic form usable for macros.

In any case, right now it's not clear what you are proposing doing in the compiler. You say this should happen in a "more generic way" and "assign fixed names to all (or to many) Rust AST nodes and productions". That's exactly what this RFC is doing. I intend to use this mechanism to expose more-or-less the entire Rust AST. If you would like to see it exposed in a different way, please make a proposal sketch, beyond "leave it entirely to third-party crates". If your proposal is "leave it entirely to third-party crates", I've added that to the "rationale and alternatives" section as a possibility to be considered when this RFC is evaluated.

Macro fragment fields

a2fc4ac

joshtriplett added T-lang Relevant to the language team, which will review and decide on the RFC. A-macros Macro related proposals and issues labels Oct 20, 2024

RFC 3714

1edb079

programmerjake reviewed Oct 21, 2024

View reviewed changes

text/3714-macro-fragment-fields.md Outdated Show resolved Hide resolved

Improve spans for fields without corresponding tokens

7fc82cd

joshtriplett added the I-lang-nominated Indicates that an issue has been nominated for prioritizing at the next lang team meeting. label Oct 22, 2024

kennytm reviewed Oct 22, 2024

View reviewed changes

text/3714-macro-fragment-fields.md Outdated Show resolved Hide resolved

joshtriplett added 6 commits October 22, 2024 18:58

Rephrase some future work

bbb2dd0

Rephrase explanation of using fragment fields

195f8a9

Define param using repetition, to allow users more flexibility with…

5d002f4

…out parsing

Clarify that :fn is a definition, including a body

3e14949

Future work: function declarations

b032d5e

Add more future possibilities

df73c45

eholk reviewed Oct 23, 2024

View reviewed changes

text/3714-macro-fragment-fields.md Outdated Show resolved Hide resolved

kennytm mentioned this pull request Oct 23, 2024

Declarative macro_rules! derive macros #3698

Merged

Kijewski reviewed Oct 24, 2024

View reviewed changes

text/3714-macro-fragment-fields.md Outdated Show resolved Hide resolved

RalfJung reviewed Oct 24, 2024

View reviewed changes

text/3714-macro-fragment-fields.md Show resolved Hide resolved

RalfJung reviewed Oct 24, 2024

View reviewed changes

text/3714-macro-fragment-fields.md Show resolved Hide resolved

coolreader18 reviewed Oct 24, 2024

View reviewed changes

text/3714-macro-fragment-fields.md Show resolved Hide resolved

text/3714-macro-fragment-fields.md Show resolved Hide resolved

joshtriplett and others added 7 commits October 24, 2024 14:57

Fix example

d6a5314

Co-authored-by: René Kijewski <[email protected]>

Future possibilities: function qualifiers like const and async

d0ba412

Hedge a future possibility further

271c9c4

Expand on possible future handling of param

62bf518

Note that adding new fields to an existing matcher is forward-compatible

aacf8ba

Add vis for :adt

2da9937

Discuss synthesis of tokens for fields

69a2c9a

joshtriplett mentioned this pull request Nov 1, 2024

Design meeting on declarative macro improvements rust-lang/lang-team#296

Closed

vincenzopalazzo reviewed Nov 12, 2024

View reviewed changes

text/3714-macro-fragment-fields.md Outdated Show resolved Hide resolved

joshtriplett and others added 8 commits November 12, 2024 09:24

More speculative future possibilities

39f750c

Link RFC

2c885c1

Co-authored-by: Vincenzo Palazzo <[email protected]>

Word-wrap after merging suggestion

3980897

Link RFC in more places

935694c

Fix typo

cb7570a

More future possibilities

185b841

Add unresolved question about return_type

225773b

Future possibility: handle structs and tuples uniformly

0ea7ea9

This gives an example of needing to synthesize tokens.

jhpratt reviewed Dec 2, 2024

View reviewed changes

joshtriplett added 2 commits December 2, 2024 13:14

Add unresolved question about process and delegation

1041307

Wording tweak

bf3dca0

Add backquotes to clarify the type of body

a2f14ab

traviscross added I-lang-radar Items that are on lang's radar and will need eventual work or consideration. and removed I-lang-nominated Indicates that an issue has been nominated for prioritizing at the next lang team meeting. labels Jan 26, 2025

Add null alternative

2afc67e

joshtriplett mentioned this pull request Mar 12, 2025

[RFC] Named macro capture groups #3649

Open

l-Luna mentioned this pull request Apr 18, 2025

Tracking Issue for macro_metavar_expr_concat rust-lang/rust#124225

Open

9 tasks

jhpratt mentioned this pull request Jul 19, 2025

Remove dependency on syn, quote, proc-macro2 to improve compile times jhpratt/powerfmt#1

Open

Macro fragment fields #3714

Are you sure you want to change the base?

Macro fragment fields #3714

Uh oh!

Conversation

joshtriplett commented Oct 20, 2024 • edited by Veykril Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

nikomatsakis commented Oct 21, 2024

Uh oh!

joshtriplett commented Oct 22, 2024

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

matthieu-m commented Oct 24, 2024

Uh oh!

Uh oh!

Uh oh!

joshtriplett commented Oct 24, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

vincenzopalazzo left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

jhpratt left a comment

Choose a reason for hiding this comment

Uh oh!

jhpratt Dec 2, 2024

Choose a reason for hiding this comment

Uh oh!

joshtriplett Dec 2, 2024

Choose a reason for hiding this comment

Uh oh!

Uh oh!

joshtriplett commented Dec 2, 2024

Uh oh!

veluca93 commented Jan 18, 2025

Uh oh!

safinaskar commented Feb 1, 2025

Uh oh!

joshtriplett commented Feb 1, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

safinaskar commented Feb 1, 2025

Uh oh!

safinaskar commented Feb 1, 2025

Uh oh!

programmerjake commented Feb 1, 2025

Uh oh!

safinaskar commented Feb 2, 2025

Uh oh!

joshtriplett commented Feb 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

joshtriplett commented Oct 20, 2024 •

edited by Veykril

Loading

joshtriplett commented Oct 24, 2024 •

edited

Loading

joshtriplett commented Feb 1, 2025 •

edited

Loading

joshtriplett commented Feb 3, 2025 •

edited

Loading