Skip to content

Conversation

@lilnasy
Copy link
Contributor

@lilnasy lilnasy commented Nov 19, 2025

  • "Slow but working" implementation for Linter plugins: Token-related SourceCode APIs #14829 (comment)
  • Uses the parser from @typescript-eslint/typescript-estree for tokens. The source code is fully parsed and everything but the tokens is discarded.
  • Snapshots are generated in a typescript-eslint project and copied over with only formatting edits.
  • Direct commits from maintainers welcome.
  • Work in progress...

Tasks

  • getTokens (support for options pending)
  • getFirstToken
  • getFirstTokens
  • getLastToken
  • getLastTokens
  • getTokenBefore
  • getTokenOrCommentBefore
  • getTokensBefore
  • getTokenAfter
  • getTokenOrCommentAfter
  • getTokensAfter
  • getTokensBetween
  • getFirstTokenBetween
  • getFirstTokensBetween
  • getLastTokenBetween
  • getLastTokensBetween
  • getTokenByRangeStart
  • isSpaceBetween
  • isSpaceBetweenTokens
  • Move token types to src-js/plugins/tokens.ts
  • Bundling typescript results in unresolved __filename usage. Uncaught ReferenceError: __filname is not defined.
  • Prevent class name mangling by tsdown. Could be deferred.
  • Test regex-divide ambiguous syntax.
  • Parse as JSX only if source text is JSX.
  • Lazy load the parser on demand. Could be deferred.
  • Test for conformance directly against eslint and typescript-eslint. Could be deferred.

Decisions

  • @typescript-eslint/typescript-estree peer-depends on typescript. How should we package it? Currently, typescript is being made an optional dependency.
    • overlookmotel (on discord): ideally bundled, direct runtime dependency otherwise.
  • Deprecated methods are being removed altogether in ESLint 10: getTokenOrCommentBefore, getTokenOrCommentAfter, and isSpaceBetweenTokens. These are surface level deprecations: the functionality was merged with other methods (the includeComments: true option) and plugins can migrate with a one line change. I'm guessing we are targeting fairly modern, actively developed projects. Should we expose them?

@github-actions github-actions bot added A-linter Area - Linter A-cli Area - CLI A-linter-plugins Area - Linter JS plugins labels Nov 19, 2025
@lilnasy lilnasy changed the title Feat/linter/plugins/token methods feat(linter/plugins): Token-related SourceCode APIs (TS ESLint implementation) Nov 19, 2025
@github-actions github-actions bot added the C-enhancement Category - New feature or request label Nov 19, 2025
@lilnasy lilnasy force-pushed the feat/linter/plugins/token-methods branch 2 times, most recently from 3e3a5bb to dd03ef0 Compare November 19, 2025 12:35
@overlookmotel
Copy link
Member

overlookmotel commented Nov 19, 2025

Deprecated methods are being removed altogether in ESLint 10: getTokenOrCommentBefore, getTokenOrCommentAfter, and isSpaceBetweenTokens. These are surface level deprecations: the functionality was merged with other methods (the includeComments: true option) and plugins can migrate with a one line change.

Cameron and I had a bit of an argument about exactly this question!

We concluded in the end to keep all the deprecated methods, to maximize compatibility with older plugins, which may take some time to get updated (or in some cases, will never get updated). Our rationale is that most of these methods are just aliases, so no maintenance burden to keep them. The only one which is slightly different from its non-deprecated "brother" is isSpaceBetweenTokens, and that's pretty simple to implement - it just treats JSXText differently (if I remember right).

I'm guessing we are targeting fairly modern, actively developed projects.

That's true, but even actively developed projects may use old unmaintained plugins.

Note: I added the stubs in #15645. That PR also added tests which illustrate the difference in behavior between isSpaceBetween and isSpaceBetweenTokens.

Copy link
Member

@overlookmotel overlookmotel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Possibly premature review, as it's still marked as draft, but I'm keen to get it merged so I thought I'd go through it now.

Apart from the comments below, this looks good to me. Sorry for the volume of comments - most are pretty small details.

Once we're happy with getTokens impl, I think we should merge this, and we can add more methods in separate PRs. No need to do the whole API in a single PR.

A couple of points which we can also leave to follow-ups:

  1. We should ideally lazy-load @typescript-eslint/typescript-estree package only when getTokens is first called.

  2. I assume TS-ESLint's parser also generates a ScopeManager. If it does, we may as well cache it, to avoid running scope analysis again if plugin does sourceCode.getTokens() followed by getting sourceCode.scopeManager.

const ast = parse(sourceText, {
sourceType: 'module',
tokens: true,
jsx: true,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

jsx property should be set depending on whether file is JSX/TSX or plain JS/TS.

There are some edge cases where you get a different AST depending on whether file is TS or TSX (it alters the interpretation of < in some places). I don't actually know if it'd alter the tokens either way, but probably better to be safe.

Right now we don't have a way to get whether the source is JSX or not on JS side (the flag is there in the buffer, but it doesn't appear in ESTree spec, so it's not deserialized). But we can add that ability later. Please just add a TODO comment for now.

Comment on lines 34 to 114
export interface Token extends Span {
type: string;
export type Token =
| BooleanToken
| CommentToken
| IdentifierToken
| JSXIdentifierToken
| JSXTextToken
| KeywordToken
| NullToken
| NumericToken
| PrivateIdentifierToken
| PunctuatorToken
| RegularExpressionToken
| StringToken
| TemplateToken;

export interface BaseToken extends Omit<Span, 'start' | 'end'> {
type: Token['type'];
value: string;
}

export interface BooleanToken extends BaseToken {
type: 'Boolean';
}

export type CommentToken = BlockCommentToken | LineCommentToken;

export interface BlockCommentToken extends BaseToken {
type: 'Block';
}

export interface LineCommentToken extends BaseToken {
type: 'Line';
}

export interface IdentifierToken extends BaseToken {
type: 'Identifier';
}

export interface JSXIdentifierToken extends BaseToken {
type: 'JSXIdentifier';
}

export interface JSXTextToken extends BaseToken {
type: 'JSXText';
}

export interface KeywordToken extends BaseToken {
type: 'Keyword';
}

export interface NullToken extends BaseToken {
type: 'Null';
}

export interface NumericToken extends BaseToken {
type: 'Numeric';
}

export interface PrivateIdentifierToken extends BaseToken {
type: 'PrivateIdentifier';
}

export interface PunctuatorToken extends BaseToken {
type: 'Punctuator';
}

export interface RegularExpressionToken extends BaseToken {
type: 'RegularExpression';
}

export interface StringToken extends BaseToken {
type: 'String';
}

export interface TemplateToken extends BaseToken {
type: 'Template';
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest moving all these types to src-js/plugins/tokens.ts. It was my mistake originally to put loads of type defs in one "everything and the kitchen sink" types.ts file, and I've been slowly moving them all out to other files. Please forgive me - I'd never used TypeScript before last month, so I'm still learning...

If we want to expose these types publicly (I assume we do?), need to also re-export them from src-js/index.ts.

tagged`template ${'literal'}`;

// RegularExpressionToken
/pattern/g;
Copy link
Member

@overlookmotel overlookmotel Nov 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest also adding this weirdo:

1 /not_a_regex/gu;

That should be tokenized as 1, /, not_a_regex, /, gu, ; (not 1, /not_a_regex/gu, ;).

AST explorer

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

176a184
The ambiguity is being undone by autofix CI. I'll revisit it at a later point.

Comment on lines 9 to 15
const tokens = sourceCode.getTokens(node);
context.report({
message: JSON.stringify(tokens, null, 2),
node,
});
Copy link
Member

@overlookmotel overlookmotel Nov 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To make the snapshot easier to verify by eye, I suggest looping through all tokens, and making a separate context.report call for each. Then each token will be underlined in the code, alongside the token's details.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason it is the way it is now is that I am manually copying the same JSON output from a typescript-eslint plugin with the same source text. It's like a makeshift conformance test until I can figure out how to create a solid testing foundation.

@lilnasy lilnasy force-pushed the feat/linter/plugins/token-methods branch from db8abc2 to 0d4164e Compare November 20, 2025 09:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

A-cli Area - CLI A-linter Area - Linter A-linter-plugins Area - Linter JS plugins C-enhancement Category - New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants