[Python] Refactor identifiers#4443
Merged
deathaxe merged 9 commits intosublimehq:masterfrom Mar 2, 2026
Merged
Conversation
Use plural to denote non-popping behavior.
Use plural to denote non-popping behavior.
This commit... 1. wraps all identifiers into dedicated contexts, which immediately pop as soon as a special variable is consumed. It helps reducing syntax cache size and avoids duplicate popping and non-popping contexts. 2. removes redundant includes, which are already handled by `qualified-names`. Note: This commit reduces parsing time of syntax test files by 13%.
This commit scopes built-in functions only in function call expressions. Reason: Especially legacy python 2 built-in functions are likely overridden and used as normal variables in modern python code. Highlighting them built-in despite those no longer existing is somewhat off. If a built-in name is used as local variable, it is likely by intent and thus should be scoped as such.
This commit... 1. replaces `items` context and its lookahead by appending `after-expression` context to each identifier. 2. introduces dedicated `type-hint-names` to replace `qualified-names` usage as those require dedicated item access syntax aka. `type-hint-lists`.
This commit refactors all identifier related contexts to replace redundant lookaheads to distinguish qualified and unqualified identifiers. With this commit, identifiers itself are parsed 20-25% faster. Overall gain bench-marked against main syntax test file is about 10-15%.
This commit attempts to implement `as foo` expressions with a more common pattern in all statements/contexts.
Python does no longer scope unqualified variables `meta.path`.
keith-hall
approved these changes
Feb 25, 2026
michaelblyons
approved these changes
Mar 2, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Fixes #3993
This PR optimizes how Python parses identifiers, without yet relying on branching, but with the goal to make a step towards more accuracy.
It doesn't provide obvious changes with regards to scoping, besides
meta.path(where possible without adding branching).It however improves overall parsing performance by about 15-20%, compared to current master. Benchmark was made using current master's syntax test file and some real world code files.
Note:
The primary challange is python actually requiring 2 sets of syntax rules. One for global expressions, which are always terminated by newline (or semicolon) and another one for expressions within brackets, which may span multiple lines without requiring a line continuation. This distinction is not yet fully made for identifiers, but a first step towards this goal is taken. Any further steps may require branching, which is a whole other story with regards to complexity - thus probably something for separate steps.