Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handle escaped braces in f-strings #1949

Merged
merged 1 commit into from
Aug 4, 2024
Merged

Conversation

stephenfin
Copy link
Contributor

@stephenfin stephenfin commented Aug 1, 2024

To use a curly brace in an f-string, you must escape it. For example:

>>> k = 1
>>> f'{{{k}'
'{1'

Saving this as a script and running the 'tokenize' module highlights something odd around the counting of tokens:

$ python -m tokenize test.py
0,0-0,0:            ENCODING       'utf-8'
1,0-1,1:            NAME           'k'
1,2-1,3:            OP             '='
1,4-1,5:            NUMBER         '1'
1,5-1,6:            NEWLINE        '\n'
2,0-2,2:            FSTRING_START  "f'"
2,2-2,3:            FSTRING_MIDDLE '{'     # <-- here...
2,4-2,5:            OP             '{'     # <-- and here
2,5-2,6:            NAME           'k'
2,6-2,7:            OP             '}'
2,7-2,8:            FSTRING_END    "'"
2,8-2,9:            NEWLINE        '\n'
3,0-3,0:            ENDMARKER      ''

The FSTRING_MIDDLE character we have is the escaped/post-parse single curly brace rather than the raw double curly brace, however, while our end index of this token accounts for the parsed form, the start index of the next token does not (put another way, it jumps from 3 -> 4). This triggers some existing, unrelated code that we need to bypass. Do just that.

Closes: #1948

@stephenfin stephenfin changed the title Resolve #1948 Handle escaped braces in f-strings Aug 1, 2024
@stephenfin stephenfin force-pushed the issue-1948 branch 2 times, most recently from 949cbbf to d19241b Compare August 1, 2024 11:19
@stephenfin
Copy link
Contributor Author

As noted in #1948, this is mainly preventing this code for executing:

if previous_row:
(start_row, start_column) = start
if previous_row != start_row:
row_index = previous_row - 1
column_index = previous_column - 1
previous_text = self.lines[row_index][column_index]
if previous_text == "," or (
previous_text not in "{[(" and text not in "}])"
):
text = f" {text}"
elif previous_column != start_column:
text = line[previous_column:start_column] + text

However, I have no idea what that code is intended for nor if I can remove it outright. Hopefully someone here can enlighten me so I can stick a comment in for future contributors/future me.

Copy link
Member

@asottile asottile left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please rebase out unrelated formatting changes

src/flake8/processor.py Outdated Show resolved Hide resolved
Comment on lines 251 to 281
f'hello world'
f'{{"{hello}": "{world}"}}'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please write a separate test demonstrating your problem instead of changing the meaning of this test

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I meant an entirely separate test -- with a name describing what's being tested (specifically the handling of curly braces in fstrings) rather than overloading the existing test

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems a bit overkill, no? This accomplishes the same thing without significant additional runtime cost, and it's arguably a better test owing or the fact it mixes strings and "code". From your comment on the issue:

nope! the method is meant to redact string contents and not code

I can straight up copy-paste 99% of the test, but it just seems a bit daft.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've asked for it twice now

To use a curly brace in an f-string, you must escape it. For example:

  >>> k = 1
  >>> f'{{{k}'
  '{1'

Saving this as a script and running the 'tokenize' module highlights
something odd around the counting of tokens:

  ❯ python -m tokenize wow.py
  0,0-0,0:            ENCODING       'utf-8'
  1,0-1,1:            NAME           'k'
  1,2-1,3:            OP             '='
  1,4-1,5:            NUMBER         '1'
  1,5-1,6:            NEWLINE        '\n'
  2,0-2,2:            FSTRING_START  "f'"
  2,2-2,3:            FSTRING_MIDDLE '{'     # <-- here...
  2,4-2,5:            OP             '{'     # <-- and here
  2,5-2,6:            NAME           'k'
  2,6-2,7:            OP             '}'
  2,7-2,8:            FSTRING_END    "'"
  2,8-2,9:            NEWLINE        '\n'
  3,0-3,0:            ENDMARKER      ''

The FSTRING_MIDDLE character we have is the escaped/post-parse single
curly brace rather than the raw double curly brace, however, while our
end index of this token accounts for the parsed form, the start index of
the next token does not (put another way, it jumps from 3 -> 4). This
triggers some existing, unrelated code that we need to bypass. Do just
that.

Signed-off-by: Stephen Finucane <[email protected]>
Closes: PyCQA#1948
@asottile asottile merged commit 939ea3d into PyCQA:main Aug 4, 2024
11 checks passed
@asottile asottile added this to the 7.1.1 milestone Aug 4, 2024
@stephenfin stephenfin deleted the issue-1948 branch August 4, 2024 22:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

logical_line for call with f-string can result in invalid ast under Python 3.12
2 participants