-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Formatter: Keep right-hanging comments aligned #7684
Comments
Now that the quote handling (#7310) has been fixed in 0.0.292, this is the last formatting incompatibility that I've been able to find between |
I have also been annoyed with this when using Black. I'm not sure if there's a downside to supporting this. Ideally it would only apply if they start out aligned. Additionally, it'd be cool if it shrunk to the minimum compatible distance for alignment — but that's a little magical. |
Enforcing this conditionally seems really hard... and enforcing this globally by default would be a big deviation from Black. I suspect it would need to be a global setting, if we choose to support it. |
For what it's worth, I have consistently rejected this sort of feature in Black, as I don't think it leads to a consistent, stable formatting style. Obviously, what style Ruff chooses is up to you. Barry's proposed rule is that the number of spaces before the hash gets preserved. But what if someone comes along and decides that zebras are worth more: def foo():
x = 1 # this comment
why = 2 # aligns with this comment
zebra = 31 # and this one Now either the code gets misaligned (ugly), or they have to manually fiddle with the spaces, which defeats the point of an autoformatter. Now, possibly the formatter could be made smart enough to detect that the comments are misaligned, and fix the spacing on the zebra line to get them aligned again. But now what if we decide that zebras are worth a lot more: def foo():
x = 1 # this comment
why = 2 # aligns with this comment
zebra = 300000000000000000000000000000000000000 # and this one Then, to get the lines aligned again, the formatter would have to change the |
There's a little bit of missing context that might be helpful. In python-mode you can set a comment column, and then hit So @JelleZijlstra 's first example, it's relatively easy, albeit manual, to re-indent everything to an aligning column. Emacs only looks at the current, line so it doesn't actively try to keep things lined up, but the effect is usually that they are. The preservation of whitespace rule that It's never going to be perfect, but I'd argue that |
Hm... a |
I'd prefer not to support this, i think comments directly after the statements or alternatively leading own line comments (where they are column aligned) are fine enough to read. For the the case of not modifying whitespace i agree with @JelleZijlstra that would defeat the point of an autoformatter while for the case of column aligning i don't think the possible improvement warrants introducing a column layout when we don't use one elsewhere¹. This is also bad from a point of locality, currently the formatting of one statement does not influence the formatting of other statements (we see those comments as trailing on the statement). ¹With column layout, i mean e.g. class Foo:
x = 1 # this comment
why = 22 # aligns with this comment
zebra = 333 # and this one |
Thinking about this overnight, this could be the best way forward. It would be a local-to-the-line only setting -- no need for global parsing or alignment. It also, um, aligns more with the Emacs behavior. Straw man proposal:
|
The top two reasons I use blue and not black are single quotes and not messing with comment spacing. I don't mind bumping the spaces to 2, but removing additional spaces is a real pain. If I'm using more, there's pretty much always a reason. Here's an example that even blue currently damages (blue fixed the single-line expression case). These are pairs of field names and numpy dtype codes, followed by comments: header_dtd = [
- ('sizeof_hdr', 'i4'), # 0; must be 348
- ('data_type', 'S10'), # 4; unused
- ('db_name', 'S18'), # 14; unused
- ('extents', 'i4'), # 32; unused
- ('session_error', 'i2'), # 36; unused
- ('regular', 'S1'), # 38; unused
- ('dim_info', 'u1'), # 39; MRI slice ordering code
- ('dim', 'i2', (8,)), # 40; data array dimensions
+ ('sizeof_hdr', 'i4'), # 0; must be 348
+ ('data_type', 'S10'), # 4; unused
+ ('db_name', 'S18'), # 14; unused
+ ('extents', 'i4'), # 32; unused
+ ('session_error', 'i2'), # 36; unused
+ ('regular', 'S1'), # 38; unused
+ ('dim_info', 'u1'), # 39; MRI slice ordering code
+ ('dim', 'i2', (8,)), # 40; data array dimensions
] I think it's hard to argue that this change makes things more readable. I understand that preserving an aligned column is difficult, and I really don't care if an autoformatter chooses not to try. But if I go through the trouble of making an aligned column of comments, it's frustrating for an autoformatter to destroy it and give no option to just leave it alone. The PR I opened for blue about this (grantjenks/blue#83) still enforces at least two spaces, but otherwise leaves inline comment spacing alone. |
Destroying carefully crafted comment spacing can make code harder to parse/understand which in turn can make it easier for bugs to creep in. Some people don't use comment spacing to make their code easier to visually parse and for them keeping the option of enforcing 2-spaces is fine. For those who do use spacing it would be nice to have the option to at least turn off any messing with comment spacing at all. I'd much rather manually align my comments (or not) than have an auto-formatter make my code harder to understand and maintain. The ability to easily parse the code and therefore more easily spot bugs is also at the core of Jarrod's request to not mess with I'd be fine with switches such as |
How frequent do you use these carefully crafted comments? IMO, disabling formatting might be the best option if they are rare. Considering: header_dtd = [
('sizeof_hdr', 'i4'), # 0; must be 348
('data_type', 'S10'), # 4; unused
('db_name', 'S18'), # 14; unused
('extents', 'i4'), # 32; unused
('session_error', 'i2'), # 36; unused
('regular', 'S1'), # 38; unused
('dim_info', 'u1'), # 39; MRI slice ordering code
('dim', 'i2', (8,)), # 40; data array dimensions
] It's unclear how the formatter should format the comments once a line exceeds the configured line width header_dtd = [
('sizeof_hdr', 'i4'), # 0; must be 348
('data_type', 'S10'), # 4; unused
('db_name', 'S18'), # 14; unused
('extents', 'i4'), # 32; unused
('session_error', 'i2'), # 36; unused
('regular', 'S1'), # 38; unused
('dim_info', 'u1'), # 39; MRI slice ordering code
(
'a very long header that exceeds the line width so that it breaks', 'i2', (8,)
), # 40; data array dimensions
] |
For me, they're rare enough that disabling formatting on a case-by-case basis is ok. I guess there's a threshold where too many |
Yes, I can disable formatting for these structures, but there are nice features of formatting, such as uniformizing indentations and quote style, and it would be nice not to have to throw out the baby with the bathwater. To be clear, I do not want the formatter to calculate the correct columns. Bumping 0/1 spaces to 2 is a completely fine rule to apply, but otherwise I just want them left alone. If I had indicated to the formatter to leave comments alone, I would expect the tuple to get reformatted and the comment spacing kept the same. If I then decided to adjust the spacing before the comment, I would expect the formatter to leave the new spacing alone. FWIW, if your situation occurred, I would probably reformat it to: header_dtd = [
('sizeof_hdr', 'i4'), # 0; must be 348
...,
(
'a very long header that exceeds the line width so that it breaks',
'i2',
(8,),
), # 40; data array dimensions
] Just as a meta-comment, it really is odd to me that black et al have decided that using spacing in comments to improve readability needs to be done away with, like arguments over tabs vs spaces or where to put newlines in parameter lists. Comments are the one bit of code whose only purpose is communication with other humans (okay, modulo |
💯 What do you think about my previous suggestion? I think it covers all the use cases. |
Yes, that would work fine for me. |
We want to support the best possible comment formatting that matches users' intuitions. It's just that comment formatting is surprisingly hard.I would estimate that about 30% of the development effort of our formatter is related to handling comments as best as we can. Comments are hard because they can appear in any position, there's no formal definition of what a comment comments, the output must be stable (on the first try), the formatter must support all possible comment placements, and misplacing comments can easily lead to syntax errors.
Assuming we align all comments at, e.g. column 60. What happens if the comment has a width of 29, exceeding the line width by 1? For example, let's pretend the comment below is aligned at column 60 and does exceed the line width. a = [1, 2] # comment that exceeds the line width when aligned to column 60 The comment would fit fine if it isn't aligned at column 60. What's your expected behavior? |
Speaking just for myself, I would expect that Aside: I wish there was a better way to signal intent to static checkers, rather than pragma/type comments. |
Yes. There are plenty of ruff checks that are not auto-fixable, and that hasn't been an existential problem. If the problem is this difficult, the obvious solution is to work on something that is tractable. Clearly some people like black's behavior, so I am not saying do not include it. Just let it be optional, please. Or you can go further and allow some other specific behaviors, as Barry suggests. Anyway, I feel like I've taken up enough space in this thread. I like ruff, I like autoformatting. Thanks for all the effort, whatever the end result of this conversation is. |
if we're still in the spit-balling ideas phase, I'd like to use a double hash fwiw, I came up with this as the most pleasant to my eyes workaround class Foo(Enum):
AAA = -2 # > for the a's
BBBBBBBB = 0 # > NOT USED
CCCC = 1 # > only when you need sea |
Not sure I would use "most pleasant" over "least horrible" but I get your point! 😆 I suppose you could do this too: class Foo(Enum):
AAA = -2 # # for the a's
BBBBBBBB = 0 # # NOT USED
CCCC = 1 # # only when you need sea which almost looks right! |
that was too uncanny valley for me 🤪 |
I would generalize this issue as vertical alignment (colons, assignments, and comments), including cases such as: @dataclass
class Example:
an_attr: int
another_attr: str | None
whatever: dict[str, str]
class Colors(IntEnum):
BLACK = 0 # a comment
RED = 1 # another comment
YELLOW = 2 I use a VSCode extension for vertical alignment, Better Align, which has a very convenient configuration method. So, the above formatting would be: colon = [ -1, 1 ]
assignment = [ 1, 1 ]
comment = 2 |
The real solution to this problem are elastic tabstops, which would make this problem disappear overnight by using a single variable length tab for separation. But as long as editors are stuck in the 60s by interpreting tabs always as |
Pretty interesting read, but I'm not sure how this helps this particular issue in practice, given there are probably more editors in use out there than lawyers (or more relevant perhaps - coders?) with opinions 🤣 . |
New to the party this one really bugs me. I have so many cases where I have hanging aligned comments: coefficients = [
a + b + c, # c0 - used for X
a * c, # c1 - used for Y
1 /2 + d, # c2 - used for Z
] Am I correct reading through the issue that there is no way to add some type of exclusion to prevent inline comments from being excluded from formatting? |
I'm curious if there are any new thoughts or progress on this? Is it likely to have a solution in the near future or far future? Debating ruff for a new project instead of blue. I'm undecided on whether to make the jump. |
I'm of the opinion that comments, including the whitespace leading up to them, should not be touched by the formatter. All of the workarounds supplied in this thread lead to really ugly comment code. This one issue is my current biggest gripe with auto formatters and I find myself smattering |
As one of the original maintainers of blue, I'll say that despite this issue still being unresolved, I've already made the jump despite the occasional |
Is there a place to put the I'm having to resort to a (It's also somewhat annoying that |
Would you mind sharing a concrete example. It would help me understand what you're running into |
I actually do this as shown below. It is far from perfect, and if the length of any of the characters before the # changes, the comment has to be updated manually, but as long as I am working somewhere else in the file (and auto-formatting along the way), nothing is changed. def foo(): |
@MichaReiser If you were asking about my dataclass/TypeDict comment, https://github.com/lmstudio-ai/venvstacks/blob/7877f743fd3df4d0d97c81d2f637f64ab73f1c34/src/venvstacks/stacks.py#L177 is one of the simpler cases: class EnvironmentLockMetadata(TypedDict):
"""Details of the last time this environment was locked."""
# fmt: off
locked_at: str # ISO formatted date/time value
requirements_hash: str # Uses "algorithm:hexdigest" format
lock_version: int # Auto-incremented from previous lock metadata
# fmt: on Putting For the original feature request here, I wonder if a special trailing comment alignment directive could solve the ambiguity problem: class EnvironmentLockMetadata(TypedDict):
"""Details of the last time this environment was locked."""
# fmt: # align comments
locked_at: str # ISO formatted date/time value
requirements_hash: str # Uses "algorithm:hexdigest" format
lock_version: int # Auto-incremented from previous lock metadata Until the next blank line, trailing comments would be aligned with the trailing comment on the The trailing comment alignment behaviour could then be specified as:
|
The topic of aligned assignments came up again on discuss.python.org. Considering the "alignment header" idea in that light, I would revise my suggestion to be: # fmt: [= align][# align [+N]]
The column formatting would apply until the next blank line. Edit: Alternatively, if there needs to be a directive name up front for the formatter to make sense of the header comment: # fmt: align [= (...|+N)][# (...|+N)]
|
ruff format
followsblack
's style for right-hanging comments (i.e. inline comments), but that leads to poorly readable results, especially when a series of consecutive lines uses comment alignment by column. Blue fixes this readability problem.Let's start with this file:
As you can see, every
#
on each of these lines appears at the same column. How does ruff reformat the line?This is the way black does it, i.e. by jamming all comment starts to two space characters between the last code character and the
#
. This destroys the alignment.How does blue format the file? Nothing changes! Blue preserves the number of spaces between the last code character and the
#
with the assumption that the spacing is deliberate and done for readability.ruff format
should followblue
's rule here.The text was updated successfully, but these errors were encountered: