Skip to content

Conversation

@leonard-IMBERT
Copy link
Contributor

This is a fix for #107.

All the test are passing but mypy is angry about file I did not modify, I don't know if this is an issue.

Also pylint wanted to change some yaml (maybe the actual configuration is not enough ?) and black is complaining about files in the docs folder (that do not seems to be python)

I've just one small interrogation, In case of multiline string, it's complicated to enact the hyperlink specification of TIM in the case you're omitting the last [/~], should \n be considered as such ? Or do you allow multiline hyperlink ? In this MR, the behaviour is:

  • If there is no ending [/~] one is appended to the string.

@bczsalba
Copy link
Owner

bczsalba commented Jan 3, 2023

Thank you for this!

I'm currently not quite able to work (still abroad for the holidays), but generally this looks good. Hyperlinks are my least favorite thing to deal with within the terminal (mostly due to how different they are from other styles), and I'm afraid it was a bit neglected in the move from TIM-v2 to TIM-v3.

In regards to the linting errors, I'm pretty sure it's not your fault, so don't mind it too much, lol. They are a whiny bunch, and pleasing all of them is a whole art in and of itself!

In the case of tokens_to_markup, I prefer keeping the output markup as close as possible to the original stream of tokens, so I'd say that part of the PR can be omitted. Other than that, I'll look further into it in a couple of days!

@leonard-IMBERT
Copy link
Contributor Author

Hi, thanks for the response :)

I understand for tokens_to_markup, is the exact behaviour described somewhere in the documentation ? I'll try to stick to it. Also after looking back at the PR, I think there should be a more elegant way of doing this. I'll leave comments in the File changes section

@bczsalba
Copy link
Owner

is the exact behaviour described somewhere in the documentation ?

The best description of it is here, but admittedly it's not very thorough. The general idea for the function is to take a stream of tokens and spit out markup that matches the tokens 1 to 1.

I still haven't had the time to come back to this project, since I have a full game project to hand in on the 25th, which I'm doing 95% of the programming for. I expect to be done earlier than that, but by the 26th I should have time for sure. Sorry about the delays!

@leonard-IMBERT
Copy link
Contributor Author

I still haven't had the time to come back to this project, since I have a full game project to hand in on the 25th, which I'm doing 95% of the programming for. I expect to be done earlier than that, but by the 26th I should have time for sure. Sorry about the delays!

Don't worry, no problems :) I finally found some time to go back and implement the hyperlink clearer token. The PR is now much cleaner (less modification). My head hurted a bit when the test where randomly passing due to randomness in some of them (You don't have so much token, I'm sure the combinatory should not take to much time to test).

Aaaand It seems that I broke something... Don't merge now please

@leonard-IMBERT
Copy link
Contributor Author

The tests are passing but In my project some artefact appears:

image

@bczsalba
Copy link
Owner

Hey again, what's the current status here? I don't completely understand the artifacts or what causes them.

@leonard-IMBERT
Copy link
Contributor Author

Hi ! Sorry, I haven't been able to work back on this issue yet. I expect, I'll be able to work on it by the end of this week :/

I'm not sure either exactly where they could come from, but I expect the it has to come from the reverse clearer add for the hyperlink. I need to do some check but the clearer itself containing an ESC (\x1b) could cause the issue

@bczsalba
Copy link
Owner

That's alright, it's taken me like a month to properly look at the PR, lol. Let me know if you need anything!

@leonard-IMBERT
Copy link
Contributor Author

Finally a fix I'm satisfied with ^^' !

I wanted to add the hyperlink clearer to the list of clearer (it would have been in my opinion the best fix) but because you don't differentiate between CSI and OSC there is a conflict with the invisible token (/~ is \x1b]8 and invisible is \x1b[8. This kind things would need each "action token" (change style, etc...) to hold their kind (if they are OSC or CSI). Would probably also streamline the ansi parsing probably...

Overall I couldn't do another way than making hyperlink a special case (without implementing the change I described up there).

@bczsalba
Copy link
Owner

bczsalba commented Feb 1, 2023

I'll look it over tomorrow, thank you!

About OSC vs CSI differentiation; I'm pretty sure the only OSC codes we ever read are hyperlink ones, so I didn't feel like the internal refactor was worth it at the time (We also write a couple of OSCs, but those are usually read by simple regex). It is a thing I've never been satisfied with, and I'm pretty sure it's about the only thing I am currently not satisfied with in the markup section.

Then again, it might be a tiny change that makes everything a lot better, lol.

@leonard-IMBERT
Copy link
Contributor Author

Hi @bczsalba , sorry to bother but did you find time to look at this PR ?

@bczsalba
Copy link
Owner

Sorry, life got ahead of me!

It looks to be alright and I like the vibe of the changes a lot. I did since realize that I cannot reproduce the original issue, but if in your case it's fixed I'll take that as a win :)

There seems to be an unrelated bug in break_line that messes up links when they are broken up, but that's been there before your changes.

Thanks for the great communication!

@bczsalba
Copy link
Owner

@leonard-IMBERT I think this PR broke something that leads to PlainTokens containing hyperlink-closing sequences to be generated, which breaks the ability to export SVG files. Could you look into it?

I'll mark the bits I think might be causing this in the code.

Comment on lines +204 to +206
if cursor < start:
yield PlainToken(text[cursor:start])

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be that this being further up causes us to yield a non-plain link closer as if it was plain (though moving it back down didn't seem to fix things)

Comment on lines +384 to +386
if token.value == "/~":
return "\x1b]8;;\x1b\\"

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the sequence that shows up and AFAIK this is about the only place it's hard-coded, so it's a bit suspicious as well.

(though sometimes it comes up as the full form, other times only 8;;\x1b\\?)

@leonard-IMBERT
Copy link
Contributor Author

Do you have a reproducible example that I can toy with? It would help :)

@bczsalba
Copy link
Owner

I've been trying to, but it seems elusive! The best I have so far is running mkdocs build on the commit 91c173c, which should complain something about an incorrect SVG. If you drill down to it you'll find that the "incorrect" part is the existence of escaped symbols within the SVG, which happens within the Located in ... part of the inspect(Container) call.

I'll try to find something more in a second.

@bczsalba
Copy link
Owner

bczsalba commented Jul 26, 2023

Alright, I got it down to:

from pytermgui.markup import tim

for plain in tim.group_styles(
    "\x1b]8;;path.py\x1b\\inner\x1b]8;;\x1b\\\x1b]8;;\x1b\\outer"
):
    assert "\x1b]8;;\x1b\\" not in plain.plain, repr(plain)

Running this (on the aforementioned commit) shouldn't raise an assertion error, since hyperlink closers (but generally escape sequences) shouldn't show up in a "plain" tag. Not completely sure whether this is a thing this patch introduced, but the issue showed up afterwards and I don't remember any major changes in any of the systems other than this.

Let me know if you have any questions!

@leonard-IMBERT
Copy link
Contributor Author

leonard-IMBERT commented Aug 22, 2023

Sorry, took a bit of time to found some time x)

About the example

In the example you give me, you close two times the hyperlink. Smt like that [~path.py]inner[/~][/~]outer.

I don't know how the parser should behave... Throw an error ? Ignore the second closing ?

The problems comes from ptyermgui/regex.py

RE_LINK = re.compile(r"(?:\x1b\]8;;([^\\]*)\x1b\\([^\\]*?)\x1b\]8;;\x1b\\)")
RE_ANSI_NEW = re.compile(rf"(\x1b\[(.*?)[mH])|{RE_LINK.pattern}|(\x1b_G(.*?)\x1b\\)")

So the RE_LINK checks for "valid" links (opening and closing of the link). In that case, the first link match but not the second closing is left alone : In this implementation \x1b\]8;;\x1b\\ is not a valid token by itself -> it is not uderstood as style token either because closing link is a OSC not a CSI. The parser does not know what to do with it and so just yield it as a plain token (as I understand).

From my point of view it is a "reasonable" behavior in the sense that it clearly indicate an error in the format and that you should not silently erase tokens (but I admit that to an un-experimented user the cause is hard to pinpoint)

About SVG export randomly breaking

I didn't looked into it yet but, considering your example, I think the problem is that hyperlink closing token are yield twice in certain cases. If you have concrete example again, it will help.

@leonard-IMBERT
Copy link
Contributor Author

After further thinking, I decided to create pull request/discussion to move from regex to state machine for parsing ANSI #130 . It will help correct this bug. Lets continue the discussion over there

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants