Skip to content
This repository has been archived by the owner on Dec 15, 2022. It is now read-only.

Fix some sql highlighting #450

Open
wants to merge 7 commits into
base: master
Choose a base branch
from

Conversation

keevan
Copy link

@keevan keevan commented Feb 5, 2022

Requirements

  • Filling out the template is required. Any pull request that does not include enough information to be reviewed in a timely manner may be closed at the maintainers' discretion.
  • All new code requires tests to ensure against regressions

Description of the Change

Combined screenshot of result

Before:

image

After (note $editors not being highlighted when SQL syntax is applied is a separate issue):

image

3 main changes to allow for SQL syntax highlighting in strings, in new scenarios.

  1. Allows lowercased keywords (e.g. "select * from users")

  2. Query starting on next line of string opener (requires --sql on opening line).

  3. Query starting with a parenthesis will be allowed in the SQL begin check. (e.g. "(select 1) union all (select 2)"

Alternate Designs

  1. An alternative to this might be to repeat the same list of keywords in lowercase, so it prevents false positives such as the string "Select from the following:" from being incorrectly highlighted. This doesn't prevent the problem entirely though, and I'm probably still okay making a non-important string have weird colors than to not have this option for people who prefer code to - and I quote - "be screaming at them".
  2. Alternative options were discussed in this thread: Embedded SQL-Syntax only works for inline SQL #87
    Until tree-sitters become a reality, I think having a 'experimental / opt-in' option might make more people happy in the mean time. At least, I would be. An option is better than none / waiting imo. I wouldn't necessarily enforce it throughout the codebase though.
  3. Heredocs were proposed but PHPCS might disallow such usage in the first place, and therefore could not be an option.

Benefits

More syntax highlighting options for SQL in PHP strings, particularly ones that probably should have worked out of the box (1,2)

Possible Drawbacks

False positives (1 proposed) - for the string examples mentioned above.

Otherwise, if it's as the (alternative design 1), this would mean that sPonGebob notation will not be supported. (See https://sqlfum.pt/)

image

Applicable Issues

#87
#449 (mentions but not its own issue)

@keevan
Copy link
Author

keevan commented Feb 6, 2022

  • Should fix up --sql matching to be at least -- sql instead, because --sql is not a valid comment and syntax highlighting should not incorrectly kick off earlier than intended.

Since --sql is not a valid comment and will create errors if applying blindly.
Copy link
Contributor

@KapitanOczywisty KapitanOczywisty left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1. Allows lowercased keywords (e.g. "select * from users")
3. Query starting with a parenthesis will be allowed in the SQL begin check. (e.g. "(select 1) union all (select 2)"

These are ok for me

2. Query starting on next line of string opener (requires --sql on opening line).

However I don't see any point for that one. There Is Heredocs for such cases, and -- sql seems to not be used at all in that context in the wild: https://grep.app/search?q=%5B%27%22%5D--%5Cs%2Bsql&regexp=true&filter[lang][0]=PHP

Also please add some tests for new features.

grammars/php.cson Outdated Show resolved Hide resolved
grammars/php.cson Outdated Show resolved Hide resolved
@KapitanOczywisty
Copy link
Contributor

KapitanOczywisty commented Feb 6, 2022

  1. An alternative to this might be to repeat the same list of keywords in lowercase, so it prevents false positives such as the string "Select from the following:" from being incorrectly highlighted.

This is valid concern, but Select is also used for sql (examples). I'm not sure if it's worth adding other variants than uppercased, since it can break a lot of code. But this is something to be discussed.

Heredocs were proposed but PHPCS might disallow such usage in the first place, and therefore could not be an option.

Rule was created when heredocs weren't able to handle indentations, this is not the case now. By grep.app Squiz.PHP.Heredoc.NotAllowed was disabled more often than -- sql was ever used.

@keevan
Copy link
Author

keevan commented Feb 6, 2022

Rule was created when heredocs weren't able to handle indentations, this is not the case now. By grep.app Squiz.PHP.Heredoc.NotAllowed was disabled more often than -- sql was ever used.

Fair point. It seems the indent-able heredocs was introduced in php 7.3. Also yeah I wouldn't be suprised if -- sql was ever used anywhere, since it started from a need here: #87 (comment)

I still personally prefer the option to trigger highlighting with -- sql because the effort to add it to existing query blocks is simpler than converting all the bits to heredocs. But that's what forks are for 😄 (Though I wish there were an option to toggle this on / off in settings)

Also thanks for the feedback. I'll try to add some tests for the new stuff when I get the chance.

keevan and others added 3 commits February 10, 2022 18:11
Co-authored-by: KapitanOczywisty <[email protected]>
Co-authored-by: KapitanOczywisty <[email protected]>
- Check for case insensitivity which will allow for `Select`, `select` and sPoNgebOB notation keyword starters
- Check for starting round brackets as it may also be used at times
@keevan keevan force-pushed the fix-some-sql-highlighting branch from 792fa61 to 169b9a4 Compare February 10, 2022 07:40
@keevan
Copy link
Author

keevan commented Feb 10, 2022

@KapitanOczywisty I've applied your suggestions and added some tests based on tests that were already there. Let me know if you prefer I split it out to a separate it block, though I feel like it made more sense where I've added it.

Ran specs locally - seems okay. Back over to you.

image

@adjenks
Copy link

adjenks commented Feb 11, 2022

Something that really bugs me is that the CTE keyword WITH doesn't trigger syntax highlighting. Would that be fixed if I added the --sql?

@keevan
Copy link
Author

keevan commented Feb 11, 2022

Something that really bugs me is that the CTE keyword WITH doesn't trigger syntax highlighting. Would that be fixed if I added the --sql?

Ah, the --sql keyword was something I thought about adding, but has been removed as @KapitanOczywisty did not want to introduce new custom syntax to trigger the syntax highlighting.

I tried locally, and it seems like it is highlighting as expected:

image

The only time it doesn't is if the string starts with a ;. @adjenks, Do you have an example of when you'd expect it to start highlighting? Perhaps it might be simple enough to include as part of this PR as well to hopefully get it over the line.

@adjenks
Copy link

adjenks commented Feb 14, 2022

@keevan It doesn't exist in phpDocumentor, but I always thought that putting the language in a doc-block made the most sense. It's common format for documenting code.

Something like...
/** @lang sql */
$query = 'SELECT 1';

There's no lang tag, but, here is the var tag as an example:
https://docs.phpdoc.org/3.0/guide/references/phpdoc/tags/var.html#examples

I don't think it should be inside the string itself because then you add some overhead to the query being sent to the server, albeit not much.

Good to see that WITH is mostly working though, last time I checked it wasn't. Guess I haven't tested it for a while.

@KapitanOczywisty
Copy link
Contributor

The only time it doesn't is if the string starts with a ;.

What is the purpose of ; before the query? I don't think it is functional unless you are creating multi-query - which is not the best idea in php.

It doesn't exist in phpDocumentor, but I always thought that putting the language in a doc-block made the most sense. It's common format for documenting code.

It won't work for TextMate (or even Tree-sitter) syntax highlighting, it would be way too much added complexity, if even possible.

I don't think it should be inside the string itself because then you add some overhead to the query being sent to the server, albeit not much.

This is over-optimization, striping comments takes nanoseconds. Updating table indexes can have real impact on performance, not removing comments. Measure, optimize and measure again.

@keevan
Copy link
Author

keevan commented Feb 15, 2022

What is the purpose of ; before the query?

I don't use it myself, I only observed examples online when searching for examples for @adjenks where highlighting did not apply.

With that said, if there's nothing left that needs to be changed, I'd like these changes to be reviewed and merged in if possible 😃

@KapitanOczywisty
Copy link
Contributor

With that said, if there's nothing left that needs to be changed, I'd like these changes to be reviewed and merged in if possible 😃

This will take a while, also with possible false positives from case insensitive keywords, I'd like to see the input from the community.

@adjenks
Copy link

adjenks commented Nov 9, 2022

I'd like to see the input from the community.

I still think that all syntax highlighters should offer a way of indicating the language of a string by writing it in a preceding comment, much like how you often can in markdown by writing the language after three back-ticks.

Markdown:

```js
console.log('hello');
```
Result:

console.log('hello');

Perhaps we could have a vote on anything that you want input on?

We could vote on manual language selections here using reactions:

(React 😄) Option 1 - Inside the text itself:

$query = '--sql
SELECT 1';

(React 🎉) Option 2 - Inside a preceding docblock:

/** @lang sql */
$query = 'SELECT 1';

(React 🚀) Option 3 - Inside a preceding non-docblock comment:

//L:sql
$query = 'SELECT 1';

(React 😕) Option 4 - None, only automatic language detection:

$query = 'SELECT 1';

I think auto-detection is fine, but if you can just manually select I think it would solve some problems and improve documentation and processing in some cases.

@KapitanOczywisty
Copy link
Contributor

KapitanOczywisty commented Nov 10, 2022

@adjenks First of all atom is dead, so there is no point discussing this PR any further. If you're using vscode there will be fork by me or somebody else and there you can continue - with vscode in mind.

Second thing options 2 and 3 are not viable to implement, textmate grammar doesn't have "memory" to keep track of language labels and use them later. Option 1 is non-standard and not popular (invented in this very PR), this is not something which should be added for hundreds of thousands of users.

And finally there is heredoc <<<SQL which is used, popular and since php 7.3 properly formattable.

There was somwhere issue to improve semantic tokens in vscode which would allow to support stuff like /** @lang sql */ without major overhead, however I don't think it is on the radar right now.

@adjenks
Copy link

adjenks commented Nov 13, 2022

@KapitanOczywisty Ah okay. Thank you for all the info.

I was interested in highlighting for both applications since I was using both when I joined the discussion. Someone had given me the impression that VSCode shared this library. Feel free to point me to the correct library for VSCode if one has been made.

That's unfortunate about the unviability. The semantic token issue you mentioned would be nice. I'll see if I can find it.

@KapitanOczywisty
Copy link
Contributor

Someone had given me the impression that VSCode shared this library.

It is still using it, but nobody is merging anything and repository will be archived in mid december.

Feel free to point me to the correct library for VSCode if one has been made.

Not yet, but we're thinking about how to proceed. I'll leave note about the new repository in the issues when time comes.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants