-
Notifications
You must be signed in to change notification settings - Fork 21
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
py-tree-sitter-languages
is unmaintained
#7
Comments
I don't have a big investment in this decision, so my opinion might not be worth much but I'm sharing it anyway: I agree that there is likely space for a best of both worlds option. Maintained, smaller binary, adaptable, the dream! Also ditto what @jvmncs said on |
Hi, author of |
This does make packaging aider, which I am working on, something of a sticky bit. It is possible currently to get around the build issues for py-tree-sitter-languages by pinning |
Ultimately this is resulting in aider not really being able to package for Python 3.12 easily, as tree-sitter 0.21 doesn't like newer versions of Python. |
@greg-hellings, I don't quite understand the Python 3.12 concern. According to tree-sitter/py-tree-sitter@ce1af66, |
@jmehnle Indeed tree-sitter did come out with a 0.21.2 that supported Python 3.12. But most people who are consuming this outside of a |
Ok, so there's not a specific major problem with Python 3.12. I understand that recent versions of the |
Correct, the issue is that the dep makes moving forward with Python versions and grep-ast more tedious. Not that there is directly a problem with 3.12 but that the issue is with a stale dep. |
…ress maintenance Resolves Aider-AI#7. This commit replaces the tree-sitter language pack from grantjenks/py-tree-sitter-languages with Goldziher/tree-sitter-language-pack, significantly expanding language support and addressing maintenance issues. Key changes include: 1. Greatly increases the number of supported languages, including Swift and Svelte. 2. Resolves dependency on an unmaintained package that was forcing grep-ast to use an old tree-sitter version (0.21). 3. Unlocks the ability to use more recent tree-sitter versions. 4. Updates requirements.txt to use tree-sitter-language-pack>=0.2.0. 5. Increments the version number to 0.3.4-dev in setup.py. 6. Adds extensive test cases for parsing various languages in test_parsers.py. Notable changes: - Removed support for DOT, OCaml, ql (GitHub CodeQL), and tsq (Tree Sitter Query) due to their absence in the new pack. - Removed potentially incorrect mappings for .gomod, .sqlite, and .regex extensions. - Replaced the uncommon ".et" mapping for "embeddedtemplate" with mappings for ERB and EJS, which are common uses of embedded templates. - Re-enabled markdown as the new pack uses to a different markdown grammar that likely doesn't suffer from previous bugs.
I've just opened PR #8 (Draft) to migrate grep-ast to Please take a look and let me know your thoughts! |
The PR looks great, thanks for preparing it. Any thoughts on how the pip install of -language-pack compares to -languages? On my mac, -pack took 4 minutes and ~130MB whereas -languages takes <2 seconds and 80MB. It seems like -languages had pre-built wheels and -pack is building it on my local? But more than the time difference, will -pack install cleanly in roughly the same set of environments that -languages did? The main reason I adopted -languages as a dependency was because it reliably installed in a wide range of environments. |
It sounded like @Goldziher was open to improving the experience with -pack, up above. It's generally considered bad form to have a derived file, like a wheel, included in your source tree but it sounds like for -languages it was a huge performance boost to ship them in that manner. |
I do think size and build time are issues needing careful consideration. I too built tree-sitter-language-pack locally. For me, the size and build time, despite being substantial, aren't all that significant compared to the benefits. But, I'm sure we can (and should) do better: @Goldziher I see that the published files on pypi.org/tree-sitter-language-pack don't include any wheels, but that you've worked on some infrastructure to build and publish wheels. This seems like it'd be a non-trivial undertaking. Can you comment on the status/challenges of that work? @paul-gauthier If tree-sitter-language-pack builds and publishes wheels with broad enough compatibility, how would that impact your evaluation of (the draft) PR #8? Besides adding wheels, we could implement a modular system for language support. However, that'd be a much larger undertaking and it's probably better to focus first on the immediate benefits of migrating to a maintained package with expanded language coverage, despite the increased size and build time. (For anyone curious, here are the files for grantjenks's pack on PyPI, including wheels, and here're its GitHub Actions workflow and build script.) |
The PR looks great. I would love to support all those languages. My only hesitation is the end user pip install experience:
|
@paul-gauthier Does adding pre-built wheels to tree-sitter-language-pack address your install time concerns?
The user build failure rate would likely increase somewhat due to the larger number of grammar projects, but the extent is hard to predict. I suspect the increase would be small, as build environments are often generally broken rather than failing on specific projects. (Importantly, if the new pack's pre-built wheels cover the same targets as the old pack's, the fallback rate to source builds should be identical.) With the unmaintained language pack's lack of ongoing support for newer systems, we should expect increases in both fallbacks to user builds and user build failures over time. A system for modular language packs could be ideal, e.g.:
This would install expected "core" languages plus Gleam and Zig. Other language grammars could be added without concern for bloat or risking breaking user builds. However, I'm less sure that the time and effort required for this modular approach is best way forward now. |
Yes, almost certainly. |
@Goldziher how are things going with ts-lang-pack? I've been experimenting with it, and it looks like I could swap it in for py-ts-langs. My install of ts-lang-pack today was quick, without a long build process. So that was nice to see. The README mentions that you are building wheels now, which is great. I see you have some open issues about build problems on different environments. Any sense of how reliably users are able to install ts-lang-pack? Aider has users on a wide range of platforms, so reliable and hassle free install is a key priority for me. |
@paul-gauthier Unfortunately, Goldziher/tree-sitter-language-pack does not have published wheels. Perhaps your fast build used a locally cached previous build? Goldziher clearly did work to include pre-built wheels, as noted in the README and evidenced in a GitHub Actions workflow (5 months ago). I'm sure someone could dig in and finish what Goldziher started. Even if that work is trivial, there'd still be the matter of actually getting them published to PyPI, and maybe needing a separate fork. For reference, To avoid an uncontrolled dependency, I’d lean towards Aider owning a modular tree-sitter language pack with prebuilt wheels under the Aider-AI org. Or, maybe explore bridging to another ecosystem in order to depend on something widely used and maintained, if such a thing exists at all. For example:
|
Feel free to open a pr with updates as you see fit |
I am also watching this fork: |
Hi @paul-gauthier , thanks for your work on
aider
. I've been having a blast using it.This project uses https://github.com/grantjenks/py-tree-sitter-languages, but that project is unmaintained and has been for several months. This forces
grep-ast
to be stuck on an old tree-sitter version (0.21) and also limits the number of parsers that can be used by upstream projects (including aider). There is a hacky way to install new language parsers, but that dependency will seemingly be stuck on tree-sitter 0.21 indefinitely, which seems bad.Another project has sprung up called tree-sitter-language-pack, however it's got a slightly different intention (large collection of grammar binaries, as opposed to small/focused one for the most popular languages only). That project is mainly an integration of this unmerged tree-sitter-languages PR with a bunch of new grammar binaries added. There's probably space for a minimal version that bundles just the top N languages and natively allows users to install their own binaries at will (so, essentially, just a version of
tree-sitter-languages
with that PR merged, and some different grammar binaries).If you want to replace
tree-sitter-languages
withtree-sitter-language-pack
, I'd be happy to open a PR. Note that the source binary size is quite a bit larger:tree-sitter-language-pack
: 35.7 MB, no platform-specific buildstree-sitter-languages
: ~9.0MB, depending on the platformThe text was updated successfully, but these errors were encountered: