-
Notifications
You must be signed in to change notification settings - Fork 370
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Loader doesn't work correctly #1514
Comments
#1134 was also related. (As was mentioned in #1513). #1018 is possibly related. #712 also looks related. I've noticed that if a module fails to import at the repl, I can't re-import it after fixing it without restarting the repl, even though the equivalent procedure works in Python. I think the corrupted module object remains in the module cache. We're supposed to remove it if it doesn't load.
You're not hurting my feelings. I didn't write that part. And you're not saying everything in Hy is wrong, just that the loader doesn't work correctly. But we kind of already knew that, hence the many existing issue reports. |
Another issue worth reporting: Hy injects This is because the way |
Still a little more work to do. Found one more issue, we don't properly interact with I've made one more big change that I think is worth bringing up: I've switched the bytecode file from
Now I realize this means you can't distribute a Hy project as pure And, this change also lets us add Hy specific features to the bytecode file structure independent from Python, which I think can see some innovations. Imagine if we stored information about the macros that where used in compilation. We might be able to make Hy automatically invalidate bytecode whenever those macros change - even when the macros as defined in a different file. |
The proposed change, as it currently stands, adds a good 400 lines of code, so the proof is in the pudding: does it fix all the bugs it's intended to fix? Until then, it's hard for me to judge your code. The benefits of switching to our own bytecode format seem meager:
But if switching to our own bytecode format substantially simplifies the new code, it could be worth it. |
I'm planning on building a large test coverage for it - at the very minimum copy Python's own test set around its importlib - so lets wait for that before I make any concrete promises of correctness. But, at the very least, as far as I can tell:
I agree, but its technically possible. As for order, Hy would always load first because we put our loader first, but that's besides the point... That second point is a really good point I didn't consider. Should have realized it too because I've done it to have literal includes from files and that, while at least deterministic, suffers from a similar problem.
At the very least, and the strongest argument for it, is it means one loader for everyone and one less place where we need to chase Python.
Fortunately, end delta is going to be smaller. I uploaded some temporary files by accident. Currently looking at a 200 line increase. I uploaded two versions of the Python 2 support... Part of this is stuff I have to backport from Python 3, like |
Come to think of it, it might be worth having Hy's importer check for a corresponding |
Yeah, maybe, but it might get complicated to do right when we're looking at the interaction between modules and packages. A |
It also means losing compatibility with any Python tools that directly act on .pyc files. I don't know how many of those are important or if they will become so, but astor is important for us now. I wouldn't approve of a change that breaks hy2py. I think mypy is also important, but it's designed for Python, not Hy. I expect we'd get it working at the level of mypy's typed AST, or a perfected hy2py via astor, not directly from bytecode, but I'm not sure. A custom loader could also give us more natural dynamic variables hylang/hyrule#51 and possibly better namespacing and autoimports #1407 by customizing the module object or module dict. And possibly serialization of arbitrary objects at compile time #919, which is nice for a Lisp to have. |
As far as I can tell, hy2py doesn't work at the bytecode level though.
Yes, this is probably how it would work in practise, no need for bytecode. There has also been talk of adding a plugin system to mypy: python/mypy#1240 We might be able to just teach mypy how to read Hy definitions. At the very least, I was considering trying to add stub file support to hy2py with #1482
It would also have consequences for packaging, actually. While I think Hy could really benifit if we add some setuptools integration, until that's done, maybe its best to revert it for now and reconsider it in the future. |
I will say that seeing another loader change that touches bytecode is a bit concerning; the last one had to be reverted because it seemed to work but broke under some hard-to-trace cases. |
I've dug into #1085 and I think the bytecode issue is actually a red herring. The problem is in how that patch replaces the default FileLoader. If you look at the contents of
to
I tried to load some code with -m with that patch and I got this:
This is a very similar problem to what was reported in the first place, and it happens even if there's no bytecode on the system. And, for completness this is the contents of the file that's run: "Module to dump __spec__, __loader__, and other metadata"
(print "Welcomd to Hy")
(print "name:" __name__)
(print "doc:" __doc__)
(print "package:" __package__)
(print "loader:" __loader__)
(if-python2 () (print "spec:" __spec__))
(print "file:" __file__)
(if-python2 () (print "cached:" __cached__)) I think those changes just broke module resolution, and a side effect of that might have been loading different code (bytecode or source) than expected. |
But yes, we should be careful. I'm not going to be comfortable letting it get merged until I get around to writing a full test suite just for it. I just won't have time till the next weekend. |
@kirbyfan64 dug into it a little bit, I'm not exactly right - for some reason that polyloader tries to load everything run we try to load a package with -m: Added a print statement above that
But I think it makes my point that the problem's with that loader is it too was incorrectly implemented and loads weird things, rather than a problem with bytecode persay. I mean, I'd be really surprised if we where emitting incorrect bytecode in the polyloader since it defers to Python internals... |
Fiddling with the loader further, I think I'm getting really close to something really nice and simple, and as close to correct as I understand Python's loading mechanism to be, but I think there are some interesting corner cases that need to be documented. I initially tried to subclass as much as possible from importlib.machinery, but I think I'll have to provide my own implementations for a few things interntionally. For example, say I had the following package:
And I open python and try to import it without first importing hy (Python 3.3 and newer):
We've loaded it successfully, but not as Hy code. This is because Hy packages slip under the radar and trick python into thinking we're defining a namespace. See PEP 420. Now this PEP is is interesting, and suff introduced in there would probably be the backbone for emulating clojures namespaces, should we go down that road (as I understand them, at least - I'm also a clojure newbie). But it does lead to some interesting pitfalls that a beginner to Hy might fall into (heck, I didn't know about PEP 420 until yesterday, and I consider myself experienced) when trying to glue into Python. Its also something that complicated the new loader because it means I can't just subclass the default importlib machinery and change the appropriate bits. When I promote the Hy sys.meta_path entry to first, suddently Python code starts showing up as namespace modules. The Hy loader must specificly be first and not directly support the scheme established in PEP 420. Another potential issue I've not fully dug into is how safe is it to even mix Python and Hy code inside a single module at all. Python, as I understand it, recusrively imports packages, so We've gotten around this in the past because the current Hy loader doesn't handle relative imports the same way Python does and treats everything as a fully qualified module. It looks like a valid workaround, but I can't speak for its overall correctness. That said, if we make the Hy loader preferential, and intentionally break PEP 420 support, I'm 99% sure we'll mitigate most things, but this seems "antagonistic" enough to maybe warrent documentation. I'm still investigating, learning, so expect things to change. But if I'm saying something obviously wrong, please correct me. |
Okay, got it working for Python 3 in a way that's, as far as I know, 100% compliant to how the Python module system works.
On top of this, we support the file system caching system, so imports should be faster, and we can independently extend the Python 3 module loader without worrying about clobbering Python at all. Tests are almost done (most of these problems cropped up when writing tests), so hopefully tomorrow, barring any other major underestimations of how the system works. |
Another tool that appears to use bytecode is https://github.com/pybee/voc which would potentially let us use Hy on Android, but probably not if we break compatibility by using our own .hyc format. |
This should probably be closed by #1672, no? |
Probably, yes. |
Okay, as a preface, I don't like coming into a project I've just started contributing to and saying everything is wrong. So before unloading a giant structural change to the project, I'd like to explain what I found, explain why its wrong, and how to move forward.
As background, I've been actually hacking really hard at Hy, trying my hand at various open issues, like #1416 and #1482, but keep bumping into issues around Hy's importer. Frustrated, I started really digging into how the system works, as you guys may have noticed opening issues like #1512
So, having dug into depth of how the Hy importer works and I'm very confident to say we've gotten it wrong. This isn't just an issue of using deprecated APIs, but actually implementing the API incorrectly. I'm at the point that I'm not sure now how Hy ever worked to begin with.
The biggest sign that things are wrong is that
runpy
doesn't work with Hy modules.runpy
uses the standard import mechanism (PEP 302), and since we hook into that, hy modules should Just Work.In fact, the
hy
command line utility probably should be usingrunpy
forhy <file>
andhy -m <module>
instead of its home rolled implementations. Once it does (and the various necessary fixes are in place), virtually all the outstanding issues like #1513 and #1466 just disappear. There is actually no need to roll our own.That said, as a side effect, it also means that
hy -m <module>
can launch python code. But I argue this is really desirable. It allows Python and Hy to glue together even better. For example:This should be completely legal, and lets me easily launch an aiohttp server (python, naturally), but with a Hy based app factory. Currently, I'd have to provider a custom shim to make this work.
So, core issues of what's wrong and stuff that should be done:
get_code
andis_package
(even on Python 2). The causesrunpy
to chock__path__
. This causespkgutil.find_loader
to misbehave, which then preventsrunpy
from working as it can't find the appropriate loader inside Hy packages.sys.modules
is imperfect. See Fix self imports #1513 for detailsimportlib
is actually cleaner. Most of the implementation can be shared.I'm 98% there of dealing with all these issues and closing a whole pile of open bugs. Its also a much more straightforward implementation than #1085 so should be more maintainable.
The text was updated successfully, but these errors were encountered: