Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Loader doesn't work correctly #1514

Closed
4 tasks
vodik opened this issue Feb 18, 2018 · 18 comments
Closed
4 tasks

Loader doesn't work correctly #1514

vodik opened this issue Feb 18, 2018 · 18 comments
Labels

Comments

@vodik
Copy link
Contributor

vodik commented Feb 18, 2018

Okay, as a preface, I don't like coming into a project I've just started contributing to and saying everything is wrong. So before unloading a giant structural change to the project, I'd like to explain what I found, explain why its wrong, and how to move forward.

As background, I've been actually hacking really hard at Hy, trying my hand at various open issues, like #1416 and #1482, but keep bumping into issues around Hy's importer. Frustrated, I started really digging into how the system works, as you guys may have noticed opening issues like #1512

So, having dug into depth of how the Hy importer works and I'm very confident to say we've gotten it wrong. This isn't just an issue of using deprecated APIs, but actually implementing the API incorrectly. I'm at the point that I'm not sure now how Hy ever worked to begin with.

The biggest sign that things are wrong is that runpy doesn't work with Hy modules. runpy uses the standard import mechanism (PEP 302), and since we hook into that, hy modules should Just Work.

In fact, the hy command line utility probably should be using runpy for hy <file> and hy -m <module> instead of its home rolled implementations. Once it does (and the various necessary fixes are in place), virtually all the outstanding issues like #1513 and #1466 just disappear. There is actually no need to roll our own.

That said, as a side effect, it also means that hy -m <module> can launch python code. But I argue this is really desirable. It allows Python and Hy to glue together even better. For example:

hy -m aiohttp.web myproject:app_factory

This should be completely legal, and lets me easily launch an aiohttp server (python, naturally), but with a Hy based app factory. Currently, I'd have to provider a custom shim to make this work.

So, core issues of what's wrong and stuff that should be done:

  • MetaLoader is missing expected public APIs: get_code and is_package (even on Python 2). The causes runpy to chock
  • importer puts incorrect values for __path__. This causes pkgutil.find_loader to misbehave, which then prevents runpy from working as it can't find the appropriate loader inside Hy packages.
  • Logic of actually setting up sys.modules is imperfect. See Fix self imports #1513 for details
  • Create a separate loader for Python 3. The current loader is sufficient for Python 2 once fixed up, but basically built with deprecated APIs. Building a new loader around importlib is actually cleaner. Most of the implementation can be shared.

I'm 98% there of dealing with all these issues and closing a whole pile of open bugs. Its also a much more straightforward implementation than #1085 so should be more maintainable.

@gilch
Copy link
Member

gilch commented Feb 19, 2018

#1134 was also related. (As was mentioned in #1513).

#1018 is possibly related.

#712 also looks related. I've noticed that if a module fails to import at the repl, I can't re-import it after fixing it without restarting the repl, even though the equivalent procedure works in Python. I think the corrupted module object remains in the module cache. We're supposed to remove it if it doesn't load.

I don't like coming into a project I've just started contributing to and saying everything is wrong.

You're not hurting my feelings. I didn't write that part. And you're not saying everything in Hy is wrong, just that the loader doesn't work correctly. But we kind of already knew that, hence the many existing issue reports.

@vodik
Copy link
Contributor Author

vodik commented Feb 19, 2018

Another issue worth reporting: Hy injects sys.path.insert(0, "") but can't load from that path.

This is because the way find_on_path was implemented that causes it to construct, when processing import foo, /foo/__init__.hy and /foo.hy

@vodik
Copy link
Contributor Author

vodik commented Feb 20, 2018

Still a little more work to do. Found one more issue, we don't properly interact with sys.path_importer_cache. But I have something that, with one kludge related to that cache, integrates properly with Python now.

I've made one more big change that I think is worth bringing up: I've switched the bytecode file from .pyc to .hyc and added a HY_MAGIC_NUMBER header to the start of that file. There are two immediate benefits from this:

  • Module names can already collide (e.g. foo.py and foo.hy can exist side by side). There's no reason that the bytecode has to share a filename.
  • Simplifies supporting multiple versions of Python - we no longer have to worry about differences between python versions, all versions of Hy can read and write the same format. What's nice is that if we add support for PEP 552, we can support it across the entire Hy ecosystem, it doesn't have to be gated to Python 3.7.

Now I realize this means you can't distribute a Hy project as pure .pyc files anymore, but you couldn't really do that anyways as Hy is pretty tied to its standard library. And if the standard library is present, then an .hyc only distribution should work just fine as well.

And, this change also lets us add Hy specific features to the bytecode file structure independent from Python, which I think can see some innovations. Imagine if we stored information about the macros that where used in compilation. We might be able to make Hy automatically invalidate bytecode whenever those macros change - even when the macros as defined in a different file.

@Kodiologist
Copy link
Member

The proposed change, as it currently stands, adds a good 400 lines of code, so the proof is in the pudding: does it fix all the bugs it's intended to fix? Until then, it's hard for me to judge your code.

The benefits of switching to our own bytecode format seem meager:

  • It's a Bad Idea to have both foo.py and foo.hy in the first place, because how would (import foo) know which to use?
  • Compilation is inherently nondeterministic in Hy because of code like (defmacro m [] (import random) (random.random)) (print (m)). This is why checking whether required files have changed isn't a general solution to the problem of stale bytecode (.pyc files don't get recreated when they use a macro from a file that was changed #1324).

But if switching to our own bytecode format substantially simplifies the new code, it could be worth it.

@vodik
Copy link
Contributor Author

vodik commented Feb 20, 2018

The proposed change, as it currently stands, adds a good 400 lines of code, so the proof is in the pudding: does it fix all the bugs it's intended to fix? Until then, it's hard for me to judge your code.

I'm planning on building a large test coverage for it - at the very minimum copy Python's own test set around its importlib - so lets wait for that before I make any concrete promises of correctness. But, at the very least, as far as I can tell:

  • __name__, __file__, __path__, __package__, loader`, etc, are set correctly and as expected.
  • runpy.run_module and runpy.run_file work as expected when pointed at Hy code
  • Properly recover when a module fails to load. Reloading modules works as expected now. Haven't tested IPython autoreload yet though.

It's a Bad Idea to have both foo.py and foo.hy in the first place...

I agree, but its technically possible. As for order, Hy would always load first because we put our loader first, but that's besides the point...

That second point is a really good point I didn't consider. Should have realized it too because I've done it to have literal includes from files and that, while at least deterministic, suffers from a similar problem.

But if switching to our own bytecode format substantially simplifies the new code, it could be worth it.

At the very least, and the strongest argument for it, is it means one loader for everyone and one less place where we need to chase Python.

adds a good 400 lines of code

Fortunately, end delta is going to be smaller. I uploaded some temporary files by accident. Currently looking at a 200 line increase. I uploaded two versions of the Python 2 support...

Part of this is stuff I have to backport from Python 3, like atomic_write, to try and make file writing more robust. This is new behaviour that goes beyond what the old code did, but matches Python's own behaviours. Its probably desirable to have, as technically having multiple processes writing our the same Hy bytecode is potentially racy.

@Kodiologist
Copy link
Member

It's a Bad Idea to have both foo.py and foo.hy in the first place...

I agree, but its technically possible.

Come to think of it, it might be worth having Hy's importer check for a corresponding *.py file and raise an error if it's there.

@vodik
Copy link
Contributor Author

vodik commented Feb 20, 2018

Yeah, maybe, but it might get complicated to do right when we're looking at the interaction between modules and packages. A foo.hy would overshadow foo.py, foo/__init__.py, and, depending on the situation, foo/__main__.py.

@gilch
Copy link
Member

gilch commented Feb 20, 2018

one loader for everyone and one less place where we need to chase Python.

It also means losing compatibility with any Python tools that directly act on .pyc files. I don't know how many of those are important or if they will become so, but astor is important for us now. I wouldn't approve of a change that breaks hy2py.

I think mypy is also important, but it's designed for Python, not Hy. I expect we'd get it working at the level of mypy's typed AST, or a perfected hy2py via astor, not directly from bytecode, but I'm not sure.

A custom loader could also give us more natural dynamic variables hylang/hyrule#51 and possibly better namespacing and autoimports #1407 by customizing the module object or module dict. And possibly serialization of arbitrary objects at compile time #919, which is nice for a Lisp to have.

@vodik
Copy link
Contributor Author

vodik commented Feb 20, 2018

I wouldn't approve of a change that breaks hy2py.

As far as I can tell, hy2py doesn't work at the bytecode level though.

I expect we'd get it working at the level of mypy's typed AST, or a perfected hy2py via astor, not directly from bytecode, but I'm not sure.

Yes, this is probably how it would work in practise, no need for bytecode. There has also been talk of adding a plugin system to mypy: python/mypy#1240 We might be able to just teach mypy how to read Hy definitions. At the very least, I was considering trying to add stub file support to hy2py with #1482

It also means losing compatibility with any Python tools that directly act on .pyc files

It would also have consequences for packaging, actually. While I think Hy could really benifit if we add some setuptools integration, until that's done, maybe its best to revert it for now and reconsider it in the future.

@refi64
Copy link
Contributor

refi64 commented Feb 21, 2018

I will say that seeing another loader change that touches bytecode is a bit concerning; the last one had to be reverted because it seemed to work but broke under some hard-to-trace cases.

@vodik
Copy link
Contributor Author

vodik commented Feb 21, 2018

I've dug into #1085 and I think the bytecode issue is actually a red herring. The problem is in how that patch replaces the default FileLoader. If you look at the contents of sys.path_importer_cache, you'll notice that patch effectively trashes a virtualenv: it goes from this:

=> (pprint sys.path_importer_cache)
{'/home/simon/.local/share/virtualenvs/hy-eeIqtnUk/bin': FileFinder('/home/simon/.local/share/virtualenvs/hy-eeIqtnUk/bin'),
 '/home/simon/.local/share/virtualenvs/hy-eeIqtnUk/bin/hy': None,
 '/home/simon/.local/share/virtualenvs/hy-eeIqtnUk/lib/python3.6': FileFinder('/home/simon/.local/share/virtualenvs/hy-eeIqtnUk/lib/python3.6'),
 '/home/simon/.local/share/virtualenvs/hy-eeIqtnUk/lib/python3.6/collections': FileFinder('/home/simon/.local/share/virtualenvs/hy-eeIqtnUk/lib/python3.6/collections'),
 '/home/simon/.local/share/virtualenvs/hy-eeIqtnUk/lib/python3.6/encodings': FileFinder('/home/simon/.local/share/virtualenvs/hy-eeIqtnUk/lib/python3.6/encodings'),
 '/home/simon/.local/share/virtualenvs/hy-eeIqtnUk/lib/python3.6/importlib': FileFinder('/home/simon/.local/share/virtualenvs/hy-eeIqtnUk/lib/python3.6/importlib'),
 '/home/simon/.local/share/virtualenvs/hy-eeIqtnUk/lib/python3.6/lib-dynload': FileFinder('/home/simon/.local/share/virtualenvs/hy-eeIqtnUk/lib/python3.6/lib-dynload'),
 '/home/simon/.local/share/virtualenvs/hy-eeIqtnUk/lib/python3.6/site-packages': FileFinder('/home/simon/.local/share/virtualenvs/hy-eeIqtnUk/lib/python3.6/site-packages'),
 '/home/simon/.local/share/virtualenvs/hy-eeIqtnUk/lib/python3.6/site-packages/astor': FileFinder('/home/simon/.local/share/virtualenvs/hy-eeIqtnUk/lib/python3.6/site-packages/astor'),
 '/home/simon/.local/share/virtualenvs/hy-eeIqtnUk/lib/python3.6/site-packages/clint': FileFinder('/home/simon/.local/share/virtualenvs/hy-eeIqtnUk/lib/python3.6/site-packages/clint'),
 '/home/simon/.local/share/virtualenvs/hy-eeIqtnUk/lib/python3.6/site-packages/clint/packages': FileFinder('/home/simon/.local/share/virtualenvs/hy-eeIqtnUk/lib/python3.6/site-packages/clint/packages'),
 '/home/simon/.local/share/virtualenvs/hy-eeIqtnUk/lib/python3.6/site-packages/clint/packages/colorama': FileFinder('/home/simon/.local/share/virtualenvs/hy-eeIqtnUk/lib/python3.6/site-packages/clint/packages/colorama'),
 '/home/simon/.local/share/virtualenvs/hy-eeIqtnUk/lib/python3.6/site-packages/clint/textui': FileFinder('/home/simon/.local/share/virtualenvs/hy-eeIqtnUk/lib/python3.6/site-packages/clint/textui'),
 '/home/simon/.local/share/virtualenvs/hy-eeIqtnUk/lib/python3.6/site-packages/pkg_resources': FileFinder('/home/simon/.local/share/virtualenvs/hy-eeIqtnUk/lib/python3.6/site-packages/pkg_resources'),
 '/home/simon/.local/share/virtualenvs/hy-eeIqtnUk/lib/python3.6/site-packages/pkg_resources/_vendor': FileFinder('/home/simon/.local/share/virtualenvs/hy-eeIqtnUk/lib/python3.6/site-packages/pkg_resources/_vendor'),
 '/home/simon/.local/share/virtualenvs/hy-eeIqtnUk/lib/python3.6/site-packages/pkg_resources/_vendor/packaging': FileFinder('/home/simon/.local/share/virtualenvs/hy-eeIqtnUk/lib/python3.6/site-packages/pkg_resources/_vendor/packaging'),
 '/home/simon/.local/share/virtualenvs/hy-eeIqtnUk/lib/python3.6/site-packages/pkg_resources/extern': FileFinder('/home/simon/.local/share/virtualenvs/hy-eeIqtnUk/lib/python3.6/site-packages/pkg_resources/extern'),
 '/home/simon/.local/share/virtualenvs/hy-eeIqtnUk/lib/python3.6/site-packages/rply': FileFinder('/home/simon/.local/share/virtualenvs/hy-eeIqtnUk/lib/python3.6/site-packages/rply'),
 '/home/simon/.local/share/virtualenvs/hy-eeIqtnUk/lib/python36.zip': None,
 '/home/simon/src/MonkeyType': FileFinder('/home/simon/src/MonkeyType'),
 '/home/simon/src/hy': FileFinder('/home/simon/src/hy'),
 '/home/simon/src/hy/hy': FileFinder('/home/simon/src/hy/hy'),
 '/home/simon/src/hy/hy/lex': FileFinder('/home/simon/src/hy/hy/lex'),
 '/home/simon/src/hy/hy/models': FileFinder('/home/simon/src/hy/hy/models'),
 '/usr/lib/python3.6': FileFinder('/usr/lib/python3.6'),
 '/usr/lib64/python3.6': FileFinder('/usr/lib64/python3.6'),
 '/usr/lib64/python3.6/ctypes': FileFinder('/usr/lib64/python3.6/ctypes'),
 '/usr/lib64/python3.6/email': FileFinder('/usr/lib64/python3.6/email'),
 '/usr/lib64/python3.6/json': FileFinder('/usr/lib64/python3.6/json'),
 '/usr/lib64/python3.6/urllib': FileFinder('/usr/lib64/python3.6/urllib'),
 '/usr/lib64/python3.6/xml': FileFinder('/usr/lib64/python3.6/xml'),
 '/usr/lib64/python3.6/xml/parsers': FileFinder('/usr/lib64/python3.6/xml/parsers')}

to

=> (pprint sys.path_importer_cache)
{'/home/simon/.local/share/virtualenvs/hy-eeIqtnUk/bin': FileFinder('/home/simon/.local/share/virtualenvs/hy-eeIqtnUk/bin'),
 '/home/simon/.local/share/virtualenvs/hy-eeIqtnUk/lib/python3.6': FileFinder('/home/simon/.local/share/virtualenvs/hy-eeIqtnUk/lib/python3.6'),
 '/home/simon/.local/share/virtualenvs/hy-eeIqtnUk/lib/python3.6/lib-dynload': FileFinder('/home/simon/.local/share/virtualenvs/hy-eeIqtnUk/lib/python3.6/lib-dynload'),
 '/home/simon/.local/share/virtualenvs/hy-eeIqtnUk/lib/python3.6/site-packages': FileFinder('/home/simon/.local/share/virtualenvs/hy-eeIqtnUk/lib/python3.6/site-packages'),
 '/home/simon/.local/share/virtualenvs/hy-eeIqtnUk/lib/python3.6/site-packages/astor': FileFinder('/home/simon/.local/share/virtualenvs/hy-eeIqtnUk/lib/python3.6/site-packages/astor'),
 '/home/simon/.local/share/virtualenvs/hy-eeIqtnUk/lib/python36.zip': None,
 '/home/simon/src/hy': FileFinder('/home/simon/src/hy'),
 '/home/simon/src/hy/hy': FileFinder('/home/simon/src/hy/hy'),
 '/home/simon/src/hy/hy/core': FileFinder('/home/simon/src/hy/hy/core'),
 '/usr/lib/python3.6': FileFinder('/usr/lib/python3.6'),
 '/usr/lib64/python3.6': FileFinder('/usr/lib64/python3.6')}

I tried to load some code with -m with that patch and I got this:

hy -m dumpers.lvl2
/home/simon/.local/share/virtualenvs/hy-eeIqtnUk/lib/python3.6/site-packages/IPython/kernel/__init__.py:13: ShimWarning: The `IPython.kernel` package has been deprecated since IPython 4.0.You should import from ipykernel or jupyter_client instead.
  "You should import from ipykernel or jupyter_client instead.", ShimWarning)
Traceback (most recent call last):
  File "/home/simon/.local/share/virtualenvs/hy-eeIqtnUk/bin/hy", line 11, in <module>
    load_entry_point('hy', 'console_scripts', 'hy')()
  File "/home/simon/src/hy/hy/cmdline.py", line 344, in hy_main
    sys.exit(cmdline_handler("hy", sys.argv))
  File "/home/simon/src/hy/hy/cmdline.py", line 318, in cmdline_handler
    return run_module(options.mod)
  File "/home/simon/src/hy/hy/cmdline.py", line 201, in run_module
    if mod[1] == mod_name), None)
  File "/home/simon/src/hy/hy/cmdline.py", line 200, in <genexpr>
    mod = next((mod for mod in pkgutil.walk_packages()
  File "/usr/lib64/python3.6/pkgutil.py", line 107, in walk_packages
    yield from walk_packages(path, info.name+'.', onerror)
  File "/usr/lib64/python3.6/pkgutil.py", line 92, in walk_packages
    __import__(info.name)
  File "/home/simon/.local/share/virtualenvs/hy-eeIqtnUk/lib/python3.6/site-packages/matplotlib/tests/__init__.py", line 17, in <module>
    'The baseline image directory does not exist. '
OSError: The baseline image directory does not exist. This is most likely because the test data is not installed. You may need to install matplotlib from source to get the test data.

This is a very similar problem to what was reported in the first place, and it happens even if there's no bytecode on the system. And, for completness this is the contents of the file that's run:

"Module to dump __spec__, __loader__, and other metadata"
(print "Welcomd to Hy")
(print "name:" __name__)
(print "doc:" __doc__)
(print "package:" __package__)
(print "loader:" __loader__)
(if-python2 () (print "spec:" __spec__))
(print "file:" __file__)
(if-python2 () (print "cached:" __cached__))

I think those changes just broke module resolution, and a side effect of that might have been loading different code (bytecode or source) than expected.

@vodik
Copy link
Contributor Author

vodik commented Feb 21, 2018

But yes, we should be careful. I'm not going to be comfortable letting it get merged until I get around to writing a full test suite just for it. I just won't have time till the next weekend.

@vodik
Copy link
Contributor Author

vodik commented Feb 21, 2018

@kirbyfan64 dug into it a little bit, I'm not exactly right - for some reason that polyloader tries to load everything run we try to load a package with -m:

Added a print statement above that __import__ line and we see this:

$ hy -m dumpers.lvl2
!!! collections
!!! distutils
!!! distutils.command
!!! distutils.tests
!!! encodings
!!! importlib
!!! asyncio
!!! concurrent
!!! concurrent.futures
!!! ctypes
!!! ctypes.macholib
!!! ctypes.test
!!! curses
!!! dbm
!!! email
!!! email.mime
!!! ensurepip
!!! html
!!! http
!!! idlelib
!!! idlelib.idle_test
!!! json
!!! lib2to3
!!! lib2to3.fixes
!!! lib2to3.pgen2
!!! lib2to3.tests
!!! logging
!!! multiprocessing
!!! multiprocessing.dummy
!!! pydoc_data
!!! sqlite3
!!! sqlite3.test
!!! test
!!! tkinter
!!! tkinter.test
!!! tkinter.test.test_tkinter
!!! tkinter.test.test_ttk
!!! turtledemo
!!! unittest
!!! unittest.test
!!! unittest.test.testmock
!!! urllib
!!! venv
!!! wsgiref
!!! xml
!!! xml.dom
!!! xml.etree
!!! xml.parsers
!!! xml.sax
!!! xmlrpc
!!! IPython
!!! IPython.core
!!! IPython.core.magics
!!! IPython.core.tests
!!! IPython.extensions
!!! IPython.extensions.tests
!!! IPython.external
!!! IPython.external.decorators
!!! IPython.kernel
/home/simon/.local/share/virtualenvs/hy-eeIqtnUk/lib/python3.6/site-packages/IPython/kernel/__init__.py:13: ShimWarning: The `IPython.kernel` package has been deprecated since IPython 4.0.You should import from ipykernel or jupyter_client instead.
  "You should import from ipykernel or jupyter_client instead.", ShimWarning)
!!! IPython.lib
!!! IPython.lib.tests
!!! IPython.sphinxext
!!! IPython.terminal
!!! IPython.terminal.pt_inputhooks
!!! IPython.terminal.tests
!!! IPython.testing
!!! IPython.testing.plugin
!!! IPython.testing.tests
!!! IPython.utils
!!! IPython.utils.tests
!!! _pytest
!!! _pytest._code
!!! _pytest.assertion
!!! astor
!!! attr
!!! click
!!! clint
!!! clint.packages
!!! clint.packages.colorama
!!! clint.textui
!!! colorlog
!!! dateutil
!!! dateutil.tz
!!! dateutil.zoneinfo
!!! ipykernel
!!! ipykernel.comm
!!! ipykernel.gui
!!! ipykernel.inprocess
!!! ipykernel.inprocess.tests
!!! ipykernel.pylab
!!! ipykernel.tests
!!! ipython_genutils
!!! ipython_genutils.testing
!!! ipython_genutils.tests
!!! jedi
!!! jedi.api
!!! jedi.common
!!! jedi.evaluate
!!! jedi.evaluate.compiled
!!! jedi.evaluate.context
!!! jupyter_client
!!! jupyter_client.blocking
!!! jupyter_client.ioloop
!!! jupyter_client.tests
!!! jupyter_core
!!! jupyter_core.tests
!!! jupyter_core.utils
!!! lxml
!!! lxml.html
!!! lxml.includes
!!! lxml.isoschematron
!!! matplotlib
!!! matplotlib.axes
!!! matplotlib.backends
!!! matplotlib.backends.qt_editor
!!! matplotlib.cbook
!!! matplotlib.compat
!!! matplotlib.projections
!!! matplotlib.sphinxext
!!! matplotlib.sphinxext.tests
!!! matplotlib.style
!!! matplotlib.testing
!!! matplotlib.testing._nose
!!! matplotlib.testing._nose.plugins
!!! matplotlib.testing.jpl_units
!!! matplotlib.tests
Traceback (most recent call last):
  File "/home/simon/.local/share/virtualenvs/hy-eeIqtnUk/bin/hy", line 11, in <module>
    load_entry_point('hy', 'console_scripts', 'hy')()
  File "/home/simon/src/hy/hy/cmdline.py", line 344, in hy_main
    sys.exit(cmdline_handler("hy", sys.argv))
  File "/home/simon/src/hy/hy/cmdline.py", line 318, in cmdline_handler
    return run_module(options.mod)
  File "/home/simon/src/hy/hy/cmdline.py", line 201, in run_module
    if mod[1] == mod_name), None)
  File "/home/simon/src/hy/hy/cmdline.py", line 200, in <genexpr>
    mod = next((mod for mod in pkgutil.walk_packages()
  File "/usr/lib64/python3.6/pkgutil.py", line 108, in walk_packages
    yield from walk_packages(path, info.name+'.', onerror)
  File "/usr/lib64/python3.6/pkgutil.py", line 93, in walk_packages
    __import__(info.name)
  File "/home/simon/.local/share/virtualenvs/hy-eeIqtnUk/lib/python3.6/site-packages/matplotlib/tests/__init__.py", line 17, in <module>
    'The baseline image directory does not exist. '
OSError: The baseline image directory does not exist. This is most likely because the test data is not installed. You may need to install matplotlib from source to get the test data.

But I think it makes my point that the problem's with that loader is it too was incorrectly implemented and loads weird things, rather than a problem with bytecode persay.

I mean, I'd be really surprised if we where emitting incorrect bytecode in the polyloader since it defers to Python internals...

@vodik
Copy link
Contributor Author

vodik commented Feb 24, 2018

Fiddling with the loader further, I think I'm getting really close to something really nice and simple, and as close to correct as I understand Python's loading mechanism to be, but I think there are some interesting corner cases that need to be documented.

I initially tried to subclass as much as possible from importlib.machinery, but I think I'll have to provide my own implementations for a few things interntionally.

For example, say I had the following package:

pep420/
└── __init__.hy

And I open python and try to import it without first importing hy (Python 3.3 and newer):

>>> import pep420
>>> pep420
<module 'pep420' (namespace)>

We've loaded it successfully, but not as Hy code. This is because Hy packages slip under the radar and trick python into thinking we're defining a namespace. See PEP 420.

Now this PEP is is interesting, and suff introduced in there would probably be the backbone for emulating clojures namespaces, should we go down that road (as I understand them, at least - I'm also a clojure newbie).

But it does lead to some interesting pitfalls that a beginner to Hy might fall into (heck, I didn't know about PEP 420 until yesterday, and I consider myself experienced) when trying to glue into Python.

Its also something that complicated the new loader because it means I can't just subclass the default importlib machinery and change the appropriate bits. When I promote the Hy sys.meta_path entry to first, suddently Python code starts showing up as namespace modules.

The Hy loader must specificly be first and not directly support the scheme established in PEP 420.

Another potential issue I've not fully dug into is how safe is it to even mix Python and Hy code inside a single module at all.

Python, as I understand it, recusrively imports packages, so import foo.bar causes both foo and foo.bar to be loaded. But as it does it, the "foo" loader is used to determine the loader to use for "foo.bar". Meaning if "foo" is Python, and "foo.bar" is Hy, because the Python loader loaded ".bar", we can end up in the PEP 420 trap again.

We've gotten around this in the past because the current Hy loader doesn't handle relative imports the same way Python does and treats everything as a fully qualified module. It looks like a valid workaround, but I can't speak for its overall correctness.

That said, if we make the Hy loader preferential, and intentionally break PEP 420 support, I'm 99% sure we'll mitigate most things, but this seems "antagonistic" enough to maybe warrent documentation.

I'm still investigating, learning, so expect things to change. But if I'm saying something obviously wrong, please correct me.

@vodik
Copy link
Contributor Author

vodik commented Feb 25, 2018

Okay, got it working for Python 3 in a way that's, as far as I know, 100% compliant to how the Python module system works.

  1. We need to have the first entry in sys.meta_path - well, we just need to be in top of the _frozen_importlib_external.PathFinder - as our custom HyPathFinder. The custom HyPathFinder has similar logic to PathFinder, but needs to have its own path_hooks and path_importer_cache (independent of sys).
  2. We then load a custom HyFileFinder into our internal path_hooks that implements pretty much the same logic as FileFinder sans namespaces (otherwise we'll swallow python modules as namespace modules - we want to fail here and let the next sys.meta_path entry try instead).
  3. The HyFileFinder is then responsible for finding files and generating module specs with the HyLoader loader. This then allows python to generate the right modules.

On top of this, we support the file system caching system, so imports should be faster, and we can independently extend the Python 3 module loader without worrying about clobbering Python at all.

Tests are almost done (most of these problems cropped up when writing tests), so hopefully tomorrow, barring any other major underestimations of how the system works.

@gilch
Copy link
Member

gilch commented Mar 13, 2018

Another tool that appears to use bytecode is https://github.com/pybee/voc which would potentially let us use Hy on Android, but probably not if we break compatibility by using our own .hyc format.

@brandonwillard
Copy link
Member

This should probably be closed by #1672, no?

@Kodiologist
Copy link
Member

Probably, yes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

5 participants