Description
In my recent work to bring more of the logic about what to parse to the frontend library in PR #25125, I ran into the test userInsteadOfStandard
. In some discussion with @bradcray I learned about the purpose of this test and that brought up some design questions & I'm creating this issue to discuss these questions. The questions have to do with the order in which something like use Foo
searches for Foo
in the standard/internal/package modules or in user paths.
Search Path Ordering
The dyno frontend is searching for modules in the following order:
- bundled module paths
- includes internal module paths; standard module paths;
modules/packages
; distributions
- includes internal module paths; standard module paths;
- user paths
- includes -M, current dir, and directories containing files named on command line
This is different from the production compiler's search order:
- user paths
- includes -M, current dir, and directories containing files named on command line
- bundled module paths
- includes standard, package, and internal library paths
This leads to the change in behavior for userInsteadOfStandard
with PR #25125.
Key Questions
- What behavior do we want when a user inadvertently uses a module name that conflicts with a bundled module?
- Should it be possible for the compiler to work with two different top-level modules with the same name?
- Is it a breaking change to add a new module to the bundled
modules/
directory (including standard, internal, package modules)? - Should modules in the bundled
modules/
be searched for before or after user modules from-M
/.
/ same directory as other code?
A Brief History of Time userInsteadOfStandard
test/modules/bradc/userInsteadOfStandard/
was added in 2009 in 0cbefbd. The purpose of this test is to check that we have behavior we like for the scenario in which a user unwittingly names a module the same as a standard/internal/package module. Originally, the test had a Math.chpl
that the test is imagining was created by a user who didn't know that there is a Math.chpl
in the standard library. The test has a foo2.chpl
that use
s the module with the conflicting name.
Note that the purpose of this test is not to test that we can replace a standard library module in the field. We have --prepend-standard-module-dir
/ --prepend-internal-module-dir
for that & as far as I know, there is agreement that these flags (or similar flags) should be required to get the replace-standard-library behavior. (I and others have been confused about this in the past).
It is not immediately obvious to me if this test is intended to focus only on automatically included modules. However, the questions it raises apply either way.
Commit 557eba1 (also in 2009) added a mechanism to the compiler to process standard/internal modules separately from user modules, so that the use
statements in the standard/internal modules only search in the standard/internal module paths (and not the user directories). The idea was that in a case like this, the compiler could actually have two Math
modules; one from the user's directory and one from the standard library. It would rename one of these, internally.
I think this logic worked for this test until PR #19306, which accepted a temporary change in behavior for this test, in order to make progress on other issues. After that:
- PR Split the Math module into two parts, one of which is included by default #19849 changed it from
Math.chpl
toAutoMath.chpl
on the justification that the test was added to exercise a name conflict with an included-by-default module & so this preserved the behavior while we were pulling out the automatically included part ofMath
toAutoMath
. - PR Add workarounds to get userInsteadOfStandard working for now #20188 added various workarounds to get this test working. The workarounds included a special case in the compiler for
AutoMath
,Errors
,ChapelIO
andTypes
; and also adding variousproc
s to the user'sAutoMath.chpl
to get things to compile (because there was only oneAutoMath
module & the user's one was replacing the standard one; likely due to my own confusion about the purpose of this test) - PR Replace
c_*alloc
functions with unstableallocate
#22358 changed it toChapelIO
for the conflicting module name due to changes in behavior resulting from changes in module initialization order. It continued to add someproc
s to keep standard library code compiling since the singleChapelIO
module the compiler gets is the one from this test rather than the standard library. - In working on PR Have the frontend library drive the process of parsing #25125, I saw failures with this test which led me to realize that the dyno frontend is behaving differently from the production compiler in terms of the order of searching standard/internal modules vs user modules.
Issue #23100 describes a related issue with a user module named search.chpl
conflicting with modules/packages/Search.chpl
on case-insensitive filesystems.
Summary of Discussion
Recall the key questions:
- What behavior do we want when a user inadvertently uses a module name that conflicts with a bundled module?
- Should it be possible for the compiler to work with two different top-level modules with the same name?
- Is it a breaking change to add a new module to the bundled
modules/
directory (including standard, internal, package modules)?- Should modules in the bundled
modules/
be searched for before or after user modules from-M
/.
/ same directory as other code?
I'm aware of 3 strategies here. I'll assume that the conflicting module will be named XYZ.chpl
for the purpose of discussion below (although it has been Math
, AutoMath
, and ChapelIO
at various times for userInsteadOfStandard
).
A: Work with the bundled module
The idea here is that there can only be one module with a given name. Even if there is a user module with that same name, use XYZ;
needs to refer to the bundled one because otherwise other bundled code or other library code (say in another mason package) depending on that bundled module will break.
Pros:
- consistent behavior
- simple to implement and understand
Cons:
- adding a new standard/internal/package module is arguably a breaking change
- perhaps this can be addressed with an idea like Rust Editions
Compiling userInsteadOfStandard/foo2.chpl
would result in this compiler output:
<command line>: warning: ambiguous module source file -- using /home/mppf/w/12/modules/standard/ChapelIO.chpl over ChapelIO.chpl
foo2.chpl:1: In module 'foo2':
foo2.chpl:31: error: unresolved call 'testchapelio()'
foo2.chpl:31: note: because no functions named testchapelio found in scope
foo2.chpl:31: note: unresolved call had id 308230
B: Work with the user's module
The idea here is that there can only be one module with a given name, but that the user's code needs to keep working if possible, and therefore the user's module should be used. Note that this means that other bundled code or other library code (say in another mason package) depending on that bundled module will break because they will try to use things from the bundled module that don't exist in the user's module. This is what we have now for 2.1 & in my opinion this option is untenable.
Pros:
- works OK if the bundled module is never used by anything (e.g. I'd expect that to be the case for a little-used package module such as
module/packages/Buffers.chpl
)
Cons:
- fails for any internal/standard module used automatically or used in the application (potentially within a 3rd party library used by the application)
- adding a new standard/internal/package module is still arguably a breaking change, because if user's code has that same module name, it can cause compilation failures elsewhere
- using certain file / module names can lead to mysterious breakage apparently from the internal/standard modules
Compiling userInsteadOfStandard/foo2.chpl
would result in this compiler output:
warning: Ambiguous module source file -- using ./ChapelIO.chpl over $CHPL_HOME/modules/standard/ChapelIO.chpl
$CHPL_HOME/modules/standard/IO.chpl:841: error: Bad identifier in 'only' clause, no known 'write' defined in 'ChapelIO'
$CHPL_HOME/modules/standard/IO.chpl:841: error: Bad identifier in 'only' clause, no known 'writeln' defined in 'ChapelIO'
$CHPL_HOME/modules/standard/IO.chpl:841: error: Bad identifier in 'only' clause, no known 'writef' defined in 'ChapelIO'
(The test as-written adds proc
s to ./ChapelIO.chpl
to avoid the errors above, but I don't think that's reasonable to expect in the inadverdent-same-name case; more generally with a module XYZ
this will work as long as the bundled XYZ
is not used in the compilation).
C: Work with both
The compiler should work with two modules with the same name, where one is a bundled module, and one is not. This strategy was used in 557eba1 but has since stopped working on main
. The idea here would be to improve the compiler in some way to support this.
Pros:
- adding a new bundled module is not a breaking change
Cons:
- goes against the current design "there can only be one top-level module with a given name in a given compilation" (more on this below)
- complications for separate compilation
- compilation for understandable error messages (if one of the modules is internally renamed inside the compiler, we have to un-rename it in the error messages; additionally, error messages mentioning the modules might be very confusing since the error message can't distinguish between the two)
- a library making use of this property will stop working if combined in certain ways; consider for example a Mason package named
Math
; that might work at first but will cause problems in an application that uses that library and also wants to useproc sin
from the bundledMath
library - potential for "hijacking" if an application is written using the bundled
XYZ
and depends on another Chapel library that later adds a top-level module namedXYZ
- significant implementation complexity, especially with the query-based dyno frontend
- only helps with bundled modules; does not help when multiple libraries / mason packages try to use the same top-level module name. Relatedly, it would still not be possible for a library to add a top-level module as a non-breaking change for users of that library.
Compiling userInsteadOfStandard/foo2.chpl
would result in this compiler output:
warning: Ambiguous module source file -- using ./ChapelIO.chpl over $CHPL_HOME/modules/standard/ChapelIO.chpl
About "there can only be one top-level module with a given name in a given compilation"
We have made a number of decisions recently that seem to double-down on the idea that there can only be one top-level module with a given name. Here are a few examples:
From #7847 (comment) :
"I want to have two modules with the same name in a single Chapel program." But that isn't possible / legal in Chapel as it's defined today (and I don't have a vision as to what it would mean to attempt to support it.
From #8470 (comment)
[regarding a "multiple definitions" error in a similar situation] I think this is appropriate—I tried to compile a program with two modules with the same name at the same scope and that doesn't make sense. It's hard for me to imagine that the language should do any more than this.
Issue #12923 proposed having different module search paths per-module, but this proposal was dismissed.
Issue #19312 concluded with the decision that it's not possible to define a user module that shadows an automatically-included symbol. That issue is saying that it's not generally possible to make a symbol with the same name as something from the automatically included modules; it is similar to the question of if it's possible to make a module with the same name as a bundled module.