-
Notifications
You must be signed in to change notification settings - Fork 1
/
Copy pathLIBRARY
244 lines (194 loc) · 8.33 KB
/
LIBRARY
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
PolyMLB
=======
## Simple usage
The PolyMLB library's main API is the `PolyMLB` structure, which offers the
following functions:
- `compile`: compile an ML Basis file and return the resulting namespace;
- `import`: compile an ML Basis file and import it into the global namespace;
- `use`: like import but with signature and behavior similar to the top level
function `use`.
The behavior of both `compile` and `import` can be customized through the passed
in options and return an `Ok | Error` value. On the other hand, `use` does not
take in options and has type `string -> unit`; in case of error, it prints the
error to stdout and raises an exception.
The library can be imported by calling `use` with the `build.sml` file in
`src/lib`, though SML_LIB is not set by default. Once it has been successfully
compiled, MLB based libraries may be imported by making use of the above
functions. Default path variables are stored in the `PolyMLB.pathMap` map, which
is publicly accessible and modifiable.
> PolyML.use "path/to/polymlb/sources/build.sml";
> HashArray.update (PolyMLB.pathMap, "SML_LIB", "path/to/libraries");
> PolyMLB.use "path/to/lib.mlb";
If PolyMLB was installed through `make install`, then it is available at
`$SML_LIB/polymlb`. As an example, the following can be used to start a
Poly/ML REPL with PolyMLB:
$ SML_LIB=$(polymlb -sml-lib)
$ poly -i --eval \
"PolyML.use \"$SML_LIB/polymlb/build.sml\";
HashArray.update (PolyMLB.pathMap, \"SML_LIB\", \"$SML_LIB\");"
The lower level components are included in the PolyMLB structure for convenience
and "just in case" but are not stable and must be used with care in order to
avoid catastrophic failure.
## Implementation overview
The exposed inner modules are as follows:
- Ann, annotations utilities;
- Basis, mlb validation and conversion from parse results;
- Compile, mlb compilation;
- Dag, mlb graph processing and validation;
- Lex, mlb lexing utilities;
- Log, logging utilities;
- NameSpace, result of mlb compilation;
- Parse, mlb parsing;
- Path, path validation.
The library is mostly designed as a two-stage process. The first one is the Dag
module, in charge of traversing and checking the entire MLB graph. The second
is Compile, which elaborates a graph; it completely trusts its input and expects
that there is no missing dependency, cycle, etc.
The general error policy is to raise an exception and let it propagate. The
PolyMLB structure is in fact only glue code for the different components with
some error handling.
Compilation works with namespaces (`NameSpace.t`), which are representations
of a 'basis' as described in [1]:
> [A] basis is a collection, but of more kinds of objects: types, values,
> structures, fixities, signatures, functors, and other bases.
The concrete implementation is simply a `PolyML.NameSpace.nameSpace` [2]
augmented with similar operations for bases (i.e operating on `NameSpace.t`).
Compiling an MLB file thus returns a namespace which contains its public exposed
symbols.
[1]: http://mlton.org/MLBasisSyntaxAndSemantics
[2]: https://polyml.org/documentation/Reference/PolyMLNameSpace.html
## Details
### Builtin libraries
Due to the Basis library and the Poly/ML extensions being baked in the compiler,
support for these is hacked in with prefilled namespaces derived from the global
namespace and a hardcoded list of structure and signature names from Poly/ML
extensions. In order to avoid special casing every module to check whether paths
refer to either of these libraries, valid mlb files are provided, although they
only contain an open directive for the corresponding namespace.
Since elaboration starts in an empty environment, these namespaces are at first
not available. A special `poly:importAll` annotation is used, which has the
effect of making the entirety of the global namespace available to its enclosed
declarations. Thus the SML Basis mlb is simply:
ann
"poly:importAll"
in
open BasisLib
end
### Elaboration order
Each mlb file is elaborated only once and its resulting namespace reused when
referenced again. On the other hand, SML source files are compilated everytime
they are referenced, even in the same mlb file. When exactly an mlb file is
compiled depends on the options passed to the Compile module.
The default elaboration scheme is purely sequential and compiles mlb files as
they are encountered, starting from the root mlb. E.g the following mlb files
$ cat a.mlb
local
$(SML_LIB)/basis/basis.mlb
in
a.sml
end
$ cat b.mlb
local
$(SML_LIB)/basis/basis.mlb
a.sml
in
b.sml
end
$ cat root.mlb
local
$(SML_LIB)/basis/basis.mlb
b.mlb
c.sml
a.mlb
in
main.sml
end
are compiled in this order:
1. root.mlb
1. basis.mlb
2. b.mlb
1. a.sml
2. b.sml
3. c.sml
4. a.mlb
1. a.sml
5. main.sml
It is possible to force all "dependencies" of an mlb file to be fully elaborated
before said mlb file is processed. The above mlb files could then be compiled as
follows:
1. basis.mlb
2. b.mlb
1. a.sml
2. b.sml
3. a.mlb
1. a.sml
4. root.mlb
1. c.sml
2. main.sml
Note that elaboration order is only guaranteed in the default case (encounter
order); if forcing dependencies first, the order in which independent mlb files
are processed is undefined. E.g here a.mlb could be processed before b.mlb.
### Parallel compilation
Parallel compilation works at the mlb level. In this example, a.mlb and b.mlb
may be processed at the same time. Elaboration of a single mlb file is always
sequential, regardless of the given options.
If the 'dependencies first' option is not set, then the elaboration of an mlb
file may be started before its dependencies have completed; then when reaching
a import directive that references an mlb which has yet to start or is
elaborating, the current elaboration is simply parked until the dependency is
ready and the processing thread switches to another job.
A possible parallel elaboration trace for root.mlb could look like this, if for
some reason compiling a.mlb took longer than b.mlb:
1. start root.mlb
2. start basis.mlb
3. start a.mlb
4. start b.mlb
5. complete basis.mlb
6. check root.mlb for basis.mlb; continue
7. check root.mlb for b.mlb; park
8. complete b.mlb
9. resume root.mlb
10. compile c.sml
11. check root.mlb for a.mlb; park
12. complete a.mlb
13. resume root.mlb
14. compile main.sml
15. complete root.mlb
The order in which mlb files are elaborated, started or resumed is undefined
and depends on a number of different factors, such as dependency order, thread
count, etc. The only guarantee is that of the dependencies first option.
Although it may seem that allowing an mlb file to start only after its mlb
dependencies are elaborated limits parallelism, most mlb files likely look like
the following:
local
$(SML_BASIS)/basis/basis.mlb
dep1.mlb
...
source1.sml
...
in
source2.sml
...
end
in which case there is at best very little to be gained from concurrent
elaboration since the actual time consuming part of the process is compiling
SML source files, which may only happen after all dependencies are completely
finished due to the sequential nature of within-mlb elaboration.
### Errors and cancellation
Elaboration is completely aborted when an error is encountered; which, assuming
the given data is correct, means the error happened either when compiling SML
source files or in a rebinding / open MLB directive such as `structure S1 = S2`.
If this is the former, then it may be due to an I/O error, an ill-formed program
that did not compile or top level code that failed in a way or another.
In the case of parallel compilation, all processing threads are interrupted
synchronously, i.e when they reach a suitable break point. Such a break point
is one of:
- before starting a new job;
- after completing a job (e.g after completing or parking an mlb file);
- before each declaration in an mlb file.
Finer grained constructs, such as SML compilation, are never interrupted (at
least not by the library itself), i.e if an error happens while an SML file is
being compiled, then it will be entirely compiled and its top level executed
before the thread actually exits.
Further errors that happen during the timeframe between the first error and all
threads exiting are discarded and only the original one is reported.