Skip to content

Commit

Permalink
Doc update
Browse files Browse the repository at this point in the history
  • Loading branch information
mmastrac committed Oct 21, 2024
1 parent 0ce200f commit 97fa342
Show file tree
Hide file tree
Showing 5 changed files with 77 additions and 16 deletions.
38 changes: 33 additions & 5 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,8 @@ Bootstrap is a small VM (< 20 ops) with an ASCII encoding. The goal of this
project is to create a readable and auditable bootstrapping process to generate
C binaries for this virtual platform or any other.

> If you wish to make an apple pie from scratch, you must first invent the universe
## Why?

1. Trusted compilation - every program involved in compiling a given C program
Expand All @@ -27,6 +29,10 @@ to get us there.
The stages should be easy to understand in isolation, and enough to hold
one-at-a-time in your head.

Stages contain _some_ but not _complete_ error checking for their inputs. Most
stages are designed to compile the next stage only and may miscompile. As the stages
advance, we add additional error checking.

In some cases we may define useful compilation utilities in earlier stages that
are re-used later in the bootstrap chain, for example linkers and shell-style
utilities.
Expand Down Expand Up @@ -112,7 +118,8 @@ fully-featured, though based assembler with support for variable-length,
two-level symbols (ie: `:global` + `.local`) and two-pass symbol resolution.
Also supports constant-style symbols that can be defined via `=symbol__ ABCD`.

Instructions are defined in textual format.
Instructions are defined in textual format. This assembler has a more natural,
intel-like syntax.

Enables:

Expand All @@ -134,14 +141,15 @@ Status: *complete* ✅

Stage goal: A fully-featured assembler, reusable by the next stage

[`bootstrap4.s`](bootstrap4/bootstrap4.s) ([README](bootstrap4/README.md)): A "complete" assembler that allows input
from multiple files, linked together to create an output executable. This
assembler has a more natural, intel-like syntax.
[`bootstrap4.s`](bootstrap4/bootstrap4.s) ([README](bootstrap4/README.md)): A
"complete" assembler that allows input from multiple files, linked together to
create an output executable. The assembler in this stage is an evolved and far
more featureful version of the one in stage 3.

The output for a given opcode from this assembler may or may not correspond to a
single VM opcode. The compiler takes over one of the VM registers as a "compiler
temporary", allowing us to create some CISC-style ops that drastically reduce
instruction counts for various types of operations.
line counts for various types of operations.

This assembler also allows for more complex macros that make procedure calls,
arguments and locals much simpler. As part of this functionality, the compiler
Expand Down Expand Up @@ -186,3 +194,23 @@ Stage goal: A fully-featured C85 (C99?) compiler.

[`bootstrap6`](bootstrap6/) ([README](bootstrap6/README.md)): A full C85 compiler written in a simpler subset of C than can compile a full CXX
compiler (as long as it conforms to C85). Currently a work-in-progress.

## Additional notes

### License

The project is under a *GPL license*, but as with the GNU C Compiler, outputs from
this program are considered copyright by the author of the input program. Mere
bundling of this bootstrap is not considered linking, nor is use of this
bootstrap to bootstrap any other system.

The GPL requires the source for this bootstrap to be transmitted with any binary
of the same program, ensuring that any use of this library is auditable and
traceable.

### On trustworthiness

To verify the trustworthiness of this source, it is highly recommended that you
use multiple VM implementations. In addition, this project ships with expected
SHA256 sums for each stage of the bootstrap that should be validated using your
own SHA256 implementation after compilation.
11 changes: 10 additions & 1 deletion bootstrap3/bootstrap3.s
Original file line number Diff line number Diff line change
@@ -1,9 +1,18 @@
# Stage 3 bootstrap
# =================
#
# Macro assembler that improves on the previous stage, allowing
# for longer label lengths, and a simple assembler format to
# isolate the source from the underlying op encoding.
#
# This is the final stage that uses hand-assembled opcodes for the
# majority of its work.

# TODO:
# eq|add|sub rX, ra/rb -> eq|add|sub rX, 0/1
# - Remove the a-e register use?
# - eq|add|sub rX, ra/rb -> eq|add|sub rX, 0/1
# - Consider `ldd r0, [r1]` rather than `ldd [r1], r0`
# - Multi-file support

=ARGVSIZE 1000
=SYMTSIZE 5000
Expand Down
31 changes: 21 additions & 10 deletions bootstrap4/bootstrap4.s
Original file line number Diff line number Diff line change
@@ -1,18 +1,29 @@
# Fourth stage bootstrap

# Stage 4 bootstrap
# =================
#
# Historical note: this was previously bootstrap3 and used the raw VM
# opcodes. We introduced a new bootstrap3 in 2024 and mechanically
# translated each line 1:1 with the newer syntax to make this stage
# easier to maintain.
#
# Implements an assembler that supports a much richer, more human-readable format
#
# Includes real, two-pass label support (up to 32 chars long), simple call/return semantics

# TODO:
# - object file support
# Polish/performance
# - Symbol table with local symbs should be "rolled back" at next global symbol for perf
# - Can we do local fixups per global?
# - Short immediate constants should use '=!x.' format
# - readtok_ subroutines should be real functions
# - macro for locals/args copy (ie: r0->r4, r1->r5, pushed/restored automatically)

# - During mechanical translation, labels were not expanded to 8+ chars,
# and we did not trim the extraneous trailing underscores that are no
# longer necessary.
# - Polish/performance:
# - Symbol table with local symbs should be "rolled back" at next global symbol for perf
# - Can we do local fixups per global?
# - Short immediate constants should use '=!x.' format
# - readtok_ subroutines should be real functions
# - macro for locals/args copy (ie: r0->r4, r1->r5, pushed/restored automatically)

# Register notes:
#
# Ra-Re = 0, 1, 2, 4, 8 values
# Rx = Temp var
# Ry = Stack pointer
# Rz = PC
Expand Down
8 changes: 8 additions & 0 deletions tools/bootstrap2/lint.py
Original file line number Diff line number Diff line change
@@ -1,4 +1,12 @@
#!/usr/bin/env python3

# As bootstrap2 is a complex, difficult beast to maintain, this Python script
# allows one to "lint" the source and detect potential compilation issues that
# may cause a miscompilation because of incomplete error detection.

# TODO: This should be converted to C so we can compile it with bootstrap5+ or
# bootstrap6+.

import sys
import re

Expand Down
5 changes: 5 additions & 0 deletions tools/bootstrap2/migrate.py
Original file line number Diff line number Diff line change
@@ -1,4 +1,9 @@
#!/usr/bin/env python3

# This was a tool used to mechanically translate what lives in bootstrap4 (nee
# bootstrap3) to the newer syntax we introduced in the "new" bootstrap3. We keep
# it here for posterity, though it is unlikely we will use it again.

import sys
import re

Expand Down

0 comments on commit 97fa342

Please sign in to comment.