-
Notifications
You must be signed in to change notification settings - Fork 18k
proposal: all: add bare metal support #73608
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Related Issues
(Emoji vote if this was helpful or unhelpful; more detailed feedback welcome in this discussion.) |
Looking at the past proposals, this is just a list of new changes (much like the previous), but doesn't actually address the primary concerns of the past proposals, namely the shift of support burden on to the core team. |
You are right, I just separated an actual proposal with the underlying background, hopefully this should reason the new request. I've laid out the external functions proposal which are specifically proposed to shift the support burden of any hardware or target specific support away from the core team. |
I would like us to take another look at this. I think there are at least two major changes which affect the maintenance burden vs. value tradeoff evaluation:
This span of time also makes it possible to have a retrospective look at use cases, and at the actual burden/churn. I probably over-index on security-sensitive use cases for obvious reasons, but running Go applications sandboxed as Firecracker micro VMs and writing memory-safe UEFI bootloader firmware in Go is extremely exciting to me. |
What about the other parts of the porting policy: https://go.dev/wiki/PortingPolicy#requirements-for-a-new-port Besides @abarisani , who are the named maintainers going to be, and what level of user adoption have we seen with tamago over the years? |
cc @golang/runtime |
I am also excited by the idea of bare metal Go. To me, the major challenges are around compatibility. Presumably such a port should conform to the Go 1 Compatibility Promise. This has two major components:
Alternatively, we could punt on this and state that these APIs must not be implemented in Go at all, and instead must be implemented in assembly. That seems unfortunate, though looking at your example, the code is fairly simple so maybe this wouldn't be completely untenable. |
I see Tamago is currently single-threaded (according to the lock implementation in |
We are happy to provide two developers and accept responsibility as well as maintaining a builder, we are happy to provide any support needed and required. Our existing effort has been engineered specifically to avoid touching first class ports or change them in any way, the implementation of standard distribution tests has also been specifically worked to support such requirements and a builder. Concerning adoption we have WithSecure sponsored projects as well as Transparency.dev Armory Witness projects (bootloader, OS, applet) as primary adopters of tamago (see a list here). We have several thousands units running tamago on a variety of privately contracted projects that unfortunately we cannot speak publicly about. The amd64 port, which enables non-embedded use cases, however is very recent has it has been published only a few months ago, we do think that a side effect of upstream adoption would also be increase in popularity of Go in places people generally don't see a use for (go-boot being one example). |
This is quite right, I think the proposed changes would well serve architectures like wasm or "soft isolated" targets.
The initialization code is meant to be really simple, most of it could be actually moved to the application itself but it's part of It is true that The other restriction I can think of is that currently I think in both cases it could be feasible to require a Go assembly implementation for both though it would mean losing a lot of convenience in just being able to use Go for clean and flexible code. I do wonder if there is a compile time technique that would allow to scan if an initialization function is suitable or not, in a similar manner to |
I am working on SMP as we speak, it's our intention to add it to the It's not trivial, but we are actively working on it, I am also toying with the idea of one runtime per core to have AMP rather than SMP as that suits kinda well the bare metal architecture that Go allows, I guess it might also be possible to have newosproc as externally provided, I need to get my bearings on this for the least friction effort but definitely the core principle would be to avoid runtime API changes one way or the other. |
FYI I added at the bottom of the Go runtime changes section our current interrupt handling helpers, I think there is room for improvement on this front in the proposal and something much simpler could possibly be obtained. The rational of the current waking function was to pierce existing Go runtime structures (namely the timer) without changing anything for other With orchestration of such changes I am sure a much simpler way could be derived, by adding to current timer support, to simply prioritize a waiting/parked goroutine. Nonetheless it's clear that bare metal support would require interrupt handling and that an asynchronous low-level wake up function is required to be invoked in pure (see an example of an handler invoking |
Reading this as someone who previously attempted porting Go to a niche OS for fun 😬 it's nice to see what I'm reading as essentially a "minimum viable GOOS porting API" written out in detail, although I understand the Go team's reservations about supporting it as an external compatibility surface. Do you think it's plausible that what you've explored here could become a general-purpose abstraction layer to aid in new ports to actual operating systems, rather than only to bare-metal? Specifically I'm thinking of a model where certain GOOS values could be mapped to either in-tree or out-of-tree Go modules that would get automatically linked in by the toolchain whenever that GOOS is selected, and would provide the same supporting machinery that would've been provided by the end-developer in the bare metal situation you described, but packaged up in a reusable form separate from the application. I imagine that the main ports to mainstream OSes would still benefit from more direct support throughout the runtime and library, but I'm thinking here about more esoteric targets that are OS-like but not sufficiently popular to justify the maintenance costs of a fully-integrated port. (If what I've asked here seems too far away from what this proposal was intended to cover then I'm happy to cease talking about it to avoid taking the discussion off-topic. My intention in asking is that framing this more broadly as a way to port to niche OSes and to bare metal might make the proposal easier to accept, assuming that's a viable thing to do without adding a huge amount of extra complexity and compatibility-promises.) |
How do you see the overlap with the usage of TinyGo? |
I don't see overlap as TinyGo is a different implementation which also targets entirely different classes of targets, different instruction sets and doesn't provide 100% compatibility with the runtime and/or language, please see our FAQ entry. |
Right now porting to an actual OS lacks an API to access the file-system, but I'd say that apart from that it can work to interact with an arbitrary OS for sure, see how we re-use this API to interact with Linux in an isolated fashion. I think this is a good point to highlight the value of a |
I think the bulk of I have a local working hack which achieves this by simply returning 1ns increments before initialization. The only downside I see is that runtime.schedinit will run under "fake" time, but I don't think that matters. Therefore if ultimately the proposal hinges on this I think we can address it, otherwise I think in general it would be useful to have a |
We discussed this proposal briefly in #43930 (comment) yesterday, where we had similar thoughts to @apparentlymart's #73608 (comment). That is, there are two ways of thinking about this proposal:
|
To be clear, your FAQ also mentions embeddedgo, which is a Go port that targets microcontrollers. I use TinyGo today, but would happily switch to "Big" Go and I think it would be a mistake to do this proposal without considering the embedded targets. It's true that many (all?) ARM microcontrollers require thumb(v2?) support in the assembler, but I see that as an orthogonal problem (and proposal). I'm not even sure thumb support would require a new GOARCH: as far as I understand, all relevant arm 32-bit processors support thumb. In particular, the Android NDK targets thumbv2 for 32-bit ARM. With embedded targets, we might see some movement on #6853 :) I would love to hear what @embeddedgo or @aykevl had to say.
I have also been thinking of re-proposing
Please consider the potential decrease in maintenance burden that |
I have been talking to folks at $dayJob and several have been using TamaGo for years and love it. From the user point of view (with no reasonable view into the sausage factory), it would be great to have these capabilities with standard Go. |
There may also be some overlap with the wasip2 discussion in #65333 as the new WASI Preview 2 "component model" also doesn't have a common ABI, it depends on the component's WIT definition. |
I feels to me that splitting Something like (yes probably these are terrible names) The full compatibility promise would be maintained with clear cut hooks with specific role, implementers can still toy with Go While I still need to do more complete testing it seems that on ARM I can get a single liner for |
I just wanted to mention that maintaining a side port of vanilla Go targeted to real ARM MCUs (RAM < 1 MB, Thumb2 ISA) isn't so easy. The allocator and the GC are surprisingly flexible and can definitely work with such small RAM but require some tweaks. From my point of view the main problem is the fact that the runtime is unaware of the possibility that the call address may differ from the instruction address and this is a "feature" of the Thumb2 ISA (these addresses differ at LSBit). It gives me a headache every time I merge the next Go release into Embedded Go and I've no enough free time to do it on every PR to the golang/go master branch (BTW, I'm just working on merging go1.24). I think the first step to support embedded ARM targets it to add support for |
Yes, I can see how MCUs can be a very different beast, which is why TamaGo targets SoCs or AMD64 which is far easier. I'd say that merging major Go releases has been probably relatively easier on the TamaGo side. Nonetheless it would be nice to see in the future support for Thumb2 mainstream and the potential interaction with TinyGo in this front. |
Microcontrollers such as the Raspberry Pi rp2350 support risc-v, and Go supports EDIT: |
AFAIK there is currently no support for 32-bit RISCV, aka.
But 32-bit RISCV ISA is very similar to the 64-bit one (what can't be said about ARM and Thumb2) so it should be quiet easy to add it. BTW the Embedded Go supports both RP2350 ARM cores in the latest master-embeedded branch. You can find the supported peripherals here. |
To be clear, the safe subset of Go isn't just about "before world started" and "after world started", though that is certainly part of it. For example,
Now, of course I think if we view this as supporting out-of-tree ports we don't try to provide language compatibility. It is just too big of a problem. We might want to expose some of the tools we use to make it easier to maintain the runtime. e.g., |
Understood, In all our platforms
This can safely and quickly be done in assembly without issues. I think |
As one of the TinyGo maintainers, I think many of us are interested in seeing something like this happen. I'm on the road right now, so do not have time to get into the details of this specific proposal until probably end of next week. A couple of the other maintainers like @aykevl and @soypat are on vacation until next week. They will surely be able to add something to the conversation about this specific proposal.
This is mostly correct. There is some overlap, but that is just mostly incidental. The vast majority of targets for each are very different.
TinyGo supports the full Go language for several years. It is true that the runtime support can differ. I will wait until I can look this over in detail before making any specific comment. Very glad to see movement on this topic, thank you to @abarisani for getting it going (yet again)! |
Heya peeps, another TinyGo maintainer here. I work on supporting the Raspberry Pi Pico bare-metal target (RP2040/RP2350) and various host peripherals like PIO, DMA, SPI, I2C, PWM, and so on. I don’t have much to add to the discussion other than expressing my gratitude for the work @abarisani has put into TamaGo and this proposal. I love that it reads like a Rosetta Stone for how the runtime interfaces with hardware-- there’s clearly a lot of experience behind it. It looks solid and feels like an important stepping stone for Go’s future in the embedded space. Huge cheers to everyone who made this happen c: Really excited about what this could mean for Go going forward! |
I would really like to see this happen. Google already uses Tamago, in transparency.dev. But it would also be useful for u-root, which Google also uses. |
I've analyzed this proposal more thoroughly and generally agree with it. But in the current form it still lacks support for smaller targets, say Raspberry Pi Pico (RP2350) or STM32H743, which with their 520 KB or 1024 KB of RAM, can be successfully programmed in Go. Below I want to mention four things that come to my mind right now.
// RP2350 constants
noosScaleDown = 8 // must be power of 2
noosStackCacheSize = 8 * 1024
noosNumStackOrders = 2
noosHeapAddrBits = 19 // 512 KiB of SRAM
noosLogHeapArenaBytes = 15 // 32 KiB
noosArenaBaseOffset = 0x2000_0000
noosMinPhysPageSize = 256
noosSpanSetInitSpineCap = 8
noosStackMin = 1024
noosStackSystem = 27 * 4 // register stacking at exception entry
noosStackGuard = 464
noosFinBlockSize = 256
noosSweepMinHeapDistance = 1024
noosDefaultHeapMinimum = 8 * 1024
noosMemoryLimitHeapGoalHeadroom = 1 << 15
noosGCSweepBlockEntries = 64
noosGCSweepBufInitSpineCap = 32
noosGCBitsChunkBytes = 2 * 1024
noosSemTabSize = 31
noosGOGC = 30
noosTimeHistMaxBucketBits = 45 Microcontorllers quite often have more than one RAM region with different (discontinuous) addresses and capabilities (for example DMA capable, non-DMA capable). Support for multiple memory regions isn't essentially required at first but as available memory is always in short supply we cannot ignore this fact in the long term. For example, in Embedded Go the non-DMA capable memory if exists is used by the runtime for persistent allocations leaving more DMA capable RAM for Go programs.
|
My proposal re-uses the existing compiler and memory allocation, therefore yes it is not currently suitable for such targets. I am happy to ponder the issue however, at first iteration though I found too "large" of a problem for me to address such different variety of platforms, also because of TinyGo existence to be honest. Fundamentally TamaGo was born with the clear intention of supporting existing compiler targets and without requiring compiler or linker changes, this is why I never considered such an expansion which I thought too invasive for my knowledge. Concerning interrupt handling this is currently supported through GetG and WakeG runtime functions which I believe I mentioned in the proposal. |
I created a stub package to I hope this helps regardless, if it creates confusion happy to remove it. I also split hardware initialization to pre World start and Post World start as it works fine and helps with some of the proposal requirements. Please see goos-none-proposal or pkg.go.dev. |
It is clear that TamaGo has its sights set on larger systems than mode embedded microcontrollers, but it is also clear that microcontrollers have a broad range of capabilities which, at the high end, intrudes into the target range of TamaGo. My experience is with TinyGo and I see and feel a pretty big difference here. The real kicker is often the size of the runtime. Another is the tight integration with the machine environment which typically shows up in the form of a large set of machine definitions in terms of special addresses and volatile declarations. At the small end, one user on the TinyGo slack described getting a trivial application to run 500 bytes of RAM and a number have reported executable sizes in the 10's of kB range. The existence of that sort of target environment makes it clear that having multiple solutions isn't necessarily a bad thing. As such, I don't think that the fact that TamaGo doesn't cover all target environments isn't so much of a defect as simply a concession to reality. It's still awesome work and a serious contribution. |
As this proposal is titled "add bare metal support" and not "add support for GOOS=tamago" I think its probably a good idea to don't limit this discussion to only one, very specific branch of this subject. |
The proposal as I interpreted it is not about discussing the general topic of bare metal and embedded targets, but about adding a specific OS abstraction interface which requires otherwise minimal changes to the runtime and toolchain. I think that’s the right first step, with its specific cost/benefit ratio. More extensive changes to support additional targets can be discussed separately and can build on top of this, if accepted. |
I think having an externally defined sbrk function might help, but I am not sure it will be enough given the complex calculations done in the runtime for heap memory allocation. I am not sure there is a clean way to abstract those, suggestions are welcome. I agree with Filippo that probably the proposal title doesn't quite catch the spirit of the proposal. This third attempt was specifically reignited because the former bare metal focus has now been shifted, working nonetheless to support my former proposals, to a working generic runtime API abstraction (at least for the same class of targets Go currently supports). |
When thinking about the topic of how this could potentially "scale down" to much smaller systems like traditional microcontrollers, maybe this framing is helpful:
This proposal is currently focused mostly on the second item above, which is inherently specific to the design details of the mainline Go. But the library ecosystem can potentially make use of |
Hey there, I'm maintaining the mips/nintendo64 port of @embeddedgo. Some remarks:
|
The question about printk() was already answered (quite a ways) upthread. The idea was that printk() is intended to be useful in panic and, as such, should not allocate because of the risk of a recursive panic. In fact, in the sample runtimes posted in this issue, a single byte was pre-allocated to make sure that no allocations are necessary. This isn't a general purpose printing substrate. It is a debugging tool of last resort. |
Hello and thanks for your feedback.
|
I don't understand how passing a single byte avoids allocations in printk. At the end of the day, the printk implementation must make sure to avoid allocations? I might be missing something here. A goroutine waiting on an interrupt is similar to a goroutine waiting on IO. So in embeddedgo a minimal netpoller was implemented. The netpoller will be called by the goroutine scheduler to get a list of ready to run goroutines. In embeddedgo the netpoller might sleep on a futex, but in your case it's even simpler: Just move the g semaphore in pollDesc to pdReady from an interrupt and the goroutine scheduler will wake it for you. See https://github.com/embeddedgo/go/blob/master-embedded/src/runtime/netpoll_noos.go |
We halt the CPU while waiting for interrupts and do not rely on any polling code. In other words if we are idling waiting for an interrupt nothing is going on and the CPU is halted. Is this achievable with your code (which I will study as soon as I have the chance)? Thanks! |
Yes, but you should halt the CPU in your netpoll() implementation and also need wake the CPU after the specified delay, if no interrupts occured. Feel free the message me privately to keep this on topic. |
Proposal Details
I propose the addition of a new
GOOS
target, such asGOOS=none
, to allow Go runtime execution under specific application defined exit functions, rather than arbitrary OS syscalls, enabling freestanding execution without direct OS support.This is currently implemented in the
GOOS=tamago
project, but for reasons laid out in the Proposal Background section it is proposed for upstream inclusion.Go applications built with
GOOS=none
would run on bare metal, without any underlying OS. All required support is provided by the Go runtime and external driver packages, also written in Go.Go runtime changes
Note
The changes are also documented in their own repository and pkg.go.dev
A working example of all proposed changes can be found in the
GOOS=tamago
implementation.Board support packages or applications would be required (only under
GOOS=none
) to define the following functions to support the runtime.If the use of
go:linkname
is undesirable different strategies are possible, right now linkname is used as convenient way to have externally defined functions being directly invoked in the runtime early on.cpuinit
(example): pre-runtime CPU initializationThe function is required to be defined in assembly as pre-runtime there is lack for support of native Go statements.
runtime.hwinit
(example): early runtime hardware initializationruntime.printk
(example): standard output (e.g. serial console)runtime.initRNG
andruntime.getRandomData
(examples): random number generation initialization and retrievalruntime.nanotime1
(example): system time in nanosecondsruntime.ramStart
(example),runtime.ramSize
(example) andruntime.ramStackOffset
(example): RAM layoutBoard support packages or applications can optionally define the following:
runtime.Bloc
(example): heap memory start address overrideruntime.Exit
(example): runtime terminationruntime.Idle
(example): CPU idle time managementNetwork I/O through Go own net package requires the application to set an external
Socket
function (example, driver example):The Go runtime would implement the following, or similar, functions to aid interrupt handling:
runtime.GetG
(example),runtime.WakeG
(example),runtime.Wake
(example): asynchronous goroutine wake-upCompilation
The compilation of such targets would remain identical to standard Go binaries, while the loading strategy might differ depending on the hardware but anyway be handled in a manner completely external to the Go distribution, using standard flags as required, examples:
Proposal Background
This proposal follows updates on the TamaGo project, which brings bare metal execution for Go on AMD64, ARM and RISCV64 targets.
While similar proposals (see #37503 and #46802) have been already attempted without success, this last effort is motivated by considerable advancements and changes in our effort.
Notable changes:
There is now fully tested Go standard library support, integrated with vanilla distribution tests. The testing environment for AMD64, ARM and RISCV64 architectures runs under Linux natively or using qemu-user-static via binfmt_misc.
The tamago networking code allows external definition of a single socket function to attach gVisor or any other fake networking stack if desired, this could benefit other Go architectures and allow replacement of existing fake networking for js/wasip1.
Because of what implemented to support 1. and 2.
GOOS=tamago
allows the execution of unmodified Go applications as softly isolated userspace code with OS resources, such as networking and filesystems, isolated from the actual user OS.TamaGo now no longer focuses only on ARM embedded systems, but it extends to AMD64 KVM execution such as microVMs.
The overall Go distribution changes are free from hardware dependent code (e.g. peripheral drivers) and changes are unified across different architectures, see for instance the identical implementation for entry points of amd64, arm, riscv64 architectures.
In summary
GOOS=tamago
has been transformed to a generic implementation that allows execution without relying on operating system calls but rather with a unified "outside world" exit interface, whether implemented for testing, userspace execution, real hardware or paravirtualization.On ARM we implemented bootloaders, Trusted Execution Environments and full secure OS and applets with this framework.
Thanks to a recent amd64 port, we enabled Pure Go KVMs under Cloud Hypervisor, Firecracker and QEMU.
This also allowed us to implemented execution under UEFI which enabled 100% Go EFI applications and bootloaders such as go-boot (which currently booted Linux on the Thinkpad I am writing from).
We think adopting these compact Go distribution changes would allow not only preservation of this ecosystem but expansion and innovation of the Go language as a whole, taking it to surprising, unxpected, yet ideal, environments.
The cost of maintaining the patch is, in our opinion, reasonable and if anything beneficial in improving the Go abstraction across architectures and OS specific components.
The only component that is somewhat low-level and sensitive in terms of maintenance between major Go releases is our asynchronous goroutine waking function, which is used to serve external interrupt requests.
However such function can be re-implemented in a manner consistent with existing OS signaling or, simply with the awareness and inclusion of
GOOS=none
, hooked to a much simpler standardized interface within the timerstructure, which so far has not been implemented purely to avoid any pollution with non-tamago architectures.
The text was updated successfully, but these errors were encountered: