Skip to content
Draft
Show file tree
Hide file tree
Changes from 10 commits
Commits
Show all changes
43 commits
Select commit Hold shift + click to select a range
ec523da
perf(block): replace sync.Map with atomic bitmap for block cache dirt…
levb Mar 26, 2026
b4b62cc
Review fixes: cap setIsCached to bitmap bounds, concurrent test, nits
levb Mar 26, 2026
7238599
chore: auto-commit generated changes
github-actions[bot] Mar 26, 2026
791950e
perf(block): precompute OTEL for chunker hot paths (#2236)
levb Mar 27, 2026
f8e5216
Merge branch 'main' of github.com:e2b-dev/infra into lev-block-cache-…
levb Mar 27, 2026
6b71c54
Merge branch 'lev-block-cache-bitmap' of github.com:e2b-dev/infra int…
levb Mar 27, 2026
7fc43fc
PR feedback: i->blockIdx, restore comments
levb Mar 27, 2026
be976c0
Merge branch 'main' of github.com:e2b-dev/infra into lev-block-cache-…
levb Mar 27, 2026
779e033
refactor(block): extract atomic bitset to shared utility package
levb Mar 27, 2026
19a857a
PR feedback: start/endBlock() helpers
levb Mar 27, 2026
3f0daf4
feat(block): add flat/roaring/sharded bitset implementations behind FF
levb Apr 2, 2026
c65d2b2
Merge branch 'main' of github.com:e2b-dev/infra into lev-block-cache-…
levb Apr 2, 2026
433c5d4
chore: auto-commit generated changes
github-actions[bot] Apr 2, 2026
cb37eb1
PR feedback: use header helpers, restore comments
levb Apr 2, 2026
51ba221
PR feedback: start/endBlock() helpers
levb Apr 2, 2026
86f31cb
fix(bitset): rename `make` param to avoid builtin shadowing
levb Apr 2, 2026
43b647d
refactor(bitset): benchmark at realistic size (131K bits)
levb Apr 2, 2026
adf386d
test: switch default bitset to atomic for CI validation
levb Apr 2, 2026
3d1e86c
refactor(block): move dirty bitmap locking to Cache, simplify Bitset …
levb Apr 2, 2026
1c53aec
chore: auto-commit generated changes
github-actions[bot] Apr 2, 2026
64b640a
test: disable race detector in CI, add bitset debug logging
levb Apr 2, 2026
4f9b0cb
Merge branch 'main' of github.com:e2b-dev/infra into lev-block-cache-…
levb Apr 2, 2026
3b4cd7d
fix: use Bitset.Iterator instead of sync.Map-style Range in cache export
levb Apr 2, 2026
75b52cc
chore: auto-commit generated changes
github-actions[bot] Apr 2, 2026
e67b4ed
fix: acquire dirtyMu in ExportToDiff to prevent data race with concur…
levb Apr 2, 2026
b77211a
fix: default bitset to bits-and-blooms, add BitsAndBlooms impl
levb Apr 3, 2026
e08ddaa
chore: auto-commit generated changes
github-actions[bot] Apr 3, 2026
6f0fe99
fix: lint issues in bitset_test.go
levb Apr 3, 2026
d4542cd
refactor: self-synchronizing bitset impls, remove dirtyMu from Cache
levb Apr 3, 2026
60c54ff
Merge branch 'lev-block-cache-bitmap' of github.com:e1b-dev/infra int…
levb Apr 3, 2026
0835f26
chore: auto-commit generated changes
github-actions[bot] Apr 3, 2026
2840ad4
perf: Has fast path in isCached, updated benchmarks
levb Apr 3, 2026
07387c8
Merge branch 'lev-block-cache-bitmap' of github.com:e1b-dev/infra int…
levb Apr 3, 2026
ae1ff1e
chore: auto-commit generated changes
github-actions[bot] Apr 3, 2026
a89f1ff
Change iteration
ValentaTomas Apr 3, 2026
4cdb5f2
Merge branch 'main' into lev-block-cache-bitmap
ValentaTomas Apr 3, 2026
59563a6
Test with sync map
ValentaTomas Apr 3, 2026
7f3307f
Fix lint
ValentaTomas Apr 3, 2026
dd98117
Try bits and blooms bitset
ValentaTomas Apr 3, 2026
098d060
Test roaring 64
ValentaTomas Apr 3, 2026
3764ee2
Try rewriting used methods
ValentaTomas Apr 4, 2026
3fdbd41
Change containsInt back
ValentaTomas Apr 4, 2026
dcfc4ea
Enable tests
ValentaTomas Apr 4, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
49 changes: 23 additions & 26 deletions packages/orchestrator/pkg/sandbox/block/cache.go
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,6 @@ import (
"math"
"math/rand"
"os"
"slices"
"sync"
"sync/atomic"
"syscall"
Expand All @@ -19,6 +18,7 @@ import (
"go.opentelemetry.io/otel"
"golang.org/x/sys/unix"

"github.com/e2b-dev/infra/packages/shared/pkg/atomicbitset"
"github.com/e2b-dev/infra/packages/shared/pkg/storage/header"
)

Expand Down Expand Up @@ -49,7 +49,7 @@ type Cache struct {
blockSize int64
mmap *mmap.MMap
mu sync.RWMutex
dirty sync.Map
dirty atomicbitset.Bitset
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if we have a 100GB disk and we set a dirty block at the end, won't we use like 400KB of memory just for storing that one bit?

Copy link
Copy Markdown
Member

@ValentaTomas ValentaTomas Mar 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Won't this be 3.125MiB?

((100*1024^3)/4096/8)/1024/1024

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be fair here though, this is also a problem with the bitset we use now. Only the roaring bitmaps solve this, but they don't support the iteration/lookup we would ideally need, so for pause proccessing I would still like to use the current bitset (or implement similarly effective iterator over the bitmaps).

One other node—with the current NBD the cache map might not be under much contention (but might be relevant if we use that for the other caches).

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Won't this be 3.125MiB?

divided by 64 as you have 64 bits (chunks) for one array slot

Copy link
Copy Markdown
Member

@ValentaTomas ValentaTomas Mar 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👌 My bad (not bad in the end).

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I added 2 implementations now - roaring (by far the most compact, and plenty fast on its own, single threaded), and a sharded atomic uint64 array which is still quite compact and much (100x) faster when concurrency is involved.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The roaring implementation looks great. Unless we really need the the speed optimization, I'm for opting for now for the roaring approach only, leaving the sharded atomic optimization outside of this PR

dirtyFile bool
closed atomic.Bool
}
Expand Down Expand Up @@ -87,12 +87,15 @@ func NewCache(size, blockSize int64, filePath string, dirtyFile bool) (*Cache, e
return nil, fmt.Errorf("error mapping file: %w", err)
}

numBlocks := (size + blockSize - 1) / blockSize

return &Cache{
mmap: &mm,
filePath: filePath,
size: size,
blockSize: blockSize,
dirtyFile: dirtyFile,
dirty: atomicbitset.New(uint(numBlocks)),
}, nil
}

Expand Down Expand Up @@ -245,31 +248,30 @@ func (c *Cache) Slice(off, length int64) ([]byte, error) {
return nil, BytesNotAvailableError{}
}

func (c *Cache) startBlock(off int64) uint {
return uint(off / c.blockSize)
}

func (c *Cache) endBlock(off int64) uint {
return uint((off + c.blockSize - 1) / c.blockSize)
}

func (c *Cache) isCached(off, length int64) bool {
// Make sure the offset is within the cache size
if off >= c.size {
return false
}

// Cap if the length goes beyond the cache size, so we don't check for blocks that are out of bounds.
end := min(off+length, c.size)
// Recalculate the length based on the capped end, so we check for the correct blocks in case of capping.
length = end - off

for _, blockOff := range header.BlocksOffsets(length, c.blockSize) {
_, dirty := c.dirty.Load(off + blockOff)
if !dirty {
return false
}
}

return true
return c.dirty.HasRange(c.startBlock(off), c.endBlock(end))
}

func (c *Cache) setIsCached(off, length int64) {
for _, blockOff := range header.BlocksOffsets(length, c.blockSize) {
c.dirty.Store(off+blockOff, struct{}{})
if length <= 0 {
return
}

c.dirty.SetRange(c.startBlock(off), c.endBlock(off+length))
}

// When using WriteAtWithoutLock you must ensure thread safety, ideally by only writing to the same block once and the exposing the slice.
Expand All @@ -291,16 +293,13 @@ func (c *Cache) WriteAtWithoutLock(b []byte, off int64) (int, error) {
return n, nil
}

// dirtySortedKeys returns a sorted list of dirty keys.
// Key represents a block offset.
// dirtySortedKeys returns a sorted list of dirty block offsets.
func (c *Cache) dirtySortedKeys() []int64 {
var keys []int64
c.dirty.Range(func(key, _ any) bool {
keys = append(keys, key.(int64))

return true
})
slices.Sort(keys)
for i := range c.dirty.Iterator() {
keys = append(keys, int64(i)*c.blockSize)
}

return keys
}
Expand Down Expand Up @@ -491,9 +490,7 @@ func (c *Cache) copyProcessMemory(
return fmt.Errorf("failed to read memory: expected %d bytes, got %d", segmentSize, n)
}

for _, blockOff := range header.BlocksOffsets(segmentSize, c.blockSize) {
c.dirty.Store(offset+blockOff, struct{}{})
}
c.setIsCached(offset, segmentSize)

offset += segmentSize

Expand Down
115 changes: 115 additions & 0 deletions packages/shared/pkg/atomicbitset/bitset.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,115 @@
// Package atomicbitset provides a fixed-size bitset with atomic set operations.
package atomicbitset

import (
"iter"
"math"
"math/bits"
"sync/atomic"
)

// Bitset is a fixed-size bitset backed by atomic uint64 words.
// SetRange uses atomic OR, so concurrent writers are safe without
// external locking.
//
// A Bitset must not be copied after first use (copies share the
// underlying array).
type Bitset struct {
words []atomic.Uint64
n uint
}

// New returns a Bitset with capacity for n bits, all initially zero.
func New(n uint) Bitset {
Copy link
Copy Markdown
Contributor Author

@levb levb Mar 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am at most 1/5 on uint here (and for indexing in the set in all methods). It's clean for the interface. The callers use int64, which would be awkward for the bitset index IMO. 2/5 we should do a separate cleanup PR where we assume int == int64 and use ints everywhere except uints where absolutely needed.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For this case, I think we should be explicit with uint64. Every other value than uint64 will be incorrectly handled in this code part as the 64 length is directly used in the operations.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We do not need 64 bit values for how many blocks there are, do we? The normal roaring bitset supports only 32-bit indices for addressing the bits. There is a 64 bit version, but less mature. And using a uint64 would still cause an ugly cast at the caller?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thing it's okay to have the ~8TiB limit for now (for 4KiB blocks), I don't expect we would approach it any time soon

return Bitset{
words: make([]atomic.Uint64, (n+63)/64),
n: n,
}
}

// Len returns the capacity in bits.
func (b *Bitset) Len() uint { return b.n }

// Has reports whether bit i is set. Out-of-range returns false.
func (b *Bitset) Has(i uint) bool {
if i >= b.n {
return false
}

return b.words[i/64].Load()&(1<<(i%64)) != 0
}

// wordMask returns a bitmask covering bits [lo, hi) within a single uint64 word.
func wordMask(lo, hi uint) uint64 {
if hi-lo == 64 {
return math.MaxUint64
}

return ((1 << (hi - lo)) - 1) << lo
}

// HasRange reports whether every bit in [lo, hi) is set.
// An empty range returns true. hi is capped to Len().
// Returns false if lo is out of range and the range is non-empty.
func (b *Bitset) HasRange(lo, hi uint) bool {
if lo >= hi {
return true
}
if hi > b.n {
hi = b.n
}
if lo >= hi {
return false
}
for i := lo; i < hi; {
w := i / 64
bit := i % 64
top := min(hi-w*64, 64)
mask := wordMask(bit, top)

if b.words[w].Load()&mask != mask {
return false
}
i = (w + 1) * 64
}

return true
}

// SetRange sets every bit in [lo, hi) using atomic OR.
// hi is capped to Len().
func (b *Bitset) SetRange(lo, hi uint) {
if hi > b.n {
hi = b.n
}
if lo >= hi {
return
}
for i := lo; i < hi; {
w := i / 64
bit := i % 64
top := min(hi-w*64, 64)

b.words[w].Or(wordMask(bit, top))

i = (w + 1) * 64
}
}

// Iterator returns an iterator over the indices of set bits
// in ascending order.
func (b *Bitset) Iterator() iter.Seq[uint] {
return func(yield func(uint) bool) {
for wi := range b.words {
word := b.words[wi].Load()
base := uint(wi) * 64
for word != 0 {
tz := uint(bits.TrailingZeros64(word))
if !yield(base + tz) {
return
}
word &= word - 1
}
}
}
}
Loading
Loading