Skip to content

Conversation

@nabijaczleweli
Copy link

Get the file size from seeking to the end (canonical, equivalent to File::stream_len()) instead of fstat(); this is more real and unseekable files are unmappable (conversely, in the vast majority of cases, seekable files are mappable)

Always try to map (size permitting) instead of only mapping S_IFREGs and fall back to reading on failure (instead of hard-erroring). This means we can map devices and correctly fall back if the mapping failed for any number of inocuous reasons (mapping too big, too many mappings, whatever)

This also ends up faster (lseek()+mmap() instead of statx()+statx()+mmap())

Modelled after https://git.sr.ht/~nabijaczleweli/voreutils/commit/e4621ad93ae69d28c032932ad617c026790c0a40

@oconnor663
Copy link
Member

Could you give me an example of something that gets mmap'ed now that wasn't getting mmap'ed before? Is there something under /proc on Linux?

The original memmap (not memmap2) had what was arguably a bug, where it could map files that were larger than isize::MAX, which is in fact UB in Rust. I'm pretty sure memmap2 errors in that case, but if we're supplying the length explicitly I want to double check that we're still on a codepath that checks that for us.

@oconnor663
Copy link
Member

In fact I think memmap2 is not checking that isize::MAX limit for us if we set len explicitly: https://github.com/RazrFalcon/memmap2-rs/blob/91bc53aa13605c9538b94564d1c02cce424ad9c0/src/lib.rs#L241-L281. I'm a little bit surprised by that, and I'm considering filing it as a bug.

@nabijaczleweli
Copy link
Author

nabijaczleweli commented Jun 1, 2025

The main driver for this is devices; b3sum /dev/rbd3 works now.

Why would it check that? This is a valid map:

lseek(3, 0, SEEK_END)                   = 2247483658
mmap(NULL, 2247483658, PROT_READ, MAP_SHARED, 3, 0) = 0x71449000

(this is 2^31+100000010).

It's not possible to make a file bigger than 2^63-1 (but even if it weren't, the map would just fail as usual when you give it a length that's too big).

Get the file size from seeking to the end
(canonical, equivalent to File::stream_len()) instead of fstat();
this is more real and unseekable files are unmappable
(conversely, in the vast majority of cases, seekable files are mappable)

Always try to map (size permitting) instead of only mapping S_IFREGs
and fall back to reading on failure (instead of hard-erroring).
This means we can map devices and correctly fall back if the mapping
failed for any number of inocuous reasons (mapping too big, too many
mappings, whatever)

This also ends up faster
(lseek()+mmap() instead of statx()+statx()+mmap())

Modelled after https://git.sr.ht/~nabijaczleweli/voreutils/commit/e4621ad93ae69d28c032932ad617c026790c0a40
@nabijaczleweli
Copy link
Author

CI failure appears unrelated

@nabijaczleweli
Copy link
Author

RazrFalcon/memmap2-rs#145 merged

@nabijaczleweli
Copy link
Author

nabijaczleweli commented Jul 13, 2025

...and now rebuilding this picks up memmap2 0.9.7 which has the check implemented

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants