Get rid of new_uninitialized and new_uninitialized_generic. #556

sebcrozet · 2019-02-23T10:40:27Z

The two unsafe functions Matrix::new_uninitialized and Matrix::new_uninitialized_generic are used when we don't care about the initial value of a matrix/vector because they will be overwritten by the result of other operations. We use them mostly for performance reasons to avoid an unnecessary initialization step.

However, this is the most significant blocker for #282. So if we ever want to support non-Copy types, this is the first thing to address in a way that does not hurt performance for Copy types.

aweinstock314 · 2019-11-15T21:21:52Z

I'm interested in working on this, as I want non-Copy scalars (#282) for matrices of gmp::Mpz for some cryptanalysis tooling that I'm writing. I've gotten the benchmarks running locally (requires PR #678) so I can measure performance impact.

It looks like currently like new_uninitialized is implemented (by the macros impl_constructors and componentwise_constructors_impl) in terms of MatrixMN::new_uninitialized_generic, which in turn calls DefaultAllocator::allocate_uninitialized. DefaultAllocator::allocate_uninitialized calls mem::uninitialized if the dimensions are static, and Vec::{reserve_exact,set_len} if either dimension is dynamic.

In order to support non-Copy scalars while still using the above scheme for Copy scalars (for performance), it seems necessary to add another dimension to this type-case (e.g. go from {static, dynamic} x {static, dynamic} to {Copy, !Copy} x {static, dynamic} x {static, dynamic} for Allocator implementations.

I've tried the following sketches of some trait-level approaches to allowing this branching, but neither quite work:

mod bools {
    pub enum True {}
    pub enum False {}
    pub trait TypeLevelBool {}
    impl TypeLevelBool for True {}
    impl TypeLevelBool for False {}
    pub trait TraitLevelTrue: TypeLevelBool {}
    impl TraitLevelTrue for True {}
    pub trait TraitLevelFalse: TypeLevelBool {}
    impl TraitLevelFalse for False {}
}
use bools::{TypeLevelBool, True, False};

pub unsafe trait IsUninitializable { type Value: TypeLevelBool; }
unsafe impl IsUninitializable for u8 { type Value = True; }
unsafe impl IsUninitializable for Vec<u8> { type Value = False; }
pub trait AllocatorIsh<S: IsUninitializable> {
    unsafe fn allocate_maybe_uninitialized(n: usize) -> Vec<S>;
}
pub struct DefaultAllocatorIsh;

// the following attempt fails because equality bounds aren't yet supported, but if they were, this would guarantee non-overlap
#[cfg(attempt1)]
mod attempt1 {
    use super::*;
    impl<S: IsUninitializable> AllocatorIsh<S> for DefaultAllocatorIsh where <S as IsUninitializable>::Value = True {
        unsafe fn allocate_maybe_uninitialized(n: usize) -> Vec<S> {
            let mut ret = Vec::new();
            ret.reserve_exact(n);
            ret.set_len(n);
            ret
        }
    }
    impl AllocatorIsh<Vec<u8>> for DefaultAllocatorIsh where <Vec<u8> as IsUninitializable>::Value = False {
        unsafe fn allocate_maybe_uninitialized(n: usize) -> Vec<Vec<u8>> {
            vec![vec![]; n]
        }
    }
}

// the following attempt fails to compile because from rustc's point of view, bools could semver-compatibly implement TraitLevelFalse for True, so this doesn't guarantee non-overlap
#[cfg(attempt2)]
mod attempt2 {
    use super::*;
    use bools::{TraitLevelTrue, TraitLevelFalse};
    impl<S: IsUninitializable> AllocatorIsh<S> for DefaultAllocatorIsh where <S as IsUninitializable>::Value: TraitLevelTrue {
        unsafe fn allocate_maybe_uninitialized(n: usize) -> Vec<S> {
            let mut ret = Vec::new();
            ret.reserve_exact(n);
            ret.set_len(n);
            ret
        }
    }
    impl AllocatorIsh<Vec<u8>> for DefaultAllocatorIsh where <Vec<u8> as IsUninitializable>::Value: TraitLevelFalse {
        unsafe fn allocate_maybe_uninitialized(n: usize) -> Vec<Vec<u8>> {
            vec![vec![]; n]
        }
    }
}

fn main() {}

Does this seem like a good approach, in the sense that if the trait metaprogramming can be made to work, I'm not missing any performance/safety concerns?

aweinstock314 · 2019-11-15T22:27:47Z

I got a variant of the first approach working, and added more comments:

mod bools {
    pub enum True {}
    pub enum False {}
    pub trait TypeLevelBool {}
    impl TypeLevelBool for True {}
    impl TypeLevelBool for False {}
}
use bools::{TypeLevelBool, True, False};

pub unsafe trait IsUninitializable { type Value: TypeLevelBool; }
// examples of Copy types that should use uninitialized
unsafe impl IsUninitializable for u8 { type Value = True; }
unsafe impl IsUninitializable for u16 { type Value = True; }
// example of a !Copy type that shouldn't use uninitialized
unsafe impl IsUninitializable for Vec<u8> { type Value = False; }

// AllocatiorIsh is Allocator specialized to the dynamic case, as mem::uninitialized isn't used explicitly, but is implicit in the use of set_len
// this should extend to cross-product impls for static and dynamic dimensions straightforwardly
pub trait AllocatorIsh<S: IsUninitializable> {
    unsafe fn allocate_maybe_uninitialized(n: usize) -> Vec<S>;
}
pub struct DefaultAllocatorIsh;

mod attempt3 {
    use super::*;
    // any type can opt into the optimized case by `unsafe impl`-ing IsInitializable<Value=True>
    impl<S: IsUninitializable<Value=True>> AllocatorIsh<S> for DefaultAllocatorIsh {
        unsafe fn allocate_maybe_uninitialized(n: usize) -> Vec<S> {
            let mut ret = Vec::new();
            ret.reserve_exact(n);
            ret.set_len(n);
            ret
        }
    }
    // for IsUninitializable<Value=False> types, instances can be given explicitly, with ad-hoc initialization, by downstream crates
    impl AllocatorIsh<Vec<u8>> for DefaultAllocatorIsh {
        unsafe fn allocate_maybe_uninitialized(n: usize) -> Vec<Vec<u8>> {
            vec![vec![]; n]
        }
    }
}

fn main() {}

aweinstock314 · 2019-11-19T19:33:36Z

On further reflection, it looks like the point of std::mem::MaybeUninit is exactly for this use case (surpressing UB on uninitialized values with references in them). I'll look into how to make Allocator::allocate_uninitialied return a Buffer with N=MaybeUniit<S>. If it works out, I think this approach will requires less boilerplate from downstream crates.

sebcrozet · 2019-11-21T21:55:01Z

@aweinstock314 The proposal based on Isuninitializable is interesting, though it feels wrong to require all the users of matrices with custom types to implement an unsafe trait. In addition the current way the Allocator trait is implemented is already quite difficult as it requires to deal with several combinations of parameters (see there). Adding a third dimension will likely make this even more complicated.

The actual solution would be to find a way to leverage the MaybeUninit type as you suggest.

I'll look into how to make Allocator::allocate_uninitialied return a Buffer with N = MaybeUninit<S>

Returning a buffer with a scalar type equal to MaybeUninit<N> (it's better avoid using S as a type parameter because it is already used for the generic Storage throughout nalgebra) may be a good idea. Though is there any way to get a Matrix<N, ...> from a Matrix<MaybeUninit<N>, ...> and viceversa? The documentation appears to say that transmuting a [MaybeUninit<Vec<...>>] to a [Vec<...>] is sound as long as all the elements have been initialized. Is it still the case for our matrix types?

aweinstock314 · 2019-12-16T23:17:09Z

#684 supports non-Copy scalars well enough to typecheck some (currently private) code that uses gmp::Mpz (with a newtype wrapper to implement Scalar). At runtime, there's a double-free in malloc in gmp::Mpz::drop, which is likely caused by mem::uninitialized, so fixing this is my next task.

aweinstock314 · 2020-01-22T00:32:52Z

Could I get feedback on the current design?

The core change is adding an UninitBuffer type to the Allocator trait: https://github.com/aweinstock314/nalgebra/blob/649507db1daf4c6aac55d9ebc33371cf86cd7d58/src/base/allocator.rs#L23-L26 . This type "lifts" mem::MaybeUninit to the Storage level. allocate _uninitialized doesn't yet use it; making that change typecheck will involve fixing all the uses of uninitialized memory in the codebase.

With respect to the soundness of converting between Matrix<N, ...> and Matrix<MaybeUninit<N>, ...> (which in the current approach, would be between Matrix<N, R, C, FooStorage<N, R, C, InitializedTag>> and Matrix<N, R, C, FooStorage<N, R, C, UninitializedTag>>), that should follow for the same reasons that it works for Vec. I'll add helpers to convert between the two as part of the refactoring to be triggered by changing the return type of Allocator::allocate_uninitialized.

The definition of UninitBuffer requires some extra machinery to be added to the Storage trait, that's done here: https://github.com/aweinstock314/nalgebra/blob/649507db1daf4c6aac55d9ebc33371cf86cd7d58/src/base/storage.rs#L37-L65 .

Most of the rest of the changes already done to this branch involve getting {ArrayStorage, VecStorage, SliceStorage} to work with the new definitions. This includes some mostly-duplicate definitions of {,Contiguous}Storage{,Mut} across {InitializedTag, UninitializedTag} which might be worth defining via macro_rules.

This is currently blocked waiting for a fix for a rustc ICE (probably a duplicate of rust-lang/rust#68090, which already has a fix in progress).

sebcrozet · 2020-01-23T17:59:05Z

This looks like a great approach!
I believe allocate_uninitialized(), just like MaybeUninit::uninit(), does not have to be unsafe.

Once the trait architecture woks, the main challenge will be to make the methods from the base::blas module work (starting with the simplest method: axpy). In particular the transition between initialized and uninitialized types may not be so easy to handle transparently.

RalfJung · 2020-08-31T14:48:42Z

Heads-up, in rustc we are trying to better detect UB due to misuse of uninitialized memory (rust-lang/rust#71274), and I think this issue came up as a regression for the next stage of that check. Specifically, this crater regression looks to me like it is caused by these uninitialized methods being insta-UB on a type that involves bool in its name -- and indeed, and uninitialized bool is UB.

RalfJung · 2020-08-31T14:52:26Z

Ah, but you are using the dreadful MaybeUninit::uninit().assume_init() and thus side-stepping our ability to detect this problem. So you are "good"... in a sense. ;)

…allocate_uninitialized` and `Matrix::new_uninitialized_generic`. Most call sites still invoke UB through `assume_init`. Said call sites instead invoke `unimplemented!()` if the `no_unsound_assume_init` feature is enabled, to make it easier to gradually fix them. Progress towards dimforge#556.

sebcrozet added enhancement P-medium Medium priority breaking change Fixing this issue, or merging this pull-request is likely to require breaking changes labels Feb 23, 2019

sebcrozet changed the title ~~Get rid of new_uninitialized and new_uninitialized_generic.~~ Get rid of new_uninitialized and new_uninitialized_generic. Feb 23, 2019

sebcrozet mentioned this issue Aug 5, 2019

Explain why each use of unsafe is sound #628

Open

aweinstock314 mentioned this issue Dec 5, 2019

Consider supporting non-Copy scalar types. #282

Open

josephlr mentioned this issue Mar 31, 2020

Fix warnings on latest rustc #705

Merged

RalfJung mentioned this issue Aug 31, 2020

might_permit_raw_init: also check aggregate fields rust-lang/rust#71274

Merged

aweinstock314 mentioned this issue Nov 27, 2020

Propagate mem::MaybeUninit through the return types of `Allocator::… #803

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Get rid of new_uninitialized and new_uninitialized_generic. #556

Get rid of new_uninitialized and new_uninitialized_generic. #556

sebcrozet commented Feb 23, 2019 •

edited

Loading

aweinstock314 commented Nov 15, 2019

Uh oh!

aweinstock314 commented Nov 15, 2019

Uh oh!

aweinstock314 commented Nov 19, 2019

Uh oh!

sebcrozet commented Nov 21, 2019 •

edited

Loading

Uh oh!

aweinstock314 commented Dec 16, 2019

Uh oh!

aweinstock314 commented Jan 22, 2020

Uh oh!

sebcrozet commented Jan 23, 2020

Uh oh!

RalfJung commented Aug 31, 2020

Uh oh!

RalfJung commented Aug 31, 2020

Uh oh!

Uh oh!

Get rid of new_uninitialized and new_uninitialized_generic. #556

Get rid of new_uninitialized and new_uninitialized_generic. #556

Comments

sebcrozet commented Feb 23, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

aweinstock314 commented Nov 15, 2019

Uh oh!

aweinstock314 commented Nov 15, 2019

Uh oh!

aweinstock314 commented Nov 19, 2019

Uh oh!

sebcrozet commented Nov 21, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

aweinstock314 commented Dec 16, 2019

Uh oh!

aweinstock314 commented Jan 22, 2020

Uh oh!

sebcrozet commented Jan 23, 2020

Uh oh!

RalfJung commented Aug 31, 2020

Uh oh!

RalfJung commented Aug 31, 2020

Uh oh!

sebcrozet commented Feb 23, 2019 •

edited

Loading

sebcrozet commented Nov 21, 2019 •

edited

Loading