Skip to content

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

Facet implementation for type with invariants #80

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
tyilo opened this issue Apr 10, 2025 · 31 comments
Closed

Facet implementation for type with invariants #80

tyilo opened this issue Apr 10, 2025 · 31 comments
Labels
✋ question Further information is requested 🎺 soundness Unsoundness in the code or design, we're doing UB / breaking invariants

Comments

@tyilo
Copy link
Contributor

tyilo commented Apr 10, 2025

Is it supposed to be safe to implement Facet with #[derive(Facet)] for a type that has internal invariants?

If so, then facet_json::from_str needs to be unsafe:

use facet::Facet;

type T = u64;

#[derive(Facet)]
pub struct MyVec {
    pub inner: Vec<T>,
}

#[derive(Facet)]
pub struct AtLeastTwoVec {
    inner: Vec<T>,
}

impl AtLeastTwoVec {
    pub fn new(v: Vec<T>) -> Option<Self> {
        if v.len() < 2 {
            None
        } else {
            Some(Self { inner: v })
        }
    }

    pub fn second(&self) -> &T {
        // SAFETY: `inner` has at least length 2
        unsafe { self.inner.get_unchecked(1) }
    }
}

fn main() {
    assert!(MyVec::type_eq::<AtLeastTwoVec>());

    let v = MyVec { inner: vec![1] };
    let peek = facet_peek::Peek::new(&v);
    let json = facet_json::to_json_string(peek, false);

    let v: AtLeastTwoVec = facet_json::from_str(&json).unwrap();
    dbg!(v.second()); // Undefined behavior
}
@fasterthanlime
Copy link
Contributor

Is it supposed to be safe to implement Facet

No, Facet is an unsafe trait.

@fasterthanlime
Copy link
Contributor

fasterthanlime commented Apr 10, 2025

But.. yeah. I see what you mean.

I don't think unsafe derives are a thing yet, really?

@Veykril
Copy link
Contributor

Veykril commented Apr 10, 2025

But.. yeah. I see what you mean.

I don't think unsafe derives are a thing yet, really?

Not yet rust-lang/rfcs#3715

@tyilo
Copy link
Contributor Author

tyilo commented Apr 10, 2025

I think this should be mentioned in the documentation for both the Facet trait and the Facet derive macro.

Maybe the derive trait can be renamed to UnsafeFacet to make this apparent?

@tyilo
Copy link
Contributor Author

tyilo commented Apr 10, 2025

What is the plan for implementing Facet for NonZero<...>?

@fasterthanlime
Copy link
Contributor

If so, then facet_json::from_str needs to be unsafe:

Maybe that's just the solution. It's true that building any type of the deployments facet is inherently unsafe unless you know what you're doing? I don't know.

Maybe the solution here is to expose some method to verify invariants? It wouldn't solve the naming issue or the documentation issue, but we could make from_str return an error if it actually deserializes into something that is not supposed to be representable.

I quite like this idea. In fact, it could be specified as an attribute in the derived macro, something like:

#[derive(Facet)]
#[facet(invariant = Self::at_least_two)]
pub struct AtLeastTwoVec {
    inner: Vec<T>,
}

Or something. It could even return an error value instead of just a bool so we have a nice user experience.

@JSorngard
Copy link

JSorngard commented Apr 10, 2025

serde allows one to deserialize into an intermediate type and then implement a check for the invariant in a TryFrom implementation: https://serde.rs/container-attrs.html#try_from. Might that structure be interesting here?

Example:

#[derive(Deserialize)] 
#[serde(try_from = MaybeFive)] 
struct NotFive {
    int: i32, 
} 

#[derive(Deserialize)] 
struct MaybeFive(i32); 

impl TryFrom<MaybeFive> for NotFive {
    type Err = FiveErr;
    fn try_from(val: MaybeFive) -> Result<Self, Self::Err> {
    if val.0 != 5 {
        Ok(Self { int: val.0 })
    } else {
        Err(FiveErr)
    } 
} 

struct FiveErr;

impl Display for FiveErr {...}

@tyilo
Copy link
Contributor Author

tyilo commented Apr 10, 2025

Should Facet be able to be implemented for structs with invariants that can't be checked at runtime?

Such as:

struct MyByteVec {
  // Safety invariant: `ptr` points to `len` bytes of memory
  ptr: *mut u8,
  len: usize,
}

I think MyByteVec should be able to implement Facet, however it can't implement any safe deserialization.

@fasterthanlime fasterthanlime added the ✋ question Further information is requested label Apr 10, 2025
@fasterthanlime
Copy link
Contributor

fasterthanlime commented Apr 10, 2025

Should Facet be able to be implemented for structs with invariants that can't be checked at runtime?

Yes, but not through the derive, if that makes sense?

I really don't like the UX implications of renaming the derive macro to anything other than Facet, but... Someone from the compiler team was just asking me what we would need to make more progress on facets, so maybe we can get those unsafe derives shipped sooner rather than later??

@fasterthanlime
Copy link
Contributor

I really don't like the UX implications of renaming the derive macro to anything other than Facet, but...

I'm starting to change my mind.

@madsmtm
Copy link

madsmtm commented Apr 10, 2025

I'm not sure I agree with the first example being the fault of the Facet derive; rather, it's the unsafe { self.inner.get_unchecked(1) } that's wrong, because you implemented Facet, and thus the type no longer upholds its invariants.
It'd be the same situation if you'd implemented Default, and we don't mark that trait as unsafe (remember: any unsafe extends to the entire module, just like Unpin is safe but can cause unsoundness elsewhere if implemented wrongly).

Are there other examples of Facet being unsound? If not, then I don't think it needs to be unsafe.

@fasterthanlime
Copy link
Contributor

It'd be the same situation if you'd implemented Default, and we don't mark that trait as unsafe (remember: any unsafe extends to the entire module, just like Unpin is safe but can cause unsoundness elsewhere if implemented wrongly).

That's actually a really good argument!

@JSorngard
Copy link

JSorngard commented Apr 10, 2025

It's also the logic serde uses. Deriving serde::Deserialize has these same pitfalls as deriving Facet does.

Though a method for enforcing invariants during deserialization would be very nice.

@fasterthanlime
Copy link
Contributor

Though a method for enforcing invariants during deserialization would be very nice.

Any ideas on how to allow returning errors there without falling into "no context" or "&'static str only" or "woops we require an allocator now"?

It would be nice to get the details of what invariant failed.... maybe &'static str is not that bad.

@fasterthanlime fasterthanlime added the 🎺 soundness Unsoundness in the code or design, we're doing UB / breaking invariants label Apr 10, 2025
@JSorngard
Copy link

JSorngard commented Apr 10, 2025

I don't think I personally can think of anything except &'static str unfortunately, though I think something like this is beyond my expertise.
If the user-provided function that checks the invariant is given the (maybe incorrectly) deserialized value as input the user could at least return one of a few different &'static strs depending on which invariant is violated.

@tyilo
Copy link
Contributor Author

tyilo commented Apr 10, 2025

Maybe it would make sense to split Facet into to different traits for serialization and deserialization.
Then AtLeastTwoVec could safely implement the serialization trait using derive.

@fasterthanlime
Copy link
Contributor

Maybe it would make sense to split Facet into to different traits for serialization and deserialization.

I would much rather not because, facet is not really about serialization or deserialization. It just happens to be one of the many things you can do with it. I'm perfectly satisfied by the logic explained earlier. You can derive default and defeat your own invariance, and so deriving facet is exactly the same thing.

We can either track validating invariants after parsing here, or we can open a new issue, but I think the rest has been covered pretty well.

@epage
Copy link
Contributor

epage commented Apr 11, 2025

Any ideas on how to allow returning errors there without falling into "no context" or "&'static str only" or "woops we require an allocator now"?

Would use of a dyn Trait be acceptable?

Instead of

fn validate(T) -> Result<(), &'static str>;

You could do

fn validate(T, error: &mut dyn ErrorSink) -> Option<T>;

trait ErrorSink {
    fn report_error(error: ValidateionError);
}

impl ErrorSink for Option<ValidationError>;

#[cfg(feature = "alloc")]
impl ErrorSink for alloc::vec::Vec<VaidationError>;

#[non_exhaustive]
enum ValidationError {
    Message(&'static str),
    #[cfg(feature = "alloc")]
    Error(alloc::boxed::Box<dyn core::error::Error + Send + Sync>),
}
  • Allows multiple errors
  • Allows error recovery (report errors and return T)
  • Core, alloc, and std friendly

Might be interesting to see what some common validation errors are in case more specialized ones could be added, either for better formatting or for richer errors in non-alloc environments (e.g. serde::de::Error).

I went with an enum instead of trait methods (though inherent trait methods could be added as helpers for constructing different variants) as it felt like it would better handle conditional / new cases being added.

@fasterthanlime
Copy link
Contributor

Here's the current draft in #188 (error reporting beyond bool is TBD):

#[derive(Facet)]
#[facet(invariants = "invariants")]
struct MyNonZeroU8 {
    value: u8,
}

impl MyNonZeroU8 {
    // TODO: does this need to be MaybUninit?
    fn invariants(&self) -> bool {
        self.value != 0
    }
}

#[test]
fn struct_with_invariant() {
    facet_testhelpers::setup();

    let (poke, guard) = PokeValueUninit::alloc::<MyNonZeroU8>();
    let poke = poke.into_struct().unwrap();
    let poke = poke
        .field_by_name("value")
        .unwrap()
        .set(0u8)
        .unwrap()
        .into_struct_uninit();

    // This should fail because the invariant is violated (value == 0)
    match poke.build::<MyNonZeroU8>(Some(guard)) {
        Err(ReflectError::InvariantViolation) => {
            // Expected failure due to invariant violation
        }
        Ok(_) => panic!("Expected invariant violation, but build succeeded"),
        Err(e) => panic!("Expected invariant violation, got {:?}", e),
    }
}

Mutation is also an issue even once you get a fully initialized struct. For now, my solution is to go back to a PokeStructUninit, do your manipulations there, and then try to build in place again or something.

@tyilo
Copy link
Contributor Author

tyilo commented Apr 13, 2025

Here's the current draft in #188 (error reporting beyond bool is TBD):

Nice!

There are at least 2 types of invariants:

  • Invariants that can be checked (MyNonZeroU8).
  • Invariants that can't be checked (MyByteVec).

I think we can handle MyByteVec like this:

  • Implement an invariant that always returns false (or always panics), so that it can't be safely constructed. (Maybe we want builtin support for this).
  • Provide unsafe versions of the build functions that doesn't check the invariants.

Something like:

#[derive(Facet)]
#[facet(invariants = "invariants")]
struct MyByteVec {
  // Safety invariant: `ptr` points to `len` bytes of memory
  ptr: *mut u8,
  len: usize,
}

impl MyByteVec {
    fn invariants(&self) -> bool {
        false
        // or maybe:
        // panic!("invariants for MyByteVec can't be checked at runtime")
    }
}

#[test]
fn my_byte_vec() {
    facet_testhelpers::setup();

    let ptr = Box::into_raw(Box::new(123u64));
    let len = std::mem::size_of::<u64>();

    let (poke, guard) = PokeValueUninit::alloc::<MyByteVec>();
    let poke = poke.into_struct().unwrap();
    let poke = poke
        .field_by_name("ptr")
        .unwrap()
        .set(ptr)
        .unwrap()
        .into_struct_uninit();
    let poke = poke
        .field_by_name("len")
        .unwrap()
        .set(len)
        .unwrap()
        .into_struct_uninit();

    let v = unsafe { poke.build_no_invariants_check::<MyByteVec>(Some(guard)) };
    let v = v.unwrap();
}

@fasterthanlime
Copy link
Contributor

I think we can handle MyByteVec like this:

I think so too. The other thing was mutation once something is already fully initialized. We don't want people to be able to just drop the poke and leave the original mutably borrowed value in some invalid state. I honestly don't know how to do this.

I think the poke should just panic if it's dropped, but forget is safe, so there's basically no option for this.

The version of poke that works on a mutable reference rather than taking ownership of the thing is going to be unsafe no matter what, even if it checks the invariants.

@fasterthanlime
Copy link
Contributor

Mutation is also an issue even once you get a fully initialized struct. For now, my solution is to go back to a PokeStructUninit, do your manipulations there, and then try to build in place again or something.

#188 landed, and it doesn't even allow direct mutation, only building from uninitiated to fully initialized.

Invariants can be checked by adding a facet(invariant = method_name) attribute on the container. More functionality can be tracked in other, smaller issues.

@tmandry
Copy link

tmandry commented Apr 16, 2025

Hi @fasterthanlime, I'm curious whether unsafe derives as in rust-lang/rfcs#3715 still be useful to you, or would they change the way you approached your implementation? Or did the way you ended up solving this feel unambiguously better to you?

@fasterthanlime
Copy link
Contributor

Hi @fasterthanlime, I'm curious whether unsafe derives as in rust-lang/rfcs#3715 still be useful to you, or would they change the way you approached your implementation? Or did the way you ended up solving this feel unambiguously better to you?

I think so? I'm not entirely sure. I'm still torn because.. deriving Default if you have a len > 0 invariant is a bug, but are we going to make deriving Default unsafe too? If so, then Facet should be unsafe. If not, then no.

@tmandry
Copy link

tmandry commented Apr 17, 2025

Yeah, violating internal invariants does not itself justify unsafe. What would justify it is if there are invariants depended on by unsafe code (and then we can argue that it's up to the unsafe to ensure there are no derives like Facet that can violate that). There's also work on unsafe fields in Rust that might be interesting here; you could imagine a safe derive not working with those, but declaring an unsafe attribute like #[unsafe(facet_deserialize_safe)] or something like that on one of those.

@tyilo
Copy link
Contributor Author

tyilo commented Apr 17, 2025

I think an important difference between Default (or serde::Deserialize or serde::Serialize) and Facet is that Default only has one purpose: Constructing a default value.

The Facet trait can used for many different things:

  1. Inspecting the shape of a type.
  2. Inspecting the fields of a value.
  3. Constructing a value by initializing its fields one at a time.
  4. Changing a field of an existing value.

Even if a type has invariants, I don't see how deriving Facet could cause problems with 1 and 2. However, deriving Default will either always or never cause problems for a type.

I see three options:

  1. Keep the Facet trait as-is and document the criteria for when it is safe to derive.
  2. Make Facet "unsafe" to derive somehow and document the criteria for when it is safe to derive.
  3. Split Facet into two traits, one of them always safe to derive. The safe trait could cover 1 and 2 above and the unsafe trait could cover 3 and 4.

I think 3 would be the best to prevent footguns.

@GoldsteinE
Copy link

I think they both could be safe: when you’re deriving FacetMut (or whatever it’s going to be called), you know that you effectively make all your fields publicly writable.

I’m not sure that’s the best solution though. The problem with the full reflection design like Facet does is that it’s a giant semver hazard: if most types in the ecosystem derive Facet, then most types in the ecosystem make their internal structure public API. I feel like it’s important for Facet to follow privacy rules in some way, so you can derive Facet and use full reflection on your types inside of your own crate, but the downstream crates only see information about the public fields (and something like non_exhaustive: true). The easiest way I see to implement this is something like

impl TypeImplementingFacet {
    pub(crate) fn pub_crate() -> impl Facet;
    fn priv() -> impl Facet;
}

so to access information about private fields you have to go through .priv().

@fasterthanlime
Copy link
Contributor

The Facet trait can used for many different things:

  1. Inspecting the shape of a type.
  2. Inspecting the fields of a value.
  3. Constructing a value by initializing its fields one at a time.
  4. Changing a field of an existing value.
  1. is the same as Default insofar as, in safe code, facet doesn't let you leave something "partially uninitialized". The Wip API has been designed so that you have to initialize everything, or you can't get a value out. If it's not fully initialized, the inner fields will be dropped properly and everything will be freed. All in safe code. This is a big value proposition of facet actually, IMHO.

As for 4., it cannot be done in safe code at all in facet. Right now you can do it in that you can get the offset of the field, but then to write to it, you have to resort to unsafe.

So I stand by my argument that, with the new Wip interface (it wasn't the case before), deriving Facet trait is not more unsafe than deriving Default, or the offset_of! macro.

@tyilo
Copy link
Contributor Author

tyilo commented Apr 17, 2025

So I stand by my argument that, with the new Wip interface (it wasn't the case before), deriving Facet trait is not more unsafe than deriving Default, or the offset_of! macro.

But what if I only want to derive Facet to inspect the type/value without allowing construction?

@tyilo
Copy link
Contributor Author

tyilo commented Apr 17, 2025

I think they both could be safe: when you’re deriving FacetMut (or whatever it’s going to be called), you know that you effectively make all your fields publicly writable.

True :)

@fasterthanlime
Copy link
Contributor

The problem with the full reflection design like Facet does is that it’s a giant semver hazard: if most types in the ecosystem derive Facet, then most types in the ecosystem make their internal structure public API.

@tyilo @GoldsteinE what use-cases do you see for in-place mutation besides like.. debuggers? Which are inherently unsafe?

It would be interesting if facet let crates annotate "this field is fine to change by yourself, here's the validation function" or something, but it's not... I feel It's not the primary objective.

And it's not relevant to the discussion of whether deriving Facet should be safe or unsafe. I'd be fine with either, but I still haven't heard a compelling argument why it's different from Default. I think @tmandry's phrase here:

Yeah, violating internal invariants does not itself justify unsafe

So that settles that as far as I'm concerned — the rest I'm interested in discussing of course (but maybe separately).


But what if I only want to derive Facet to inspect the type/value without allowing construction?

Then invariants returns false unconditionally? It'd be neat to have some marker trait or flag or something that says Unbuildable or whatever but... you can already today on facet 0.8 prevent building your types.

(You can also just declare it as scalar/opaque and then there's no way to build it through Wip either.)

@facet-rs facet-rs locked and limited conversation to collaborators Apr 17, 2025
@fasterthanlime fasterthanlime converted this issue into discussion #272 Apr 17, 2025

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

Labels
✋ question Further information is requested 🎺 soundness Unsoundness in the code or design, we're doing UB / breaking invariants
Projects
None yet
Development

No branches or pull requests

8 participants