Skip to content

SockRef::from, Socket::sendfile and other functions that operate on arbitrary file descriptors or SOCKETs potentially should be unsafe #218

@ghost

Description

See rust-lang/rust#72175 and nix-rust/nix#1421: it seems an external library is allowed to assume a file descriptor is private, for example:

#![allow(clippy::blacklisted_name)]
#![deny(unsafe_code)]

use socket2::SockRef;
use std::mem;

#[allow(unsafe_code)]
mod external_library {
    use std::{mem, os::unix::io::RawFd, ptr};

    static DATA: i32 = -1;

    pub struct Foo([RawFd; 2]);

    impl Foo {
        pub fn new() -> Self {
            let mut sockets = [-1; 2];
            assert_eq!(
                unsafe {
                    libc::socketpair(
                        libc::AF_UNIX,
                        libc::SOCK_SEQPACKET | libc::SOCK_CLOEXEC,
                        0,
                        sockets.as_mut_ptr(),
                    )
                },
                0
            );
            Self(sockets)
        }

        pub fn foo(&self) {
            let data: *const _ = &DATA;
            assert_eq!(
                unsafe {
                    libc::send(
                        self.0[0],
                        ptr::addr_of!(data).cast(),
                        mem::size_of_val(&data),
                        0,
                    )
                },
                mem::size_of_val(&data) as _
            );
        }

        pub fn bar(&self) -> i32 {
            let mut ptr: *const i32 = ptr::null();
            assert_eq!(
                unsafe {
                    libc::recv(
                        self.0[1],
                        ptr::addr_of_mut!(ptr).cast(),
                        mem::size_of_val(&ptr),
                        0,
                    )
                },
                mem::size_of_val(&ptr) as _
            );
            unsafe { *ptr }
        }
    }
}

fn main() {
    let foo = external_library::Foo::new();
    SockRef::from(&3)
        .send(&[0; mem::size_of::<*const i32>()])
        .unwrap();
    foo.foo();
    dbg!(foo.bar());
}

Here, the safe SockRef::from allows to crash a (fake) "external library" by letting it dereference a null pointer using only safe code. Making functions that accept arbitrary file descriptors or SOCKETs unsafe can probably solve this problem.

Activity

Thomasdezeeuw

Thomasdezeeuw commented on Apr 13, 2021

@Thomasdezeeuw
Collaborator

See rust-lang/rust#72175 and nix-rust/nix#1421: it seems an external library is allowed to assume a file descriptor is private, for example:

I have no idea what you mean with private, but if you mean shouldn't be modified by external (outside of the type's impls) that it should not implement AsRawFd/AsRawSocket. If it does implement AsRawFd/AsRawSocket that it must be able to deal with the situation it's actually used.

Here, the safe SockRef::from allows to crash a (fake) "external library" by letting it dereference a null pointer using only safe code. Making functions that accept arbitrary file descriptors or SOCKETs unsafe can probably solve this problem.

The SocketRef::from(&3) should be unsafe, but not because SocketRef::from should be unsafe. Because you're creating (implicitly) a RawFd which should be valid, but isn't in this example, that is the unsafe part.

Furthermore you're dereferencing a raw pointer using unsafe, so you're clearing not only using safe code. Putting code in an "external library" doesn't mean it shouldn't count towards unsafe code, it still is.

So, I do agree that creating RawFd from literals, especially when passing them directly to function as you did in the example should be unsafe or at least clearer. But I don't agree SocketRef::from is unsafe, or should be marked as such.

ghost

ghost commented on Apr 13, 2021

@ghost

The SocketRef::from(&3) should be unsafe, but not because SocketRef::from should be unsafe. Because you're creating (implicitly) a RawFd which should be valid, but isn't in this example, that is the unsafe part.

So, I do agree that creating RawFd from literals, especially when passing them directly to function as you did in the example should be unsafe or at least clearer. But I don't agree SocketRef::from is unsafe, or should be marked as such.

Fair. I think at least something in the 3_i32 -> ... -> SockRef path should be unsafe. Maybe a new wrapper type that represents a file descriptor which will be used by the SockRef that is unsafe to construct from an arbitrary number?

sunfishcode

sunfishcode commented on Apr 13, 2021

@sunfishcode
Member

It turns out that RawFd itself implements AsRawFd. So AsRawFd doesn't say anything about the validity or ownership of the returned RawFd value.

rust-lang/rust#72175 is still being discussed, and looks like it may lead to an RFC. If that happens, and if the RFC is accepted, functions that operate on RawFd values should be unsafe. That includes SockRef::from in its current form, since it accepts any type that implements AsRawFd which includes RawFd itself.

That said, SockRef::from is a valuable use case, and ideally it should be possible to do what it's doing without unsafe in its API. As one possible approach, the unsafe-io crate has an OwnsRaw trait which types that implement AsRawFd etc. can implement to additionally state that they really do own their handles, and traits like AsUnsafeSocket type which combine OwnsRaw and AsRawFd (AsRawSocket on Windows). With these, it should be possible to have fn SockRef::from(t: &T) where T: AsUnsafeSocket as a safe API. (Aside: unsafe-io is a new crate, and I'm interested if anyone has ideas for how to improve the API!)

But of course, it makes sense to wait and see how the discussion in rust-lang/rust#72175 and the possible RFC turn out before making major changes here.

sunfishcode

sunfishcode commented on Apr 15, 2021

@sunfishcode
Member

I'm working on a draft of a proposal to add new official wording about RawFd to Rust.

In particular, this proposal would mean functions like SockRef::from with AsRawFd arguments should be marked unsafe, or migrated to alternatives. But the upside would be clearer guarantees for many use cases. I'd be interested in any feedback from folks working on socket2!

Thomasdezeeuw

Thomasdezeeuw commented on Apr 16, 2021

@Thomasdezeeuw
Collaborator

I'm working on a draft of a proposal to add new official wording about RawFd to Rust.

In particular, this proposal would mean functions like SockRef::from with AsRawFd arguments should be marked unsafe, or migrated to alternatives. But the upside would be clearer guarantees for many use cases. I'd be interested in any feedback from folks working on socket2!

I don't have time to look through the entire proposal, but SockRef::from will remain a safe function. In my opinion the AsRawFd/AsRawSocket documentation should be clearer that the returned fd/socket should be valid. The current documentation for AsRawFd says the following

This method does not pass ownership of the raw file descriptor to the caller. The descriptor is only guaranteed to be valid while the original object has not yet been destroyed.

The second sentence says to me that the implementation needs to ensure the file descriptor is valid (while the object is alive). Unfortunately RawFd is just a c_int (because it needs to match C semantics) and can "just be created" without being valid, i.e. the literal 123 is a "valid" (as in type) RawFd but of course isn't a valid fd. Couple that with the AsRawFd impl for RawFd and we got a problem as shown in the initial issue report.

I think the problem is that the creation of RawFd should be an unsafe operation in which the programmer needs to ensure it's a valid descriptor. But we can't change that anymore. We can improve the documentation however.

sunfishcode

sunfishcode commented on Apr 16, 2021

@sunfishcode
Member

Instead of making it unsafe, what would you think about changing SockRef::from to use AsUnsafeSocket in place of AsRawFd?

AsUnsafeSocket is implemented for all the relevant std types, TcpStream, TcpListener, UdpSocket, UnixStream, UnixListener, and UnixDatagram. For other types, the only thing needed is to add an impl OwnsRaw, so types like socket2::Socket and others could easily implement it. That way, SockGet::from wouldn't be unsafe, and many use cases would continue to work as-is.

And to be sure, this isn't urgent; I'm still in the process of exploring the options.

Thomasdezeeuw

Thomasdezeeuw commented on Apr 17, 2021

@Thomasdezeeuw
Collaborator

Instead of making it unsafe, what would you think about changing SockRef::from to use AsUnsafeSocket in place of AsRawFd?

I don't think socket2 should dependencies outside of libc/winapi. Furthermore exposing those dependencies in a public API is a bad idea as we're then tied to a fixed version of the crate (i.e. we couldn't update to unsafe-io v2 without also updating to socket2 v2).

I'm also not convinced that AsUnsafeSocket is an improvement of AsRawFd. I think the best way forward is improving the documentation of AsRawFd/AsRawSocket.

AsUnsafeSocket is implemented for all the relevant std types, TcpStream, TcpListener, UdpSocket, UnixStream, UnixListener, and UnixDatagram. For other types, the only thing needed is to add an impl OwnsRaw, so types like socket2::Socket and others could easily implement it. That way, SockGet::from wouldn't be unsafe, and many use cases would continue to work as-is.

And to be sure, this isn't urgent; I'm still in the process of exploring the options.

sunfishcode

sunfishcode commented on Apr 19, 2021

@sunfishcode
Member

Right now, it's possible for code in one library to use functions like SockRef::from to break the encapsulation of code in another library without itself using any unsafe. It's a small loophole in Rust, and it is fixable, so I'm interested in whether any of the possible fixes are practical, rather than just leaving the loophole open and documenting it.

One option would be to propose adding the parts of unsafe-io needed to fix this to std. The OwnsRaw trait might be sufficient. That way you could implement OwnsRaw and use AsRawFd + OwnsRaw, without any dependencies. Does something like that sound practical?

Thomasdezeeuw

Thomasdezeeuw commented on Apr 20, 2021

@Thomasdezeeuw
Collaborator

Right now, it's possible for code in one library to use functions like SockRef::from to break the encapsulation of code in another library without itself using any unsafe.

This is not true. If the encapsulation implements AsRawFd, which allows external access to the file descriptor, it must be able to handle changes to the file descriptor, e.g. options are set on it. If it can't deal with this it shouldn't implement AsRawFd, it's as simple as that.

It's a small loophole in Rust, and it is fixable, so I'm interested in whether any of the possible fixes are practical, rather than just leaving the loophole open and documenting it.

It's not a loophole, it's by design. It's required to deal with C/Unix/Windows I/O model of using integers/void pointers as identifiers to a kernel file description.

One option would be to propose adding the parts of unsafe-io needed to fix this to std. The OwnsRaw trait might be sufficient. That way you could implement OwnsRaw and use AsRawFd + OwnsRaw, without any dependencies. Does something like that sound practical?

I still don't get the point of OwnsRaw and the benefit over AsRawFd, could you elaborate why you think it's an improvement?

sunfishcode

sunfishcode commented on Apr 20, 2021

@sunfishcode
Member

Suppose library A holds a RawFd it never exposes, which happens to have the value 4. Suppose library B does this:

let s = socket2::SockRef::from(&4);
let mut buf = vec![std::mem::MaybeUninit::new(0u8); 32];
s.peek(&mut buf[..])?;

This compiles on stable Rust today. B can read A's otherwise encapsulated data, with no unsafe of its own.

Another example would be B accidentally holding a RawFd value after closing it, which then aliases a newly created file descriptors in A. POSIX considers all this defined behavior. But in Rust, it's expected that one library shouldn't be able to read another's encapsulated data without using unsafe.

There's a way to fix this: from_raw_fd is unsafe. However there's a loophole: AsRawFd and IntoRawFd are not unsafe, so anything can implement them. It's tempting to just document this, and say that implementing them should imply certain guarantees, however std itself implements them for RawFd and makes no guarantees.

OwnsRaw solves this by being an unsafe trait. Types which implement it have to use the keyword unsafe, and in doing so they commit to the desired guarantees.

Thomasdezeeuw

Thomasdezeeuw commented on Apr 20, 2021

@Thomasdezeeuw
Collaborator

Suppose library A holds a RawFd it never exposes, which happens to have the value 4. Suppose library B does this:

let s = socket2::SockRef::from(&4);
let mut buf = vec![std::mem::MaybeUninit::new(0u8); 32];
s.peek(&mut buf[..])?;

This comment by withoutboats (not pinging) points to the core issue: rust-lang/rust#76969 (comment), that RawFd is just a type alias. In your example you're creating a RawFd from nothing: that should be unsafe, but isn't. That is not a problem with socket2, but one with std lib, which due to the v1 stability guarantees can't be changed.

We should clearly document this footgun of using integer literals as RawFd in both SocketRef::from and AsRawFd.

sunfishcode

sunfishcode commented on Apr 20, 2021

@sunfishcode
Member

Integer literals are one example. Use-after-close is another. Arithmetic and deserialization are others.

I agree, we could better document the current situation. At the same time, this problem has the same form as problems with raw pointers, such as dangling and aliasing, which Rust does more than just document. Fixing it seems desirable in principle. And it seems possible. Is it practical?

I agree with withoutboats; the underlying problem is in std. OwnsRaw is a new trait, designed to be used with the existing types and traits without changing or replacing them. If there's a consensus around an RFC, then we should be able to arrange for all the important types that implement AsRawFd to implement OwnsRaw too—it's just one line of code per type, and it doesn't break anything to add them. std itself wouldn't need any breaking changes. Functions like SockRef::from could wait until the OwnsRaw impls are in place before adding + OwnsRaw to their bound to minimize disruption.

RalfJung

RalfJung commented on Apr 23, 2021

@RalfJung
Member

I have no idea what you mean with private, but if you mean shouldn't be modified by external (outside of the type's impls) that it should not implement AsRawFd/AsRawSocket. If it does implement AsRawFd/AsRawSocket that it must be able to deal with the situation it's actually used.

I strongly disagree. AsRawFd for a file descriptor is the equivalent of casting a reference to an integer. The latter is a safe operation, and so is the former. The inverse, turning an integer to a reference, is unsafe -- not just because the integer could be dangling, but also because that memory could be owned by other parties that e.g. have &mut or Box pointers.

The situation here is entirely analogous: as_raw_fd (the "ptr-to-int cast") is safe; anything working with such potentially dangling, potentially aliased FDs ("dereferencing a raw pointer") should be unsafe.

This is not true. If the encapsulation implements AsRawFd, which allows external access to the file descriptor, it must be able to handle changes to the file descriptor, e.g. options are set on it. If it can't deal with this it shouldn't implement AsRawFd, it's as simple as that.

Again, consider the analogy with references and integers: just because &mut can be safely cast to a raw pointer, does not mean that references can handle arbitrary changes to the underlying data through this raw pointer -- it is still wrong to violate the uniqueness assumption of the &mut reference.

Taking a step back, there are two options for how the Rust ecosystem as a whole could treat raw FDs:

  1. "free-for-all", no ownership, doing arbitrary operations on arbitrary FDs is, in principle, allowed.
  2. "exclusive", with ownership, one may only operate on FDs that one owns.

Both options are self-consistent and internally coherent. The standard library quite clearly took option 2, as a consequence as_raw_fd is safe and from_raw_fd is unsafe and RawFd is just a type aliased. This is also the option Rust took for pointers/memory. (Here, option 1 is not really realistic as that would violate memory safety.) You seem to claim that what the standard library does is somehow incoherent; that claim is wrong. It is entirely coherent to say both "AsRawFd is safe" and "working with non-owned FDs is unsafe"; this is exactly what Rust does for pointers and it works equally well for FDs. The only incoherence we have here is parts of the ecosystem picking option 2 with others picking option 1.

In your example you're creating a RawFd from nothing: that should be unsafe, but isn't.

Here you are presuming option 1 -- but that is simply not the option the standard library took. Under option 2, there is nothing wrong with creating a RawFd being a safe operation. This is in analogy with how turning a usize to a *mut i32 is a safe operation.
So, it could be unsafe (if option 1 is what we'd use), but saying it "should" be unsafe is incorrect.

Thomasdezeeuw

Thomasdezeeuw commented on Apr 27, 2021

@Thomasdezeeuw
Collaborator

I strongly disagree. AsRawFd for a file descriptor is the equivalent of casting a reference to an integer. The latter is a safe operation, and so is the former. The inverse, turning an integer to a reference, is unsafe -- not just because the integer could be dangling, but also because that memory could be owned by other parties that e.g. have &mut or Box pointers.

The situation here is entirely analogous: as_raw_fd (the "ptr-to-int cast") is safe; anything working with such potentially dangling, potentially aliased FDs ("dereferencing a raw pointer") should be unsafe.

I see your point and partially agree. I don't working with fd should be as unsafe as it can/should never cause memory unsafety as the OS always checks the validity of the file descriptor and returns an error if it's invalid. But I do see the point your making.

Again, consider the analogy with references and integers: just because &mut can be safely cast to a raw pointer, does not mean that references can handle arbitrary changes to the underlying data through this raw pointer -- it is still wrong to violate the uniqueness assumption of the &mut reference.

I have to disagree here. If a type needs to hold some invariant it shouldn't allow arbitrary access to the underlying data. Take NonZeroUsize for example, it needs to ensure that the underlying data (usize) is never zero. So it can hand out mutable reference to the underlying data as they might break the invariant. Compared to say Wrapping (which doesn't have any such invariant) it's underlying data is public.

Taking a step back, there are two options for how the Rust ecosystem as a whole could treat raw FDs:

1. "free-for-all", no ownership, doing arbitrary operations on arbitrary FDs is, in principle, allowed.

2. "exclusive", with ownership, one may only operate on FDs that one owns.

Both options are self-consistent and internally coherent. The standard library quite clearly took option 2, as a consequence as_raw_fd is safe and from_raw_fd is unsafe and RawFd is just a type aliased. This is also the option Rust took for pointers/memory. (Here, option 1 is not really realistic as that would violate memory safety.) You seem to claim that what the standard library does is somehow incoherent; that claim is wrong. It is entirely coherent to say both "AsRawFd is safe" and "working with non-owned FDs is unsafe"; this is exactly what Rust does for pointers and it works equally well for FDs. The only incoherence we have here is parts of the ecosystem picking option 2 with others picking option 1.

In your example you're creating a RawFd from nothing: that should be unsafe, but isn't.

Here you are presuming option 1 -- but that is simply not the option the standard library took. Under option 2, there is nothing wrong with creating a RawFd being a safe operation. This is in analogy with how turning a usize to a *mut i32 is a safe operation.
So, it could be unsafe (if option 1 is what we'd use), but saying it "should" be unsafe is incorrect.

I see your point.

What do you (not just @RalfJung) suggest the new API would look like? I have some requirements:

  • No external dependencies.
  • Preferably the operation requires no unsafe for known valid fds, e.g. going from &TcpStream to SocketRef should be safe (as it is with the current API).
  • Preferably crates that expose socket types, e.g. mio::net::TcpStream, don't need to have a (public) dependency on socket2.

26 remaining items

Loading
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

      Participants

      @RalfJung@Thomasdezeeuw@sunfishcode@notgull

      Issue actions

        `SockRef::from`, `Socket::sendfile` and other functions that operate on arbitrary file descriptors or `SOCKET`s potentially should be unsafe · Issue #218 · rust-lang/socket2