-
Notifications
You must be signed in to change notification settings - Fork 259

Description
See rust-lang/rust#72175 and nix-rust/nix#1421: it seems an external library is allowed to assume a file descriptor is private, for example:
#![allow(clippy::blacklisted_name)]
#![deny(unsafe_code)]
use socket2::SockRef;
use std::mem;
#[allow(unsafe_code)]
mod external_library {
use std::{mem, os::unix::io::RawFd, ptr};
static DATA: i32 = -1;
pub struct Foo([RawFd; 2]);
impl Foo {
pub fn new() -> Self {
let mut sockets = [-1; 2];
assert_eq!(
unsafe {
libc::socketpair(
libc::AF_UNIX,
libc::SOCK_SEQPACKET | libc::SOCK_CLOEXEC,
0,
sockets.as_mut_ptr(),
)
},
0
);
Self(sockets)
}
pub fn foo(&self) {
let data: *const _ = &DATA;
assert_eq!(
unsafe {
libc::send(
self.0[0],
ptr::addr_of!(data).cast(),
mem::size_of_val(&data),
0,
)
},
mem::size_of_val(&data) as _
);
}
pub fn bar(&self) -> i32 {
let mut ptr: *const i32 = ptr::null();
assert_eq!(
unsafe {
libc::recv(
self.0[1],
ptr::addr_of_mut!(ptr).cast(),
mem::size_of_val(&ptr),
0,
)
},
mem::size_of_val(&ptr) as _
);
unsafe { *ptr }
}
}
}
fn main() {
let foo = external_library::Foo::new();
SockRef::from(&3)
.send(&[0; mem::size_of::<*const i32>()])
.unwrap();
foo.foo();
dbg!(foo.bar());
}
Here, the safe SockRef::from
allows to crash a (fake) "external library" by letting it dereference a null pointer using only safe code. Making functions that accept arbitrary file descriptors or SOCKET
s unsafe can probably solve this problem.
Metadata
Metadata
Assignees
Labels
No labels
Activity
Thomasdezeeuw commentedon Apr 13, 2021
I have no idea what you mean with private, but if you mean shouldn't be modified by external (outside of the type's
impl
s) that it should not implementAsRawFd
/AsRawSocket
. If it does implementAsRawFd
/AsRawSocket
that it must be able to deal with the situation it's actually used.The
SocketRef::from(&3)
should be unsafe, but not becauseSocketRef::from
should be unsafe. Because you're creating (implicitly) aRawFd
which should be valid, but isn't in this example, that is the unsafe part.Furthermore you're dereferencing a raw pointer using
unsafe
, so you're clearing not only using safe code. Putting code in an "external library" doesn't mean it shouldn't count towards unsafe code, it still is.So, I do agree that creating
RawFd
from literals, especially when passing them directly to function as you did in the example should be unsafe or at least clearer. But I don't agreeSocketRef::from
is unsafe, or should be marked as such.ghost commentedon Apr 13, 2021
Fair. I think at least something in the
3_i32
-> ... ->SockRef
path should be unsafe. Maybe a new wrapper type that represents a file descriptor which will be used by theSockRef
that is unsafe to construct from an arbitrary number?sunfishcode commentedon Apr 13, 2021
It turns out that
RawFd
itself implementsAsRawFd
. SoAsRawFd
doesn't say anything about the validity or ownership of the returnedRawFd
value.rust-lang/rust#72175 is still being discussed, and looks like it may lead to an RFC. If that happens, and if the RFC is accepted, functions that operate on
RawFd
values should beunsafe
. That includesSockRef::from
in its current form, since it accepts any type that implementsAsRawFd
which includesRawFd
itself.That said,
SockRef::from
is a valuable use case, and ideally it should be possible to do what it's doing withoutunsafe
in its API. As one possible approach, theunsafe-io
crate has anOwnsRaw
trait which types that implementAsRawFd
etc. can implement to additionally state that they really do own their handles, and traits likeAsUnsafeSocket
type which combineOwnsRaw
andAsRawFd
(AsRawSocket
on Windows). With these, it should be possible to havefn SockRef::from(t: &T) where T: AsUnsafeSocket
as a safe API. (Aside: unsafe-io is a new crate, and I'm interested if anyone has ideas for how to improve the API!)But of course, it makes sense to wait and see how the discussion in rust-lang/rust#72175 and the possible RFC turn out before making major changes here.
sunfishcode commentedon Apr 15, 2021
I'm working on a draft of a proposal to add new official wording about
RawFd
to Rust.In particular, this proposal would mean functions like
SockRef::from
withAsRawFd
arguments should be marked unsafe, or migrated to alternatives. But the upside would be clearer guarantees for many use cases. I'd be interested in any feedback from folks working on socket2!Thomasdezeeuw commentedon Apr 16, 2021
I don't have time to look through the entire proposal, but
SockRef::from
will remain a safe function. In my opinion theAsRawFd
/AsRawSocket
documentation should be clearer that the returned fd/socket should be valid. The current documentation forAsRawFd
says the followingThe second sentence says to me that the implementation needs to ensure the file descriptor is valid (while the object is alive). Unfortunately
RawFd
is just ac_int
(because it needs to match C semantics) and can "just be created" without being valid, i.e. the literal123
is a "valid" (as in type)RawFd
but of course isn't a valid fd. Couple that with theAsRawFd
impl forRawFd
and we got a problem as shown in the initial issue report.I think the problem is that the creation of
RawFd
should be an unsafe operation in which the programmer needs to ensure it's a valid descriptor. But we can't change that anymore. We can improve the documentation however.sunfishcode commentedon Apr 16, 2021
Instead of making it
unsafe
, what would you think about changingSockRef::from
to useAsUnsafeSocket
in place ofAsRawFd
?AsUnsafeSocket
is implemented for all the relevant std types,TcpStream
,TcpListener
,UdpSocket
,UnixStream
,UnixListener
, andUnixDatagram
. For other types, the only thing needed is to add an implOwnsRaw
, so types likesocket2::Socket
and others could easily implement it. That way,SockGet::from
wouldn't beunsafe
, and many use cases would continue to work as-is.And to be sure, this isn't urgent; I'm still in the process of exploring the options.
Thomasdezeeuw commentedon Apr 17, 2021
I don't think socket2 should dependencies outside of libc/winapi. Furthermore exposing those dependencies in a public API is a bad idea as we're then tied to a fixed version of the crate (i.e. we couldn't update to unsafe-io v2 without also updating to socket2 v2).
I'm also not convinced that
AsUnsafeSocket
is an improvement ofAsRawFd
. I think the best way forward is improving the documentation ofAsRawFd
/AsRawSocket
.sunfishcode commentedon Apr 19, 2021
Right now, it's possible for code in one library to use functions like
SockRef::from
to break the encapsulation of code in another library without itself using anyunsafe
. It's a small loophole in Rust, and it is fixable, so I'm interested in whether any of the possible fixes are practical, rather than just leaving the loophole open and documenting it.One option would be to propose adding the parts of unsafe-io needed to fix this to
std
. TheOwnsRaw
trait might be sufficient. That way you could implementOwnsRaw
and useAsRawFd + OwnsRaw
, without any dependencies. Does something like that sound practical?Thomasdezeeuw commentedon Apr 20, 2021
This is not true. If the encapsulation implements
AsRawFd
, which allows external access to the file descriptor, it must be able to handle changes to the file descriptor, e.g. options are set on it. If it can't deal with this it shouldn't implementAsRawFd
, it's as simple as that.It's not a loophole, it's by design. It's required to deal with C/Unix/Windows I/O model of using integers/void pointers as identifiers to a kernel file description.
I still don't get the point of
OwnsRaw
and the benefit overAsRawFd
, could you elaborate why you think it's an improvement?sunfishcode commentedon Apr 20, 2021
Suppose library A holds a
RawFd
it never exposes, which happens to have the value 4. Suppose library B does this:This compiles on stable Rust today. B can read A's otherwise encapsulated data, with no
unsafe
of its own.Another example would be B accidentally holding a
RawFd
value after closing it, which then aliases a newly created file descriptors in A. POSIX considers all this defined behavior. But in Rust, it's expected that one library shouldn't be able to read another's encapsulated data without usingunsafe
.There's a way to fix this:
from_raw_fd
is unsafe. However there's a loophole:AsRawFd
andIntoRawFd
are not unsafe, so anything can implement them. It's tempting to just document this, and say that implementing them should imply certain guarantees, howeverstd
itself implements them forRawFd
and makes no guarantees.OwnsRaw
solves this by being an unsafe trait. Types which implement it have to use the keywordunsafe
, and in doing so they commit to the desired guarantees.Thomasdezeeuw commentedon Apr 20, 2021
This comment by withoutboats (not pinging) points to the core issue: rust-lang/rust#76969 (comment), that
RawFd
is just a type alias. In your example you're creating aRawFd
from nothing: that should be unsafe, but isn't. That is not a problem with socket2, but one with std lib, which due to the v1 stability guarantees can't be changed.We should clearly document this footgun of using integer literals as
RawFd
in bothSocketRef::from
andAsRawFd
.sunfishcode commentedon Apr 20, 2021
Integer literals are one example. Use-after-
close
is another. Arithmetic and deserialization are others.I agree, we could better document the current situation. At the same time, this problem has the same form as problems with raw pointers, such as dangling and aliasing, which Rust does more than just document. Fixing it seems desirable in principle. And it seems possible. Is it practical?
I agree with withoutboats; the underlying problem is in std.
OwnsRaw
is a new trait, designed to be used with the existing types and traits without changing or replacing them. If there's a consensus around an RFC, then we should be able to arrange for all the important types that implementAsRawFd
to implementOwnsRaw
too—it's just one line of code per type, and it doesn't break anything to add them. std itself wouldn't need any breaking changes. Functions likeSockRef::from
could wait until theOwnsRaw
impls are in place before adding+ OwnsRaw
to their bound to minimize disruption.RalfJung commentedon Apr 23, 2021
I strongly disagree.
AsRawFd
for a file descriptor is the equivalent of casting a reference to an integer. The latter is a safe operation, and so is the former. The inverse, turning an integer to a reference, is unsafe -- not just because the integer could be dangling, but also because that memory could be owned by other parties that e.g. have&mut
orBox
pointers.The situation here is entirely analogous:
as_raw_fd
(the "ptr-to-int cast") is safe; anything working with such potentially dangling, potentially aliased FDs ("dereferencing a raw pointer") should beunsafe
.Again, consider the analogy with references and integers: just because
&mut
can be safely cast to a raw pointer, does not mean that references can handle arbitrary changes to the underlying data through this raw pointer -- it is still wrong to violate the uniqueness assumption of the&mut
reference.Taking a step back, there are two options for how the Rust ecosystem as a whole could treat raw FDs:
Both options are self-consistent and internally coherent. The standard library quite clearly took option 2, as a consequence
as_raw_fd
is safe andfrom_raw_fd
is unsafe andRawFd
is just a type aliased. This is also the option Rust took for pointers/memory. (Here, option 1 is not really realistic as that would violate memory safety.) You seem to claim that what the standard library does is somehow incoherent; that claim is wrong. It is entirely coherent to say both "AsRawFd is safe" and "working with non-owned FDs is unsafe"; this is exactly what Rust does for pointers and it works equally well for FDs. The only incoherence we have here is parts of the ecosystem picking option 2 with others picking option 1.Here you are presuming option 1 -- but that is simply not the option the standard library took. Under option 2, there is nothing wrong with creating a
RawFd
being a safe operation. This is in analogy with how turning ausize
to a*mut i32
is a safe operation.So, it could be unsafe (if option 1 is what we'd use), but saying it "should" be unsafe is incorrect.
Thomasdezeeuw commentedon Apr 27, 2021
I see your point and partially agree. I don't working with fd should be as unsafe as it can/should never cause memory unsafety as the OS always checks the validity of the file descriptor and returns an error if it's invalid. But I do see the point your making.
I have to disagree here. If a type needs to hold some invariant it shouldn't allow arbitrary access to the underlying data. Take
NonZeroUsize
for example, it needs to ensure that the underlying data (usize
) is never zero. So it can hand out mutable reference to the underlying data as they might break the invariant. Compared to sayWrapping
(which doesn't have any such invariant) it's underlying data is public.I see your point.
What do you (not just @RalfJung) suggest the new API would look like? I have some requirements:
&TcpStream
toSocketRef
should be safe (as it is with the current API).mio::net::TcpStream
, don't need to have a (public) dependency on socket2.26 remaining items