Skip to content

Replace MemoryBlock with NonNull<[u8]> #61

Closed
rust-lang/rust
#75152
@SimonSapin

Description

@SimonSapin
Contributor

The return type some methods involves:

pub struct MemoryBlock {
    pub ptr: NonNull<u8>,
    pub size: usize,
}

A pointer and a size, that sounds familiar.

Perhaps a dedicated struct is not needed and this could be NonNull<[u8]> instead?

NonNull::cast can be used to cast to any thin pointer type, similar to accessing the MemoryBlock::ptr field. PR rust-lang/rust#71940 proposes NonNull<[T]>::len for accessing the size, and NonNull<[T]>::slice_from_raw_parts for constructing a new value. (I feel that these would be good to have regardless of what happens to MemoryBlock, and I expect their existence be uncontroversial given the precedent of <*const [T]>::len and ptr::slice_from_raw_parts.)

Activity

Lokathor

Lokathor commented on May 6, 2020

@Lokathor

This makes the struct less repr(C) friendly, so that's a major downside.

TimDiekmann

TimDiekmann commented on May 6, 2020

@TimDiekmann
Member

Why would MemoryBlock need repr(c)?

Lokathor

Lokathor commented on May 6, 2020

@Lokathor

Much easier to interact with foreign code if our allocation system can actually have its primitives sent directly to said foreign code.

for plain data structs, repr(C) is a very sane and good default to follow. We don't need any fancy field reordering here.

TimDiekmann

TimDiekmann commented on May 6, 2020

@TimDiekmann
Member

I see the argument, why #[repr(C)] is good, but I also like NonNull<[u8]>, it just feels intuitive. Is there an option to make NonNull<[u8]> repr(C) somehow in the future? Was there any progression recently, which defines the layout of fat pointers?

Lokathor

Lokathor commented on May 6, 2020

@Lokathor

I am not on T-Lang or the UCG-wg, but they're both having meetings on Thursday and they're open to the public. I'll try to ask about it if I get a chance, but I expect the answer is "no plans at this time".

glandium

glandium commented on May 6, 2020

@glandium

NonNull<[u8]> is actually painful to get the length of.

glandium

glandium commented on May 6, 2020

@glandium

Oh, I missed that there's <*const [T]>::len now.

TimDiekmann

TimDiekmann commented on May 6, 2020

@TimDiekmann
Member

What I really like on NonNull<[u8]> is, that we could far more methods to NonNull in generell inspired by *mut T. Sure, those could also be implemented on MemoryBlock, but this feels off. This also comes in handy when dealing with allocators in general as NonNull is widely used in the allocator API. Originally I introduced MemoryBlock to pass it back to the allocator (grow, shrink, dealloc), but now it's just a tiny wrapper around NonNull and usize, which is quite verbose to type every time. It also requires an additional struct to be in scope.

Much easier to interact with foreign code if our allocation system can actually have its primitives sent directly to said foreign code.

Couldn't you just implement an extension trait for NonNull<[u8]>, when you need to talk to ffi? I mean, we don't want to design the allocator API towards a specific C-API. Passing those pointers to/from ffi is pretty easy with rust-lang/rust#71940.

@SimonSapin What you would pass back to the allocator? NonNull<u8> + Layout? NonNull<[u8]> + usize? Would the latter solve #3?

Lokathor

Lokathor commented on May 6, 2020

@Lokathor

Couldn't you just implement an extension trait for NonNull<[u8]>, when you need to talk to ffi? I mean, we don't want to design the allocator API towards a specific C-API.

NonNull<[u8]> cannot currently be passed directly to C. And of course extension traits don't help you either, this is a layout issue.

I'd be totally fine with this proposal if there was a defined and stable layout for the type, but until then we should not change from a simple struct that's easy to have interact with C to a more "rusty" type that makes it harder to interact with C. Because it's not just C that we're talking about of course, this affects Rust being used in Python, and Ruby, and Erlang and all those other things.

TimDiekmann

TimDiekmann commented on May 6, 2020

@TimDiekmann
Member

NonNull<[u8]> cannot currently be passed directly to C

I didn't mean to pass it directly but use a wrapper struct (like MemoryBlock), which is #[repr(C)] and is used by the extension trait.

Lokathor

Lokathor commented on May 6, 2020

@Lokathor

I suppose, still feels kinda clunky, but sure.

cormacrelf

cormacrelf commented on May 7, 2020

@cormacrelf

A bit off topic, but is the size field designed such that the computation to compute it can be optimised out if it's never read? The only way to return the full usable size from AllocRef::alloc is to always compute and include it. That is, if I'm implementing it, should I decide to always return usable sizes and let the optimiser figure out if it was needed, or decide that it's never worth it? If relying on the optimiser, does that tend to work in practice?

This is more a question for wrapping jemalloc etc, which have an extra function call associated with that. I don't think those wrappers will be able to optimise out an extern "C" call (although static jemalloc with LTO?). For pure Rust allocators you can always just structure your code not to forget the usable size.

(Edit: it's also my understanding that eventually, the list of __rg_alloc > GlobalAlloc::alloc (etc) calls is not the spot you should assume AllocRef will be inlined... I don't know how opaque those __rg functions are. But from now, it's more each specific Box<T, MyAllocator> instance, right?)

SimonSapin

SimonSapin commented on May 7, 2020

@SimonSapin
ContributorAuthor

@cormacrelf This issue is about what type to use in order to represent a pointer and size together. It’s independent of whether and when to do that in the first place, as opposed to only returning a pointer.

Always returning the usable size was proposed in #17 (comment). I think how optimization interacts with the global allocator indirection is interesting, please consider filing a new issue.

I don't know how opaque those __rg functions are.

I don’t remember the exact scenario I tried, but quite some time ago I observed that __rg_alloc could be inlined with LTO (thin or not) but not without. This is unsurprising as it is just before linking (after compiling each crate) that the compiler decides what allocator to use (depending on which if any crate defines a #[global_allocator]) and synthesizes __rust_alloc and other trampoline functions.

40 remaining items

Loading
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

    Development

    Participants

    @petertodd@Amanieu@SimonSapin@cormacrelf@glandium

    Issue actions

      Replace `MemoryBlock` with `NonNull<[u8]>` · Issue #61 · rust-lang/wg-allocators