ACP Implement `fmt::Write` wrapper for `&mut [u8]` (and maybe `&mut str`?) #278

jmillikin · 2023-10-13T04:26:06Z

Proposal

Problem statement

The ability to place formatted text into a buffer seems like a useful primitive to have, so I propose adding types to core::fmt that implement Write on top of a user-provided &mut [u8]. These types would provide a safe way to access the portion of the buffer that has been written to.

Additional variants that seem useful to add at the same time:

A wrapper of &mut str that uses str::as_bytes_mut(), with careful attention toward handling the EOF case. This would let the user avoid UTF8 revalidation.
Writers to &mut [MaybeUnint<u8>] would involve a bit more setup boilerplate but enable bypassing the cost of buffer initialization.

Motivating examples or use cases

I frequently find myself writing a BufWriter { &mut [u8] } wrapper struct in no_std code when I want to do the Rust equivalent of snprintf(), for example when constructing Linux mount() data options in rust-fuse: https://github.com/jmillikin/rust-fuse/blob/47ca3bc5cda263543cf6a71155115a85dd1d094e/fuse/os/linux.rs#L539-L560.

Solution sketch

Something roughly like this -- I'm not attached to the names or the exact API, and the real implementation would leverage unsafe-unchecked methods to avoid bounds checks.

pub struct BufWriter<'a> {
	buf: &'a mut [u8],
	count: usize,
}

impl<'a> BufWriter<'a> {
	pub fn new(buf: &'a mut [u8]) -> BufWriter<'a> {
		BufWriter { buf, count: 0 }
	}

	pub fn bytes(&self) -> &[u8] {
		&self.buf[..self.count]
	}

	pub fn into_bytes(self) -> &'a [u8] {
		&self.buf[..self.count]
	}

	// maybe? I find this method useful when glomming together chunks of &CStr, but
	// I'd be fine leaving it out if libs-team dislikes it.
	//
	// pub fn write_bytes(&mut self, b: &[u8]) -> core::fmt::Result { ... }
}

impl core::fmt::Write for BufWriter<'_> {
	fn write_str(&mut self, s: &str) -> core::fmt::Result {
		let b = s.as_bytes();
		let avail = &mut self.buf[self.count..];
		if b.len() > avail.len() {
			return Err(core::fmt::Error);
		}
		avail[..b.len()].copy_from_slice(b);
		self.count += b.len();
		Ok(())
	}
}

Alternatives

Add a function like write_to(buf: &mut [u8], w: &impl Write) -> Result<usize, fmt::Error> to core::fmt.
- This API feels unergonomic to me, because the user must keep track of write offsets themselves.
Add a type modeled on std::io::Cursor, which implements additional methods for reading and seeking.
- Unclear where this new type would go -- core::io::Cursor isn't possible because that would cause name conflicts with std::io::Cursor, and core::fmt would be a weird place to put a type that does more than formatting.
- More logic to implement than writer-only wrappers, but maybe code could be relocated from std::io::Cursor ?

Links and related work

This pattern is a popular StackOverflow question:

There's another open proposal for fmt::Write that seems related:

Add fmt::Write to io::Write adapter #133 proposes adding a wrapper for formatted writing to a std::io::Write.

What happens now?

This issue is part of the libs-api team API change proposal process. Once this issue is filed the libs-api team will review open proposals as capability becomes available. Current response times do not have a clear estimate, but may be up to several months.

Possible responses

The libs team may respond in various different ways. First, the team will consider the problem (this doesn't require any concrete solution or alternatives to have been proposed):

We think this problem seems worth solving, and the standard library might be the right place to solve it.
We think that this probably doesn't belong in the standard library.

Second, if there's a concrete solution:

We think this specific solution looks roughly right, approved, you or someone else should implement this. (Further review will still happen on the subsequent implementation PR.)
We're not sure this is the right solution, and the alternatives or other materials don't give us enough information to be sure about that. Here are some questions we have that aren't answered, or rough ideas about alternatives we'd want to see discussed.

The text was updated successfully, but these errors were encountered:

m-ou-se · 2023-10-17T15:58:43Z

Does arrayvec::ArrayString work for your use cases?

I'd be interested in getting something like ArrayVec and ArrayString into core.

jmillikin · 2023-10-18T03:41:28Z

I think ArrayString might be solving a different problem, although I've not used it before so it could be that I've missed something in the docs.

First, an ArrayString seems to use a constant-sized underlying buffer (const CAP: usize). I don't see a way to construct one from a &mut [u8], so there's no way to (for example) format a value into a fixed-size slice of a larger buffer. The user would have to use an intermediate ArrayString, format into it, then do a slice-to-slice copy into the target buffer.

Second, this feature would be closer to ArrayVec<u8>, since otherwise there's no way to mix UTF8 and non-UTF8 data in the buffer. For example when working with older C APIs, sometimes the text encoding is raw byte values (ISO-8859-1) or a legacy encoding (e.g. Shift_JIS) so the flexibility of &mut [u8] is helpful.

jmillikin · 2023-11-15T10:10:56Z

I put together a draft implementation: rust-lang/rust#117931

It's not large (~120 lines including docs), and enables formatting data into a &mut [u8] or &mut [MaybeUninit<u8>]. It works well as a replacement for format!() in no-alloc code.

ChayimFriedman2 · 2023-11-29T17:15:59Z

Can we re-use core::io::BorrowedBuf and core::io::BorrowedCursor for this?

jmillikin · 2023-11-30T02:56:23Z

Can we re-use core::io::BorrowedBuf and core::io::BorrowedCursor for this?

Implementing fmt::Write for those types also seems like a reasonable idea, but they're not a substitute for this ACP because their API doesn't provide the same affordances -- in particular there's no way to obtain the written portion of their buffer via deconstruction.

You could use a BorrowedBuf in the implementation of fmt::WriteCursor, but the implementation is already so trivial that it doesn't seem worth the effort.

ChayimFriedman2 · 2023-11-30T04:20:07Z

in particular there's no way to obtain the written portion of their buffer via deconstruction

What do you mean? We can get the written portion via BorrowedBuf::filled(). The only problem is that we cannot return a &str using from_utf8_unchecked(), since the user might write invalid UTF-8 bytes manually.

jmillikin · 2023-11-30T05:01:13Z

in particular there's no way to obtain the written portion of their buffer via deconstruction

What do you mean? We can get the written portion via BorrowedBuf::filled(). The only problem is that we cannot return a &str using from_utf8_unchecked(), since the user might write invalid UTF-8 bytes manually.

That's not deconstruction, it's a reborrow.

ChayimFriedman2 · 2023-11-30T06:57:13Z

We can add a method for deconstruction if it's a common need (if not, it can be emulated by taking the len() of the BorrowedBuf, dropping it then slicing the original data according to it, but it will require unsafe code if the data is MaybeUninit).

jmillikin · 2023-11-30T07:28:33Z

We can add a method for deconstruction if it's a common need (if not, it can be emulated by taking the len() of the BorrowedBuf, dropping it then slicing the original data according to it, but it will require unsafe code if the data is MaybeUninit).

I think the ACP covers what I'm about to say, but just to clarify: I know a BorrowedBuf (or any &mut [u8]) can be used to store the result of write!() by writing a dozen lines of glue code. The goal of this ACP is to have that glue code be part of the standard library, not ad-hoc reimplemented in each crate that needs to do alloc-free formatting.

Consider C, which has the sprintf() family of functions. It is very easy to use sprintf() to format into a buffer:

#include <assert.h>
#include <string.h>
#include <stdint.h>
#include <stdio.h>

void fmt_u16(char *buf, size_t buf_len, uint16_t v) {
    assert(buf_len >= 6); 
    sprintf(buf, "%hu", v); 
}

int main() {
    char buf[6];
    fmt_u16(buf, sizeof buf, 12345);
    printf("formatted: %s\n", buf);
    return 0;
}

The equivalent in Rust would be written like this in current stable (lifetimes explicit for clarity):

use core::fmt;
use core::mem::MaybeUninit;

fn fmt_u16<'a>(buf: &'a mut [MaybeUninit<u8>], v: u16) -> &'a str {
    struct W<'a> { buf: &'a mut [MaybeUninit<u8>], idx: usize }
    impl fmt::Write for W<'_> {
        fn write_str(&mut self, s: &str) -> fmt::Result {
            self.buf[self.idx..self.idx+s.len()].copy_from_slice(unsafe {
                core::mem::transmute(s.as_bytes())
            });
            self.idx += s.len();
            Ok(())
        }
    }
    let mut w = W { buf, idx: 0 };
    let _ = fmt::Write::write_fmt(&mut w, format_args!("{v}"));
    unsafe {
        let utf8: &[u8] = core::mem::transmute(&w.buf[..w.idx]);
        core::str::from_utf8_unchecked(utf8)
    }
}

fn main() {
    let mut buf = [MaybeUninit::uninit(); 6];
    println!("formatted: {:?}", fmt_u16(&mut buf, 12345));
}

And it would be really nice if all that extra stuff wasn't needed -- if non-allocating format look like this:

fn fmt_u16<'a>(buf: &'a mut [MaybeUninit<u8>], v: u16) -> &'a str {
    let mut cursor = fmt::WriteCursor::new_uninit(buf);
    write!(cursor, "{v}");
    cursor.into_str()
}

fn main() {
    let mut buf = [MaybeUninit::uninit(); 6];
    println!("formatted: {:?}", fmt_u16(&mut buf, 12345));
}

SUPERCILEX · 2023-12-09T23:20:42Z

I just put out a crate that tries to solve the composition problem in a generalized fashion: https://github.com/SUPERCILEX/io-adapters

Example usage that (I believe) solves this proposal's use case:

//! ```cargo
//! [dependencies]
//! io-adapters = "0.1"
//! ```
#![feature(core_io_borrowed_buf)] // Or just use a normal array if you want to keep things simple

use core::fmt;
use std::{io::BorrowedBuf, mem::MaybeUninit, str::from_utf8};

use io_adapters::WriteExtension;

fn main() {
    let mut buf = [MaybeUninit::uninit(); 128];
    let mut buf = BorrowedBuf::from(buf.as_mut_slice());

    foo(buf.unfilled().write_adapter());

    println!("{:?}", from_utf8(buf.filled()));
}

fn foo(mut output: impl fmt::Write) {
    writeln!(output, "Hello, World!").unwrap();
}

jmillikin added api-change-proposal A proposal to add or alter unstable APIs in the standard libraries T-libs-api labels Oct 13, 2023

jmillikin mentioned this issue Nov 15, 2023

Add core::fmt::WriteCursor for formatting into a borrowed buffer. rust-lang/rust#117931

Closed

SUPERCILEX mentioned this issue Dec 9, 2023

Add fmt::Write to io::Write adapter #133

Open

jmillikin closed this as not planned Won't fix, can't repro, duplicate, stale Jun 9, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ACP Implement `fmt::Write` wrapper for `&mut [u8]` (and maybe `&mut str`?) #278

ACP Implement `fmt::Write` wrapper for `&mut [u8]` (and maybe `&mut str`?) #278

jmillikin commented Oct 13, 2023 •

edited

Loading

m-ou-se commented Oct 17, 2023

jmillikin commented Oct 18, 2023

jmillikin commented Nov 15, 2023

ChayimFriedman2 commented Nov 29, 2023

jmillikin commented Nov 30, 2023

ChayimFriedman2 commented Nov 30, 2023

jmillikin commented Nov 30, 2023

ChayimFriedman2 commented Nov 30, 2023

jmillikin commented Nov 30, 2023 •

edited

Loading

SUPERCILEX commented Dec 9, 2023

ACP Implement fmt::Write wrapper for &mut [u8] (and maybe &mut str?) #278

ACP Implement fmt::Write wrapper for &mut [u8] (and maybe &mut str?) #278

Comments

jmillikin commented Oct 13, 2023 • edited Loading

Proposal

Problem statement

Motivating examples or use cases

Solution sketch

Alternatives

Links and related work

What happens now?

Possible responses

m-ou-se commented Oct 17, 2023

jmillikin commented Oct 18, 2023

jmillikin commented Nov 15, 2023

ChayimFriedman2 commented Nov 29, 2023

jmillikin commented Nov 30, 2023

ChayimFriedman2 commented Nov 30, 2023

jmillikin commented Nov 30, 2023

ChayimFriedman2 commented Nov 30, 2023

jmillikin commented Nov 30, 2023 • edited Loading

SUPERCILEX commented Dec 9, 2023

ACP Implement `fmt::Write` wrapper for `&mut [u8]` (and maybe `&mut str`?) #278

ACP Implement `fmt::Write` wrapper for `&mut [u8]` (and maybe `&mut str`?) #278

jmillikin commented Oct 13, 2023 •

edited

Loading

jmillikin commented Nov 30, 2023 •

edited

Loading