-
Notifications
You must be signed in to change notification settings - Fork 13.3k
Implement String::drain(range) according to RFC 574 #25028
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Thanks for the pull request, and welcome! The Rust team is excited to review your changes, and you should hear from @brson (or someone else) soon. If any changes to this PR are deemed necessary, please add them as extra commits. This ensures that the reviewer can see what has changed since they last reviewed the code. The way Github handles out-of-date commits, this should also make it reasonably obvious what issues have or haven't been addressed. Large or tricky changes may require several passes of review and changes. Please see CONTRIBUTING.md for more information. |
Previous vec drain PR is #24781. NOTE: The iterator is really the side show. Iterating a slice out of the string can be done more flexibly with just The lead role here is the ability to remove a range from anywhere in the string. So this gives String, (just like just given to Vec), magic memmove (a.k.a. ptr::copy) capabilities. This is needed in the parser itself at a FIXME marked issue #12884 in src/libsyntax/codemap.rs, where we can avoid reallocating a whole file's source code. |
I added a change to libsyntax to put it into use immediately. Not a big thing, but it clears up a FIXME and avoided some other already avoidable string copying. |
Hrm, I wish this composed on Vec::drain better. You could just call |
(otherwise this basically LGTM) |
I considered that but I didn't want to take the chance that the iteration in the Vec::drain Drop wouldn't be compiled out in the u8 case. Maybe it will optimize well. |
This is the asm think it looks good enough? Panic code goes away only if we repeat the bounds checks #![feature(collections_drain)]
#![crate_type="lib"]
#[inline(never)]
pub fn remove_range(v: &mut Vec<u8>, a: usize, b: usize) {
if a <= b && b <= v.len() {
v.drain(a..b);
}
} .section .text._ZN12remove_range20h1888d273e1a01667eaaE,"ax",@progbits
.globl _ZN12remove_range20h1888d273e1a01667eaaE
.align 16, 0x90
.type _ZN12remove_range20h1888d273e1a01667eaaE,@function
_ZN12remove_range20h1888d273e1a01667eaaE:
.cfi_startproc
cmpq %fs:112, %rsp
ja .LBB0_2
movabsq $24, %r10
movabsq $0, %r11
callq __morestack
retq
.LBB0_2:
pushq %r15
.Ltmp0:
.cfi_def_cfa_offset 16
pushq %r14
.Ltmp1:
.cfi_def_cfa_offset 24
pushq %rbx
.Ltmp2:
.cfi_def_cfa_offset 32
.Ltmp3:
.cfi_offset %rbx, -32
.Ltmp4:
.cfi_offset %r14, -24
.Ltmp5:
.cfi_offset %r15, -16
movq %rsi, %rbx
movq %rdi, %r14
cmpq %rdx, %rbx
ja .LBB0_6
movq 8(%r14), %r15
subq %rdx, %r15
jb .LBB0_6
movq %rbx, 8(%r14)
je .LBB0_6
movq (%r14), %rdi
addq %rdi, %rdx
addq %rbx, %rdi
movq %rdx, %rsi
movq %r15, %rdx
callq memmove@PLT
addq %rbx, %r15
movq %r15, 8(%r14)
.LBB0_6:
popq %rbx
popq %r14
popq %r15
retq
.Ltmp6:
.size _ZN12remove_range20h1888d273e1a01667eaaE, .Ltmp6-_ZN12remove_range20h1888d273e1a01667eaaE
.cfi_endproc |
I have been told that this looks good enough for now, at least. |
} | ||
|
||
#[unstable(feature = "collections_drain", reason = "recently added")] | ||
impl<'a> ExactSizeIterator for Drain<'a> { } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like the underlying Chars
iterator itself isn't an exact size iterator, so we may not be able to provide this implementation just yet :(
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I'll remove that, I just didn't think that through.
Currently this isn't quite as 0-cost as I also believe that our naming conventions for strings would call this method |
Running the iterator is optional, not needed for removing the range. (See drop). I would suggest dropping the iterator part and just naming it remove_range, but the RFC doesn't allow for that. As I said previously above, there are many iterator choices for strings and it's simple to do them just before you use remove_range. |
The API is unstable, so I'm happy to land Edit: this is in reply to remove_range, not whether it should be _chars or byte-oriented. |
I expect we will write another RFC to straighten out the use of ranges and to add drain to other collections |
One advantage implementation-wise of byte-oriented is that this just becomes a pure forwarding of Vec::drain. On the other hand that might make it less useful (just get the inner Vec)? |
I implemented another toy thing for String first, a remove_range that returns a value that does the memmove on drop, just like this iterator. But instead of returning an iterator, it just returns something that derefs to &str. You can use it to iterate with any of the &str iterators instead of just char, but it's too complicated to explain and it's cuter than useful. |
Right, of course!
Yes I agree, I don't want to hold it up too much on this point, mostly just thinking out loud. |
If you have a range:
Just use that range to both iterate with arbitrary iterator and then remove the range?
|
Thank you both. It always improves a lot with review. |
`.drain(range)` is unstable and under feature(collections_drain). This adds a safe way to remove any range of a String as efficiently as possible. As noted in the code, this drain iterator has none of the memory safety issues of the vector version. RFC tracking issue is rust-lang#23055
Avoid creating a new String when there is no BOM to strip, and otherwises use .drain(..3) to strip the BOM using the same allocation.
Pushed a squash of the review fix into the first patch. It was needed to keep it tidy. No changes, just the squash. |
⌛ Testing commit da03c9d with merge 700b4c1... |
collections: Implement String::drain(range) according to RFC 574 `.drain(range)` is unstable and under feature(collections_drain). This adds a safe way to remove any range of a String as efficiently as possible. As noted in the code, this drain iterator has none of the memory safety issues of the vector version. RFC tracking issue is #23055
collections: Implement String::drain(range) according to RFC 574
.drain(range)
is unstable and under feature(collections_drain).This adds a safe way to remove any range of a String as efficiently as
possible.
As noted in the code, this drain iterator has none of the memory safety
issues of the vector version.
RFC tracking issue is #23055