Closed
Description
From reddit: http://www.reddit.com/r/rust/comments/2bpenl/confused_by_the_purpose_of_str_and_string/cj7rt1u
- add a section for indexing - i.e. how do I compare just the first 3 characters of two strings? Or fetch the character in position 3 in the string? Or iterate through the characters in the string?
- comparing - i.e. how do I know if one string is greater than another - and on what basis is the ordering done (binary value of the strings, or based on character set)?
- applying regular expressions to strings, is &str preferred over String for this?
I like all of these, and they should be in the guide.
Activity
[-]String Guide suggetsions[/-][+]String Guide suggestions[/+]chris-morgan commentedon Jul 26, 2014
Anything about indexing should be minimal and should be very strongly suggesting that you shouldn’t be doing this in the first place. Almost always, a string should be opaque data. Iteration is just about the only way you should ever do such things, and even iteration should very seldom be done.
Alternatives that may serve certain purposes are
begins_with
andends_with
, andgraphemes()
will need to be mentioned.As for comparison, UTF-8 has the convenient property that bitwise comparison yields the same answers as codepoint comparison. Of course there is still the question of composed versus decomposed characters and so on and so forth… the simple summary is that you really shouldn’t be doing comparisons, either.
People want to do all these operations on strings, but it seems to me that the more experienced you get, the more you realise that these sorts of operations are all unsound and should never really be done.
steveklabnik commentedon Jul 26, 2014
Agreed. This is a good place to explain that.
nielsle commentedon Jul 26, 2014
In #15997 I tried to rearrange the sections to introduce String before &str. That makes it easier to introduce &str as a view into String. (I agree that indexing should be discouraged, but indexing makes it easy to explain how &str is different from String)
This PR is mostly meant as an experiment. Feel free to close it if you are already editing the chapter or if you are heading in a different direction.
pcn commentedon Jul 27, 2014
The string guide should have some common use cases and the rust-ish (rusty? oxidi-shous?) way described. My case is this:
I want to take a string (e.g. a url) and use it to do something not too complex (e.g. authenticate to the AWS S3 api). This involves taking the url, and deciding based on the url which of the two available formats will be used, and returning the string that will be used to determine the signature of the request.
This means some slicing and dicing. Coming from python/ruby/go/clojure (even C) the easiest answer is to split the string and compare to known values (e.g. does the hostname bit of the URL start with "s3.amazonaws.com") which lends itself naturally to a match. The odd part is that I pass in one kind of string (an &str) , and get another kind out (a String) where I need to be familiar with a whole different set of traits vs. &str. My understanding is that I should prefer String types, and I can see this being a common idiom - so much so that there should probably be some agreement on how something like this could be made more obvious:
I would like to have documented where the convention should be to place type conversions via to_str() and collect() etc. It would be nice to just be able to say that e.g. I should just convert &str strings to String and document which operations on a String are similar to common string operations in other languages (comparisons, splitting, joining, tokenizing, etc.), explain the slice types and how to operate with them (and why they exist) and just overall make it so that there is a clear path to doing common things the easy way.
samdoshi commentedon Jul 28, 2014
Would it be a good idea to discuss
std::str::MaybeOwned
here? When it's appropriate to use it and when it isn't.steveklabnik commentedon Jul 28, 2014
@samdoshi it might. I know nothing about it.
lee-b commentedon Jul 30, 2014
How come the strings guide doesn't mention the char type (utf32) at all? ;)
steveklabnik commentedon Jul 30, 2014
Strings are UTF8, not UTF32.
lee-b commentedon Jul 30, 2014
I know, but that makes it even more confusing and in need of explanation. Why is an str a u8 slice, rather than chars, and why IS there a char that's 32-bit, but not part of string, etc.? ;)
I get it, at a low level: char is a 32-bit value, capable of representing all (most?) unicode codepoints as a fixed-length binary value. But it's not clear why they called that char, why there's no "byte" type, why string is essentially a vector of bytes, but already converted from bytes to unicode (rather than using stronger typing, and calling it char8, for example). The low-level stuff is understandable, but the high-level design / reasoning, and how to use char along with all this... that stuff's not so clear.
reem commentedon Jul 30, 2014
It might be a good idea to bring up Str, which makes writing APIs that are agnostic to the type of string they receive better.
pcn commentedon Aug 5, 2014
From the current state of the guide:
Insight into examples of both would be helpful.
I'd like to know what you'd think about this language:
Just below that it says:
That reads as a bit confusing to me. If I understand it, would this preserve the meaning and provide some more clarity?
steveklabnik commentedon Aug 5, 2014
Yes, that means the same thing. I feel they're about equally clear, but if you feel that it's more...
pcn commentedon Aug 5, 2014
Maybe there's a better phrasing? I feel like from the perspective of the un-initiated reader, the extra information helps by describing the mechanism and the context.
steveklabnik commentedon Aug 11, 2014
Adding a section on c_str and FFI would be good as well.
http://doc.rust-lang.org/std/c_str/index.html
l0kod commentedon Aug 18, 2014
The guide should maybe add a note to highlight the
Str
trait who can be used as a generic parameter if the function doesn't care about owning (or not) the string. This way, it's possible to use&str
orString
, which might be convenient:l0kod commentedon Aug 18, 2014
In general, the guide should encourage traits as function parameter instead of types.
steveklabnik commentedon Jan 12, 2015
I think that most of this has been tackled, if there are specific improvements, please open new issues with one per issue.
Auto merge of rust-lang#15994 - ChayimFriedman2:err-comma-after-fus, …