Skip to content

Write better module-level and type-level docs for String #30146

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Dec 2, 2015
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
198 changes: 194 additions & 4 deletions src/libcollections/string.rs
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,14 @@
// option. This file may not be copied, modified, or distributed
// except according to those terms.

//! An owned, growable string that enforces that its contents are valid UTF-8.
//! A UTF-8 encoded, growable string.
//!
//! This module contains the [`String`] type, a trait for converting
//! [`ToString`]s, and several error types that may result from working with
//! [`String`]s.
//!
//! [`String`]: struct.String.html
//! [`ToString`]: trait.ToString.html

#![stable(feature = "rust1", since = "1.0.0")]

Expand All @@ -29,7 +36,190 @@ use str::{self, FromStr, Utf8Error, Chars};
use vec::Vec;
use boxed::Box;

/// A growable string stored as a UTF-8 encoded buffer.
/// A UTF-8 encoded, growable string.
///
/// The `String` type is the most common string type that has ownership over the
/// contents of the string. It has a close relationship with its borrowed
/// counterpart, the primitive [`str`].
///
/// [`str`]: ../primitive.str.html
///
/// # Examples
///
/// You can create a `String` from a literal string with `String::from`:
///
/// ```
/// let hello = String::from("Hello, world!");
/// ```
///
/// You can append a [`char`] to a `String` with the [`push()`] method, and
/// append a [`&str`] with the [`push_str()`] method:
///
/// ```
/// let mut hello = String::from("Hello, ");
///
/// hello.push('w');
/// hello.push_str("orld!");
/// ```
///
/// [`push()`]: #method.push
/// [`push_str()`]: #method.push_str
///
/// If you have a vector of UTF-8 bytes, you can create a `String` from it with
/// the [`from_utf8()`] method:
///
/// ```
/// // some bytes, in a vector
/// let sparkle_heart = vec![240, 159, 146, 150];
///
/// // We know these bytes are valid, so we'll use `unwrap()`.
/// let sparkle_heart = String::from_utf8(sparkle_heart).unwrap();
///
/// assert_eq!("💖", sparkle_heart);
/// ```
///
/// [`from_utf8()`]: #method.from_utf8
///
/// # UTF-8
///
/// `String`s are always valid UTF-8. This has a few implications, the first of
/// which is that if you need a non-UTF-8 string, consider [`OsString`]. It is
/// similar, but without the UTF-8 constraint. The second implication is that
/// you cannot index into a `String`:
///
/// ```ignore
/// let s = "hello";
///
/// println!("The first letter of s is {}", s[0]); // ERROR!!!
/// ```
///
/// [`OsString`]: ../ffi/struct.OsString.html
///
/// Indexing is intended to be a constant-time operation, but UTF-8 encoding
/// does not allow us to do this. Furtheremore, it's not clear what sort of
/// thing the index should return: a byte, a codepoint, or a grapheme cluster.
/// The [`as_bytes()`] and [`chars()`] methods return iterators over the first
/// two, respectively.
///
/// [`as_bytes()`]: #method.as_bytes
/// [`chars()`]: #method.chars
///
/// # Deref
///
/// `String`s implement [`Deref`]`<Target=str>`, and so inherit all of [`str`]'s
/// methods. In addition, this means that you can pass a `String` to any
/// function which takes a [`&str`] by using an ampersand (`&`):
///
/// ```
/// fn takes_str(s: &str) { }
///
/// let s = String::from("Hello");
///
/// takes_str(&s);
/// ```
///
/// [`&str`]: ../primitive.str.html
/// [`Deref`]: ../ops/trait.Deref.html
///
/// This will create a [`&str`] from the `String` and pass it in. This
/// conversion is very inexpensive, and so generally, functions will accept
/// [`&str`]s as arguments unless they need a `String` for some specific reason.
///
///
/// # Representation
///
/// A `String` is made up of three components: a pointer to some bytes, a
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A String is made up of Vec<u8>. This section ignores way too many abstraction levels :)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That it's a literal Vec is an implementation detail, absolutely.

/// length, and a capacity. The pointer points to an internal buffer `String`
/// uses to store its data. The length is the number of bytes currently stored
/// in the buffer, and the capacity is the size of the buffer in bytes. As such,
/// the length will always be less than or equal to the capacity.
///
/// This buffer is always stored on the heap.
///
/// You can look at these with the [`as_ptr()`], [`len()`], and [`capacity()`]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This section is… pretty far away from the “these”. It should either be repeated what “these” are or paragraphs should be reorganised.

/// methods:
///
/// ```
/// use std::mem;
///
/// let story = String::from("Once upon a time...");
///
/// let ptr = story.as_ptr();
/// let len = story.len();
/// let capacity = story.capacity();
///
/// // story has thirteen bytes
/// assert_eq!(19, len);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

“thirteen” and 19 are pretty different.

///
/// // Now that we have our parts, we throw the story away.
/// mem::forget(story);
///
/// // We can re-build a String out of ptr, len, and capacity. This is all
/// // unsafe becuase we are responsible for making sure the components are
/// // valid:
/// let s = unsafe { String::from_raw_parts(ptr as *mut _, len, capacity) } ;
///
/// assert_eq!(String::from("Once upon a time..."), s);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Creating a new string is unnecessary here, impl<'a> PartialEq<&'a str> for String exists.

/// ```
///
/// [`as_ptr()`]: #method.as_ptr
/// [`len()`]: # method.len
/// [`capacity()`]: # method.capacity
///
/// If a `String` has enough capacity, adding elements to it will not
/// re-allocate. For example, consider this program:
///
/// ```
/// let mut s = String::new();
///
/// println!("{}", s.capacity());
///
/// for _ in 0..5 {
/// s.push_str("hello");
/// println!("{}", s.capacity());
/// }
/// ```
///
/// This will output the following:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Uhh, I’m not sure this is something that can be shown nor that “will” is the correct word here. Especially if something like this lands. Also, play.rust-lang.org already produces different results to me on stable and beta.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I was a little concerned about this, but I wasn't sure how else to show off this behavior. Ideas?

///
/// ```text
/// 0
/// 5
/// 10
/// 20
/// 20
/// 40
/// ```
///
/// At first, we have no memory allocated at all, but as we append to the
/// string, it increases its capacity appropriately. If we instead use the
/// [`with_capacity()`] method to allocate the correct capacity initially:
///
/// ```
/// let mut s = String::with_capacity(25);
///
/// println!("{}", s.capacity());
///
/// for _ in 0..5 {
/// s.push_str("hello");
/// println!("{}", s.capacity());
/// }
/// ```
///
/// [`with_capacity()`]: #method.with_capacity
///
/// We end up with a different output:
///
/// ```text
/// 25
/// 25
/// 25
/// 25
/// 25
/// 25
/// ```
///
/// Here, there's no need to allocate more memory inside the loop.
#[derive(PartialOrd, Eq, Ord)]
#[stable(feature = "rust1", since = "1.0.0")]
pub struct String {
Expand Down Expand Up @@ -139,7 +329,7 @@ impl String {
/// // some bytes, in a vector
/// let sparkle_heart = vec![240, 159, 146, 150];
///
/// // We know these bytes are valid, so just use `unwrap()`.
/// // We know these bytes are valid, so we'll use `unwrap()`.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these two changes were made to bring this example in line with the module-level version, cc #28835

/// let sparkle_heart = String::from_utf8(sparkle_heart).unwrap();
///
/// assert_eq!("💖", sparkle_heart);
Expand Down Expand Up @@ -201,7 +391,7 @@ impl String {
/// // some bytes, in a vector
/// let sparkle_heart = vec![240, 159, 146, 150];
///
/// // We know these bytes are valid, so just use `unwrap()`.
/// // We know these bytes are valid, so we'll use `unwrap()`.
/// let sparkle_heart = String::from_utf8(sparkle_heart).unwrap();
///
/// assert_eq!("💖", sparkle_heart);
Expand Down