-
Notifications
You must be signed in to change notification settings - Fork 6k
pipeline doc - #14414
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
pipeline doc - #14414
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some quick comments to get started
@mairaw I haven't started on the buffer doc yet. I've made some changes to pipeline. |
@BillWagner see Pipe basic usage Would you be able to create a project with the code in this PR (from this sample down). I'm having trouble getting it to work. I can then do snippet references and customer can have code they can download/test/modify. If you're too busy I could ask @KPixel to migrate the code. |
I'll own fixing the code samples, that's not a problem. |
@davidfowl really fantastic info I'm anxious to get published. If @pakrym No hurry as I'm doing this after my ASP.NET Core 3.0 migration work. |
docs/standard/io/Buffers.md
Outdated
} | ||
``` | ||
|
||
The above method searches each segment for a specific byte. If we need to keep track of each segment's `SequencePosition` then [`ReadOnlySequence.TryGet`](https://docs.microsoft.com/en-us/dotnet/api/system.buffers.readonlysequence-1.tryget?view=netcore-3.0) would be more appropriate. Let's change the above code to return a `SequencePosition` instead of an integer. This has the added benefit of allowing the caller to avoid a second scan to get the data at a specific index. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It took me a little while to work out what this meant:
"allowing the caller to avoid a second scan"
There are two kinds of scans going on here: the Span<byte>.IndexOf
calls, and then the scan through the segments in the sequence. My mental model was that the bulk of the scan-like work here is the IndexOf
because that's going to touch every byte until it finds what it's looking for. (Also, so far in my work with ReadOnlySequence<T>
, which I've been using in conjunction with pipelines, the majority have been single-segment, with multi-segment examples only appearing as you approach the end of a buffer. I've found my app gets better throughput with largish buffer sizes, ensuring that most of the time I'm in the single-segment case, so I naturally tend to think of that one as the hot path that's most perf-critical.)
So maybe qualify this as "...a second scan through the sequence's segments to get..."?
docs/standard/io/Pipelines.md
Outdated
|
||
* Is called to get memory from the underlying writer. | ||
* `PipeWriter.Advance(int)` is called to tell the `PipeWriter` how much data was written to the buffer. | ||
* `PipeWriter.FlushAsync` is called to make the data available to the `PipeReader`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As a reader, a question I have at this point is: why are these separate operations?
In your example, the call to Advance
is followed immediately (as long as an exception didn't occur) by a call to FlushAsync
. This suggests that an alternative design was possible, in which these were rolled into a single method. But presumably there's a reason you didn't do that—there must be some scenario in which you might want to call Advance
more than once in between calls to FlushAsync
? But what is that scenario? As it stands, I don't understand this aspect of the design.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This suggests that an alternative design was possible, in which these were rolled into a single method. But presumably there's a reason you didn't do that—there must be some scenario in which you might want to call Advance more than once in between calls to FlushAsync? But what is that scenario? As it stands, I don't understand this aspect of the design.
Buffering. You can GetMemory()/GetSpan and Advance(..) over and over then when you have buffered up the appropriate message/amount you call FlushAsync.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense. I don't think this is very discoverable right now, so I'm wondering if it would be helpful to add a short description of when this might be something you'd want to do. I think in the app where I'm using this I don't need to, but it's hard to be certain from the docs—my read loop reads fairly large blocks of data off disk and writes them into the pipe, and I flush every time. But with the socket example you've given I don't actually know how to judge whether the "flush on every iteration" approach the code you've shown would be right in any particular scenario. If I happen to know that I'm receiving from a source that sends data in small dribs and drabs (not a hypothetical question for me, by the way) should I be waiting until I've "buffered up the appropriate message/amount"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not useful when bytes are being transferred from one source to the other. In that scenario the writer has no idea what the bytes are and can't make sense of how much should be buffered before flushing.
There are cases like serialization that take in an IBufferWriter<byte>
or when you're directly writing via the PipeWriter where you can choose to write an "entire message" not pieces of a message.
Here's an example:
A simple length prefixed message, you could imagine writing code like this:
async Task WriteMessage(PipeWriter writer, byte[] payload)
{
writer.Write(GetLength(payload));
writer.Advance(4); // Write 4 bytes
writer.Write(payload);
writer.Advance(payload.Length);
await writer.FlushAsync();
}
docs/standard/io/Buffers.md
Outdated
|
||
The preceding method requests a buffer of at least 5 bytes from the `IBufferWriter<byte>` using `GetSpan(5)` then writes bytes for the ASCII string "Hello" to the returned `Span<byte>`. It then calls `Advance(written)` to indicate how many bytes were written to the buffer. | ||
|
||
This method of writing uses the `Memory<T>`/`Span<T>` buffer provided by the `IBufferWriter<T>`, but you can also use the [`Write`](https://docs.microsoft.com/en-us/dotnet/api/system.buffers.buffersextensions.write?view=netstandard-2.1) extension method to copy an existing buffer to the `IBufferWriter<T>`. `Write` does the work of calling `GetSpan`/`Advance` as appropriate, so there's no need to call `Advance` after writing. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should have links to the API doc for GetSpan/GetMemory/Advance. Ex: https://docs.microsoft.com/en-us/dotnet/api/system.buffers.ibufferwriter-1.getmemory?view=netstandard-2.1#System_Buffers_IBufferWriter_1_GetMemory_System_Int32_
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PLEASE HOLD COMMENTS UNTIL IT'S READY FOR REVIEW
😆
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I stopped as soon as I saw that. Oops!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NP. This is taking a back seat to the ASP.NET Core 3.0 release so I'll be slow doing updates. At some point I'll need one of you to get the bulk of the pipe code into a working project on a Git repo. That way I can migrate the code snippets to this doc and not have embedded snippets.
…into ra/buffers/pipelines
Is there any mention of completing Pipes with an exception? Given the overall depth this doc, that seems like a pretty big omission to me. |
I’ll add it in the future. I want to get this in since it’s the first big conceptual doc. But you’re right. That is missing. All of the changes that don’t have suggested edits will happen later. |
Co-Authored-By: Stephen Halter <[email protected]>
|
||
* The entire message (end of line) might not be received in a single call to `ReadAsync`. | ||
* It's ignoring the result of `stream.ReadAsync`. `stream.ReadAsync` returns how much data was read. | ||
* It doesn't handle the case where multiple lines are read in a single `ReadAsync` call. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd also consider mentioning that the method is always allocating a byte array each time we read data.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd also consider mentioning that the method is always allocating a byte array each time we read data.
Added
- It allocates a
byte
array with each read.
* Returns an incomplete `ValueTask<FlushResult>` when the amount of data in the `Pipe` crosses `PauseWriterThreshold`. | ||
* Completes `ValueTask<FlushResult>` when it becomes lower than `ResumeWriterThreshold`. | ||
|
||
Two values are used to prevent rapid cycling, which can occur if one value is used. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd give an example of the rapid cycling if the same value is used for the threshold.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jkotalik please suggest verbiage.
|
||
## Streams | ||
|
||
When reading or writing stream data, you typically read data using a de-serializer and write data using a serializer. Most of these read and write stream APIs have a `Stream` parameter. To make it easier to integrate with these existing APIs, `PipeReader` and `PipeWriter` expose an <xref:System.IO.Pipelines.PipeReader.AsStream%2A>. <xref:System.IO.Pipelines.PipeWriter.AsStream%2A> returns a `Stream` implementation around the `PipeReader` or `PipeWriter`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also mention that there are PipeReader.Create and PipeWriter.Create methods to convert a stream into a PipeReader/PipeWriter.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is on me. I need to expand on the Stream interop with examples. I'll do this in a different PR
halter73 suggestions Co-Authored-By: Stephen Halter <[email protected]>
@davidfowl should I format the code/comments to minimize the horizontal scroll bars on a tablet? You can simulate a tablet on bigger displays by making the widest window without a left side and right side TOC. |
Yes we can wrap the comments and should align the code by the parameters |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Feel free to squash and merge this when you're ready @Rick-Anderson. I imagine you can do the comment wrapping separately since it's a nice improvement but not blocking right? |
Where's the real URL? |
@mairaw @BillWagner when do you merge master into live? Let me know so I can tweat this before @davidfowl does. |
Usually by the end of my day shift 😆 |
Internal review URLs
Todo
Fixes #7516