-
Notifications
You must be signed in to change notification settings - Fork 1.9k
Simple IDataView implementation sample. #3302
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
} | ||
|
||
/// <summary> | ||
/// This is an implementation of <see cref="IDataView"/> that wraps an <see cref="IEnumerable{T}"/> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In my first attempt at this, I had written much of this material as separate XML docs for each member, but I found that this made it hard to tell a connected "story" about what is going on here, so I stuck to one big block on the IDataView
implementation and one block on the DataViewRowCursor
derived class. This made it easier to tell a cohesive story, while also having it I think look a lot less busy.
They are XML comments so I can use <see
tags, mostly so that someone inspecting this sample can sometimes F12 to see what I'm talking about...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few points:
-
The docs engine will paste this examples verbatim, so none of the xml docstrings will be parsed and the F12 for cref won't work as you hoped them to.
-
If there's a way to put these in the section for IDataView we can get all the xml features like cref. But you're writing a story that's attached to specific code blocks, so lumping it all in would probably break the story.
-
This is ideal to be written as mixed xml + code pages like below but those pages are editorials on a different repo and are manually updated.
https://docs.microsoft.com/en-us/dotnet/machine-learning/how-to-guides/inspect-intermediate-data-ml-net
Having said all that, I think this works for V1 as is, with the only caveat that the xml features like cref won't work as you intended.
In reply to: 274705835 [](ancestors = 274705835)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very good points @shmoradims. Thanks for keeping track of this. Fair enough. We can think about that later. TBH I find the documentation system perhaps more confusing than other parts of our infrastructure, so I'm grateful to have your guidance in this area.
I'm still not sure why the <see
tags wouldn't work if someone were to try to run the sample in VS -- certainly we reference types outside of our own library all the time, e.g., lots of things in the BCL -- but we can try as you say to work that out later.
@@ -5,6 +5,7 @@ | |||
<OutputType>Exe</OutputType> | |||
<SignAssembly>false</SignAssembly> | |||
<PublicSign>false</PublicSign> | |||
<RootNamespace>Samples</RootNamespace> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thought I might opportunistically do this, since our new policy per #3205 is that samples not have Microsoft.ML
in their namespace, but by default we were still doing the "wrong" thing when we create a new file. I think it might be nice if we avoided the problem by default.
Not sure offhand who the typical sample reviewers are... @shmoradims and @sfilipi are obvious, but beyond that not entirely certain. Maybe @zeahmed ? |
docs/samples/Microsoft.ML.Samples/Dynamic/SimpleDataViewImplementation.cs
Outdated
Show resolved
Hide resolved
docs/samples/Microsoft.ML.Samples/Dynamic/SimpleDataViewImplementation.cs
Outdated
Show resolved
Hide resolved
docs/samples/Microsoft.ML.Samples/Dynamic/SimpleDataViewImplementation.cs
Outdated
Show resolved
Hide resolved
docs/samples/Microsoft.ML.Samples/Dynamic/SimpleDataViewImplementation.cs
Show resolved
Hide resolved
Codecov Report
@@ Coverage Diff @@
## master #3302 +/- ##
==========================================
+ Coverage 72.63% 72.63% +<.01%
==========================================
Files 807 807
Lines 145127 145127
Branches 16219 16219
==========================================
+ Hits 105413 105417 +4
+ Misses 35296 35293 -3
+ Partials 4418 4417 -1
|
Codecov Report
@@ Coverage Diff @@
## master #3302 +/- ##
==========================================
+ Coverage 72.63% 72.65% +0.01%
==========================================
Files 807 807
Lines 145127 145190 +63
Branches 16219 16223 +4
==========================================
+ Hits 105413 105485 +72
+ Misses 35296 35290 -6
+ Partials 4418 4415 -3
|
Looks great! Is this something that we want to link in our API docs? Or should it be somewhere else in the ML.NET documentation website? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure @artidoro, I can try to do that... I don't actually know how, so I just did some copy-pasta from one of @rogancarr's PRs and adapted it, let me know if you get a chance whether my |
/// to create pre-baked implementations, it is also useful to know how to create one completely from scratch. We also | ||
/// take this opportunity to illustrate and motivate the basic principles of how the IDataView system is architected, | ||
/// since people interested in implementing <see cref="IDataView"/> need at least some knowledge of those principles. | ||
/// </summary> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we're using simple, non-xml comments since the sample displays as is.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But why? Maybe we've had wildly difference experiences, but I've generally observed that people don't read samples as an intellectual exercise from beginning to end, they might read a little, but immediately fall to running them and see what the heck is going on. At least, such has been my observation. It's definitely true for me.
But, you're the expert, but I don't know that I like this idea of making the documentation non-browsable if someone actually tries to run the sample. Perhaps that's what you're doing, but I might prefer to have some understanding of why it's actually a good idea.
// while `tokensValue` is logically presented as a three element array, internally you will | ||
// see that the arrays internal to that structure have (at least) four items, specifically: | ||
// `Masterfully`, `done`, `hero!`, `listen.`. In this way we see a simple example of the details | ||
// of how buffer sharing from one iteration to the next actually works. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
next actually works [](start = 66, length = 20)
do you want to add a line about the Length.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You mean describe that Masterfully
, done
, hero!
has three items, but Stay
, awhile
, and
, listen
has four items? I know I usually err on the side of overdescription so this may seem a bit out of character for me to say but, I kind of feel like this particular point is so completely obvious that including it explicitly in the narrative I've structured serves more to de-focus that narrative than clarify it. But maybe you mean something else?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixes #3301.