-
Notifications
You must be signed in to change notification settings - Fork 41
ML.NET Notes #74
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ML.NET Notes #74
Conversation
Hi @terrajobst , thanks for posting this. In order to track the correspondence of work to these notes, I will mark as comments on the specific lines corresponding issues (or comments on issues, where appropriate), where they are posted. If you would prefer in future I provide these annotations in a different way, please let me know, thanks! |
|
||
## Notes | ||
|
||
* `Column` can be a struct |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added as a comment here, to address this and some other commetns.
way, but it's hard to make the much simpler as they would have to be | ||
generic or the user would have to handle the type | ||
* `IDataView` | ||
- `ISchematized` is being removed |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This specific line entered as 1502.
generic or the user would have to handle the type | ||
* `IDataView` | ||
- `ISchematized` is being removed | ||
- Should be an abstract class |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This line item entered as 1528.
implementers can still return `null`. | ||
- Either make it a property `long? Count` or make it | ||
`bool TryGetCount(out long count)`. | ||
- `GetRowCursor()` instead of taking `Func<int, bool>` make it `Predicate<Column>` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This line item entered as 1529.
* `IDataView` | ||
- `ISchematized` is being removed | ||
- Should be an abstract class | ||
- `GetRowCount()`. `lazy = true` means get if you know it, otherwise don't |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This line item entered as 1531.
which makes it easier to reason about. | ||
- We should look into whether we should have an async version | ||
- We want to get rid of the the `IRowCursorConsolidator` base | ||
* `ICounted` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Many of the things like this related to IRowCursor
I have bundled into a single issue 1532 since I felt like this refactoring is something worth considering collectively, rather than individual items, since many of them are related. This actually includes all the rest of the file up to the ColumnType
item.
... Except for the UInt128
issue now that I see it.
* `IRowToRowMapper` | ||
- `Action disposer` should probably be replaced by making the right type | ||
(`IRow`?) implement `IDisposable`. | ||
- `Func<int, bool>` is hard to get right. Should this be `Predicate<Column>` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggested this as part of 1529, which I'd mentioned previously.
as well? | ||
* Can we remove `DataKind` entirely? If not, we should probably rename some of the | ||
members, remove the overlap, and add an entry for `0`, like `Unknown` or `None`. | ||
* `ColumnType` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Entered as 1533, with some expansion, please take a look and tell me what you think. This includes a solution to the DataKind
problem that I think will work pretty well and at fairly little trouble to anyone...
These notes are mostly published to help us (to remeber what happened in the meeting) and the community (to see what's going on). Commenting here is a fine. We could also file work items and link them from the notes, notes but that might be overkill. So whatever helps you stay organized is The Correct Process ™️ 😄 |
- `GetRowCursor()` instead of taking `Func<int, bool>` make it `Predicate<Column>` | ||
which makes it easier to reason about. | ||
- We should look into whether we should have an async version | ||
- We want to get rid of the the `IRowCursorConsolidator` base |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Almost forgot about this one. Entered as 1867.
* `IRowCursor` extending `IRow` seems odd. Can these be collapsed? If not, can | ||
they have an inheritance relationship when we're making them classes? | ||
* `IRowToRowMapper` | ||
- `Action disposer` should probably be replaced by making the right type |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Questions about disposability of IRow
(or now Row
) were fixed in 1867 so this strangeness around disposing methods has been resolved, thanks!
No description provided.