Skip to content

How do you load data from an in-memory data set? #91

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
isaacabraham opened this issue May 9, 2018 · 8 comments
Closed

How do you load data from an in-memory data set? #91

isaacabraham opened this issue May 9, 2018 · 8 comments
Labels
enhancement New feature or request

Comments

@isaacabraham
Copy link

The only sample I can see uses the TextLoader to import data from a CSV file. How do you load in arbitrary data from e.g. a record or array and specify labels etc. on there?

Thanks!

@markusweimer markusweimer added the enhancement New feature or request label May 9, 2018
@Ivanidzo4ka Ivanidzo4ka reopened this May 9, 2018
@Ivanidzo4ka
Copy link
Contributor

Sorry, fat fingered issue.
As part of open API, we currently don't have it, but we working on it.
As part of internals we have ArrayDataViewBuilder and you can see example of it usage here:

ArrayDataViewBuilder builder = new ArrayDataViewBuilder(Env);

@isaacabraham
Copy link
Author

Yep, understood. Not expecting everything to be there, but this is definitely something you want to have - the ability to pass in arbitrary data.

@Ivanidzo4ka
Copy link
Contributor

For one who will work on this in future.
we have DataViewConstructionUtils internal class which does desired.

internal static class DataViewConstructionUtils

Maybe we need to add ArrayDataView, and we need to decide should we expose IHostEnvironment or not, and where in new API it belongs.

@Ivanidzo4ka
Copy link
Contributor

also copy of #10

@isaacabraham
Copy link
Author

@Ivanidzo4ka whichever way you go, make it as simple as possible to consume. This means either allowing F# Records / C# Classes, with either attributes or some guidance to specify labels / features etc., or (and this should definitely also be supported) support for naked arrays or sequences of either arrays (so a 2x2 array) or tuple data.

@danroot
Copy link

danroot commented May 17, 2018

Looking for this as well. A couple use case I can see are:

  1. Loaders other than flat files. SqlLoader, etc
    2)Scenarios where the schema is not known at compile time - ie a command line tool that lets users pass in arbitrary csv file.

@TomFinley
Copy link
Contributor

Hi @isaacabraham and @danroot , thanks for the feedback. @Ivanidzo4ka introduced an attempt to solve this issue in #106 , though I see that PR did not link to this issue but rather the pre-existing #10 . Is it possible for you to try it, and provide feedback on whether it does what you need?

@isaacabraham
Copy link
Author

Will give it a bash in the coming days - thanks!

@ghost ghost locked as resolved and limited conversation to collaborators Mar 31, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

5 participants