Skip to content

[AutoML v0.16.0] InferColumn doesn't work on tricky csv file #4460

Closed
@LittleLittleCloud

Description

@LittleLittleCloud

For some csv file that contains double quotes in it's field, the inferColumn API can't work properly. It's probably because when guessing delimiter, AutoML takes the candidates inside double quote into consideration, which should be neglect. (Or when splitting lines, it uses \n inside double quote)

steps to reproduce:
download this dataset

MLContext mlContext = new MLContext();
var inputColumnInformation = new ColumnInformation();
inputColumnInformation.LabelColumnName = @"review_scores_rating";
var train = mlContext.Auto().InferColumns(TrainDataPath, inputColumnInformation);

Updated

The dataset actually works for latest AutoML/ModelBuilder, To reproduce the error, please uses this dataset:

jigsaw.txt

Metadata

Metadata

Assignees

No one assigned

    Labels

    P2Priority of the issue for triage purpose: Needs to be fixed at some point.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions