Skip to content

In v0.11 Transforms.Conversion.ConvertType() does not properly convert numeric values if they are "in quotes" #2824

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
CESARDELATORRE opened this issue Mar 3, 2019 · 2 comments
Assignees
Labels
P1 Priority of the issue for triage purpose: Needs to be fixed soon.

Comments

@CESARDELATORRE
Copy link
Contributor

CESARDELATORRE commented Mar 3, 2019

Since v0.11, when a dataset file column's numeric value has quotes, ML.NET mlContext.Transforms.Conversion.ConvertType() cannot handle it properly. For instance, a column with the following values:

  • "1"
  • "0"

ML.NET ConvertType() in a pipeline was not able to convert those values to Boolean (it was transforming all values, either "0" and "1" to 0) neither to Float (all values transformed to NaN)…

The following transformer puts a 0 to all values when converted to Boolean:
mlContext.Transforms.Conversion.ConvertType(outputColumnName: "LabelBool", inputColumnName: "Label", outputKind: DataKind.Boolean

The following transformer puts a NaN to all values when converted to Float:
mlContext.Transforms.Conversion.ConvertType(outputColumnName: "LabelFloat", inputColumnName: "Label", outputKind: DataKind.Single

Interestingly, until ML.NET v0.10, ML.NET was able to directly load that properly into a Boolean type, properly.

@CESARDELATORRE CESARDELATORRE added the bug Something isn't working label Mar 3, 2019
@Ivanidzo4ka
Copy link
Contributor

We turn off quote support as default option.
more can be found #2630 in this PR

@CESARDELATORRE CESARDELATORRE removed the bug Something isn't working label Mar 4, 2019
@Lynx1820 Lynx1820 added P1 Priority of the issue for triage purpose: Needs to be fixed soon. and removed P1 Priority of the issue for triage purpose: Needs to be fixed soon. labels Jan 10, 2020
@najeeb-kazmi najeeb-kazmi self-assigned this Jan 30, 2020
@najeeb-kazmi
Copy link
Member

By design. Setting allowQuoting = true will give the expected result.

@ghost ghost locked as resolved and limited conversation to collaborators Mar 23, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
P1 Priority of the issue for triage purpose: Needs to be fixed soon.
Projects
None yet
Development

No branches or pull requests

4 participants