Skip to content

Exception on 'IgnoreColumns' in input #1613

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
daholste opened this issue Nov 13, 2018 · 2 comments
Closed

Exception on 'IgnoreColumns' in input #1613

daholste opened this issue Nov 13, 2018 · 2 comments
Assignees
Labels
AutoML.NET Automating various steps of the machine learning process bug Something isn't working

Comments

@daholste
Copy link
Contributor

daholste commented Nov 13, 2018

System information

  • OS version/distro: Windows 10
  • .NET Version (eg., dotnet --info): .NET Core 2.1

Issue

  • What did you do?
    Ran MML command line: execgraph "C:\Benchmarking\automl_graph.json"

Contents of automl_.graph.json:

{
  "Inputs": {
    "file_train": "D:\\SplitDatasets\\ExcitementFG2_train.csv",
    "file_test": "D:\\SplitDatasets\\ExcitementFG2_valid.csv"
  },
  "Nodes": [
    {
      "Inputs": {
        "CustomSchema": "sep=, col=Label:R4:78 col=Features1:R4:0-77 col=Features2:R4:79-202 header=+",
        "InputFile": "$file_train"
      },
      "Name": "Data.CustomTextLoader",
      "Outputs": {
        "Data": "$data_train"
      }
    },
    {
      "Inputs": {
        "CustomSchema": "sep=, col=Label:R4:78 col=Features1:R4:0-77 col=Features2:R4:79-202 header=+",
        "InputFile": "$file_test"
      },
      "Name": "Data.CustomTextLoader",
      "Outputs": {
        "Data": "$data_test"
      }
    },
    {
      "Inputs": {
        "BatchSize": 3,
        "StateArguments": {
          "Name": "AutoMlState",
          "Settings": {
            "Engine": {
              "Name": "Rocket",
              "Settings": {}
            },
            "Metric": "Accuracy",
            "TerminatorArgs": {
              "Name": "IterationLimited",
              "Settings": {
                "FinalHistoryLength": 100
              }
            },
            "TrainerKind": "SignatureBinaryClassifierTrainer"
          }
        },
        "TestingData": "$data_test",
        "TrainingData": "$data_train",
		"IgnoreColumns": ["cost"]
      },
      "Name": "Models.PipelineSweeper",
      "Outputs": {
        "Results": "$output_data",
        "State": "$xyz"
      }
    }
  ],
  "Outputs": {
    "output_data": "C:\\Benchmarking\\01-ResultsOut.csv"
  }
}
  • What happened?
    'IgnoreColumns' in file is not respected / throws an exception (more details in logs section below)

  • What did you expect?
    A run w/o exception

Source code / logs

--- Command line args ---
dotnet MML.dll execgraph C:\Benchmarking\automl_graph.json

--- Exception message ---

(1) Unexpected exception: Unexpected input: 'IgnoreColumns', 'System.InvalidOperationException'

Exception context:
    Throwing component: Environment

   at Microsoft.ML.Runtime.EntryPoints.EntryPointNode.CheckAndSetInputValue(KeyValuePair`2 pair) in C:\MLDotNet\src\Microsoft.ML.Data\EntryPoints\EntryPointNode.cs:line 686
   at Microsoft.ML.Runtime.EntryPoints.EntryPointNode..ctor(IHostEnvironment env, IChannel ch, RunContext context, String id, String entryPointName, JObject inputs, JObject outputs, Boolean checkpoint, String stageId, Single cost, String label, String group, String weight, String name) in C:\MLDotNet\src\Microsoft.ML.Data\EntryPoints\EntryPointNode.cs:line 505
   at Microsoft.ML.Runtime.EntryPoints.EntryPointNode.ValidateNodes(IHostEnvironment env, RunContext context, JArray nodes, String label, String group, String weight, String name) in C:\MLDotNet\src\Microsoft.ML.Data\EntryPoints\EntryPointNode.cs:line 934
   at Microsoft.ML.Runtime.EntryPoints.EntryPointGraph..ctor(IHostEnvironment env, JArray nodes) in C:\MLDotNet\src\Microsoft.ML.Data\EntryPoints\EntryPointNode.cs:line 1008
   at Microsoft.ML.Runtime.EntryPoints.JsonUtils.GraphRunner..ctor(IHostEnvironment env, JArray nodes) in C:\MLDotNet\src\Microsoft.ML.Legacy\Runtime\EntryPoints\JsonUtils\GraphRunner.cs:line 32
   at Microsoft.ML.Runtime.EntryPoints.JsonUtils.ExecuteGraphCommand.Run() in C:\MLDotNet\src\Microsoft.ML.Legacy\Runtime\EntryPoints\JsonUtils\ExecuteGraphCommand.cs:line 62
   at Microsoft.ML.Runtime.Tools.Maml.MainCore(ConsoleEnvironment env, String args, Boolean alwaysPrintStacktrace) in C:\MLDotNet\src\Microsoft.ML.Maml\MAML.cs:line 139
@najeeb-kazmi
Copy link
Member

I don't think this argument is supported at the node level. You can ignore a column by simply not loading it in the first place. Btw, you can load the features into one vector with col=Features:R4:0-77,79-202. Please also see my comment about future support in #1614.

@justinormont justinormont added bug Something isn't working AutoML.NET Automating various steps of the machine learning process labels Nov 16, 2018
@rogancarr
Copy link
Contributor

Closing: Microsoft.ML.PipelineInference has been removed from the repository.

@ghost ghost locked as resolved and limited conversation to collaborators Mar 26, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
AutoML.NET Automating various steps of the machine learning process bug Something isn't working
Projects
None yet
Development

No branches or pull requests

4 participants