Skip to content

Still problem with "Input string was not in a correct format" #1493

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
Darlanio opened this issue Jun 5, 2021 · 9 comments
Closed

Still problem with "Input string was not in a correct format" #1493

Darlanio opened this issue Jun 5, 2021 · 9 comments

Comments

@Darlanio
Copy link

Darlanio commented Jun 5, 2021

Referring to closed issue: #845
Trying to follow this tutorial with a slight modification (my own data with 5 columns).
With smaller datasets (<10000 rows), like A,B,Result where A and B are random numbers 0-99 and Result is the sum or product,
it is possible to train for a minute on CPU without errors, but at ten minutes or with larger datasets the error occurs.

System Information (please complete the following information):

  • Model Builder Version: Preview installed 2021-06-02 (New GUI, can't find version number). ML.NET 1.5.5
  • Visual Studio Version: 16.10.0, Community Edition

Describe the bug

  • On which step of the process did you run into an issue: Train
  • Clear description of the problem:
    I tried to train a machine learning model using file attached on a Swedish Windows, English Visual Studio version.
    I set the decimal points to be '.' in the file since ',' was being interpreted as a column separator. Training goes on about 10 minutes before the error is shown.
    I tried again with integers and only three columns - it then works for smaller datasets but not for larger sets (csv file is still less than 30 kb!) and gives error for longer training periods even with smaller datasets. I use CPU to train and have plenty of memory available (64 Gb machine, more than 32 Gb available).

To Reproduce
Steps to reproduce the behavior:

  1. Create a new .NET Core 3.1 project.
  2. Right click on the project and add machine learning.
  3. Select Value prediction
  4. Select Local (For this, I train on CPU, no GFX involved)
  5. Select the file and set prediction column as prediction
  6. Train for 600 seconds.
  7. Error should appear.

Expected behavior
I would have expected the model to be added to the project without an error occurring.
The file contains rounded floats approximating p = a x b x c x d.
I tried with simpler datasets and when training can be limited to less than 60 seconds, it works.

Screenshots
If applicable, add screenshots to help explain your problem.

Additional context
Debug Log:
2021-06-03 20:27:09.9301 DEBUG Set log file path to C:\Users\darla\AppData\Local\Temp\MLVSTools\logs\1c5e2112-e010-4acb-82a1-373f09101864.txt (Microsoft.ML.ModelBuilder.Utils.Logger.Debug)
2021-06-03 21:59:48.7752 DEBUG C:\Users\darla\source\repos\NETCoreCreateDataSet\NETCoreCreateDataSet\bin\Debug\netcoreapp3.1\dataset1.tsv (Microsoft.ML.ModelBuilder.Utils.Logger.Debug)
2021-06-04 00:12:06.5689 DEBUG C:\Users\darla\source\repos\NETCoreCreateDataSet\NETCoreCreateDataSet\bin\Debug\netcoreapp3.1\dataset1.tsv (Microsoft.ML.ModelBuilder.Utils.Logger.Debug)
2021-06-04 00:13:35.5941 DEBUG Disposing TrainSession (Microsoft.ML.ModelBuilder.Utils.Logger.Debug)
2021-06-04 00:13:38.1135 WARN GPU Service not found. Falling back to CPU AutoML Service. (Microsoft.ML.ModelBuilder.Utils.Logger.Warn)
2021-06-04 00:13:40.0998 INFO | Trainer RSquared Absolute-loss Squared-loss RMS-loss Duration #Iteration | (Microsoft.ML.ModelBuilder.Utils.Logger.Info)
2021-06-04 00:13:43.2422 INFO |1 SdcaRegression 0,8842 0,15 0,04 0,21 2,5 1 | (Microsoft.ML.ModelBuilder.Utils.Logger.Info)
2021-06-04 00:13:46.1628 INFO |2 LightGbmRegression 0,9964 0,03 0,00 0,04 2,9 2 | (Microsoft.ML.ModelBuilder.Utils.Logger.Info)
2021-06-04 00:13:51.6378 INFO |3 FastTreeRegression 0,9960 0,03 0,00 0,04 5,5 3 | (Microsoft.ML.ModelBuilder.Utils.Logger.Info)
2021-06-04 00:13:58.9658 INFO |4 FastTreeTweedieRegression 0,9955 0,03 0,00 0,04 7,3 4 | (Microsoft.ML.ModelBuilder.Utils.Logger.Info)
2021-06-04 00:14:05.2862 INFO |5 FastForestRegression 0,7915 0,21 0,08 0,28 6,3 5 | (Microsoft.ML.ModelBuilder.Utils.Logger.Info)
2021-06-04 00:14:07.3135 INFO |6 LbfgsPoissonRegression 0,9808 0,06 0,01 0,09 2,0 6 | (Microsoft.ML.ModelBuilder.Utils.Logger.Info)
2021-06-04 00:14:09.1019 INFO |7 OnlineGradientDescentRegression 0,8808 0,15 0,05 0,21 1,8 7 | (Microsoft.ML.ModelBuilder.Utils.Logger.Info)
2021-06-04 00:14:10.7940 INFO |8 OlsRegression 0,8843 0,15 0,04 0,21 1,7 8 | (Microsoft.ML.ModelBuilder.Utils.Logger.Info)
2021-06-04 00:14:13.5810 INFO |9 LightGbmRegression 0,9805 0,06 0,01 0,09 2,8 9 | (Microsoft.ML.ModelBuilder.Utils.Logger.Info)
2021-06-04 00:14:33.8414 INFO |10 FastTreeRegression 0,9993 0,01 0,00 0,02 20,3 10 | (Microsoft.ML.ModelBuilder.Utils.Logger.Info)
2021-06-04 00:14:36.6655 INFO |11 FastTreeTweedieRegression 0,2031 0,42 0,30 0,55 2,8 11 | (Microsoft.ML.ModelBuilder.Utils.Logger.Info)
2021-06-04 00:14:38.6425 INFO |12 LightGbmRegression 0,9422 0,11 0,02 0,15 2,0 12 | (Microsoft.ML.ModelBuilder.Utils.Logger.Info)
2021-06-04 00:14:55.8813 INFO |13 FastTreeRegression 0,9992 0,01 0,00 0,02 17,2 13 | (Microsoft.ML.ModelBuilder.Utils.Logger.Info)
2021-06-04 00:15:05.5139 INFO |14 FastTreeTweedieRegression 0,9390 0,10 0,02 0,15 9,6 14 | (Microsoft.ML.ModelBuilder.Utils.Logger.Info)
2021-06-04 00:15:09.0996 INFO |15 LightGbmRegression 0,9982 0,02 0,00 0,03 3,6 15 | (Microsoft.ML.ModelBuilder.Utils.Logger.Info)
2021-06-04 00:15:18.7497 INFO |16 FastTreeRegression 0,9947 0,03 0,00 0,04 9,6 16 | (Microsoft.ML.ModelBuilder.Utils.Logger.Info)
2021-06-04 00:15:21.2727 INFO |17 FastTreeTweedieRegression 0,5571 0,30 0,17 0,41 2,5 17 | (Microsoft.ML.ModelBuilder.Utils.Logger.Info)
2021-06-04 00:15:23.9141 INFO |18 LightGbmRegression 0,9831 0,06 0,01 0,08 2,6 18 | (Microsoft.ML.ModelBuilder.Utils.Logger.Info)
2021-06-04 00:15:26.1774 INFO |19 FastTreeRegression -2,1707 0,93 1,20 1,10 2,3 19 | (Microsoft.ML.ModelBuilder.Utils.Logger.Info)
2021-06-04 00:15:47.5387 INFO |20 FastTreeTweedieRegression 0,9981 0,02 0,00 0,03 21,4 20 | (Microsoft.ML.ModelBuilder.Utils.Logger.Info)
2021-06-04 00:15:49.6557 INFO |21 LightGbmRegression 0,8827 0,16 0,04 0,21 2,1 21 | (Microsoft.ML.ModelBuilder.Utils.Logger.Info)
2021-06-04 00:15:51.6150 INFO |22 FastTreeRegression -1,5322 0,79 0,96 0,98 2,0 22 | (Microsoft.ML.ModelBuilder.Utils.Logger.Info)
2021-06-04 00:15:56.8352 INFO |23 FastTreeTweedieRegression 0,6924 0,26 0,12 0,34 5,2 23 | (Microsoft.ML.ModelBuilder.Utils.Logger.Info)
2021-06-04 00:15:58.7807 INFO |24 LightGbmRegression 0,7291 0,24 0,10 0,32 1,9 24 | (Microsoft.ML.ModelBuilder.Utils.Logger.Info)
2021-06-04 00:16:01.1459 INFO |25 FastTreeRegression 0,9614 0,09 0,01 0,12 2,4 25 | (Microsoft.ML.ModelBuilder.Utils.Logger.Info)
2021-06-04 00:16:04.4861 INFO |26 FastTreeTweedieRegression 0,4639 0,33 0,20 0,45 3,3 26 | (Microsoft.ML.ModelBuilder.Utils.Logger.Info)
2021-06-04 00:16:07.2304 INFO |27 LightGbmRegression 0,9782 0,07 0,01 0,09 2,7 27 | (Microsoft.ML.ModelBuilder.Utils.Logger.Info)
2021-06-04 00:16:34.3087 INFO |28 FastTreeRegression 0,9801 0,06 0,01 0,09 27,1 28 | (Microsoft.ML.ModelBuilder.Utils.Logger.Info)
2021-06-04 00:16:37.6194 INFO |29 FastTreeTweedieRegression 0,0800 0,45 0,35 0,59 3,3 29 | (Microsoft.ML.ModelBuilder.Utils.Logger.Info)
2021-06-04 00:16:40.1970 INFO |30 LightGbmRegression 0,8696 0,15 0,05 0,22 2,6 30 | (Microsoft.ML.ModelBuilder.Utils.Logger.Info)
2021-06-04 00:16:53.2113 INFO |31 FastTreeRegression 0,9935 0,03 0,00 0,05 13,0 31 | (Microsoft.ML.ModelBuilder.Utils.Logger.Info)
2021-06-04 00:17:16.3573 INFO |32 FastTreeTweedieRegression 0,9758 0,07 0,01 0,10 23,1 32 | (Microsoft.ML.ModelBuilder.Utils.Logger.Info)
2021-06-04 00:17:20.2456 INFO |33 LightGbmRegression 0,8822 0,15 0,04 0,21 3,9 33 | (Microsoft.ML.ModelBuilder.Utils.Logger.Info)
2021-06-04 00:17:23.0729 INFO |34 FastTreeRegression -0,0172 0,51 0,39 0,62 2,8 34 | (Microsoft.ML.ModelBuilder.Utils.Logger.Info)
2021-06-04 00:17:26.1219 INFO |35 FastTreeTweedieRegression 0,0263 0,47 0,37 0,61 3,0 35 | (Microsoft.ML.ModelBuilder.Utils.Logger.Info)
2021-06-04 00:17:28.4487 INFO |36 LightGbmRegression 0,9927 0,04 0,00 0,05 2,3 36 | (Microsoft.ML.ModelBuilder.Utils.Logger.Info)
2021-06-04 00:17:30.4403 INFO |37 FastTreeRegression 0,8027 0,20 0,07 0,27 2,0 37 | (Microsoft.ML.ModelBuilder.Utils.Logger.Info)
2021-06-04 00:17:32.5977 INFO |38 FastTreeTweedieRegression 0,8442 0,17 0,06 0,24 2,2 38 | (Microsoft.ML.ModelBuilder.Utils.Logger.Info)
2021-06-04 00:17:35.4974 INFO |39 LightGbmRegression 0,9787 0,06 0,01 0,09 2,9 39 | (Microsoft.ML.ModelBuilder.Utils.Logger.Info)
2021-06-04 00:17:43.2962 INFO |40 FastTreeRegression 0,8814 0,15 0,04 0,21 7,8 40 | (Microsoft.ML.ModelBuilder.Utils.Logger.Info)
2021-06-04 00:17:54.1076 INFO |41 FastTreeTweedieRegression 0,9773 0,06 0,01 0,09 10,8 41 | (Microsoft.ML.ModelBuilder.Utils.Logger.Info)
2021-06-04 00:17:56.4898 INFO |42 LightGbmRegression 0,7673 0,23 0,09 0,30 2,4 42 | (Microsoft.ML.ModelBuilder.Utils.Logger.Info)
2021-06-04 00:18:10.2398 INFO |43 FastTreeRegression 0,9385 0,11 0,02 0,15 13,7 43 | (Microsoft.ML.ModelBuilder.Utils.Logger.Info)
2021-06-04 00:18:17.8742 INFO |44 FastTreeTweedieRegression 0,9962 0,03 0,00 0,04 7,6 44 | (Microsoft.ML.ModelBuilder.Utils.Logger.Info)
2021-06-04 00:18:22.5084 INFO |45 LightGbmRegression 0,9990 0,01 0,00 0,02 4,6 45 | (Microsoft.ML.ModelBuilder.Utils.Logger.Info)
2021-06-04 00:18:25.2342 INFO |46 FastTreeRegression 0,9739 0,07 0,01 0,10 2,7 46 | (Microsoft.ML.ModelBuilder.Utils.Logger.Info)
2021-06-04 00:18:46.9190 INFO |47 FastTreeTweedieRegression 0,7017 0,25 0,11 0,34 21,7 47 | (Microsoft.ML.ModelBuilder.Utils.Logger.Info)
2021-06-04 00:18:50.1409 INFO |48 LightGbmRegression 0,9974 0,02 0,00 0,03 3,2 48 | (Microsoft.ML.ModelBuilder.Utils.Logger.Info)
2021-06-04 00:18:58.3642 INFO |49 FastTreeRegression 0,9988 0,02 0,00 0,02 8,2 49 | (Microsoft.ML.ModelBuilder.Utils.Logger.Info)
2021-06-04 00:19:08.1574 INFO |50 FastTreeTweedieRegression 0,7427 0,22 0,10 0,31 9,8 50 | (Microsoft.ML.ModelBuilder.Utils.Logger.Info)
2021-06-04 00:19:12.2166 INFO |51 LightGbmRegression 0,9972 0,02 0,00 0,03 4,1 51 | (Microsoft.ML.ModelBuilder.Utils.Logger.Info)
2021-06-04 00:19:21.8491 INFO |52 FastTreeRegression 0,9663 0,07 0,01 0,11 9,6 52 | (Microsoft.ML.ModelBuilder.Utils.Logger.Info)
2021-06-04 00:19:26.1101 INFO |53 FastTreeTweedieRegression 0,4846 0,32 0,20 0,44 4,3 53 | (Microsoft.ML.ModelBuilder.Utils.Logger.Info)
2021-06-04 00:19:28.2084 INFO |54 LightGbmRegression 0,5135 0,33 0,18 0,43 2,1 54 | (Microsoft.ML.ModelBuilder.Utils.Logger.Info)
2021-06-04 00:19:59.3728 INFO |55 FastTreeRegression 0,9985 0,02 0,00 0,02 31,2 55 | (Microsoft.ML.ModelBuilder.Utils.Logger.Info)
2021-06-04 00:20:02.2304 INFO |56 FastTreeTweedieRegression 0,8848 0,14 0,04 0,21 2,9 56 | (Microsoft.ML.ModelBuilder.Utils.Logger.Info)
2021-06-04 00:20:04.6358 INFO |57 LightGbmRegression 0,9496 0,10 0,02 0,14 2,4 57 | (Microsoft.ML.ModelBuilder.Utils.Logger.Info)
2021-06-04 00:20:23.4426 INFO |58 FastTreeRegression 0,9537 0,09 0,02 0,13 18,8 58 | (Microsoft.ML.ModelBuilder.Utils.Logger.Info)
2021-06-04 00:20:26.8705 INFO |59 FastTreeTweedieRegression 0,2804 0,39 0,27 0,52 3,4 59 | (Microsoft.ML.ModelBuilder.Utils.Logger.Info)
2021-06-04 00:20:29.9091 INFO |60 LightGbmRegression 0,9978 0,02 0,00 0,03 3,0 60 | (Microsoft.ML.ModelBuilder.Utils.Logger.Info)
2021-06-04 00:21:07.6770 INFO |61 FastTreeRegression 0,9990 0,01 0,00 0,02 37,8 61 | (Microsoft.ML.ModelBuilder.Utils.Logger.Info)
2021-06-04 00:21:41.7417 INFO |62 FastTreeTweedieRegression 0,7642 0,21 0,09 0,30 34,1 62 | (Microsoft.ML.ModelBuilder.Utils.Logger.Info)
2021-06-04 00:21:43.8605 INFO |63 LightGbmRegression 0,8582 0,18 0,05 0,23 2,1 63 | (Microsoft.ML.ModelBuilder.Utils.Logger.Info)
2021-06-04 00:21:51.7671 INFO |64 FastTreeRegression 0,8806 0,15 0,05 0,21 7,9 64 | (Microsoft.ML.ModelBuilder.Utils.Logger.Info)
2021-06-04 00:21:54.6391 INFO |65 FastTreeTweedieRegression 0,0938 0,45 0,34 0,59 2,9 65 | (Microsoft.ML.ModelBuilder.Utils.Logger.Info)
2021-06-04 00:21:57.5520 INFO |66 LightGbmRegression 0,9889 0,05 0,00 0,06 2,9 66 | (Microsoft.ML.ModelBuilder.Utils.Logger.Info)
2021-06-04 00:22:22.9859 INFO |67 FastTreeRegression 0,9991 0,01 0,00 0,02 25,4 67 | (Microsoft.ML.ModelBuilder.Utils.Logger.Info)
2021-06-04 00:22:26.5649 INFO |68 FastTreeTweedieRegression 0,1032 0,44 0,34 0,58 3,6 68 | (Microsoft.ML.ModelBuilder.Utils.Logger.Info)
2021-06-04 00:22:26.5864 DEBUG Input string was not in a correct format.
at System.Number.ParseSingle(String value, NumberStyles options, NumberFormatInfo numfmt)
at Microsoft.ML.AutoML.SweeperProbabilityUtils.ParameterSetAsFloatArray(IValueGenerator[] sweepParams, ParameterSet ps, Boolean expandCategoricals)
at Microsoft.ML.AutoML.SmacSweeper.FitModel(IEnumerable1 previousRuns) at Microsoft.ML.AutoML.SmacSweeper.ProposeSweeps(Int32 maxSweeps, IEnumerable1 previousRuns)
at Microsoft.ML.AutoML.PipelineSuggester.SampleHyperparameters(MLContext context, SuggestedTrainer trainer, IEnumerable1 history, Boolean isMaximizingMetric) at Microsoft.ML.AutoML.PipelineSuggester.GetNextInferredPipeline(MLContext context, IEnumerable1 history, DatasetColumnInfo[] columns, TaskKind task, Boolean isMaximizingMetric, CacheBeforeTrainer cacheBeforeTrainer, IEnumerable1 trainerWhitelist) at Microsoft.ML.AutoML.Experiment2.Execute()
at Microsoft.ML.AutoML.ExperimentBase2.Execute(ColumnInformation columnInfo, DatasetColumnInfo[] columns, IEstimator1 preFeaturizer, IProgress1 progressHandler, IRunner1 runner)
at Microsoft.ML.AutoML.ExperimentBase2.Execute(IDataView trainData, ColumnInformation columnInformation, IEstimator1 preFeaturizer, IProgress1 progressHandler) at Microsoft.ML.ModelBuilder.AutoMLService.Experiments.AutoMLExperiment3.<>c__DisplayClass21_0.b__5() in //src/Microsoft.ML.ModelBuilder.AutoMLService/Experiments/AutoMLExperiment.cs:line 81
at System.Threading.Tasks.Task1.InnerInvoke() at System.Threading.Tasks.Task.Execute() --- End of stack trace from previous location where exception was thrown --- at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw() at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task) at Microsoft.ML.ModelBuilder.AutoMLService.Experiments.AutoMLExperiment3.d__21.MoveNext() in /
/src/Microsoft.ML.ModelBuilder.AutoMLService/Experiments/AutoMLExperiment.cs:line 108
--- End of stack trace from previous location where exception was thrown ---
at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
at Microsoft.ML.ModelBuilder.AutoMLEngine.d__30.MoveNext() in /_/src/Microsoft.ML.ModelBuilder.AutoMLService/AutoMLEngineService/AutoMLEngine.cs:line 147 (Microsoft.ML.ModelBuilder.Utils.Logger.Debug)
2021-06-04 11:10:18.5415 DEBUG Open Log FileC:\Users\darla\AppData\Local\Temp\MLVSTools\logs\1c5e2112-e010-4acb-82a1-373f09101864.txt (Microsoft.ML.ModelBuilder.Utils.Logger.Debug)

@LittleLittleCloud
Copy link
Contributor

@beccamc
Looks like a bug in SMAC sweeper in old AutoML.Net, which should no longer exist in our main branch. Maybe we can launch another preview release to make the fix available?

@beccamc
Copy link
Contributor

beccamc commented Jun 7, 2021

Sorry you ran into this issue @Darlanio. Model Builder version number is available in Extensions -> Manage Extension -> Installed. If you aren't on 16.3.0.2056001 you can try to update here.

We are validating a new release this week, which as Xiaoyun mentioned above has removed this code. I'll update this issue when that is released (will also be available from the marketplace link above).

@beccamc
Copy link
Contributor

beccamc commented Jun 10, 2021

@Darlanio we just released the new version. Can you update and try again? https://marketplace.visualstudio.com/items?itemName=MLNET.07

@Darlanio
Copy link
Author

Thanks! Will do.

@Darlanio
Copy link
Author

Darlanio commented Jun 10, 2021

After updating Modelbuilder I am using version 16.6.0.2130907.
I have only tested training a handful of networks so far, but not had any trouble. I will let you know if I get the error again, but for now this seems to be a bug that is solved.

Many thanks for the quick replies and the update!

@beccamc
Copy link
Contributor

beccamc commented Jun 10, 2021

I'm going to close this issue. If you see the bug again feel free to reopen, and please @ mention me. Thanks for reporting!

@beccamc beccamc closed this as completed Jun 10, 2021
@idenchik1
Copy link

idenchik1 commented Jul 1, 2021

@beccamc same issue, ML.NET 16.6.1.213190
image
It with ~150k rows and 60 sec, has same result with 50k rows and 10 sec

@beccamc
Copy link
Contributor

beccamc commented Jul 1, 2021

@idenchik1 Can you share a sample row of your dataset?

@idenchik1
Copy link

idenchik1 commented Jul 2, 2021

https://hastebin.com/raw/ayunozupos
I used this dataset https://www.kaggle.com/fizzbuzz/cleaned-toxic-comments?select=train_preprocessed.csv
or something like this
0e2d592688eed0f3c6afd87b1b39477df11dfded50610a5b4b277c2b2f7414ca;False a074cd0285dc1c121e2ea2c0e70168b86f9b83293f92d0bd1c0e57a377e1d46a;False 8c7567237067316295fe18748acc53bb81a990f2193091235889191ca454b570;True

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants