Skip to content

Fix bug in TextLoader #3011

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 19 commits into from
Closed

Conversation

yaeldekel
Copy link

Fixes #2996.

Copy link
Contributor

@Ivanidzo4ka Ivanidzo4ka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:shipit:

@codecov
Copy link

codecov bot commented Mar 19, 2019

Codecov Report

❗ No coverage uploaded for pull request base (master@0831865). Click here to learn what that means.
The diff coverage is 100%.

@@            Coverage Diff            @@
##             master    #3011   +/-   ##
=========================================
  Coverage          ?   72.35%           
=========================================
  Files             ?      803           
  Lines             ?   143297           
  Branches          ?    16154           
=========================================
  Hits              ?   103680           
  Misses            ?    35192           
  Partials          ?     4425
Flag Coverage Δ
#Debug 72.35% <100%> (?)
#production 68.06% <ø> (?)
#test 88.52% <100%> (?)
Impacted Files Coverage Δ
.../Microsoft.ML.Data/DataLoadSave/Text/TextLoader.cs 81.89% <ø> (ø)
test/Microsoft.ML.Tests/TextLoaderTests.cs 98.92% <100%> (ø)

singlis and others added 4 commits March 19, 2019 12:19
* Breaking down the LightGBM options class into separate option classes for
each LightGBM trainer.
* Refactored the Boost Option classes
* Hides the interface for the Booster Parameter Factory
(IBoosterParameterFactory).

Fixes dotnet#2559 
Fixes dotnet#2618
* Added samples for tree regression trainers.

* PR comments

* Added tab.
@@ -1291,11 +1291,6 @@ private sealed class LoaderHolder
if (loader == null || string.IsNullOrWhiteSpace(loader.Name))
goto LDone;

// Make sure the loader binds to us.
var info = host.ComponentCatalog.GetLoadableClassInfo<SignatureDataLoader>(loader.Name);
if (info.Type != typeof(ILegacyDataLoader) || info.ArgType != typeof(Options))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

info [](start = 20, length = 4)

Could we explain this change? As far as I can tell, we're considering not finding this loader in the component factory to be no longer an error condition. This may be right, but I'd feel a bit more comfortable with this change if we can explain why?

I did read the attached issue, but it was a little vague on precisely why removing this check is a right and desirable thing to do.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems that this check was there in order to make sure that the schema defined in the file has the correct syntax for creating a legacy TextLoader. The schema defined in the file should always be in the format TextLoader{col=... }, since it is generated by the TextSaver. Instead of the check that I deleted, should we check that loader.Name==TextLoader.LoaderSignature?


In reply to: 267105243 [](ancestors = 267105243)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since it is a text file, might be manually altered after saving it with the TextSaver?


In reply to: 267424076 [](ancestors = 267424076,267105243)

sfilipi and others added 13 commits March 19, 2019 14:37
* Refactoring the Old ModelStatistics to a class just for the modelStats, like MLR needs all the time, and lr needs when the stats are not calculated.
* Separting the Bias from the weights coefficients APIs.
…experimental nuget. (dotnet#2797)

* Clean up of cancellation mechanism.

* changes.

* Make cancellation mechanism internal and accessible via experiemtnal nuget.

* merge conflicts.

* PR feedback and fix analyzer test.

* Add test for public API in experimental nuget.

* PR feedback.

* fix tests.

* PR feedback.

* PR feedback.

* PR feedback.

* PR feedback.

* PR feedback.

* PR feedback.

* PR feedback.

* PR feedback.
* Adding functional tests for ONNX scenarios
…otnet#2976)

* Step 1: create two multi-class linear models

Step 2: Make SDCA trainers typed

Finish version 0.1

Delete commented lines

* Add some doc strings

More document

* Handle static extensions

* Rename several maximum entropy models and trainers

* Fix EP test

Fix two tests and address a comment

Add missing piece

* Address comments

* Improve option of MCSDCA

* Address comments

* Update code sample

* Refactorize saving family

* Rename a class following binary SDCA trainer
* Internalize some members of SchemaDefinition, and add tests

* Code review comments

* Fix build after rebase

* Fix failing test

* Fix build after rebase

* Internalize Column ctor

* Fix build after rebase
* adding XML to a public AP that had no documentation.

* adding a traintest split sample. Small corrections to the images doc.xml.
* Update tests to run on mac ci leg

* Updated OnnxRuntime version to 0.3.0

* Update OnnxTheory Attribute
* Binary FastTree/Forest samples using T4 templates.

* Addressed comments.
@yaeldekel yaeldekel closed this Mar 21, 2019
@yaeldekel
Copy link
Author

Closing this since other commits got included in it somehow. Will open a new one soon.

@yaeldekel yaeldekel deleted the tryparseschema branch March 21, 2019 21:37
@yaeldekel yaeldekel mentioned this pull request Mar 21, 2019
@ghost ghost locked as resolved and limited conversation to collaborators Mar 23, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.