-
Notifications
You must be signed in to change notification settings - Fork 1.9k
Added tests for text featurizer options (Part2). #3036
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
||
var prediction = engine.Predict(data[0]); | ||
Assert.Equal("this is some text in english", string.Join(" ", prediction.OutputTokens)); | ||
Assert.Equal(1.0f, prediction.Features[0]); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Assert.Equal(1.0f, prediction.Features[0]) [](start = 12, length = 42)
Doesn't Assert
has option to compare to arrays or enumerables? #Resolved
|
||
prediction = engine.Predict(data[1]); | ||
Assert.Equal("xyz", string.Join(" ", prediction.OutputTokens)); | ||
Assert.Equal(1.0f, prediction.Features[0]); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Assert.Equal(1.0f, prediction.Features[0]); [](start = 12, length = 43)
So i expect ngrams to be a
,b
,c
,e
,f
,g
,x
,y
,z
,
and end of string and empty string, not sure honestly, I guess. total 12 ngrams.
feature 0 and feature 8 is this end of string and empty string, right? #Resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Index 0 is start marker and index 8 is end marker. Then there are total 10 characters including space.
In reply to: 267486364 [](ancestors = 267486364)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Codecov Report
@@ Coverage Diff @@
## master #3036 +/- ##
==========================================
+ Coverage 72.5% 72.5% +<.01%
==========================================
Files 804 804
Lines 144077 144150 +73
Branches 16179 16179
==========================================
+ Hits 104462 104519 +57
- Misses 35198 35220 +22
+ Partials 4417 4411 -6
|
var engine = model.CreatePredictionEngine<TestClass, TestClass>(ML); | ||
|
||
var prediction = engine.Predict(data[0]); | ||
Assert.Equal("this is some text in english", string.Join(" ", prediction.OutputTokens)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"this is some text in english" [](start = 25, length = 30)
nit, but for maintainability i'd create a var for this. #Resolved
|
||
var prediction = engine.Predict(data[0]); | ||
Assert.Equal("abc efg", string.Join(" ", prediction.OutputTokens)); | ||
var expected = new float[] { 1.0f, 1.0f, 1.0f, 1.0f, 1.0f, 1.0f, 1.0f, 1.0f, 1.0f, 0.0f, 0.0f, 0.0f }; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
1.0f [](start = 89, length = 4)
should this be 0? or is this the end marker? #Resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
|
||
var options = new TextFeaturizingEstimator.Options() | ||
{ | ||
CharFeatureExtractor = new WordBagEstimator.Options() { NgramLength = 1}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CharFeatureExtractor = new WordBagEstimator.Options() [](start = 16, length = 54)
Is this correct? it doesn't read right to initialize a CharExtractor with the options of a WordBagEstimator... #Resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This PR finally fixes #2967. Test created in this PR are for the following parameters in options class
The intend here is to test that TextFeaturizer is instantiated for every parameter in the options class. Here, we are not testing the internal components of TextFeaturizer.