Skip to content

Added tests for text featurizer options (Part1). #3006

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Mar 20, 2019

Conversation

zeahmed
Copy link
Contributor

@zeahmed zeahmed commented Mar 19, 2019

This PR partially address #2967. See the following list. Further tests will be added in the next PR.

Test created for following parameters in options class

  • StopWordsRemover
  • CaseMode
  • KeepDiacritics
  • KeepPunctuations
  • KeepNumbers

@zeahmed zeahmed requested review from Ivanidzo4ka and artidoro March 19, 2019 00:33
@codecov
Copy link

codecov bot commented Mar 19, 2019

Codecov Report

Merging #3006 into master will increase coverage by 0.06%.
The diff coverage is 100%.

@@            Coverage Diff             @@
##           master    #3006      +/-   ##
==========================================
+ Coverage   72.39%   72.45%   +0.06%     
==========================================
  Files         803      803              
  Lines      143590   143837     +247     
  Branches    16164    16173       +9     
==========================================
+ Hits       103946   104222     +276     
+ Misses      35225    35199      -26     
+ Partials     4419     4416       -3
Flag Coverage Δ
#Debug 72.45% <100%> (+0.06%) ⬆️
#production 68.11% <100%> (+0.02%) ⬆️
#test 88.65% <100%> (+0.09%) ⬆️
Impacted Files Coverage Δ
...oft.ML.Transforms/Text/TextFeaturizingEstimator.cs 90.97% <100%> (+7.76%) ⬆️
...osoft.ML.Tests/Transformers/TextFeaturizerTests.cs 99.74% <100%> (+0.18%) ⬆️
...c/Microsoft.ML.FastTree/Utils/ThreadTaskManager.cs 79.48% <0%> (-20.52%) ⬇️
...StandardTrainers/Standard/LinearModelParameters.cs 60.05% <0%> (-0.27%) ⬇️
...soft.ML.Functional.Tests/Datasets/FeatureColumn.cs
...tional.Tests/Datasets/FeatureContributionOutput.cs
test/Microsoft.ML.Functional.Tests/ONNX.cs 100% <0%> (ø)
...soft.ML.Functional.Tests/Datasets/CommonColumns.cs 100% <0%> (ø)
...ML.Transforms/Text/StopWordsRemovingTransformer.cs 86.1% <0%> (+0.47%) ⬆️
src/Microsoft.ML.Maml/MAML.cs 26.21% <0%> (+1.45%) ⬆️
... and 2 more

@zeahmed zeahmed requested a review from sfilipi March 19, 2019 17:25
new TestClass() { A = "No stop words", OutputText=null } };
var dataView = ML.Data.LoadFromEnumerable(data);

var options = new TextFeaturizingEstimator.Options() { StopWordsRemoverOptions = new StopWordsRemovingEstimator.Options(), OutputTokens = true };
Copy link
Contributor

@Ivanidzo4ka Ivanidzo4ka Mar 19, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OutputTokens [](start = 135, length = 12)

You need to merge master in your code. #Resolved

private class TestClass
{
public string A;
[ColumnName("OutputText_TransformedText")]
Copy link
Contributor

@Ivanidzo4ka Ivanidzo4ka Mar 19, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OutputText_TransformedText [](start = 25, length = 26)

You can give it name without _ after merge with master. #Resolved

Copy link
Contributor

@Ivanidzo4ka Ivanidzo4ka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:shipit:

Copy link
Contributor

@artidoro artidoro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:shipit:

@zeahmed zeahmed merged commit e00d19d into dotnet:master Mar 20, 2019
@ghost ghost locked as resolved and limited conversation to collaborators Mar 23, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants