-
Notifications
You must be signed in to change notification settings - Fork 1.9k
Add an example of random PCA using in-memory data structure #2780
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
8409d09
to
48c01d3
Compare
Fix build Polish example
Fix build
Codecov Report
@@ Coverage Diff @@
## master #2780 +/- ##
==========================================
+ Coverage 71.66% 71.67% +<.01%
==========================================
Files 809 809
Lines 142378 142416 +38
Branches 16119 16120 +1
==========================================
+ Hits 102031 102072 +41
+ Misses 35915 35912 -3
Partials 4432 4432
|
{ | ||
/// <summary> | ||
/// Example with 3 feature values. | ||
/// </summary> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please don;t use XML style comments in samples, as it becomes harder to read than simple comments. #Resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/// Class used to capture prediction of <see cref="DataPoint"/> in <see cref="Example"/>. | ||
/// </summary> | ||
// We disable this warning because complier doesn't realize those fields below are assigned somewhere. | ||
#pragma warning disable 649 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we disable this somewhere else? csproj etc? #Resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, we can do
<PropertyGroup>
<TargetFramework>netcoreapp2.1</TargetFramework>
<OutputType>Exe</OutputType>
+ <NoWarn>1701;1702;0649</NoWarn>
</PropertyGroup>
In reply to: 261047307 [](ancestors = 261047307)
new DataPoint(){ Features= new float[3] {-100, 50, -100} } | ||
}; | ||
|
||
// Convert native C# class to IDataView, a consumble format to ML.NET functions. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
native C# class [](start = 23, length = 15)
the List #Closed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Supper good to make it strongly-typed. No ambiguity anymore.
In reply to: 261047430 [](ancestors = 261047430)
// Apply the trained model on the training data. | ||
var transformed = model.Transform(data); | ||
|
||
// Read ML.NET predictions into C# class. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
C# class [](start = 44, length = 8)
an IEnumerable #Resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// The i-th sample is predicted as an outlier. | ||
Console.WriteLine("The {0}-th example with features [{1}] is an outlier with a score of being inlier {2}", | ||
i, featuresInText, result.Score); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please put in comment below what would the output of this look. (The results of those WriteLine) #Closed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/// <example> | ||
/// <format type="text/markdown"> | ||
/// <] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
RandomizedPcaSampleWithOptions [](start = 119, length = 30)
i don't think this exists #Resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@@ -48,6 +50,98 @@ public void NoAnomalyTest() | |||
Assert.Throws<ArgumentOutOfRangeException>(() => ML.AnomalyDetection.Evaluate(transformedData)); | |||
} | |||
|
|||
[Fact] | |||
public static void RandomizedPcaInMemory() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
RandomizedPcaInMemory [](start = 27, length = 21)
love it :) #WontFix
|
||
namespace Microsoft.ML.Samples.Dynamic.Trainers.AnomalyDetection | ||
{ | ||
class RandomizedPcaSampleWithOptions |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
class [](start = 4, length = 5)
public static #Closed
/// <summary> | ||
/// Class used to capture prediction of <see cref="DataPoint"/> in <see cref="ExecutePipelineWithGivenRandomizedPcaTrainer"/>. | ||
/// </summary> | ||
#pragma warning disable 649 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
am curious ..whats this warning ? #Resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fields assigned in runtime are considered never assigned
, which is wrong. #Resolved
new DataPoint(){ Features= new float[3] {-100, 50, -100} } | ||
}; | ||
|
||
// Convert native C# class to IDataView, a consumble format to ML.NET functions. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
native C# class [](start = 23, length = 15)
nit : maybe just saying List would suffice ? #Resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@@ -3,6 +3,7 @@ | |||
<PropertyGroup> | |||
<TargetFramework>netcoreapp2.1</TargetFramework> | |||
<OutputType>Exe</OutputType> | |||
<NoWarn>1701;1702;0649</NoWarn> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
1701;1702;0649 [](start = 4, length = 31)
one concern i have is that users might not be aware of this..so they will get thrown off when they see the warning about "uninitialized field" ...
an alternative is to make them properties in which case we do not need this NoWarn
, and also we do not see the "uninitialized field" warning when we use properties.
@sfilipi what do u think ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just switched to properties. No need to disable warning anymore. Your suggestion reduces the redundant information user need to know!
In reply to: 261304076 [](ancestors = 261304076)
// The 2 - th example with features[1, 2, 3] is an inlier with a score of being inlier 0.8450122 | ||
// The 3 - th example with features[0, 1, 0] is an inlier with a score of being inlier 0.9428905 | ||
// The 4 - th example with features[0, 2, 1] is an inlier with a score of being inlier 0.9999999 | ||
// The 5 - th example with features[-100, 50, -100] is an outlier with a score of being inlier 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: typically this goes below the code that generates it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// The 2 - th example with features[1, 2, 3] is an inlier with a score of being inlier 0.8450122 | ||
// The 3 - th example with features[0, 1, 0] is an inlier with a score of being inlier 0.9428905 | ||
// The 4 - th example with features[0, 2, 1] is an inlier with a score of being inlier 0.9999999 | ||
// The 5 - th example with features[-100, 50, -100] is an outlier with a score of being inlier 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same nit, put it below the code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed. Thanks.
As title. It also shows some benefits of using in-memory data described in #2726.