-
Notifications
You must be signed in to change notification settings - Fork 1.9k
Lockdown Microsoft.ML.TimeSeries public surface #2344
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
8488c94
to
02978f0
Compare
@@ -15,7 +15,7 @@ namespace Microsoft.ML.TimeSeriesProcessing | |||
internal static class TimeSeriesProcessingEntryPoints | |||
{ | |||
[TlcModule.EntryPoint(Desc = ExponentialAverageTransform.Summary, UserName = ExponentialAverageTransform.UserName, ShortName = ExponentialAverageTransform.ShortName)] | |||
public static CommonOutputs.TransformOutput ExponentialAverage(IHostEnvironment env, ExponentialAverageTransform.Arguments input) | |||
internal static CommonOutputs.TransformOutput ExponentialAverage(IHostEnvironment env, ExponentialAverageTransform.Arguments input) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
internal [](start = 8, length = 8)
This appears to already be part of an internal class, so the changes in this particular file were not helpful. Probably revert. #Pending
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure but these functions were showing up in API list but I could be wrong as well ... will do in the next iteration.
In reply to: 252812773 [](ancestors = 252812773)
At least in the current version, I do not see that we have made an effort to internalize and hide |
@TomFinley SequentialAnomalyDetectionTransformBase is the base class for IID and SSA spike and anomaly detectors. Can you please clarify on what needs to be internalized there? I believe the only thing we need to internalize is State object but if there is anything else please let me know. #Resolved |
Well, the only plausible way to internalize the state is to wrap the transformer implementation, since you will doubtless need these generic parameters still. This suggests internalizing the generic base class. In reply to: 459483255 [](ancestors = 459483255) |
Codecov Report
@@ Coverage Diff @@
## master #2344 +/- ##
==========================================
+ Coverage 71.24% 71.24% +<.01%
==========================================
Files 788 789 +1
Lines 141131 141200 +69
Branches 16115 16116 +1
==========================================
+ Hits 100545 100598 +53
- Misses 36121 36136 +15
- Partials 4465 4466 +1
|
Since the goal of this is to refine the public surface, it would be helpful to understand what the public surface now is. Could you post a listing of the public surface? Thanks. #Resolved |
3c9e562
to
8e4f596
Compare
@TomFinley Below is the new API list: N:Microsoft.ML.TimeSeriesT:Microsoft.ML.TimeSeries.PredictionFunctionExtensions #Resolved |
Can you add:
Refers to: src/Microsoft.ML.TimeSeries/IidChangePointDetector.cs:242 in b81ef6c. [](commit_id = b81ef6c, deletion_comment = False) |
Can you make these constructors internal, and allow to create this estimator only through MLContext? #Resolved Refers to: src/Microsoft.ML.TimeSeries/IidChangePointDetector.cs:221 in b81ef6c. [](commit_id = b81ef6c, deletion_comment = False) |
Same comment here. You can actually make the constructors internal and only allow them to be created from MLContext. #Resolved Refers to: src/Microsoft.ML.TimeSeries/IidSpikeDetector.cs:199 in b81ef6c. [](commit_id = b81ef6c, deletion_comment = False) |
b81ef6c
to
23fefa4
Compare
@@ -146,7 +146,7 @@ public abstract class PredictionEngineBase<TSrc, TDst> : IDisposable | |||
disposer = inputRow.Dispose; | |||
} | |||
|
|||
protected virtual Func<Schema, IRowToRowMapper> TransformerChecker(IExceptionContext ectx, ITransformer transformer) | |||
internal virtual Func<Schema, IRowToRowMapper> TransformerChecker(IExceptionContext ectx, ITransformer transformer) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
private protected
? #Resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, what is the added benefit for pp? #Resolved
@@ -60,11 +58,11 @@ private sealed class Reconciler : EstimatorReconciler | |||
{ | |||
Contracts.Assert(toOutput.Length == 1); | |||
var outCol = (OutColumn)toOutput[0]; | |||
return new IidChangePointEstimator(env, | |||
return new MLContext().Transforms.IidChangePointEstimator( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure if we should reuse the env
passed in. This new MLContext()
is also a kind of environment. #WontFix
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
|
||
public Schema GetOutputSchema(Schema inputSchema) => Base.GetOutputSchema(inputSchema); | ||
|
||
public IRowToRowMapper GetRowToRowMapper(Schema inputSchema) => Base.GetRowToRowMapper(inputSchema); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we want to expose this function to users? If not, maybe consider explicit interface implementation?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here, I agree that it is a bit confusing, I might consider doing what I am doing with ICanSaveModel, and explicit implement the interface for these methods, but it should be done in another PR, and further discussed.
In reply to: 254477452 [](ancestors = 254477452)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed, this needs to be its own change since it require creating and implementing a new interface.
In reply to: 254485970 [](ancestors = 254485970,254477452)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you please open an issue?
In reply to: 255151564 [](ancestors = 255151564,254485970,254477452)
{ | ||
public const string Summary = "Applies a Exponential average on a time series."; | ||
public const string LoaderSignature = "ExpAverageTransform"; | ||
public const string UserName = "Exponential Average Transform"; | ||
public const string ShortName = "ExpAvg"; | ||
|
||
#pragma warning disable 0649 | ||
public sealed class Arguments : TransformInputBase |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why do we need this? #Resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It will warn for variable not initialized error. This pattern is used across the codebase #Resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it has to do with making the class internal.
In reply to: 254483842 [](ancestors = 254483842,254478686)
StateRef = new State(); | ||
StateRef.InitState(WindowSize, InitialWindowSize, this, Host); | ||
} | ||
public bool IsRowToRowMapper => Base.IsRowToRowMapper; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we want to expose this function to users? Maybe internal-best-friend it? #ByDesign
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it comes from the IStatefulTransformer, so it has to be public.
In reply to: 254477753 [](ancestors = 254477753)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It comes from ITransformer which is public.
In reply to: 254485421 [](ancestors = 254485421,254477753)
|
||
public IRowToRowMapper GetRowToRowMapper(Schema inputSchema) => Base.GetRowToRowMapper(inputSchema); | ||
|
||
public IRowToRowMapper GetStatefulRowToRowMapper(Schema inputSchema) => ((IStatefulTransformer)Base).GetStatefulRowToRowMapper(inputSchema); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similarly, I am not sure this method is good for being exposed. Could you take a look at other public attributes and functions in this class? I am definitely not the durid (which is @TomFinley) who can judge which function should be public or not, so feel free to blame me for stupid questions. #WontFix
@@ -82,11 +82,29 @@ public static Double SquaredDifference(Double actual, Double predicted) | |||
} | |||
|
|||
/// <summary> | |||
/// This base class that implements the general anomaly detection transform based on Singular Spectrum modeling of the time-series. | |||
/// The wrapper to the base class that implements the general anomaly detection transform based on Singular Spectrum modeling of the time-series. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you use <see cref=>
to reference the base class? #Resolved
StateRef = new State(); | ||
StateRef.InitState(WindowSize, InitialWindowSize, this, Host); | ||
} | ||
public SsaAnomalyDetectionBaseWrapper(SsaArguments args, string name, IHostEnvironment env) { Base = new SsaAnomalyDetectionBase(args, name, env, this); } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SsaAnomalyDetectionBaseWrapper [](start = 15, length = 30)
If this is a base class, and users don't need to instantiate it can we make the two constructors internal? #Resolved
internal IStatefulRowMapper MakeRowMapper(Schema schema) => Base.MakeRowMapper(schema); | ||
|
||
internal IDataTransform MakeDataTransform(IDataView input) => Base.MakeDataTransform(input); | ||
|
||
public abstract class SsaArguments : ArgumentsBase |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SsaArguments [](start = 30, length = 12)
Can you rename them to SsaOptions
? and also rename local variables to options
If the arguments are not used in other places can we make them internal? #Resolved
23fefa4
to
1c0061e
Compare
New API surface: IDN:Microsoft.ML #Resolved |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I still have a few minor comments, but this is great work @codemzs!
@@ -98,7 +98,7 @@ private static VersionInfo GetVersionInfo() | |||
} | |||
|
|||
// Factory method for SignatureDataTransform. | |||
private static IDataTransform Create(IHostEnvironment env, Arguments args, IDataView input) | |||
private static IDataTransform Create(IHostEnvironment env, Options args, IDataView input) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you rename to options
? #Resolved
@@ -258,38 +258,45 @@ public sealed class SsaChangePointEstimator : IEstimator<SsaChangePointDetector> | |||
{ | |||
} | |||
|
|||
public SsaChangePointEstimator(IHostEnvironment env, SsaChangePointDetector.Arguments args) | |||
internal SsaChangePointEstimator(IHostEnvironment env, SsaChangePointDetector.Options args) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here, if you could rename to options that would be great. #Resolved
@@ -237,38 +237,45 @@ public sealed class SsaSpikeEstimator : IEstimator<SsaSpikeDetector> | |||
{ | |||
} | |||
|
|||
public SsaSpikeEstimator(IHostEnvironment env, SsaSpikeDetector.Arguments args) | |||
internal SsaSpikeEstimator(IHostEnvironment env, SsaSpikeDetector.Options args) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here if you could rename args
to options
. #Resolved
@@ -148,7 +147,7 @@ public IidSpikeDetector(IHostEnvironment env, ModelLoadContext ctx) | |||
// *** Binary format *** | |||
// <base> | |||
|
|||
Host.CheckDecode(ThresholdScore == AlertingScore.PValueScore); | |||
InternalTransform.Host.CheckDecode(InternalTransform.ThresholdScore == AlertingScore.PValueScore); | |||
} | |||
private IidSpikeDetector(IHostEnvironment env, IidSpikeDetector transform) | |||
: base(new BaseArguments(transform), LoaderSignature, env) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And since you are at it, why don't you add a line before this method. #Resolved
@@ -61,7 +63,7 @@ public abstract class StateBase | |||
/// </summary> | |||
protected long RowCounter { get; private set; } | |||
|
|||
private protected StateBase() | |||
public StateBase() | |||
{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we even need this constructor? #Resolved
/// <param name="outputColumnName">Name of the column resulting from the transformation of <paramref name="inputColumnName"/>. | ||
/// <param name="inputColumnName">Name of column to transform. If set to <see langword="null"/>, the value of the <paramref name="outputColumnName"/> will be used as source.</param> | ||
/// Column is a vector of type double and size 4. The vector contains Alert, Raw Score, P-Value as first three values.</param> | ||
/// <param name="confidence">The confidence for spike detection in the range [0, 100].</param> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't this supposed to qualify the outputColumnName
? #Resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, just wanted to make sure that we need to expose the following directly rather than through MLContext. I am not too familiar with the predictionengine/predictionfunction that's why I am just pointing out for you to double check:
|
These are exposed directly as extension methods for ITransformer, not through ML Context. In reply to: 462005919 [](ancestors = 462005919) |
e11cf53
to
312526d
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🚢 on its way.
fixes #2281
This change reduces TimeSeries public API count from 371 to
1245857.