-
Notifications
You must be signed in to change notification settings - Fork 1.9k
Explicit implementation for IsRowToRowMapper and GetRowToRowMapper #2673
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Codecov Report
@@ Coverage Diff @@
## master #2673 +/- ##
==========================================
- Coverage 71.57% 71.57% -0.01%
==========================================
Files 800 800
Lines 141789 141792 +3
Branches 16107 16107
==========================================
+ Hits 101488 101489 +1
- Misses 35851 35854 +3
+ Partials 4450 4449 -1
|
@@ -123,9 +121,9 @@ private static bool IsChainRowToRowMapper(IDataView view) | |||
return true; | |||
} | |||
|
|||
public bool IsRowToRowMapper { get; } | |||
bool ITransformer.IsRowToRowMapper => IsChainRowToRowMapper(_xf); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IsChainRowToRowMapper [](start = 46, length = 21)
This operation is somewhat involved, so I would prefer if you could have a backing field to store this result as we had in the initial implementation. #Resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@@ -162,7 +162,7 @@ internal SsaSpikeDetector(IHostEnvironment env, ModelLoadContext ctx) | |||
InternalTransform.Host.CheckDecode(InternalTransform.IsAdaptive == false); | |||
} | |||
|
|||
public override void Save(ModelSaveContext ctx) | |||
private protected override void SaveModel(ModelSaveContext ctx) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
private protected override void SaveModel [](start = 8, length = 41)
Ahh, thank you very much for finding more of these!!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually wait, these are already internal, so probably not necessary... still, eh... I don't know. If you think it helps with consistency and readability.
In reply to: 258775986 [](ancestors = 258775986)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the containing classes are public. The classes in which I made the implementation of ICanSaveModel,Save
explicit derive from SsaAnomalyDetectionBaseWrapper
or IidAnomalyDetectionBaseWrapper
which are both public.
There is just one that does not, and that's the Mapper class in SequentialAnomalyDetectionTransformBase
which is internal. This was the only instance where a mapper class does not have explicit implementation of the Save
method. In the previous PR, we decided the that despite this being an internal class it was beneficial to do an explicit implementation for consistency as it implements ICanSaveModel
. So all mappers elsewhere have an explicit implementation of that method.
In reply to: 258776058 [](ancestors = 258776058,258775986)
Thank you @artidoro I finished a pass, the only real high level things I saw were not storing the result of that somewhat expensive call in the wrapper. More minor was the somewhat needless internalization of members of an already internal class, but I consider that less urgent (especially if you feel strongly that being consistent in that fashion is desirable, which I could understand). |
@@ -348,7 +348,7 @@ public DataViewSchema.DetachedColumn[] GetOutputColumns() | |||
return col => false; | |||
} | |||
|
|||
public void Save(ModelSaveContext ctx) => _parent.SaveModel(ctx); | |||
void ICanSaveModel.Save(ModelSaveContext ctx) => _parent.SaveModel(ctx); | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This one was internal already, but in every other Mapper elsewhere we have made this one explicit despite it being internal for consistency.
{ | ||
Contracts.CheckValue(inputSchema, nameof(inputSchema)); | ||
Contracts.Check(IsRowToRowMapper, nameof(GetRowToRowMapper) + " method called despite " + nameof(IsRowToRowMapper) + " being false."); | ||
Contracts.Check(((ITransformer)this).IsRowToRowMapper, nameof(ITransformer.GetRowToRowMapper) + " method called despite " + |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IsRowToRowMapper [](start = 49, length = 16)
I feel we should hide IsRowToRowMapper
as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you are asking whether we can make IsRowToRowMapper
internal.
I discussed this with Tom at length and the answer is that we should not make it internal. The reason is that when a ITransformer
is a IRowToRowMapper
, the performance of the prediction engine can be significantly increased compared to when it is not. Tom talked of a difference of orders of magnitude.
Since it is such an important characteristics of the ITransformer
we think it is better to keep it as part of the public API. This is especially true when we think about possibly allowing expansion of the library with custom components in the future.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If its for potential use, I feel we should hide for this version because I don't think users will be able to control the performance of ML.NET in the near future.
In reply to: 259442579 [](ancestors = 259442579)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will let @TomFinley explain his reasons.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think in general the policy is a good one, that we should try to limit the scope of the public API. However, I feel very strongly that something like IRowToRowMapper
has to exist.
The reason why I think this is the case is that I think back to #973. Literally everything can be captured as mapping from IDataView
to IDataView
, as we have seen. But we see that if we have a more "specific" interface for this row-to-row mapping, we can gain some benefit -- in this case, orders of magnitude speedup and a far more clear interface from using this "bypass" mechanism, in the case where we can use it.
From this perspective, we might say, "very well, then IRowToRowMapper
should just be internal and prediction engine will still be free to use it." Which is of course possible. But then I think to myself, if PredictionEngine
is useful from a reflection-driven perspective (as it quite clearly is), what is the dynamic API's answer to this scenario? There is nothing except IRowToRowMapper
.
So, I think the case for something like IRowToRowMapper
is clear, so I think we should ship it. So does it have to be that specifically? I don't see any better proposal than that interface. (Though perhaps it could be refined, e.g., with @sfilipi making it operating over schema columns specifically rather than int => bool
delegates, etc.) But I think the principle has proven itself useful and should be made public, just like IDataView
has been around, proven itself useful, and should be made public.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(I know this already closed, but though I'd answer the direct question. 😄 )
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you @artidoro !!
Fixes #2540.
As explained in the issue I make the implementations of two methods of the
ITransformer
interfaceIsRowToRowMapper
andGetRowToRowMapper
explicit.In a previous PR #2431 I made the implementation of
ICanSaveModel.Save
explicit.Save
is part ofITransformer
as it derives fromICanSaveModel
. I found a few instances that have been changed since my last PR in the TimeSeries assembly and that still had a regular implementation for `Save. I made their implementation explicit in this PR as it is very similar in spirit.