-
Notifications
You must be signed in to change notification settings - Fork 1.9k
Mark VBuffer readonly and convert usages to use 'in' instead of 'ref` #1454
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
ML.Core and ML.Data.
ML.StandardLearners
@@ -424,11 +424,6 @@ public static void Copy(T[] src, int srcIndex, ref VBuffer<T> dst, int length) | |||
dst = new VBuffer<T>(length, values, dst.Indices); | |||
} | |||
|
|||
public static void Copy(ref VBuffer<T> src, ref VBuffer<T> dst) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note: there was only 1 usage of this method, and it is just a wrapper over src.CopyTo(ref dst)
, so I thought it was best to drop it in favor of only having a single way to copy from one buffer to another.
Let me know if this was a bad idea. #Resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's also used once in internal repos, so I think this is a good change.
In reply to: 229444488 [](ancestors = 229444488)
@@ -592,9 +592,9 @@ public static void CreateMaybeSparseCopy<T>(ref VBuffer<T> src, ref VBuffer<T> d | |||
/// <param name="dst">Argument vector, whose elements are only read</param> | |||
/// <param name="res">Result vector</param> | |||
/// <param name="manip">Function to apply to each pair of elements</param> | |||
public static void ApplyWithCopy<TSrc, TDst>(ref VBuffer<TSrc> src, ref VBuffer<TDst> dst, ref VBuffer<TDst> res, PairManipulatorCopy<TSrc, TDst> manip) | |||
public static void ApplyWithCopy<TSrc, TDst>(in VBuffer<TSrc> src, ref VBuffer<TDst> dst, ref VBuffer<TDst> res, PairManipulatorCopy<TSrc, TDst> manip) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ref VBuffer dst [](start = 75, length = 21)
/// Argument vector, whose elements are only read #Resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@@ -625,9 +625,9 @@ public static void CreateMaybeSparseCopy<T>(ref VBuffer<T> src, ref VBuffer<T> d | |||
/// <param name="dst">Argument vector, whose elements are only read</param> | |||
/// <param name="res">Result vector</param> | |||
/// <param name="manip">Function to apply to each pair of elements</param> | |||
public static void ApplyWithEitherDefinedCopy<TSrc, TDst>(ref VBuffer<TSrc> src, ref VBuffer<TDst> dst, ref VBuffer<TDst> res, PairManipulatorCopy<TSrc, TDst> manip) | |||
public static void ApplyWithEitherDefinedCopy<TSrc, TDst>(in VBuffer<TSrc> src, ref VBuffer<TDst> dst, ref VBuffer<TDst> res, PairManipulatorCopy<TSrc, TDst> manip) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ref [](start = 88, length = 3)
/// Argument vector, whose elements are only read #Resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@@ -908,7 +908,7 @@ public static void CreateMaybeSparseCopy<T>(ref VBuffer<T> src, ref VBuffer<T> d | |||
/// where necessary depending on whether this is an inner or outer join of the | |||
/// indices of <paramref name="src"/> on <paramref name="dst"/>. | |||
/// </summary> | |||
private static void ApplyWithCoreCopy<TSrc, TDst>(ref VBuffer<TSrc> src, ref VBuffer<TDst> dst, ref VBuffer<TDst> res, PairManipulatorCopy<TSrc, TDst> manip, bool outer) | |||
private static void ApplyWithCoreCopy<TSrc, TDst>(in VBuffer<TSrc> src, ref VBuffer<TDst> dst, ref VBuffer<TDst> res, PairManipulatorCopy<TSrc, TDst> manip, bool outer) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ref [](start = 80, length = 3)
can it be "in"? #ByDesign
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This dst
(and the two calling methods above) cannot be made in
because of the case at this line of code:
machinelearning/src/Microsoft.ML.Core/Utilities/VBufferUtils.cs
Lines 1089 to 1093 in ef2bade
// result will become dense. | |
// This is unnecessary -- falling through to the sparse code will | |
// actually handle this case just fine -- but it is more efficient. | |
Densify(ref dst); | |
ApplyWithCoreCopy(ref src, ref dst, ref res, manip, outer); |
This method actually modifies dst
in that case, by making it dense. Thus, the dst
can't be marked in
.
In reply to: 229498975 [](ancestors = 229498975)
} | ||
|
||
/// <summary> | ||
/// Perform scalar vector addition | ||
/// <c><paramref name="res"/> = <paramref name="c"/> * <paramref name="src"/> + <paramref name="dst"/></c> | ||
/// </summary> | ||
public static void AddMult(ref VBuffer<Float> src, Float c, ref VBuffer<Float> dst, ref VBuffer<Float> res) | ||
public static void AddMult(in VBuffer<Float> src, Float c, ref VBuffer<Float> dst, ref VBuffer<Float> res) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ref [](start = 67, length = 3)
in #ByDesign
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
dst
can't be marked in
because of the case that calls VBufferUtils.ApplyWithCopy
, which passes dst
into the ref dst
parameter.
In reply to: 229499582 [](ancestors = 229499582)
@@ -31,7 +31,7 @@ public static RoleMappedData SelectFeatures(IHost host, RoleMappedData data, Bit | |||
var name = data.Schema.Feature.Name; | |||
var view = LambdaColumnMapper.Create( | |||
host, "FeatureSelector", data.Data, name, name, type, type, | |||
(ref VBuffer<Single> src, ref VBuffer<Single> dst) => SelectFeatures(ref src, features, card, ref dst)); | |||
(ref VBuffer<Single> src, ref VBuffer<Single> dst) => SelectFeatures(in src, features, card, ref dst)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ref [](start = 17, length = 3)
is it worth to modify ValueMapper to switch from public delegate void ValueMapper<TSrc, TDst>(ref TSrc src, ref TDst dst);
to
public delegate void ValueMapper<TSrc, TDst>(in TSrc src, ref TDst dst);
? #Pending
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmmm, yes, this seems like a good idea. Indeed in a change yesterday I see @eerhardt already did something like this by improving RefPredicate
to InPredicate
in #1405, and this seems quite similar. However I wonder if, like the RefPredicate
to InPredicate
change, it would be worthwhile to phrase it as a separate PR, since that seems orthogonal to this change unless I am much mistaken? #Pending
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can do this in a follow up PR. This change was necessary first, since there are a ton of places where ValueMapper
implementations call into these methods. Once these are in, ValueMapper
can be updated as well. #Pending
@@ -1034,10 +1034,10 @@ protected void GetFeatureIniContent(RoleMappedData data, ref VBuffer<ReadOnlyMem | |||
/// <param name="upperBounds">The bin upper bounds, maximum length will be <paramref name="maxBins"/></param> | |||
/// <returns>Whether finding the bins was successful or not. It will be unsuccessful iff <paramref name="values"/> | |||
/// has any missing values. In that event, the out parameters will be left as null.</returns> | |||
protected static bool CalculateBins(BinFinder binFinder, ref VBuffer<double> values, int maxBins, int minDocsPerLeaf, | |||
protected static bool CalculateBins(BinFinder binFinder, in VBuffer<double> values, int maxBins, int minDocsPerLeaf, | |||
ref double[] distinctValues, ref int[] distinctCounts, out double[] upperBounds) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ref double[] distinctValues, ref int[] distinctCounts, [](start = 12, length = 54)
Can you ask @TomFinley why we have them? I don't see any usages of them. #Resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this from a prior time before @costin-eseanu (I think?) introduced the concept of the BinFinder
, which internally incorporates this concept of a reusable buffer. Yet somehow these arguments were not removed at the same time. Or maybe I made the mistake, since I also touched the disk implementation a lot, so maybe I did this. Anyway this was done many years ago.
Not quite relevant to the PR, but something we may want to remove opportunistically. It's not really hurting anything exactly, but it is a little silly to keep these around now that we do have bin finder. #Resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will remove them. Might as well do some clean up while I'm here.
In reply to: 229529149 [](ancestors = 229529149)
@@ -1282,16 +1282,16 @@ public bool IsYinYangGloballyBound(int n) | |||
} | |||
|
|||
#if DEBUG | |||
public void AssertValidYinYangBounds(int n, ref VBuffer<float> features, VBuffer<float>[] centroids) | |||
public void AssertValidYinYangBounds(int n, in VBuffer<float> features, VBuffer<float>[] centroids) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
VBuffer[] centroids [](start = 84, length = 26)
also "in"? #ByDesign
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's an array of VBuffers. We pass the array like any other reference.
In reply to: 229506248 [](ancestors = 229506248)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @eerhardt thanks for doing this -- this looks very nice, and is a major improvement to the codebase. @Ivanidzo4ka had one additional comment I agree with regarding the so-called value mappers, but I wonder if that would be best explored in a separate PR.
This implements the 2nd and 3rd steps in the plan outlined in #608 (comment).
ref
toin
.Working towards #608
Note: this doesn't need to make it in for
v0.7
.