Skip to content

Mark VBuffer readonly and convert usages to use 'in' instead of 'ref` #1454

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 6 commits into from
Oct 31, 2018

Conversation

eerhardt
Copy link
Member

This implements the 2nd and 3rd steps in the plan outlined in #608 (comment).

  1. Make VBuffer a readonly struct.
    • It was always designed as readonly, we just didn't have the C# capability before now.
  2. In methods that only actually read values from a VBuffer, change that parameter from ref to in.
    • This guarantees that the method won't change the reference, as was always the intention. We just didn't have the C# capability before now.

Working towards #608

Note: this doesn't need to make it in for v0.7.

@@ -424,11 +424,6 @@ public static void Copy(T[] src, int srcIndex, ref VBuffer<T> dst, int length)
dst = new VBuffer<T>(length, values, dst.Indices);
}

public static void Copy(ref VBuffer<T> src, ref VBuffer<T> dst)
Copy link
Member Author

@eerhardt eerhardt Oct 30, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: there was only 1 usage of this method, and it is just a wrapper over src.CopyTo(ref dst), so I thought it was best to drop it in favor of only having a single way to copy from one buffer to another.

Let me know if this was a bad idea. #Resolved

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's also used once in internal repos, so I think this is a good change.


In reply to: 229444488 [](ancestors = 229444488)

@@ -592,9 +592,9 @@ public static void CreateMaybeSparseCopy<T>(ref VBuffer<T> src, ref VBuffer<T> d
/// <param name="dst">Argument vector, whose elements are only read</param>
/// <param name="res">Result vector</param>
/// <param name="manip">Function to apply to each pair of elements</param>
public static void ApplyWithCopy<TSrc, TDst>(ref VBuffer<TSrc> src, ref VBuffer<TDst> dst, ref VBuffer<TDst> res, PairManipulatorCopy<TSrc, TDst> manip)
public static void ApplyWithCopy<TSrc, TDst>(in VBuffer<TSrc> src, ref VBuffer<TDst> dst, ref VBuffer<TDst> res, PairManipulatorCopy<TSrc, TDst> manip)
Copy link
Contributor

@Ivanidzo4ka Ivanidzo4ka Oct 30, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ref VBuffer dst [](start = 75, length = 21)

/// Argument vector, whose elements are only read #Resolved

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will update this comment, since it is lying.


In reply to: 229498527 [](ancestors = 229498527)

@@ -625,9 +625,9 @@ public static void CreateMaybeSparseCopy<T>(ref VBuffer<T> src, ref VBuffer<T> d
/// <param name="dst">Argument vector, whose elements are only read</param>
/// <param name="res">Result vector</param>
/// <param name="manip">Function to apply to each pair of elements</param>
public static void ApplyWithEitherDefinedCopy<TSrc, TDst>(ref VBuffer<TSrc> src, ref VBuffer<TDst> dst, ref VBuffer<TDst> res, PairManipulatorCopy<TSrc, TDst> manip)
public static void ApplyWithEitherDefinedCopy<TSrc, TDst>(in VBuffer<TSrc> src, ref VBuffer<TDst> dst, ref VBuffer<TDst> res, PairManipulatorCopy<TSrc, TDst> manip)
Copy link
Contributor

@Ivanidzo4ka Ivanidzo4ka Oct 30, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ref [](start = 88, length = 3)

/// Argument vector, whose elements are only read #Resolved

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will update this comment, since it is lying.


In reply to: 229498807 [](ancestors = 229498807)

@@ -908,7 +908,7 @@ public static void CreateMaybeSparseCopy<T>(ref VBuffer<T> src, ref VBuffer<T> d
/// where necessary depending on whether this is an inner or outer join of the
/// indices of <paramref name="src"/> on <paramref name="dst"/>.
/// </summary>
private static void ApplyWithCoreCopy<TSrc, TDst>(ref VBuffer<TSrc> src, ref VBuffer<TDst> dst, ref VBuffer<TDst> res, PairManipulatorCopy<TSrc, TDst> manip, bool outer)
private static void ApplyWithCoreCopy<TSrc, TDst>(in VBuffer<TSrc> src, ref VBuffer<TDst> dst, ref VBuffer<TDst> res, PairManipulatorCopy<TSrc, TDst> manip, bool outer)
Copy link
Contributor

@Ivanidzo4ka Ivanidzo4ka Oct 30, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ref [](start = 80, length = 3)

can it be "in"? #ByDesign

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This dst (and the two calling methods above) cannot be made in because of the case at this line of code:

// result will become dense.
// This is unnecessary -- falling through to the sparse code will
// actually handle this case just fine -- but it is more efficient.
Densify(ref dst);
ApplyWithCoreCopy(ref src, ref dst, ref res, manip, outer);

This method actually modifies dst in that case, by making it dense. Thus, the dst can't be marked in.


In reply to: 229498975 [](ancestors = 229498975)

}

/// <summary>
/// Perform scalar vector addition
/// <c><paramref name="res"/> = <paramref name="c"/> * <paramref name="src"/> + <paramref name="dst"/></c>
/// </summary>
public static void AddMult(ref VBuffer<Float> src, Float c, ref VBuffer<Float> dst, ref VBuffer<Float> res)
public static void AddMult(in VBuffer<Float> src, Float c, ref VBuffer<Float> dst, ref VBuffer<Float> res)
Copy link
Contributor

@Ivanidzo4ka Ivanidzo4ka Oct 30, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ref [](start = 67, length = 3)

in #ByDesign

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dst can't be marked in because of the case that calls VBufferUtils.ApplyWithCopy, which passes dst into the ref dst parameter.


In reply to: 229499582 [](ancestors = 229499582)

@@ -31,7 +31,7 @@ public static RoleMappedData SelectFeatures(IHost host, RoleMappedData data, Bit
var name = data.Schema.Feature.Name;
var view = LambdaColumnMapper.Create(
host, "FeatureSelector", data.Data, name, name, type, type,
(ref VBuffer<Single> src, ref VBuffer<Single> dst) => SelectFeatures(ref src, features, card, ref dst));
(ref VBuffer<Single> src, ref VBuffer<Single> dst) => SelectFeatures(in src, features, card, ref dst));
Copy link
Contributor

@Ivanidzo4ka Ivanidzo4ka Oct 30, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ref [](start = 17, length = 3)

is it worth to modify ValueMapper to switch from public delegate void ValueMapper<TSrc, TDst>(ref TSrc src, ref TDst dst); to
public delegate void ValueMapper<TSrc, TDst>(in TSrc src, ref TDst dst); ? #Pending

Copy link
Contributor

@TomFinley TomFinley Oct 31, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmm, yes, this seems like a good idea. Indeed in a change yesterday I see @eerhardt already did something like this by improving RefPredicate to InPredicate in #1405, and this seems quite similar. However I wonder if, like the RefPredicate to InPredicate change, it would be worthwhile to phrase it as a separate PR, since that seems orthogonal to this change unless I am much mistaken? #Pending

Copy link
Member Author

@eerhardt eerhardt Oct 31, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can do this in a follow up PR. This change was necessary first, since there are a ton of places where ValueMapper implementations call into these methods. Once these are in, ValueMapper can be updated as well. #Pending

@@ -1034,10 +1034,10 @@ protected void GetFeatureIniContent(RoleMappedData data, ref VBuffer<ReadOnlyMem
/// <param name="upperBounds">The bin upper bounds, maximum length will be <paramref name="maxBins"/></param>
/// <returns>Whether finding the bins was successful or not. It will be unsuccessful iff <paramref name="values"/>
/// has any missing values. In that event, the out parameters will be left as null.</returns>
protected static bool CalculateBins(BinFinder binFinder, ref VBuffer<double> values, int maxBins, int minDocsPerLeaf,
protected static bool CalculateBins(BinFinder binFinder, in VBuffer<double> values, int maxBins, int minDocsPerLeaf,
ref double[] distinctValues, ref int[] distinctCounts, out double[] upperBounds)
Copy link
Contributor

@Ivanidzo4ka Ivanidzo4ka Oct 30, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ref double[] distinctValues, ref int[] distinctCounts, [](start = 12, length = 54)

Can you ask @TomFinley why we have them? I don't see any usages of them. #Resolved

Copy link
Contributor

@TomFinley TomFinley Oct 31, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this from a prior time before @costin-eseanu (I think?) introduced the concept of the BinFinder, which internally incorporates this concept of a reusable buffer. Yet somehow these arguments were not removed at the same time. Or maybe I made the mistake, since I also touched the disk implementation a lot, so maybe I did this. Anyway this was done many years ago.

Not quite relevant to the PR, but something we may want to remove opportunistically. It's not really hurting anything exactly, but it is a little silly to keep these around now that we do have bin finder. #Resolved

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will remove them. Might as well do some clean up while I'm here.


In reply to: 229529149 [](ancestors = 229529149)

@@ -1282,16 +1282,16 @@ public bool IsYinYangGloballyBound(int n)
}

#if DEBUG
public void AssertValidYinYangBounds(int n, ref VBuffer<float> features, VBuffer<float>[] centroids)
public void AssertValidYinYangBounds(int n, in VBuffer<float> features, VBuffer<float>[] centroids)
Copy link
Contributor

@Ivanidzo4ka Ivanidzo4ka Oct 30, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

VBuffer[] centroids [](start = 84, length = 26)

also "in"? #ByDesign

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's an array of VBuffers. We pass the array like any other reference.


In reply to: 229506248 [](ancestors = 229506248)

Copy link
Contributor

@Ivanidzo4ka Ivanidzo4ka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:shipit:

Copy link
Contributor

@TomFinley TomFinley left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @eerhardt thanks for doing this -- this looks very nice, and is a major improvement to the codebase. @Ivanidzo4ka had one additional comment I agree with regarding the so-called value mappers, but I wonder if that would be best explored in a separate PR.

@eerhardt eerhardt merged commit 2d2ed27 into dotnet:master Oct 31, 2018
@eerhardt eerhardt deleted the InVBuffer branch October 31, 2018 20:20
@ghost ghost locked as resolved and limited conversation to collaborators Mar 27, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants