Introduce DenseVector<T> and convert one usage of VBuffer to it. #4

eerhardt · 2018-10-08T22:29:07Z

Working towards dotnet#608

This is an initial draft of what a "dense vector" could look like in ML.NET. I also updated one location to use it to show how it would work.

Please let me know if you think I'm going in the wrong direction.

@TomFinley @ericstj @Zruty0 @KrzysztofCwalina

Working towards dotnet#608

KrzysztofCwalina · 2018-10-10T14:53:08Z

src/Microsoft.ML.Core/Data/DenseVector.cs

+    /// is passed to a row cursor getter, the callee is free to take ownership of
+    /// and re-use the backing data structure (Buffer).
+    /// </summary>
+    public readonly struct DenseVector<T>


Why do we need this type? Why not use Memory directly? If you need to be able to resize it, could the caller just hold to the original buffer? Or the method filling in data into Memory would return out parameter specifying how many items were filled in.

Take a look at https://github.com/dotnet/machinelearning/blob/master/docs/code/VBufferCareFeeding.md#buffer-re-use-as-a-user for the common usage of these vectors.

Why not use Memory directly?

Because if you only had Memory directly, as soon as you wanted to return a vector that had a length less than the total buffer capacity, now you could no longer get back to use the total capacity of the buffer.

could the caller just hold to the original buffer?

It's my understanding that the caller doesn't always know how big the original buffer should be, and thus can't create it.

Or the method filling in data into Memory would return out parameter specifying how many items were filled in.

That won't work with the current IDataView GetGetter delegate design.

What I meant is the following (it's a pattern we use in lots of Span/Memory APIs):

Memory<byte> buffer = ... while(true){ if(TryFillBuffer(buffer, out written)) { UseFilledInBuffer(buffer.Slice(0, written)); break; } Enlarge(ref buffer); }

Could we do something like that?

KrzysztofCwalina · 2018-10-10T14:54:00Z

src/Microsoft.ML.Core/Data/DenseVector.cs

+
+using System;
+
+namespace Microsoft.ML.Runtime.Data


Will the type show in public surface area of mainline scenario APIs? If yes, should this type be in Microsoft.ML root?

KrzysztofCwalina · 2018-10-10T14:54:34Z

src/Microsoft.ML.Core/Utilities/Utils.cs

@@ -107,6 +108,16 @@ public static int Size<T>(SortedSet<T> x)
            return x == null ? 0 : x.Count;
        }

+        public static int Size<T>(Memory<T> x)


Why do we need these helpers?

KrzysztofCwalina · 2018-10-10T14:56:11Z

test/BaselineOutput/SingleDebug/Command/CommandTrainScoreEvaluateQuantileRegression-3-out.txt

@@ -0,0 +1,15 @@
+L1(avg):            1.926877


I assume these out.txt files should not be here?

eerhardt · 2018-10-23T17:10:54Z

After speaking with @TomFinley, this is not the direction we want to go with VBuffer. Closing.

Introduce DenseVector<T> and convert one usage of VBuffer to it.

4570cdc

Working towards dotnet#608

KrzysztofCwalina reviewed Oct 10, 2018

View reviewed changes

eerhardt closed this Oct 23, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Introduce DenseVector<T> and convert one usage of VBuffer to it. #4

Introduce DenseVector<T> and convert one usage of VBuffer to it. #4

Uh oh!

eerhardt commented Oct 8, 2018

Uh oh!

KrzysztofCwalina Oct 10, 2018

Uh oh!

eerhardt Oct 10, 2018

Uh oh!

KrzysztofCwalina Oct 10, 2018

Uh oh!

KrzysztofCwalina Oct 10, 2018

Uh oh!

KrzysztofCwalina Oct 10, 2018

Uh oh!

KrzysztofCwalina Oct 10, 2018

Uh oh!

eerhardt commented Oct 23, 2018

Uh oh!

Uh oh!


		using System;

		namespace Microsoft.ML.Runtime.Data

Introduce DenseVector<T> and convert one usage of VBuffer to it. #4

Introduce DenseVector<T> and convert one usage of VBuffer to it. #4

Uh oh!

Conversation

eerhardt commented Oct 8, 2018

Uh oh!

KrzysztofCwalina Oct 10, 2018

Choose a reason for hiding this comment

Uh oh!

eerhardt Oct 10, 2018

Choose a reason for hiding this comment

Uh oh!

KrzysztofCwalina Oct 10, 2018

Choose a reason for hiding this comment

Uh oh!

KrzysztofCwalina Oct 10, 2018

Choose a reason for hiding this comment

Uh oh!

KrzysztofCwalina Oct 10, 2018

Choose a reason for hiding this comment

Uh oh!

KrzysztofCwalina Oct 10, 2018

Choose a reason for hiding this comment

Uh oh!

eerhardt commented Oct 23, 2018

Uh oh!

Uh oh!