Skip to content

Convert ValueMapper to use 'in' parameters #1475

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Nov 1, 2018

Conversation

eerhardt
Copy link
Member

ValueMapper takes in a source value, and maps it to a destination value. The source value that is passed "in" is current passed by ref, which isn't correct because we don't want ValueMapper implementations to modify this value.
Change ValueMapper (both versions) to use in parameters instead.

This is a follow-up to #1454 requested in the PR feedback.

Working towards #608


/// <summary>
/// Delegate type to map/convert among three values, for example, one input with two
/// outputs, or two inputs with one output.
/// </summary>
public delegate void ValueMapper<TVal1, TVal2, TVal3>(ref TVal1 val1, ref TVal2 val2, ref TVal3 val3);
public delegate void ValueMapper<TVal1, TVal2, TVal3>(in TVal1 val1, ref TVal2 val2, ref TVal3 val3);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I inspected all the usages of this delegate, and most times it is used by the below interface IValueMapperDist, which takes 1 input and returns 2 outputs.
The only case where this was used for 2 inputs and 1 output was in one place in matrix factorization code (the two inputs were column and row I believe).

I didn't feel it was necessary to split this delegate. If someone does feel it is necessary, I think a separate issue should be logged and addressed separately.

Copy link
Contributor

@Ivanidzo4ka Ivanidzo4ka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:shipit:

@@ -9,13 +9,13 @@ namespace Microsoft.ML.Runtime.Data
/// <summary>
/// Delegate type to map/convert a value.
/// </summary>
public delegate void ValueMapper<TSrc, TDst>(ref TSrc src, ref TDst dst);
public delegate void ValueMapper<TSrc, TDst>(in TSrc src, ref TDst dst);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to confirm, is TSrc expected to always be a readonly struct? in causes hidden copies when calling an instance member for non-readonly structs and can cause unexpected perf-regressions in some cases.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As a practical matter in most applications they will be some sort of VBuffer, with some isolated instances here and there as you've observed. Yet these other types from what I see tend to also be readonly structs. (E.g., you observed elsewhere TimeSpan, and here is Int32) I just ran a pretty simple benchmark like so (using BenchmarkDotNet, my new favorite project)...

private static int Foo(in byte a, in byte b) => a + b;
private static int Bar(byte a, byte b) => a + b;

[Benchmark]
public void Foo() { for (int i = 0; i < 1000; ++i) Foo(2, 5); }
[Benchmark]
public void Bar() { for (int i = 0; i < 1000; ++i) Bar(2, 5); }

At least on my computer this gives:

Method Mean Error StdDev
Foo 239.6 ns 2.8371 ns 2.6538 ns
Bar 238.2 ns 0.8780 ns 0.7332 ns

So it seems like it is at least not harmful for things like int, while still giving some benefit for things like VBuffer.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In my limited knowledge of the area, I would guess that >95% of the time ValueMapper will be used for primitives, reference types, and readonly struct.

The most common case is to use these types:

    using BL = Boolean;
    using DT = DateTime;
    using DZ = DateTimeOffset;
    using R4 = Single;
    using R8 = Double;
    using I1 = SByte;
    using I2 = Int16;
    using I4 = Int32;
    using I8 = Int64;
    using SB = StringBuilder;
    using TX = ReadOnlyMemory<char>;
    using TS = TimeSpan;
    using U1 = Byte;
    using U2 = UInt16;
    using U4 = UInt32;
    using U8 = UInt64;
    using UG = UInt128;

Along with VBuffer, which just got made readonly in #1454.

Skimming this list, I found that UInt128 was not marked readonly. So I'm opening #1496 to make that (and other obvious readonly structs) marked as readonly.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any concern with the primitive types not being readonly structs for full framework and/or netstandard right now?

{
TextSaverUtils.MapText(src.Span, ref sb, Sep);
}

protected void MapTimeSpan(ref TimeSpan src, ref StringBuilder sb)
protected void MapTimeSpan(in TimeSpan src, ref StringBuilder sb)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a note: some of these don't seem worthwhile to pass by in, since they are often smaller than a register in size.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As we see though we benefit from having a uniform signature no matter what the input type is, even if not all types will benefit, since we can then use the same high level code for all types.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These have to be changed to in because these methods are used as ValueMapper delegates.

else if (type.IsTimeSpan)
{
ValueMapper<TimeSpan, StringBuilder> c = MapTimeSpan;
Conv = (ValueMapper<T, StringBuilder>)(Delegate)c;
}
else if (type.IsDateTime)
{
ValueMapper<DateTime, StringBuilder> c = MapDateTime;
Conv = (ValueMapper<T, StringBuilder>)(Delegate)c;
}
else if (type.IsDateTimeZone)
{
ValueMapper<DateTimeOffset, StringBuilder> c = MapDateTimeZone;
Conv = (ValueMapper<T, StringBuilder>)(Delegate)c;
}

Copy link
Contributor

@TomFinley TomFinley left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @eerhardt looks great!

@TomFinley TomFinley merged commit a8bc268 into dotnet:master Nov 1, 2018
@eerhardt eerhardt deleted the InValueMapper branch November 1, 2018 14:02
@ghost ghost locked as resolved and limited conversation to collaborators Mar 27, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants