Skip to content

Conversation

manofstick
Copy link
Contributor

@manofstick manofstick commented Sep 4, 2016

Discriminated Unions are represented internally as a shallow object hierarchy. Externally only the base class is exposed. Determining which actually type it is (i.e. some for of RTTI) is broadly handled using one of two mechanisms, the mechasim being determined by how many union cases exist (there are some optimizatins around using NULL, and when Nullary items exists).

The first mechanism uses .net RTTI for determining underlying cases, and the second uses a Tag field stored in the base class, which is populated by by the case on construction. When four or more cases exist, the second method is used.

The first method is lighter on memory, but somewhat slower, especially when threes cases exist, the second method is faster, but consumes an extra 4 bytes per object instance.

The PR introduces a third, opt-in method, which is a halfway house between the two methods. It creates an abstract Tag on the base class, and each of the derived types just return a constant value.

Speed wise this falls between the two existing methods, and memory wise it is as good as the first.

Opt is has been handled by adding another flag to CompilationRepresentationFlags (currently it overrides UseNullAsTrueValue - as in ORing them together doesn't work, although it could be used in conjunction, which would be faster for the null case;)

I'll create some performance number soon.

This is represented as a WIP to determine if this is a) a good idea, and b) using CompilationRepresentationFlags is a good idea. Also, I've only really just done this as a proof of concept really so I don't know if I'm broken things at the moment...

This currently just duplicates the IntegerTag functionality withing
EraseUnions.fs
@manofstick
Copy link
Contributor Author

OK - some numbers!

Not that it is probably practical, because it would cause all sorts of compatibility problems (maybe?), but one area that could benefit from such a change would be the humble Map.

So if we change the UseNullAsTrueValue to UseVirtualFlag in MapTree and then run some tests...

[<CompilationRepresentation(CompilationRepresentationFlags.UseVirtualTag)>]
[<NoEquality; NoComparison>]
type MapTree<'Key,'Value when 'Key : comparison > = 
    | MapEmpty 
    | MapOne of 'Key * 'Value
    | MapNode of 'Key * 'Value * MapTree<'Key,'Value> *  MapTree<'Key,'Value> * int
        // REVIEW: performance rumour has it that the data held in MapNode and MapOne should be
        // exactly one cache line. It is currently ~7 and 4 words respectively. 

So lets hammer the map with some data.

Previous Time (ms) Modified Time (ms) % of Original
x86 build 1622 1395 86%
x86 lookup 413 373 90%
x64 build 1758 1523 87%
x64 lookup 404 384 95%

So it's about ~ 10% faster. Pretty good?

@manofstick
Copy link
Contributor Author

  • the build errors, as far as I'm aware, are just because I have increased the public surface area with the new flag; I'm not sure if I should me modifying such things though (i.e. shouldn't change surface without someone with property authority I think?)
  • in some contrived testing I found that having the IntegerTag field was fastest, but when I forced IntegerTag encoding with the Map example as per the timing from the previous message I actually found negligible time difference. Thus I think that the VirtualTag is superior to both RuntimeTypes (better speed, equal space) and IntegerTag (better space, equal speed).

@forki
Copy link
Contributor

forki commented Sep 5, 2016

just because I have increased the public surface area

you should still change that test then. review on green is better

@manofstick
Copy link
Contributor Author

@forki

OK; well we'll see how she goes; my machine just doesn't run the test suite and I haven't bothered to find out why, and NUnit integration in Visual Studio doesn't work either, so I just ramshackled some stuff together to make the surface area check. Should have just manually put it in, would have been easier and quicker, but hey. (don't have much time to give this machine any love; just happy it compiles!)

But I guess I would have liked to hear some feedback, not particularly about my implementation, as that is neither here not there, but rather about the idea. Because to me this method seems like the way to go when the existing IntegerTag and RunTimeTypes methods could be deprecated? This could simplify things vastly, but I'm not sure of the ramifications of such a rollout, due to inlining especially which would mean version management? But maybe that's already an issue with any FSharp.Core change?

(Not that this matters in my currently implementation, as it's opt-in, but I would like to see some discussion in the general case....)

member EnclosingType : ILType
member GenericArgs : ILGenericArgs
member Alternatives : IlxUnionAlternative list
member EnclosingType : ILType
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please remove these whitespace normalizations for the purposes of the PR, so we can see the minimal diff. Thanks

@dsyme
Copy link
Contributor

dsyme commented Sep 6, 2016

@manofstick The implementation look good and seems a pretty sane design. The performance improvements are convincing. Would like to see if overall compiler perf is improved by this (see also Zmap in the compiler)

@manofstick
Copy link
Contributor Author

@dsyme

Zmap (and Zset) are optimized for performance already; i.e. they are only a single nullary/single non-nullary case, and they include UseNullAsTrueValue, so they optimize down to just null checks - although that does mean from a space perspective they are a little more bloated as the don't contain the (MapOne|SetOne) case. (Must have been experimental at some stage, as it's wrapped in a conditional compiler directive "ONE")

I tried forcing all RuntimeTypes/IntegerType to VirtualTag, but FSharpOption was clashing with some members; I will have to look what combination causes issues. Otherwise I might just sprinkle it the flag around - do you have a list of most used types from a performance tests? Do you have any standard performance tests for the compiler?

And, if this was to be come non-opt-in (i.e. default for all disciminate unions - although keep the special cases for all nullary and single non-nullary/single nullary and structs) would this cause compatibility issues; i.e. with inlining of modules that were build pre-such a change? Or is this not an issue? I don't know; just trying to find out...

@manofstick
Copy link
Contributor Author

@dsyme

If you want to play with checking performance, then this diff appears to work, changing all Discriminate Unions to use virtual tag except for option and list... (I'm not recommending this... just hacked it to get it going)

(My machine is toasted for the test suite for some reason, and haven't had time to look at it, so I'm just going off that I could do a build with this diff applied, and then analysing the resultant assemblies to ensure that virtual tag had been applied)

Anyway, no more time for me; hopefully I'll get some time again soon...

@dsyme
Copy link
Contributor

dsyme commented Sep 8, 2016

And, if this was to be come non-opt-in ... would this cause compatibility issues

I think the biggest concern is that serialization format incompatibility (e.g. binary serialization) . Inlining etc. would be ok.

@manofstick
Copy link
Contributor Author

@dsyme

Would inlining be OK? If I have an assembly built with a previous version, and then tried to reference it in an assembly built with a new one, then the new assembly, in say a match expression would try to make explicit access to the underlying virtual Tag member, which just wouldn't exist? Or is this not how it would work?

Anyway, hadn't thought of binary serialization. Bummer. I guess that really does mean it will have to remain opt-in. A shame. Because RuntimeTypes is always worse, and I'm pretty sure the difference in the real world between the VirtualTag and IntegerTag is fairly negligible performance wise (caveat; I really need to do some better testing!) and you get your 4 bytes per object saving... Hmmm...

@manofstick
Copy link
Contributor Author

@forki

I saw you had FSharpPerf project, which sounds just like what I need, but I tried to do "build.cmd BuildProjects" with the default build.fsx but it failed; I pressed on anyway with the RunPerfTests, but it just failed too. Should I not be using defaults? Is there anything else I need? Or do I really need to modify the build.fsx?

@forki
Copy link
Contributor

forki commented Sep 8, 2016

from https://github.com/fsprojects/FSharpPerf#testing-on-appveyor

@manofstick
Copy link
Contributor Author

@forki

So just do a non-local build? OK.

@forki
Copy link
Contributor

forki commented Sep 8, 2016

yeah. that's what I recommend. otherwise your machine is occupied for a while.

@forki
Copy link
Contributor

forki commented Sep 8, 2016

fsprojects/FSharpPerf#7 show cool results for this.

@manofstick
Copy link
Contributor Author

@forki

Unfortunately I think these are the wrong way around! (Or at least that is what I would be expecting) I.e. I have probably slowed things down, as I have removed some exciting optimisations by just forcing everything to use virtual tag.

Anyway, I'll analyse and confirm, fix, and be more surgical now that I have a framework to work within.

@forki
Copy link
Contributor

forki commented Sep 8, 2016

9d5e36a is your commit and FSharpPerf shows this as faster.

@manofstick
Copy link
Contributor Author

@forki

Ahh but that commit was the baseline; the other one was the one where I forced all union types to use virtual tag! No great performance leap!

And I'm not really expecting much, if any, improvement in regards to the compiler, as I believe that people have manually optimized the cases where things were slow by cutting down the number of cases to a single non-nullary/single nullary cases such as what was done with Zmap/Zset. These end up as best case scenerios, as they only need to do null checks with no tags anyway. And in my original test I converted these to my less optimal VirtualTag just because I wanted to see what would happen if I just did sweeping changes.

But what I would like to see is, if I can now pull out all those special cases and leave them in their optimized state, and then change the rest of the unions to VirtualTag I'm hoping this will have minimal slowdown or possibly even a slight performance increases then that would be a win; as we're then saving 4 bytes per union cases allocation.

Anyway, we'll see how we go - Ballet and Mr Maker are providing me with a brief reprieve, but it won't last for long :-)

@manofstick
Copy link
Contributor Author

@dsyme

Do you think it is worth refactoring the EraseUnions to make the DiscriminationTechnique more explicit?

What I mean is extend it to:

type DiscriminationTechnique =
   | TailOrNull
   | SingleCase
   | SingleCaseStruct
   | Struct 
   | OneNullaryOneNonNullaryCase
   | VirtualTag
   | VirtualTagWithNull // not yet implemented
   | RuntimeTypes
   | RuntimeTypesWithNull
   | IntegerTag
   | IntegerTagAllNullaryTypes

Which I think is the "real" list of what actually goes on after all the helper functions are invoked. And I think the logic in determining which is captured by the following modification:

member repr.DiscriminationTechnique cu = 
    if isList cu then TailOrNull
    else
        let alts = getAlternatives cu
        if alts.Length = 1 then
            if isStruct cu then SingleCaseStruct else SingleCase
        elif isStruct cu then Struct
        elif alts.Length = 2 && nullPermitted cu && (isNullary alts.[0] <> isNullary alts.[1]) then OneNullaryOneNonNullaryCase
        elif alts |> Array.forall isNullary then IntegerTagAllNullaryTypes
        elif useVirtualTag cu then
            if nullPermitted cu && Array.existsOne isNullary alts then VirtualTagWithNull else VirtualTag
        elif alts.Length < TaggingThresholdFixedConstant then
            if nullPermitted cu && Array.existsOne isNullary alts then RuntimeTypesWithNull else RuntimeTypes
        else
            IntegerTag

(I haven't grokked the struct path yet, so am assuming that some logic in there would split along the single case/multi case...) Anyway, something like that...

I'm hoping that this will then allow me to be binary serialization
compatible with RuntimeTypes
@manofstick
Copy link
Contributor Author

@dsyme

OK; well I've added the combination of UseNullAsTrueValue and UseVirtualFlag. I think now this would mean that I could be binary serialization compatible for FSharp.Cores Map and Set which currently use UseNullAsTrueValue+RunTimeTypes (implicit) combination; (although maybe would need to add an attribute on underlying abstract VirtualTag property to ensure it is ignored (but only has a getter, so is maybe ignored anyway?))

Do you have any suggestions as to checking if this is true?

@KevinRansom
Copy link
Contributor

@manofstick @dsyme

Do you think you will get back to this PR?

It has not been updated or disussed since mid September.

Kevin

@manofstick
Copy link
Contributor Author

@KevinRansom

If is designing discriminated unions from scratch there is no way I would have gone with RTTI, this method is much better. In fact i probably just would have gone with the real Tag in all cases, which although a little excessive, you are already paying for an object which had loads of overhead anyway. And then I probably would have provided this virtual tag as a memory saving measure. But unfortunately we're not designing things from scratch.

Now? It does provide some improvement in the map/set cases, but not as much as I would like due to having to also support the existing null case, which I replicated to try to ensure that serialisation wasn't damaged.

In other situations it is a niche performance boost. I think there is just one occasion where I would have liked this to exist. I ended up adding a dummy case to force the creation of the Integer Tag; I would have preferred this solution.

Anyway, so where does this leave us? I think it is a reasonable PR, but the results are marginal I guess. Do with it as you see fit.

@KevinRansom
Copy link
Contributor

@dsyme @manofstick

Don this is an interesting PR but I'm not sure that we should take it. There seems to be too much uncertainty about compatibility what is your take on this?

Kevin

@manofstick
Copy link
Contributor Author

@KevinRansom

An alternative could be to remove the application of it in set and map and just provide it as an option for users. Then it doesn't mess with any code base.

Possibly this could be pushed up to a argument for the compiler?

I.e. no argument, current behavior.

Argument could force virtual tag, or Integer Tag where no attribute is specified on a discriminated union.

Maybe bit excessive?

@KevinRansom
Copy link
Contributor

@manofstick

A compiler switch I expect would effectively fork the libraries into those that enabled it for disc unions and those that didn't. So an attribute per type or a heuristic perhaps. Or we work through the compatibility issues and convince ourselves we can enable it as "the mechanism."

Having a better idea for an already shipped feature ... is always difficult ... :-)

@manofstick
Copy link
Contributor Author

manofstick commented Dec 14, 2016

@KevinRansom

The compiler flag wouldn't cause a forking. It would just apply the attribute to discriminate unions that don't specify it. Any consumer of them would read the meta data which would contain the attribute and function correctly.

So I think it would work fine, but just needs time and effort that is possibly not available!

@KevinRansom
Copy link
Contributor

@manofstick that doesn't sound too bad then. Why don't we just make an assembly level attribute in that case, to turn it on or off, we would also need the on the class attribute to turn it on and off too.

The assembly level attribute can go in the attributes.fs for new library templates, and the developer can enable disable it on the specific types where some compatibility is needed.

Then we may be good to go ... although Don would probably prefer to hold out until F# 4.2.

@manofstick
Copy link
Contributor Author

@KevinRansom

I agree that it definitely not for this release!

Bump it and we can think about it sometime in the future.

@manofstick
Copy link
Contributor Author

manofstick commented Dec 15, 2016

@KevinRansom

I'm thinking about the assembly attribute and I think I quite like it.

The option available would be "optimise for space" which uses virtual tag or "optimise for speed" using Integer Tag. And both of them could do the"correct thing"in regards to the other available options, like null or using base type which would be well defined in documentation.

I think this would mean that this would actually be used as only have to define it one place. Hmmm. Yeah i like it.

@KevinRansom
Copy link
Contributor

@manofstick It's a candidate for 4.2 so we have plenty of time :-)

@KevinRansom
Copy link
Contributor

@manofstick

How is progress on this going. Clearly it's a candidate for F# 4.2 and so there is no particular rush.

Kevin

@manofstick
Copy link
Contributor Author

@KevinRansom

I'll get back to this eventually, but currently it's third in my priorities...

@dsyme dsyme changed the title WIP: Virtual Tags for Discriminated Unions [WIP] Virtual Tags for Discriminated Unions Dec 1, 2017
@realvictorprm
Copy link
Contributor

@dotnet-bot Test this please

@zpodlovics
Copy link

zpodlovics commented May 20, 2018

FYI: type equality is now optimized "away" by CoreCLR [1][3][4] - in theory this could be also used to implement virtual tags (at least for value types). Struct interface call is now without boxing [5][6] (dotnet core 2.1) Adopted code from [2] in C#:

public interface ITag { }
public struct Case1 : ITag { }
public struct Case2 : ITag { }

[MethodImpl(MethodImplOptions.NoInlining)]
static bool IsCase1<T>() where T : struct, ITag => typeof(T) == typeof(Case1);
[MethodImpl(MethodImplOptions.NoInlining)]
static bool IsCase2<T>() where T : struct, ITag => typeof(T) == typeof(Case2);

Beware! The F# struct equality will box the value (#526)!

[1] https://github.com/dotnet/corefx/blob/master/src/Common/src/CoreLib/System/Numerics/Vector.cs#L18
[2] https://github.com/dotnet/coreclr/issues/10282#issuecomment-287568247
[3] https://github.com/dotnet/corefx/issues/17273#issuecomment-287551967
[4] https://github.com/dotnet/coreclr/issues/17981
[5] dotnet/coreclr#17006
[6] dotnet/coreclr#14698

@KevinRansom
Copy link
Contributor

@manofstick,

I'm trying to cut down the number of inactive PR's in our repo, there are so many. Can this be closed or do you think it should be pursued?

You can of course maintain the changes on your clone until you are ready to get back to it.

Thanks

Kevin

@manofstick
Copy link
Contributor Author

@KevinRansom

I think it's a good thing, but way too much time has passed so would be best restarted, rather than trying to merge this I reckon.

But probably too late in the day for F# anyway... maybe in G# :-)

Closing...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants