Skip to content

Support "type granularity" option #854

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed

Conversation

SteveSandersonMS
Copy link
Member

@SteveSandersonMS SteveSandersonMS commented Dec 12, 2019

@lewing @marek-safar @rynowak Following our discussions this week, I've tried two different approaches to linking more from the Microsoft.AspNetCore.* assemblies in Blazor apps.

Approach 1: Use XML configs

I created some new build tooling inside the aspnetcore repo that generates and embeds linker XML config files based on known patterns for Blazor. Its rules are:

  • preserve all IComponent types completely, since their constructors and property-setters are used through reflection only
  • preserve all [JSinterop] methods, since they are called through reflection only
  • preserve all DI service types, since their constructors are called through reflection only
  • preserve all EventArgs subclasses, since their property setters are called through reflection only

This does work, and allows us to enable linking for Microsoft.AspNetCore.* assemblies. It doesn't require any new linker features. However, it has two main drawbacks:

  • Creating, maintaining, and reasoning about these XML config files is going to be expensive. For example, we have to use Mono.Cecil and hard-code duplicate copies of some logic from the linker to generate method signature strings in exactly the same format as the linker expects. It also complicates our build process somewhat.
  • The actual size gains aren't as substantial as we hoped.
    • The core problem is that the XML rules aren't expressive enough to cover a requirement like "preserve IComponent types completely, but only if there is a reference to the type somewhere in user code". Since we have no choice but to just preserve all IComponent types completely, we end up with sizeable chains of references.
    • For example, the default app ends up including the Microsoft.AspNetCore.Components.Forms assembly even though it contains no forms components, simply because one of the types in that assembly is referenced by a component in Microsoft.AspNetCore.Components.Web that is not actually being used by default.

Approach 2: Define a new "type-level granularity" option

This is implemented by this PR, and the core logic is very simple. It's a per-assembly flag that means "if a type in this assembly is referenced at all, preserve that type completely". This is a perfect fit for common patterns in ASP.NET and Blazor applications, because for things like IComponent types and DI services, you will have a rooted reference to the type itself (e.g., in DI config) and not to any members in that type, but you do need to preserve those members.

When enabling this for the Microsoft.AspNetCore.* assemblies, we get two major benefits compared with the XML approach:

  • Virtually nothing to maintain. It needs no XML config files, and the resulting applications just work.
  • Better size reductions. See table below.
    • For example, in the scenario described above, it does manage to drop the Microsoft.AspNetCore.Components.Forms assembly completely.

Based on this, I consider the "type-level granularity" option to be preferable.

Size comparisons

These numbers are for the Blazor StandaloneApp test case in the ASP.NET Core repo, which is pretty much identical to a File->New Blazor WebAssembly app, built in Release configuration.

Scenario Total app transferred size (compressed)
Before any changes (i.e., HEAD of blazor-wasm branch) 2.17 MB
Approach 1 (XML configs) 2.11 MB
Approach 2 (type-level granularity) 1.98 MB

Note that for all of these, 0.95 MB of the compressed transferred data is things other than .NET dlls, so the percentage gains are higher if you only consider .dlls.

PR review notes

I know we didn't make a specific plan to add this feature to the linker. If you feel it's not an appropriate feature, or something is wrong with the implementation, I've very happy to discuss other options!

In particular, I tracked the "per-assembly granularity" option in a general-purpose dictionary of per-assembly flags, just to avoid over-specializing for this one feature. For example, on the command line you can pass -f typegranularity MyApp.SomeAssembly.

This might be useful for other per-assembly flags in the future. However if you disagree and think it's over-generalized, I'm happy to take the ActionFlags concept out and reduce it to a specialized option like -usetypegranularity MyApp.SomeAssembly.

@mrvoorhe
Copy link
Contributor

Your experience with approach 1 mirrors a post I made about the problems with link.xml awhile back. In that post I proposed new attributes that would be better suited than link.xml for scenarios like this.
#797 (comment)

Just thought I'd mention that in case you were holding out hope for a more maintainable way to strip more than your approach 2 would allow.

Your approach 2 is harmless enough to support in my opinion and does offer an effort/reward ratio that would not be filled by an attribute based api.

@sbomer
Copy link
Member

sbomer commented Dec 14, 2019

The implementation looks good, nice and simple. I agree that there's a need for some kind of type-level granularity option - it has been suggested or requested a few times already (see also #799).

My main concern is that we're introducing another linker "concept". We already have AssemblyAction, TypePreserve, and MethodAction to influence what is kept from Assemblies/Types/Methods. Historically this has been a source of bugs (particularly AssemblyAction IMO), and this introduces another orthogonal concept that we need to make sure combines well with the others. What do you think of making this AssemblyAction.LinkTypeGranularity (or similar) instead? I see that this sort of cuts across assemblies and type preservation behavior, so maybe it doesn't quite fit into AssemblyAction, but I wanted to ask the question at least.

@vitek-karas and I were also talking about how this would influence the diagnostic analyzer we're working on. Now when the linker attempts to detect a pattern like typeof(Foo).GetMethod("M"), when Foo comes from a TypeGranularity assembly, the logic doesn't need to look inside GetMethod calls, and so the pattern matching will need to take the granularity into account when reporting a pattern as understood or not. (It doesn't do this yet, so nothing for you to worry about - just wanted to share our thinking).

@SteveSandersonMS
Copy link
Member Author

What do you think of making this AssemblyAction.LinkTypeGranularity (or similar) instead?

I did originally start by adding a new AssemblyAction called LinkTypes, but backed away from that approach when I saw how much extra maintenance burden it was going to create for you. I think it would introduce many more opportunities for bugs if it's done that way, because then, every single time you check for the Link action, you also have to remember to check for LinkTypes too. If you ever forget to do that, you'll have a bug in the LinkTypes mode. In the current codebase, there are around 20 checks for Link mode, so all of them would need to be updated to check for "Link or LinkTypes".

I'm totally happy with doing it that way if you prefer. Let me know and I'll update this PR to work that way. However I do think the orthogonal flag approach will make this much more maintainable in the long run as you don't have to care about differing linking strategies in every place - there's just one concept of Link and the only place you need to care about the "type-level granularity" behavior is in the one place where the flag is checked for.

@marek-safar
Copy link
Contributor

I kind of look at this as a quick solution which we used to do in the past and migrated away from it because it didn't give us sufficient size saving. Some of the problems can be solved on much more granular bases today but I agree for some of the problems we'll still need to come up with the right approach.

Now, to your suggested solution. I'm fine to accept something like this but prefer to implement it in XML descriptors fully instead. They are the bazooka for these kinds of issues if someone does not want to do the extra work (btw, all frameworks dlls we worked with went through the same path).

Instead of adding another logic entry point and mess with logical ordering what takes precedence. We could expand existing XML preserve logic with a new attribute which controls marking behaviour. For example mark with values like always (the default today) and referenced.

The change on your side would be something like bellow with the flexibility to fully override any type with their logic.

<linker>
    <assembly fullname="Microsoft.AspNetCore.Components, Version=0.0.0.0, Culture=neutral, PublicKeyToken=null" preserve="methods" mark="referenced" >
    </assembly>
</linker>

@SteveSandersonMS
Copy link
Member Author

@marek-safar I'm afraid I don't quite understand your comment. The config syntax you're suggesting doesn't seem to say anything about type-level marking granularity.

Are you saying that preserve="methods" mark="referenced" would produce the same behavior as the -f typegranularity option in this PR? If so I don't know why either preserve="methods" or mark="referenced" would imply that.

Could you clarify?

@SteveSandersonMS
Copy link
Member Author

@marek-safar Thinking about this more, my best guess is that you mean the syntax for per-type linking would look something like the following:

<linker>
    <assembly fullname="Microsoft.AspNetCore.Components">
        <type fullname="*" preserve="all" mark="referenced" />
    </assembly>
</linker>

That would make sense. Is that how it would work? Can you confirm also that it is (or would be) possible to specify <type fullname="*">? We don't want to be in a position where we have to maintain a list of all types in the assembly, as that could easily lead to bugs.

@marek-safar
Copy link
Contributor

The config syntax you're suggesting doesn't seem to say anything about type-level marking granularity.

It does. The XML descriptor already works on the hierarchical level. See for example this test https://github.com/mono/linker/blob/master/test/Mono.Linker.Tests.Cases/LinkXml/AssemblyWithPreserveAll.xml, https://github.com/mono/linker/blob/master/test/Mono.Linker.Tests.Cases/LinkXml/AssemblyWithPreserveAll.cs

About preserve="all" I don't think you need to use it. It should be preserve="methods" unless you are touching fields which are referenced only by reflection.

@SteveSandersonMS
Copy link
Member Author

OK, in that case the syntax <assembly ... preserve="methods" mark="referenced"> seems very ambiguous. Does this definitely imply that the thing being marked is a type and not (say) a method or everything in the assembly? The concept of types isn't mentioned at all so I find it surprising that it would mean that.

@marek-safar
Copy link
Contributor

@SteveSandersonMS I don't see that pattern ambiguous, you can have methods only inside types. Your example will work too as linker supports regex patterns for type names (it's just a bit more verbose).

@SteveSandersonMS
Copy link
Member Author

OK, thanks for clarifying! I guess what I saw as ambiguous is mark="referenced": it's not saying what is being referenced. What we need it to mean is "when the enclosing type is referenced" but that's not explicitly stated. Another person might read as meaning "preserve the methods when those methods themselves are referenced". Yes I know that would be a redundant definition, but it's still the most literal interpretation. Perhaps it should be mark="WhenTypeReferenced" to make that more explicit.

Regardless, it sounds clear now that you do expect that syntax to behave in the way we need. So, do you think this work could be scheduled to be done on the linker? Is there a particular milestone or ETA we could put this down for?

It looks like this is outside the scope of what I can do a PR for, given how much our discussion has shown there are subtleties about how XML config is meant to be interpreted that I don't know.

@marek-safar
Copy link
Contributor

I am fine with WhenTypeReferenced naming. I used reference as the starting point and I am glad you come up with better specification when the marking should apply in your case.

@vitek-karas
Copy link
Member

I'm curious about the intended usage of this. Problems like these tend to be "viral" that is they span multiple assemblies (potentially the entire application), and so per-assembly settings can leave holes in the solution.

For example in this case if I read it correctly any type which implements IComponent (and is actually referenced) needs to be kept in its entirety. But the proposed solution would do this on per-assembly level only. Does this mean that you will enable this new setting on all assemblies which may contain types implementing IComponent? How are you going to determine that fact? What if you don't own such assembly? What if the app implements such a type?

@mrvoorhe
Copy link
Contributor

It sounds like we should take a step back and reevaluate the ideal solution. Going back to the points listed in approach (1).

preserve all IComponent types completely, since their constructors and property-setters are used through reflection only

I described an attribute that would provide almost this exact behavior. See in RequireSubclasses in #797.

preserve all [JSinterop] methods, since they are called through reflection only

While I did not list an attribute that would help with this case in my reply's on #797, I do think this is another case where an attribute would make sense. I want something similar for [JsonProperty] (Newtonsoft.Json) and and [Serialized] (UnityEngine). The attribute could be something like [RequireMembersWithAttribute] (the name is horrible I know). You would put this on the attribute type and the linker would mark all members that have this attribute. We could add some sort of optional enum for controlling what should trigger the marking. i.e. Always or ParentMarked.

preserve all DI service types, since their constructors are called through reflection only

Not sure what DI is but this sounds addressable by something like the RequireSubclasses attribute idea.

preserve all EventArgs subclasses, since their property setters are called through reflection only

addressable by something like the RequireSubclasses attribute idea.

If approach 2 is feeling forced and not like a good fit, why not just shoot for the better approach?

@SteveSandersonMS
Copy link
Member Author

Thanks to help from @marek-safar we now understand this change is unnecessary, since the linker can already do this quite easily via XML config like the following:

<?xml version="1.0" encoding="utf-8" ?>
<linker>
    <assembly fullname="SomeAssemblyName">
        <type fullname="*" preserve="all" required="false" />
    </assembly>
</linker>

So, I'll close this as there's no reason to have a separate option to get the same results as this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants