Skip to content

Create methods not being called when loading models from disk #4385

Closed
@antoniovs1029

Description

@antoniovs1029

There are some classes that define a Create method which is supposed to be called when loading a model from disk, but the problem is that the method is not being called at all.

For example, I've noticed this unexpected behavior in the following classes:

  1. LinearBinaryModelParameters
  2. GamBinaryModelParameters
  3. FastTreeBinaryModelParameters
  4. FastForestBinaryModelParameters
  5. PlattCalibrator

when loading a model that uses any of these classes, their Create method is expected to be called, but, as stated, the method is not being called.

I also noticed this behavior in 3 other classes of the Calibrator.cs file (namely, the ParameterMixingCalibratedModelParameters, ValueMapperCalibratedModelParameters, and FeatureWeightsCalibratedModelParameters classes), but I've fixed this problem for those specific classes in my recent PR #4306 (which, as of the moment of writing this, is still waiting to get approved). It was while working on that PR that I noticed this problem on these classes, and I commented about it there... but it is appropriate to open this separate issue to better document this, since it is a problem that affects different classes across different files.

In fact, as I will describe below, there's a certain code pattern that is related to this problem, and I've seen this pattern in other classes of the Calibrator.cs file as well. So, the problem I describe here might be affecting even more classes than the ones I've mentioned.

Cause of the problem

The CreateInstanceCore method in the ComponentCatalog.cs file is responsible to try to call the Create method of any class when loading a model from disk.

The current implementation of the method actually checks first if the class has a constructor with parameters (IHostEnvironment env, ModelLoadContext ctx) and invokes that constructor through reflection. If the class doesn't have such a constructor, then it checks if it has a Create((IHostEnvironment env, ModelLoadContext ctx) method, and it gets invoked through reflection.

This behavior is not desired for the classes I've mentioned (and potentially other classes), since they define both a constructor and a Create method with those parameters, but in these cases it's actually expected that the Create method gets called instead of the constructor. Thus, if a class follows the pattern of having a private or internal constructor (with the (env, ctx) parameters) and also has a Create method, then this problem might also be affecting that class.

Since the Create method typically only runs some security checks before calling the constructor, it turns out that the overall process of loading models doesn't seem affected by this issue. But the problem remains that these security checks are being missed along with whatever behavior the Create method adds to the process.

As explained by @yaeldekel in her comment on my recent PR #4306 (under "Answer 1"), this problem might had been introduced before the official release of ML.net, when the ComponentCatalog method was modified in a way that permitted the CreateInstanceCore method to use private and internal constructors, which didn't use to happen... so before those changes were made, classes could have private or internal constructors and a Create method, and the latter would appropriately be called. But now the constructor gets called, and this is the case of all the classes mentioned in this issue.

Since these changes were made while trying to internalize as many APIs as possible before the ML.net official release, many constructors where also made private or internal, and thus the changes in ComponentCatalog that permit using those constructors are also necessary.

Because of these, further investigation is needed to know for sure which classes are being affected by this problem, so to better find a way to fix this problem without affecting all of the other classes that doesn't present this situation.

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions