`ITransformer` yields `IRowToRowMapper`, make prediction engine faster

Things like `ITransformer` (or its predecessor, `IDataTransform`) given an `IDataView` can produce another `IDataView`. This works well for doing things like streaming over billions of records, but for just one record, the whole machinery around setting up a cursor.

For example, consider the prediction engine.

https://github.com/dotnet/machinelearning/blob/ecb9126691401a3142e59139290bf78ed9bc68ad/src/Microsoft.ML.Api/PredictionEngine.cs#L149

What this does currently is it basically composes an `IDataView` consisting of *one* item, then applies the transform chain to it, and so on. But this is pretty heavyweight. The setting up the dynamically typed delegates, binding to the appropriate types, and so on, on every single point absolutely dwarfs any actual computation that happens in many pipelines. Again, this system is fine if you're doing what it was designed to do, stream efficiently over billions of records, but on a small scale it's not great.

There is an existing `IRowToRowMapper` interface that we might be able to exploit.

https://github.com/dotnet/machinelearning/blob/ecb9126691401a3142e59139290bf78ed9bc68ad/src/Microsoft.ML.Core/Data/ISchemaBindableMapper.cs#L91

This interface is somewhat analogous to `IDataView`, and the `IRowToRowMapper.GetRow` method is somewhat analogous to `IDataView.GetRowCursor`. This is something many existing `IDataTransform` interfaces would implement, to enable faster mapping. We can exploit this same functionality through `ITransformer`.

So we can do this:

* Allow `ITransformer` to, in addition to providing `IDataView`s through transformation of datasets, *optionally* allow them to also provide `IRowToRowMapper` implementors.

* Exploit this new functionality to make `PredictionEngine` faster, on applicable pipelines.

This will also allow us to check in prediction engine if a pipeline really is able to be expressed in a row-to-row capacity.



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

`ITransformer` yields `IRowToRowMapper`, make prediction engine faster #986

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

ITransformer yields IRowToRowMapper, make prediction engine faster #986

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

`ITransformer` yields `IRowToRowMapper`, make prediction engine faster #986