Skip to content

ITransformer yields IRowToRowMapper, make prediction engine faster #986

Closed
@TomFinley

Description

@TomFinley

Things like ITransformer (or its predecessor, IDataTransform) given an IDataView can produce another IDataView. This works well for doing things like streaming over billions of records, but for just one record, the whole machinery around setting up a cursor.

For example, consider the prediction engine.

public sealed class PredictionEngine<TSrc, TDst>

What this does currently is it basically composes an IDataView consisting of one item, then applies the transform chain to it, and so on. But this is pretty heavyweight. The setting up the dynamically typed delegates, binding to the appropriate types, and so on, on every single point absolutely dwarfs any actual computation that happens in many pipelines. Again, this system is fine if you're doing what it was designed to do, stream efficiently over billions of records, but on a small scale it's not great.

There is an existing IRowToRowMapper interface that we might be able to exploit.

This interface is somewhat analogous to IDataView, and the IRowToRowMapper.GetRow method is somewhat analogous to IDataView.GetRowCursor. This is something many existing IDataTransform interfaces would implement, to enable faster mapping. We can exploit this same functionality through ITransformer.

So we can do this:

  • Allow ITransformer to, in addition to providing IDataViews through transformation of datasets, optionally allow them to also provide IRowToRowMapper implementors.

  • Exploit this new functionality to make PredictionEngine faster, on applicable pipelines.

This will also allow us to check in prediction engine if a pipeline really is able to be expressed in a row-to-row capacity.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions