Training multiple random forests on common data set

I am attempting to train random forests on a fixed 6GB data set of features, with `N` different labels, using `M` different random forest parameter settings.  Overwhelmingly, the time taken to do this appears to be the disk transpose operation, which occurs `N * M` times, when ideally it should only be done once (as the feature set is common to all models).

To rectify this, is there any way to either:
- train multiple random forests in the same pipeline, or,
- share the transposed data object between multiple training pipelines?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Training multiple random forests on common data set #256

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Training multiple random forests on common data set #256

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions