You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I would like to contribute a new module to the linfa-trees crate that implements the Random Forest algorithm for classification tasks. This will expand linfa-trees from single decision trees into ensemble learning, aligning closely with scikit-learn's functionality in Python.
🚀 Motivation
Random Forests are a powerful ensemble learning method used widely in classification tasks. They provide:
Robustness to overfitting
Better generalization than single trees
Feature importance estimates
Currently, linfa-trees provides support for single decision trees. By adding Random Forests, we unlock ensemble learning for the Rust ML ecosystem.
📐 Proposed Design
🔹 New Module
A new file will be added:
bash
CopyEdit
linfa-trees/src/decision_trees/random_forest.rs
This will include:
RandomForestClassifier<F: Float>
RandomForestParams<F> (unchecked)
RandomForestValidParams<F> (checked)
🔹 Trait Implementations
I will implement the following traits according to linfa conventions:
ParamGuard for parameter validation
Fit to train the forest using bootstrapped data and random feature subsetting
PredictInplace and Predict to perform inference via majority voting
🔹 Example
An example will be added in:
bash
CopyEdit
linfa-trees/examples/iris_random_forest.rs
Using the Iris dataset from linfa-datasets.
🔹 Benchmark (Optional)
If approved, I can also add a benchmark using Criterion:
bash
CopyEdit
linfa-trees/benches/random_forest.rs
📁 File Integration Plan
src/lib.rs: Re-export random_forest::*
src/decision_trees/mod.rs: pub mod random_forest;
README.md: Update with a section on Random Forests and example usage
examples/iris_random_forest.rs: Demonstrates training and evaluation
📝 Description
I would like to contribute a new module to the
linfa-trees
crate that implements the Random Forest algorithm for classification tasks. This will expandlinfa-trees
from single decision trees into ensemble learning, aligning closely with scikit-learn's functionality in Python.🚀 Motivation
Random Forests are a powerful ensemble learning method used widely in classification tasks. They provide:
Robustness to overfitting
Better generalization than single trees
Feature importance estimates
Currently,
linfa-trees
provides support for single decision trees. By adding Random Forests, we unlock ensemble learning for the Rust ML ecosystem.📐 Proposed Design
🔹 New Module
A new file will be added:
This will include:
RandomForestClassifier<F: Float>
RandomForestParams<F>
(unchecked)RandomForestValidParams<F>
(checked)🔹 Trait Implementations
I will implement the following traits according to
linfa
conventions:ParamGuard
for parameter validationFit
to train the forest using bootstrapped data and random feature subsettingPredictInplace
andPredict
to perform inference via majority voting🔹 Example
An example will be added in:
Using the Iris dataset from
linfa-datasets
.🔹 Benchmark (Optional)
If approved, I can also add a benchmark using Criterion:
📁 File Integration Plan
src/lib.rs
: Re-exportrandom_forest::*
src/decision_trees/mod.rs
:pub mod random_forest;
README.md
: Update with a section on Random Forests and example usageexamples/iris_random_forest.rs
: Demonstrates training and evaluation📦 API Preview
✅ Conformity with CONTRIBUTING.md
Uses
Float
trait forf32
/f64
compatibilityFollows the
Params
→ValidParams
validation patternImplements
Fit
,Predict
, andPredictInplace
usingDataset
Optional
serde
support via feature flagWill include unit tests and optionally benchmarks
🙋♂️ Request
Please let me know if you're open to this contribution. I’d be happy to align with maintainers on:
Feature scope (classifier first, regressor later?)
Benchmarking standards
Integration strategy (e.g., reuse of
DecisionTree
)Looking forward to your guidance!
The text was updated successfully, but these errors were encountered: