Data Science: Implementing a Random Forest in Rust.
This story is part of my Data Science series.
Random forests do come in many variations. The common idea is instead building a single tree model, one trains an ensemble of tree model where each has slightly different input parameters. These input parameters for instance could be a random-selected subset of features. Or a random sample from the training data.
Predictions with such an ensemble typically is done by ‘majority voting’ in case of a classification, and averaging in case of a regression.
In this story I aim to present an implementation of a random forest for a classification problem stemming from these data here.
This data base is quite large and thus not suitable for training one single tree from it. For this reason I choose to build an ensemble of trees based on random-selected samples of the training data.
Implementation (Tree):
Before implementing a random forest, we first need to implement a tree model: