Member-only story
Data Science: Theory and Implementation of the Logistic Regression in Rust.
This story is part of my Data Science series.
This story attempts to explain the theory behind the logistic regression and to give a rudimentary implementation in Rust. Logistic regression is a classic algorithm in supervised machine learning that can be used to estimate conditional probabilities of a categorical outcome variable.
The feature variables for logistic regression are typically required to be continuous or binary. Note, it is always possible and straightforward to produce binary variables from a categorical variable. For continuous variables it is sometimes necessary to scale them to a common range. Alternatively one can partition such a variable by considering quantiles and then transform the quantile-based categories into binary variables.
The data we are going to use can be found here. They present a set of feature variables being mapped against a 0-1
outputs. For this reason, and also because it is the most typical usage, we restrict our attention in this story to outcomes of binary type.