Member-only story

Data Science: Implementing a Naive Bayesian Classifier using the Empirical Density.

applied.math.coding

5 min readMay 30, 2023

Join Medium with my referral link - applied.math.coding

Get access to all of my stories and to thousands more on Medium from other writers. As my strong believe is, Medium is…

applied-math-coding.medium.com

This story is part of my Data Science series.

In my previous story (see here), I have provided an example implementation for a random forest based on a classification problem stemming from this set of data.

As one further approach for solving this problem I want to use a Naive Bayes classifier which implementation shall be the focus of this story.

In general, since the data set is quite large, the Naive Bayes approach is a good choice despite of its rather hard assumptions imposed on the data. You may find some more details on that in the theoretic outline here.

As a quick reminder, the Naive Bayes is based on the following relation of conditional probabilities:

P(C | X) = P(C) / P(X) * P(X | C)

where C is the categorical outcome random variable and X the feature random variable.

As we see, in order to find an estimate for the outcome to be 1 based on the condition the feature to take value…

Data Science: Implementing a Naive Bayesian Classifier using the Empirical Density.

Join Medium with my referral link - applied.math.coding

Get access to all of my stories and to thousands more on Medium from other writers. As my strong believe is, Medium is…

Create an account to read the full story.

Written by applied.math.coding

No responses yet

More from applied.math.coding

Lifetimes in Rust are not that hard to understand.

When learning Rust, the first time one encounters a compilation error that states about lifetimes, can be a confusing experience. The first…

Data Science: Clustering Data with DBSCAN.

With an implementation in Rust.

A course in Category Theory.

Part 1: “Definition and Examples.”

Computing the Determinant in Rust: Part 2.

This story is a direct follow-up of the previous account about computing a determinant numerically. As we have pointed out, or as you might…

Recommended from Medium

Understanding Decision Boundaries in Machine Learning

When training a machine learning model for classification tasks, one of the most important concepts to grasp is the decision boundary. It…

How I Learned to Love `init.py`: A Simple Guide😊

💡 Heads Up! Click here to unlock this article for free if you’re not a Medium member!

20 Advanced Statistical Approaches Every Data Scientist Should Know 🐱‍🚀

Data science is a multidisciplinary field that combines mathematics, statistics, computer science, and domain expertise to extract…

The 5 paid subscriptions I actually use in 2025 as a Staff Software Engineer

Tools I use that are cheaper than Netflix

Jeff Bezos Says the 1-Hour Rule Makes Him Smarter. New Neuroscience Says He’s Right

Jeff Bezos’s morning routine has long included the one-hour rule. New neuroscience says yours probably should too.

SpaceX Has Finally Figured Out Why Starship Exploded, And The Reason Is Utterly Embarrassing

This should never have happened.

Data Science: Implementing a Naive Bayesian Classifier using the Empirical Density.

Join Medium with my referral link - applied.math.coding

Get access to all of my stories and to thousands more on Medium from other writers. As my strong believe is, Medium is…

Create an account to read the full story.

Written by applied.math.coding

No responses yet

More from applied.math.coding

Lifetimes in Rust are not that hard to understand.

When learning Rust, the first time one encounters a compilation error that states about lifetimes, can be a confusing experience. The first…

Data Science: Clustering Data with DBSCAN.

With an implementation in Rust.

A course in Category Theory.

Part 1: “Definition and Examples.”

Computing the Determinant in Rust: Part 2.

This story is a direct follow-up of the previous account about computing a determinant numerically. As we have pointed out, or as you might…

Recommended from Medium

Understanding Decision Boundaries in Machine Learning

When training a machine learning model for classification tasks, one of the most important concepts to grasp is the decision boundary. It…

How I Learned to Love `__init__.py`: A Simple Guide😊

💡 Heads Up! Click here to unlock this article for free if you’re not a Medium member!

20 Advanced Statistical Approaches Every Data Scientist Should Know 🐱‍🚀

Data science is a multidisciplinary field that combines mathematics, statistics, computer science, and domain expertise to extract…

The 5 paid subscriptions I actually use in 2025 as a Staff Software Engineer

Tools I use that are cheaper than Netflix

Jeff Bezos Says the 1-Hour Rule Makes Him Smarter. New Neuroscience Says He’s Right

Jeff Bezos’s morning routine has long included the one-hour rule. New neuroscience says yours probably should too.

SpaceX Has Finally Figured Out Why Starship Exploded, And The Reason Is Utterly Embarrassing

This should never have happened.

How I Learned to Love `init.py`: A Simple Guide😊