Probability in Data Science

funmi somoye
3 min readOct 16, 2020

--

Probability, Chance
Photo by fotografierende from Pexels

Whether you are just starting out, or you have gone further in your data science journey, or you simply just want to understand probability, you have come to the right place.

Probability is the “opposite” of Statistics. Here’s what I mean:

Statistics tries to predict the possible relationship/causes through the data. That’s where you have your 5-number and 7-number statistical summaries, like mean, median, mode, and standard deviation.

Probability on the other hand already has the causes, but now wants to predict the data.

Examples

Let’s just say it is the outcomes from probability that forms some data.

Basically, probability is trying to measure how likely it is for an event to occur. Got it?

Photo by pixabay from Pexels

So when a fair coin is tossed, it can either come out Head, or Tail. That’s just two possible outcomes!

So, the probability of our coin landing as Head is 1/2 (1 out of 2 possible outcomes) and the probability of landing as Tail is also 1/2.

Thus,

Probability of event happening = Number of ways it can happen / Total number of outcomes

What about a die? There are 6 possible outcomes when a die is tossed. You can either get a 1, 2, 3, 4, 5, or 6. This reminds me of the days when I frequently played Ludo.

Your chances of rolling a die with a 4 when a 4 is all you need to win = 1 / 6 because there is only one face of a die with a 4 on it.

In its simplest form, events in probability are independent, like the coin. You cannot ascertain the outcome of your next toss no matter how many times you have tossed before. Attempting to do this is what people call The Gambler’s Fallacy.

Also, these examples I’ve given above like the coin and the die have equal likelihoods of occurrence, no matter how many times you repeat your experiment. A coin cannot have more than two faces — Head and Tail — just as a die cannot have a number between 1 and 6 appear on more than one of its 6 faces.

Real Life

Going deeper, probability gets more complex. But just note these, while you continue your exploration into the world of statistics:

Many measurements and observations have an infinite number of possible outcomes. For example, you might want to keep tossing your coin until you get a Head. Hence the number of possible tosses is n = 1, 2, 3, … .

Examples of such measurements in real life are temperature, sound, website traffic, marginal income, etc. What I have also learnt from Britannica here is that,

If the repeated measurements on different subjects or at different times on the same subject can lead to different outcomes, probability theory is a possible tool to study this variability.

Natural Language Processing and Spam Filter algorithms are also very practical examples of the application of probability to real life problems with code.

Photo by charlotte from Iwaria

When next you step look up from your phone or computer, I hope you can see more ways the concept of probability applies to our everyday life, like the probability of picking 4 groundnut seeds in a spoon of garri — an African delicacy.

--

--

funmi somoye

Funmi Somoye is a Data Science enthusiast and is excited about sharing her experiences with others.