“Bayes’ Theorem is the quiet architect of artificial intelligence, showing us that true intelligence is not found in knowing everything, but in learning how to change our minds when new evidence appears.” – MJ Martin
Introduction
Do you understand how artificial intelligence works?
Artificial Intelligence is built on statistics. It finds multiple answers to every question posed, and then selects the best answer with the highest probability of being correct. AI systems apply probability theory to solve problems. Through the optimization of logistics, the interpretation of complex data, or the seamless interaction with humans, it all adds up. Numbers are the fundamental foundation of A.I. Artificial Intelligence is all about probability….
Probability is one of the most fascinating branches of mathematics because it attempts to measure uncertainty and predict outcomes in an unpredictable world. Among the many important concepts in probability, Bayes’ Theorem stands out as a powerful tool that links prior knowledge with new evidence to update our understanding of the likelihood of events.
Named after Reverend Thomas Bayes, an 18th-century mathematician and theologian, this theorem is both elegant and practical, with applications ranging from medicine to artificial intelligence.
To understand Bayes’ Theorem is to explore not just a formula, but also a way of thinking about the world that encourages careful reasoning, humility about our assumptions, and a structured method for revising our beliefs.

The Foundation of Probability
Is gambling a game of chance or is it one of probability?
At its heart, probability is about quantifying how likely something is to happen. For example, when rolling a fair six-sided die, the probability of landing on a two is one out of six, or about 16.7 percent. Simple events like these are easy to calculate, but real-world situations are more complicated because they involve overlapping conditions and uncertain information.
This is where conditional probability becomes important. Conditional probability is the likelihood that an event occurs given that another event has already happened. For instance, what is the probability that it is raining outside given that you see people carrying umbrellas? Bayes’ Theorem provides a mathematical framework for calculating such probabilities. As the statistician E. T. Jaynes wrote, “Probability theory is nothing but common sense reduced to calculation” (Jaynes, 2003).
Stating Bayes’ Theorem
What is Bayes’ Theorem exactly?
Bayes’ Theorem can be expressed in words as follows: the probability of an event A occurring, given that event B has occurred, is equal to the probability of B occurring given A, multiplied by the probability of A, and divided by the probability of B. In symbolic form, it is written as:
P(A|B) = [P(B|A) × P(A)] ÷ P(B)
Each part of this equation has a specific meaning. P(A) represents the prior probability, or how likely we believe event A is before we see any new evidence. P(B|A) is the likelihood, the probability that we would observe event B if A were true. P(B) is the marginal probability, or the overall chance of seeing event B no matter what. Finally, P(A|B) is called the posterior probability, the updated probability of A after considering the new evidence B.
Philosopher Richard Price, who helped publish Bayes’ work, observed that this process “gives us a rule to judge how much we should believe an event when informed of another” (Price, 1763).
An Everyday Example
Imagine you are in a city where 1 percent of the population has a rare illness. There is a medical test for this illness that is 99 percent accurate. If you test positive, what is the probability that you actually have the illness? At first glance, many people think the answer must be 99 percent. However, Bayes’ Theorem reveals a more nuanced truth.
The prior probability of having the illness is 1 percent. The likelihood that the test is positive if you have the illness is 99 percent. The probability of testing positive overall includes both true positives and false positives. Using Bayes’ Theorem, the posterior probability of actually having the illness given a positive test is closer to 50 percent. This surprising result shows how easy it is to misinterpret probabilities without a structured method for reasoning. Bayes’ Theorem forces us to balance prior odds with new evidence, which is why it has become a cornerstone of modern decision-making.

Advantages of Bayes’ Theorem
One of the greatest strengths of Bayes’ Theorem is that it provides a systematic way to update beliefs when new information becomes available. This is incredibly valuable in areas such as medicine, where doctors must weigh test results against the background prevalence of diseases. It also plays a central role in artificial intelligence, where algorithms must make predictions based on incomplete data. For example, early spam filters used Bayesian reasoning to determine whether an email was junk by comparing the frequency of suspicious words to their occurrence in known spam messages.
Another advantage is that Bayes’ Theorem teaches critical thinking. It reminds us that no piece of evidence exists in isolation, but must be interpreted in context. A single test, observation, or clue is rarely definitive, but when combined with prior knowledge, it can shift our understanding in meaningful ways. This perspective aligns with how science itself operates: forming hypotheses, collecting evidence, and then revising conclusions as new information arrives. As computer scientist Judea Pearl argued, “Bayesian networks constitute the most significant advance in reasoning under uncertainty since the advent of probability theory” (Pearl, 1988). In this sense, Bayes’ Theorem is more than a mathematical formula. It is a philosophy of learning that values adaptability and careful reasoning.
Limitations and Criticisms
Despite its strengths, Bayes’ Theorem is not without limitations. A major challenge lies in choosing the prior probability. In many situations, we do not know the correct prior and must make assumptions. These assumptions can introduce bias, especially if they are chosen subjectively or without evidence. Critics argue that this makes Bayesian reasoning less objective than it appears. For instance, two people might apply Bayes’ Theorem to the same evidence but reach different conclusions because they started with different priors.
Another drawback is the complexity of real-world calculations. While Bayes’ Theorem is simple in form, applying it to large problems often requires advanced computation. In cases involving hundreds of variables, the mathematics can become so complex that only computers can manage it. This is one reason why Bayesian statistics did not see widespread use until the rise of modern computing. As historian Sharon Bertsch McGrayne noted, “For two centuries, Bayes’ Theorem was largely neglected because the calculations it required were practically impossible without computers” (McGrayne, 2011).
Finally, some situations involve evidence that is ambiguous or unreliable. If the evidence itself is flawed, then the results of Bayesian reasoning may also be flawed. As the saying goes, “garbage in, garbage out.” This is why careful attention must always be paid to the quality of the data before applying the theorem.

Bayes’ Theorem in Artificial Intelligence
Is Bayes’ Theorem still relevant today? After all, it was conceived in the early 1700s.
Today, Bayes’ Theorem is deeply relevant in artificial intelligence and machine learning. Bayesian methods are used to help machines classify data, recognize patterns, and even learn from experience. Self-driving cars, for instance, apply Bayesian reasoning to interpret sensor data and predict the likelihood of obstacles appearing on the road. When the car’s sensors detect a shadow across the street, the system must weigh the probability that the shadow represents a harmless tree branch against the possibility that it is a pedestrian. Bayes’ Theorem provides a rational way of updating these probabilities in real time.
Voice assistants such as Siri or Alexa also rely on Bayesian models. When you speak into the device, the system analyses sound waves and compares them against thousands of possible words. If the sound is unclear, the assistant does not just guess randomly. Instead, it uses Bayesian reasoning to weigh the likelihood of each possible word based on context. For instance, if you say “play the Beatles,” the system is more likely to interpret the sound as “Beatles” rather than “beetles,” because it updates its probabilities using prior knowledge about common music requests.
Another case study comes from medical AI. Researchers have developed Bayesian networks that assist doctors in diagnosing diseases. For example, if a patient presents with chest pain, the system evaluates prior probabilities of conditions such as heart disease, indigestion, or pneumonia. As new evidence is added, such as blood test results or X-ray images, the Bayesian model updates the likelihood of each condition. This does not replace the doctor but supports clinical decision-making by ensuring that evidence is weighed consistently and logically.
As AI pioneer Geoffrey Hinton explained, “The Bayesian framework is the correct way to do learning, if you can do the computations” (Hinton, 2007). His point underlines both the power and the challenge of Bayesian reasoning: it offers a logically sound method for updating beliefs, but it often requires enormous computational resources. In an age where artificial intelligence is shaping everything from healthcare to finance, the logic of Bayes’ Theorem continues to guide how machines make sense of uncertainty.

Summary
Bayes’ Theorem is a simple but profound idea that has transformed how we think about probability and decision-making. By combining prior knowledge with new evidence, it provides a rational framework for updating beliefs in the face of uncertainty. Its strengths lie in its clarity, adaptability, and wide range of applications, while its limitations remind us of the importance of careful assumptions and quality evidence. In the modern world, Bayes’ Theorem has found new life through artificial intelligence, where it helps computers learn, adapt, and make predictions in complex environments. For students in a high school classroom, Bayes’ Theorem is not just another formula to memorize, but a tool for learning how to think critically about information in a world that is often uncertain. As historian Sharon McGrayne concluded, “Bayes’ rule is a way of thinking, a way of reasoning with uncertainty, and it is shaping the twenty-first century” (McGrayne, 2011).
References
Jaynes, E. T. (2003). Probability Theory: The Logic of Science. Cambridge University Press.
Price, R. (1763). An Essay Towards Solving a Problem in the Doctrine of Chances. Philosophical Transactions of the Royal Society of London.
Pearl, J. (1988). Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference.
Morgan Kaufmann. McGrayne, S. B. (2011). The Theory That Would Not Die: How Bayes’ Rule Cracked the Enigma Code, Hunted Down Russian Submarines, and Emerged Triumphant from Two Centuries of Controversy. Yale University Press.
Hinton, G. (2007). Lecture on Machine Learning, University of Toronto.
About the Author:
Michael Martin is the Vice President of Technology with Metercor Inc., a Smart Meter, IoT, and Smart City systems integrator based in Canada. He has more than 40 years of experience in systems design for applications that use broadband networks, optical fibre, wireless, and digital communications technologies. He is a business and technology consultant. He was a senior executive consultant for 15 years with IBM, where he worked in the GBS Global Center of Competency for Energy and Utilities and the GTS Global Center of Excellence for Energy and Utilities. He is a founding partner and President of MICAN Communications and before that was President of Comlink Systems Limited and Ensat Broadcast Services, Inc., both divisions of Cygnal Technologies Corporation (CYN: TSX).
Martin served on the Board of Directors for TeraGo Inc (TGO: TSX) and on the Board of Directors for Avante Logixx Inc. (XX: TSX.V). He has served as a Member, SCC ISO-IEC JTC 1/SC-41 – Internet of Things and related technologies, ISO – International Organization for Standardization, and as a member of the NIST SP 500-325 Fog Computing Conceptual Model, National Institute of Standards and Technology. He served on the Board of Governors of the University of Ontario Institute of Technology (UOIT) [now Ontario Tech University] and on the Board of Advisers of five different Colleges in Ontario – Centennial College, Humber College, George Brown College, Durham College, Ryerson Polytechnic University [now Toronto Metropolitan University]. For 16 years he served on the Board of the Society of Motion Picture and Television Engineers (SMPTE), Toronto Section.
He holds three master’s degrees, in business (MBA), communication (MA), and education (MEd). As well, he has three undergraduate diplomas and seven certifications in business, computer programming, internetworking, project management, media, photography, and communication technology. He has completed over 60 next generation MOOC (Massive Open Online Courses) continuous education in a wide variety of topics, including: Economics, Python Programming, Internet of Things, Cloud, Artificial Intelligence and Cognitive systems, Blockchain, Agile, Big Data, Design Thinking, Security, Indigenous Canada awareness, and more.