Reader small image

You're reading from  Causal Inference and Discovery in Python

Product typeBook
Published inMay 2023
PublisherPackt
ISBN-139781804612989
Edition1st Edition
Concepts
Right arrow
Author (1)
Aleksander Molak
Aleksander Molak
author image
Aleksander Molak

Aleksander Molak is a Machine Learning Researcher and Consultant who gained experience working with Fortune 100, Fortune 500, and Inc. 5000 companies across Europe, the USA, and Israel, designing and building large-scale machine learning systems. On a mission to democratize causality for businesses and machine learning practitioners, Aleksander is a prolific writer, creator, and international speaker. As a co-founder of Lespire, an innovative provider of AI and machine learning training for corporate teams, Aleksander is committed to empowering businesses to harness the full potential of cutting-edge technologies that allow them to stay ahead of the curve.
Read more about Aleksander Molak

Right arrow

From associations to logic and imagination – the Ladder of Causation

In this section, we’ll introduce the concept of the Ladder of Causation and summarize its building blocks. Figure 2.1 presents a symbolic representation of the Ladder of Causation. The higher the rung, the more sophisticated our capabilities become, but let’s start from the beginning:

Figure 2.1 – The Ladder of Causation. Image by the author, based on a picture by Laurie Shaw (https://www.pexels.com/photo/brown-wooden-door-frame-804394/)

Figure 2.1 – The Ladder of Causation. Image by the author, based on a picture by Laurie Shaw (https://www.pexels.com/photo/brown-wooden-door-frame-804394/)

The Ladder of Causation, introduced by Judea Pearl (Pearl, Mackenzie, 2019), is a helpful metaphor for understanding distinct levels of relationships between variables – from simple associations to counterfactual reasoning. Pearl’s ladder has three rungs. Each rung is related to different activity and offers answers to different types of causal questions. Each rung comes with a distinct set of mathematical tools...

Associations

In this section, we’ll demonstrate how to quantify associational relationships using conditional probability. Then, we’ll briefly introduce structural causal models. Finally, we’ll implement conditional probability queries using Python.

We already learned a lot about associations. We know that associations are related to observing and that they allow us to generate predictions. Let’s take a look at mathematical tools that will allow us to talk about associations in a more formal way.

We can view the mathematics of rung one from a couple of angles. In this section, we’ll focus on the perspective of conditional probability.

Conditional probability

Conditional probability is the probability of one event, given that another event has occurred. A mathematical symbol that we use to express conditional probability is | (known as a pipe or vertical bar). We read <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:m="http://schemas.openxmlformats.org/officeDocument/2006/math"><mml:mi>P</mml:mi><mml:mo>(</mml:mo><mml:mi>X</mml:mi><mml:mo>|</mml:mo><mml:mi>Y</mml:mi><mml:mo>)</mml:mo></mml:math> as a probability of X given Y. This notation is a bit simplified (or...

What are interventions?

In this section, we’ll summarize what we’ve learned about interventions so far and introduce mathematical tools to describe them. Finally, we’ll use our newly acquired knowledge to implement an intervention example in Python.

The idea of intervention is very simple. We change one thing in the world and observe whether and how this change affects another thing in the world. This is the essence of scientific experiments. To describe interventions mathematically, we use a special <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:m="http://schemas.openxmlformats.org/officeDocument/2006/math"><mml:mi>d</mml:mi><mml:mi>o</mml:mi></mml:math>-operator. We usually express it in mathematical notation in the following way:

<mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:m="http://schemas.openxmlformats.org/officeDocument/2006/math" display="block"><mml:mi>P</mml:mi><mml:mfenced separators="|"><mml:mrow><mml:mi>Y</mml:mi><mml:mo>=</mml:mo><mml:mn>1</mml:mn></mml:mrow><mml:mrow><mml:mi>d</mml:mi><mml:mi>o</mml:mi><mml:mfenced separators="|"><mml:mrow><mml:mi>X</mml:mi><mml:mo>=</mml:mo><mml:mn>0</mml:mn></mml:mrow></mml:mfenced></mml:mrow></mml:mfenced></mml:math>

The preceding formula states that the probability of <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:m="http://schemas.openxmlformats.org/officeDocument/2006/math"><mml:mi>Y</mml:mi><mml:mo>=</mml:mo><mml:mn>1</mml:mn></mml:math>, given that we set <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:m="http://schemas.openxmlformats.org/officeDocument/2006/math"><mml:mi>X</mml:mi></mml:math> to 0. The fact that we need to change <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:m="http://schemas.openxmlformats.org/officeDocument/2006/math"><mml:mi>X</mml:mi></mml:math>’s value is critical here, and it highlights the inherent difference between intervening and conditioning (conditioning is the operation that we used to obtain conditional probabilities in the previous section). Conditioning only modifies our view of the data...

What are counterfactuals?

Have you ever wondered where you would be today if you had chosen something different in your life? Moved to another city 10 years ago? Studied art? Dated another person? Taken a motorcycle trip in Hawaii? Answering these types of questions requires us to create alternative worlds, worlds that we have never observed. If you’ve ever tried doing this for yourself, you already know intuitively what counterfactuals are.

Let’s try to structure this intuition. We can think about counterfactuals as a minimal modification to a system (Pearl, Glymour, and Jewell, 2016). In this sense, they are similar to interventions. Nonetheless, there is a fundamental difference between the two.

Counterfactuals can be thought of as hypothetical or simulated interventions that assume a particular state of the world (note that interventions do not require any assumptions about the state of the world). For instance, answering a counterfactual question such as &...

Extra – is all machine learning causally the same?

So far, when we have spoken about machine learning, we mainly mean supervised methods. You might wonder what the relationship is between other types of machine learning and causality.

Causality and reinforcement learning

For many people, the first family of machine learning methods that come to mind when thinking about causality is reinforcement learning (RL).

In the classic formulation of RL, an agent interacts with the environment. This suggests that an RL agent can make interventions in the environment. Intuitively, this possibility moves RL from an associative rung one to an interventional rung two. Bottou et al. (2013) amplify this intuition by proposing that causal models can be reduced to multiarmed bandit problems – in other words, that RL bandit algorithms are special cases of rung two causal models.

Although the idea that all RL is causal might seem intuitive at first, the reality is more nuanced...

Wrapping it up

In this chapter, we introduced the concept of the Ladder of Causation. We discussed each of the three rungs of the ladder: associations, interventions, and counterfactuals. We presented mathematical apparatus to describe each of the rungs and translated the ideas behind them into code. These ideas are foundational for causal thinking and will allow us to understand more complex topics further on in the book.

Additionally, we broadened our perspective on causality by discussing the relationships between causality and various families of machine learning algorithms.

In the next chapter, we’ll take a look at the link between observations, interventions, and linear regression to see the differences between rung one and rung two from yet another perspective. Ready?

References

Berrevoets, J., Kacprzyk, K., Qian, Z., & van der Schaar, M. (2023). Causal Deep Learning. arXiv preprint arXiv:2303.02186.

Bottou, L., Peters, J., Quiñonero-Candela, J., Charles, D. X., Chickering, D. M., Portugaly, E., Ray, D., Simard, P., & Snelson, E. (2013). Counterfactual Reasoning and Learning Systems: The Example of Computational Advertising. J. Mach. Learn. Res., 14 (1), 3207–3260.

Gretton, A., Fukumizu, K., Teo, C. H., Song, L., Schölkopf, B., & Smola, A. (2007). A Kernel Statistical Test of Independence. NIPS.

Holland, P. (1986). Statistics and Causal Inference. Journal of the American Statistical Association, 81, 945–960.

Huszár, F. (2019, January 24). Causal Inference 3: Counterfactuals. https://www.inference.vc/causal-inference-3-counterfactuals/.

Kaddour, J., Lynch, A., Liu, Q., Kusner, M. J., & Silva, R. (2022). Causal Machine Learning: A Survey and Open Problems. arXiv, abs/2206.15475

Lee, S...

References

Berrevoets, J., Kacprzyk, K., Qian, Z., & van der Schaar, M. (2023). Causal Deep Learning. arXiv preprint arXiv:2303.02186.

Bottou, L., Peters, J., Quiñonero-Candela, J., Charles, D. X., Chickering, D. M., Portugaly, E., Ray, D., Simard, P., & Snelson, E. (2013). Counterfactual Reasoning and Learning Systems: The Example of Computational Advertising. J. Mach. Learn. Res., 14 (1), 3207–3260.

Gretton, A., Fukumizu, K., Teo, C. H., Song, L., Schölkopf, B., & Smola, A. (2007). A Kernel Statistical Test of Independence. NIPS.

Holland, P. (1986). Statistics and Causal Inference. Journal of the American Statistical Association, 81, 945–960.

Huszár, F. (2019, January 24). Causal Inference 3: Counterfactuals. https://www.inference.vc/causal-inference-3-counterfactuals/.

Kaddour, J., Lynch, A., Liu, Q., Kusner, M. J., & Silva, R. (2022). Causal Machine Learning: A Survey and Open Problems. arXiv, abs/2206.15475

Lee, S...

lock icon
The rest of the chapter is locked
You have been reading a chapter from
Causal Inference and Discovery in Python
Published in: May 2023Publisher: PacktISBN-13: 9781804612989
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
undefined
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at €14.99/month. Cancel anytime

Author (1)

author image
Aleksander Molak

Aleksander Molak is a Machine Learning Researcher and Consultant who gained experience working with Fortune 100, Fortune 500, and Inc. 5000 companies across Europe, the USA, and Israel, designing and building large-scale machine learning systems. On a mission to democratize causality for businesses and machine learning practitioners, Aleksander is a prolific writer, creator, and international speaker. As a co-founder of Lespire, an innovative provider of AI and machine learning training for corporate teams, Aleksander is committed to empowering businesses to harness the full potential of cutting-edge technologies that allow them to stay ahead of the curve.
Read more about Aleksander Molak