Reader small image

You're reading from  Causal Inference and Discovery in Python

Product typeBook
Published inMay 2023
PublisherPackt
ISBN-139781804612989
Edition1st Edition
Concepts
Right arrow
Author (1)
Aleksander Molak
Aleksander Molak
author image
Aleksander Molak

Aleksander Molak is a Machine Learning Researcher and Consultant who gained experience working with Fortune 100, Fortune 500, and Inc. 5000 companies across Europe, the USA, and Israel, designing and building large-scale machine learning systems. On a mission to democratize causality for businesses and machine learning practitioners, Aleksander is a prolific writer, creator, and international speaker. As a co-founder of Lespire, an innovative provider of AI and machine learning training for corporate teams, Aleksander is committed to empowering businesses to harness the full potential of cutting-edge technologies that allow them to stay ahead of the curve.
Read more about Aleksander Molak

Right arrow

Graphs, graphs, graphs

This section will be a quick refresher on graphs and basic graph theory. If you’re not familiar with graphs – don’t worry – you can treat this section as a crash course on the topic.

Let’s start!

Graphs can be defined in multiple ways. You can think of them as discrete mathematical structures, abstract representations of real-world entities and relations between them, or computational data structures. What all of these perspectives have in common are the basic building blocks of graphs: nodes (also called vertices) and edges (links) that connect the nodes.

Types of graphs

We can divide graphs into types based on several attributes. Let’s discuss the ones that are the most relevant from the causal point of view.

Undirected versus directed

Directed graphs are graphs with directed edges, while undirected graphs have undirected edges. Figure 4.1 presents an example of a directed and undirected graph:

...

What is a graphical model?

In this section, we’re going to discuss what graphical causal models (GCMs) are and how they can help in causal inference and discovery.

GCMs can be seen as a useful framework that integrates probabilistic, structural, and graphical aspects of causal inference.

Formally speaking, we can define a graphical causal model as a set consisting of a graph and a set of functions that induce a joint distribution over the variables in the model (Peters et al., 2017).

The basic building blocks of GCM graphs are the same as the basic elements of any directed graph: nodes and directed edges. In a GCM, each node is associated with a variable.

Importantly, in GCMs, edges have a strictly causal interpretation, so that <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:m="http://schemas.openxmlformats.org/officeDocument/2006/math"><mml:mi>A</mml:mi><mml:mo>→</mml:mo><mml:mi>B</mml:mi></mml:math> means that <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:m="http://schemas.openxmlformats.org/officeDocument/2006/math"><mml:mi>A</mml:mi></mml:math> causes <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:m="http://schemas.openxmlformats.org/officeDocument/2006/math"><mml:mi>B</mml:mi></mml:math> (this is one of the differentiating factors between causal models and Bayesian networks; Pearl & Mackenzie, 2019, pp. 111-113). GCMs are very powerful because certain combinations of nodes and edges can reveal important...

DAG your pardon? Directed acyclic graphs in the causal wonderland

We’ll start this section by reviewing definitions of causality. Then, we’ll discuss the motivations behind DAGs and their limitations. Finally, we’ll formalize the concept of a DAG.

Definitions of causality

In the first chapter, we discussed a couple of historical definitions of causality. We started with Aristotle, then we briefly covered the ideas proposed by David Hume. We’ve seen that Hume’s definition (as we presented it) was focused on associations. This led us to look into how babies learn about the world using experimentation. We‘ve seen how experimentation allows us to go beyond the realm of observations by interacting with the environment. The possibility of interacting with the environment is at the heart of another definition of causality that comes from Judea Pearl.

Pearl proposed something very simple yet powerful. His definition is short, ignores ontological...

Sources of causal graphs in the real world

We have discussed graphs from several perspectives now, yet we haven’t tackled an important practical question: what is the source of causal graphs in the real world?

In this section, we’ll provide a brief overview of such sources and we’ll leave a more detailed discussion for Part 3 of the book.

On a high level, we can group the ways of obtaining causal graphs into three classes:

  • Causal discovery
  • Expert knowledge
  • A combination of both

Let’s discuss them briefly.

Causal discovery

Causal discovery and causal structure learning are umbrella terms for various kinds of methods used to uncover causal structure from observational or interventional data. We devote the entirety of Part 3 of this book to this topic.

Expert knowledge

Expert knowledge is a term covering various types of knowledge that can help define or disambiguate causal relations between two or more variables. Depending...

Extra – is there causality beyond DAGs?

In this extra section, we’ll give a brief overview of some non-DAG-based approaches to causality. This is definitely an incomplete and somehow subjective guide.

Dynamical systems

The scenario with two interacting partners that we discussed in the previous section describes a dynamical system. This particular example is inspired by the research by an American ex-rabbi turned psychologist called John Gottman, who studies human romantic relationships from a dynamical systems point of view (for an overview: Gottman & Notarius, 2000; Gottman et al., 1999).

Dynamical systems are often described using differential equations and cannot be solved analytically (for a toy example of differential equations applied to romantic relationships, check Strogatz, 1988).

Dynamical systems have been extensively studied in physics (Strogatz, 2018), biology (Cosentino & Bates, 2011), and psychology (Nowak & Vallacher, 1998), among...

Wrapping it up

We started this chapter by refreshing our knowledge of graphs and learned how to build simple graphs using Python and the NetworkX library. We introduced GCMs and DAGs and discussed some common limitations and challenges that we might face when using them.

Finally, we examined selected approaches to model causal systems with cycles.

Now you have the ability to translate between the visual representation of a graph and an adjacency matrix. The basic DAG toolkit that we’ve discussed in this chapter will allow you to work smoothly with many causal inference and causal discovery tools and will help you represent your own problems as graphs, which can bring a lot of clarity – even in your work with traditional (non-causal) machine learning.

The knowledge you gained in this chapter will be critical to understanding the next chapter and the next two parts of this book. Feel free to review this chapter anytime you need.

In the next chapter, we’...

References

Cosentino, C., & Bates, D. (2011). Feedback control in systems biology. CRC Press.

Forré, P., & Mooij, J. M. (2017). Markov properties for graphical models with cycles and latent variables. arXiv preprint arXiv:1710.08775.

Forré, P., & Mooij, J. M. (2018). Constraint-based causal discovery for non-linear structural causal models with cycles and latent confounders. arXiv preprint arXiv:1807.03024.

Gottman, J. M., & Notarius, C. I. (2000). Decade review: Observing marital interaction. Journal of marriage and family, 62(4), 927-947.

Gottman, J., Swanson, C., & Murray, J. (1999). The mathematics of marital conflict: Dynamic mathematical nonlinear modeling of newlywed marital interaction. Journal of Family Psychology, 13(1), 3.

Magliacane, S., Van Ommen, T., Claassen, T., Bongers, S., Versteeg, P., & Mooij, J. M. (2018). Domain adaptation by using causal inference to predict invariant conditional distributions. Advances in neural...

References

Cosentino, C., & Bates, D. (2011). Feedback control in systems biology. CRC Press.

Forré, P., & Mooij, J. M. (2017). Markov properties for graphical models with cycles and latent variables. arXiv preprint arXiv:1710.08775.

Forré, P., & Mooij, J. M. (2018). Constraint-based causal discovery for non-linear structural causal models with cycles and latent confounders. arXiv preprint arXiv:1807.03024.

Gottman, J. M., & Notarius, C. I. (2000). Decade review: Observing marital interaction. Journal of marriage and family, 62(4), 927-947.

Gottman, J., Swanson, C., & Murray, J. (1999). The mathematics of marital conflict: Dynamic mathematical nonlinear modeling of newlywed marital interaction. Journal of Family Psychology, 13(1), 3.

Magliacane, S., Van Ommen, T., Claassen, T., Bongers, S., Versteeg, P., & Mooij, J. M. (2018). Domain adaptation by using causal inference to predict invariant conditional distributions. Advances in neural...

lock icon
The rest of the chapter is locked
You have been reading a chapter from
Causal Inference and Discovery in Python
Published in: May 2023Publisher: PacktISBN-13: 9781804612989
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
undefined
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at €14.99/month. Cancel anytime

Author (1)

author image
Aleksander Molak

Aleksander Molak is a Machine Learning Researcher and Consultant who gained experience working with Fortune 100, Fortune 500, and Inc. 5000 companies across Europe, the USA, and Israel, designing and building large-scale machine learning systems. On a mission to democratize causality for businesses and machine learning practitioners, Aleksander is a prolific writer, creator, and international speaker. As a co-founder of Lespire, an innovative provider of AI and machine learning training for corporate teams, Aleksander is committed to empowering businesses to harness the full potential of cutting-edge technologies that allow them to stay ahead of the curve.
Read more about Aleksander Molak