Packt+ | Advance your knowledge in tech

You're reading from Scala for Data Science

Product typeBook

Published inJan 2016

Reading LevelIntermediate

Publisher

ISBN-139781785281372

Edition1st Edition

Languages

Scala

Concepts

Application Development

Author (1)

Pascal Bugnion

Appendix A. Pattern Matching and Extractors

Pattern matching is a powerful tool for control flow in Scala. It is often underused and under-estimated by people coming to Scala from imperative languages.

Let's start with a few examples of pattern matching before diving into the theory. We start by defining a tuple:

scala> val names = ("Pascal", "Bugnion")
names: (String, String) = (Pascal,Bugnion)

We can use pattern matching to extract the elements of this tuple and bind them to variables:

scala> val (firstName, lastName) = names
firstName: String = Pascal
lastName: String = Bugnion

We just extracted the two elements of the names tuple, binding them to the variables firstName and lastName. Notice how the left-hand side defines a pattern that the right-hand side must match: we are declaring that the variable names must be a two-element tuple. To make the pattern more specific, we could also have specified the expected types of the elements in the tuple:

scala> val (firstName:String, lastName:String) = names
firstName: String = Pascal
lastName: String = Bugnion

What happens if the pattern on the left-hand side does not match the right-hand side?

scala> val (firstName, middleName, lastName) = names
<console>:13: error: constructor cannot be instantiated to expected type;
found   : (T1, T2, T3)
required: (String, String)
   val (firstName, middleName, lastName) = names

This results in a compile error. Other types of pattern matching failures result in runtime errors.

Pattern matching is very expressive. To achieve the same behavior without pattern matching, you would have to do the following explicitly:

Verify that the variable names is a two-element tuple
Extract the first element and bind it to firstName
Extract the second element and bind it to lastName

If we expect certain elements in the tuple to have specific values, we can verify this as part of the pattern match. For instance, we can verify that the first element of the names tuple matches "Pascal":

scala> val ("Pascal", lastName) = names
lastName: String = Bugnion

Besides tuples, we can also match on Scala collections:

scala> val point = Array(1, 2, 3)
point: Array[Int] = Array(1, 2, 3)

scala> val Array(x, y, z) = point
x: Int = 1
y: Int = 2
z: Int = 3

Notice the similarity between this pattern matching and array construction:

scala> val point = Array(x, y, z)
point: Array[Int] = Array(1, 2, 3)

Syntactically, Scala expresses pattern matching as the reverse process to instance construction. We can think of pattern matching as the deconstruction of an object, binding the object's constituent parts to variables.

When matching against collections, one is sometimes only interested in matching the first element, or the first few elements, and discarding the rest of the collection, whatever its length. The operator _* will match against any number of elements:

scala> val Array(x, _*) = point
x: Int = 1

By default, the part of the pattern matched by the _* operator is not bound to a variable. We can capture it as follows:

scala> val Array(x, xs @ _*) = point
x: Int = 1
xs: Seq[Int] = Vector(2, 3)

Besides tuples and collections, we can also match against case classes. Let's start by defining a case representing a name:

scala> case class Name(first: String, last: String)
defined class Name

scala> val name = Name("Martin", "Odersky")
name: Name = Name(Martin,Odersky)

We can match against instances of Name in much the same way we matched against tuples:

scala> val Name(firstName, lastName) = name
firstName: String = Martin
lastName: String = Odersky

All these patterns can also be used in match statements:

scala> def greet(name:Name) = name match {
  case Name("Martin", "Odersky") => "An honor to meet you"
  case Name(first, "Bugnion") => "Wow! A family member!"
  case Name(first, last) => s"Hello, $first"
}
greet: (name: Name)String

Pattern matching in for comprehensions

Pattern matching is useful in for comprehensions for extracting items from a collection that match a specific pattern. Let's build a collection of Name instances:

scala> val names = List(Name("Martin", "Odersky"), 
  Name("Derek", "Wyatt"))
names: List[Name] = List(Name(Martin,Odersky), Name(Derek,Wyatt))

We can use pattern matching to extract the internals of the class in a for-comprehension:

scala> for { Name(first, last) <- names } yield first
List[String] = List(Martin, Derek)

So far, nothing terribly ground-breaking. But what if we wanted to extract the surname of everyone whose first name is "Martin"?

scala> for { Name("Martin", last) <- names } yield last
List[String] = List(Odersky)

Writing Name("Martin", last) <- names extracts the elements of names that match the pattern. You might think that this is a contrived example, and it is, but the examples in Chapter 7, Web APIs demonstrate the usefulness and versatility of this language pattern, for instance, for extracting specific fields from JSON objects.

Pattern matching internals

If you define a case class, as we saw with Name, you get pattern matching against the constructor for free. You should be using case classes to represent your data as much as possible, thus reducing the need to implement your own pattern matching. It is nevertheless useful to understand how pattern matching works.

When you create a case class, Scala automatically builds a companion object:

scala> case class Name(first: String, last: String)
defined class Name

scala> Name.<tab>
apply   asInstanceOf   curried   isInstanceOf   toString   tupled   unapply

The method used (internally) for pattern matching is unapply. This method takes, as argument, an object and returns Option[T], where T is a tuple of the values of the case class.

scala> val name = Name("Martin", "Odersky")
name: Name = Name(Martin,Odersky)

scala> Name.unapply(name)
Option[(String, String)] = Some((Martin,Odersky))

The unapply method is an extractor. It plays the opposite role of the constructor: it takes an object and extracts the list of parameters needed to construct that object. When you write val Name(firstName, lastName), or when you use Name as a case in a match statement, Scala calls Name.unapply on what you are matching against. A value of Some[(String, String)] implies a pattern match, while a value of None implies that the pattern fails.

To write custom extractors, you just need an object with an unapply method. While unapply normally resides in the companion object of a class that you are deconstructing, this need not be the case. In fact, it does not need to correspond to an existing class at all. For instance, let's define a NonZeroDouble extractor that matches any non-zero double:

scala> object NonZeroDouble { 
  def unapply(d:Double):Option[Double] = {
    if (d == 0.0) { None } else { Some(d) }  
  }
}
defined object NonZeroDouble

scala> val NonZeroDouble(denominator) = 5.5
denominator: Double = 5.5

scala> val NonZeroDouble(denominator) = 0.0
scala.MatchError: 0.0 (of class java.lang.Double)
  ... 43 elided

We defined an extractor for NonZeroDouble, despite the absence of a corresponding NonZeroDouble class.

This NonZeroDouble extractor would be useful in a match object. For instance, let's define a safeDivision function that returns a default value when the denominator is zero:

scala> def safeDivision(numerator:Double, 
  denominator:Double, fallBack:Double) =
    denominator match {
      case NonZeroDouble(d) => numerator / d
      case _ => fallBack
    }
safeDivision: (numerator: Double, denominator: Double, fallBack: Double)Double

scala> safeDivision(5.0, 2.0, 100.0)
Double = 2.5

scala> safeDivision(5.0, 0.0, 100.0)
Double = 100.0

This is a trivial example because the NonZeroDouble.unapply method is so simple, but you can hopefully see the usefulness and expressiveness, if we were to define a more complex test. Defining custom extractors lets you define powerful control flow constructs to leverage match statements. More importantly, they enable the client using the extractors to think about control flow declaratively: the client can declare that they need a NonZeroDouble, rather than instructing the compiler to check whether the value is zero.

Extracting sequences

The previous section explains extraction from case classes, and how to write custom extractors, but it does not explain how extraction works on sequences:

scala> val Array(a, b) = Array(1, 2)
a: Int = 1
b: Int = 2

Rather than relying on an unapply method, sequences rely on an unapplySeq method defined in the companion object. This is expected to return an Option[Seq[A]]:

scala> Array.unapplySeq(Array(1, 2))
Option[IndexedSeq[Int]] = Some(Vector(1, 2))

Let's write an example. We will write an extractor for Breeze vectors (which do not currently support pattern matching). To avoid clashing with the DenseVector companion object, we will write our unapplySeq in a separate object, called DV. All our unapplySeq method needs to do is convert its argument to a Scala Vector instance. To avoid muddying the concepts with generics, we will write this implementation for [Double] vectors only:

scala> import breeze.linalg._
import breeze.linalg._

scala> object DV {
  // Just need to convert to a Scala vector.
  def unapplySeq(v:DenseVector[Double]) = Some(v.toScalaVector)
}
defined object DV

Let's try our new extractor implementation:

scala> val vec = DenseVector(1.0, 2.0, 3.0)
vec: breeze.linalg.DenseVector[Double] = DenseVector(1.0, 2.0, 3.0)

scala> val DV(x, y, z) = vec
x: Double = 1.0
y: Double = 2.0
z: Double = 3.0

Summary

Pattern matching is a powerful tool for control flow. It encourages the programmer to think declaratively: declare that you expect a variable to match a certain pattern, rather than explicitly tell the computer how to check that it matches this pattern. This can save many lines of code and enhance clarity.

Reference

For an overview of pattern matching in Scala, there is no better reference than Programming in Scala, by Martin Odersky, Bill Venners, and Lex Spoon. An online version of the first edition is available at: https://www.artima.com/pins1ed/case-classes-and-pattern-matching.html.

Daniel Westheide's blog covers slightly more advanced Scala constructs, and is a very useful read: http://danielwestheide.com/blog/2012/11/21/the-neophytes-guide-to-scala-part-1-extractors.html.

The rest of the chapter is locked

You have been reading a chapter from

Scala for Data Science

Published in: Jan 2016Publisher: ISBN-13: 9781785281372

A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.

undefined

Unlock this book and the full library FREE for 7 days

Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of

Start free trial

Renews at $15.99/month. Cancel anytime

Author (1)

Pascal Bugnion

Pascal Bugnion is a data engineer at the ASI, a consultancy offering bespoke data science services. Previously, he was the head of data engineering at SCL Elections. He holds a PhD in computational physics from Cambridge University. Besides Scala, Pascal is a keen Python developer. He has contributed to NumPy, matplotlib and IPython. He also maintains scikit-monaco, an open source library for Monte Carlo integration. He currently lives in London, UK.
Read more about Pascal Bugnion

Personalised recommendations for you

Based on your interests and search pattern

C++ Programming for Linux Systems

This book covers the essential system programming tools and helps you explore the features of C++20. It emphasizes important details to maintain code quality and tackle everyday challenges of developing software for high performance, optimization, and more.

BookSep 2023288 pages

Expert C++

Discover advanced programming techniques, the latest features of C++17 and C++20, and best practices for memory management, debugging, testing, and large-scale application design with Expert C++. Ideal for experienced developers advancing to proficient programmers and building professional-grade C++ applications.

BookAug 2023604 pages

iOS 17 Programming for Beginners

iOS 17 Programming for Beginners, Eighth Edition is your comprehensive guide to learning the art of iOS app development. Whether you dream of creating the next chart-topping app or simply want to enhance your programming skills, this book is your trusted companion on this exciting journey.

BookOct 2023604 pages4

Developer Career Masterplan

Written by industry experts that have spent the last 20+ years helping developers grow their career path towards senior developer positions and beyond. This book provides a comprehensive guide, sharing examples and stories from their global careers. By the end, you’ll have the knowledge to create a clear career progression plan as a technical professional.

BookSep 2023310 pages

Refactoring with C#

In Refactoring with C#, you’ll explore the process of safely refactoring modern .NET code using Visual Studio features, advanced unit tests, AI assistance, and custom Roslyn analyzers.

BookNov 2023434 pages

Python Real-World Projects

Amplify your developer journey by curating a dynamic project portfolio that outshines traditional resumes. Delve into the Python realm through immersive projects, mastering core concepts while constructing comprehensive modules and applications. From data acquisition prowess to impactful data visualization, Python Real-World Projects arms you with essential skills to beat the competition.

BookSep 2023478 pages5

The MVVM Pattern in .NET MAUI

The MVVM Pattern in .NET MAUI enables developers to master MVVM principles and effectively apply them to .NET MAUI. This book uses real-life examples and covers complex problems to help you successfully apply MVVM with .NET MAUI to confidently develop robust and high-performing cross-platform apps.

BookNov 2023386 pages

Extending Microsoft Business Central with Power Platform

Extending Business Central with the Power Platform is a step-by-step guide for Business Central professionals to create solutions that automate business processes, explain complex workflow approvals, and integrate with hundreds of other systems, without traditional development. It’ll guide you in customizing Business Central with Power Platform.

BookAug 2023458 pages5

Extending Microsoft Business Central with Power Platform

Extending Business Central with the Power Platform is a step-by-step guide for Business Central professionals to create solutions that automate business processes, explain complex workflow approvals, and integrate with hundreds of other systems, without traditional development. It’ll guide you in customizing Business Central with Power Platform.

BookAug 2023458 pages5

Quantum Computing Algorithms

The book emphasizes intuitive ideas behind quantum algorithms in ways that other books don’t cover, striking a careful balance between no math and too much math. To get the most from this book, you should be comfortable with basic algebra and writing simple computer code. No prior understanding of quantum physics is needed to get started.

BookSep 2023342 pages

Python – Complete Python, Django, Data Science and ML Guide

Unlock Python's full potential with this 50+ hour course! From programming to web and game development, data manipulation, and machine learning, gain the skills required to succeed in various Python-related careers. With practical tasks, hands-on experience, and a strong foundation in Python, you'll be ready to tackle real-world challenges and take advantage of the many opportunities this versatile language offers.

VideoNov 202350 hours 30 minutes5

Python – Complete Python, Django, Data Science and ML Guide

Unlock Python's full potential with this 50+ hour course! From programming to web and game development, data manipulation, and machine learning, gain the skills required to succeed in various Python-related careers. With practical tasks, hands-on experience, and a strong foundation in Python, you'll be ready to tackle real-world challenges and take advantage of the many opportunities this versatile language offers.

VideoNov 202350 hours 30 minutes5