Learn Scala Programming

3.8 (4 reviews total)
By Slava Schmidt
    What do you get with a Packt Subscription?

  • Instant access to this title and 7,500+ eBooks & Videos
  • Constantly updated with 100+ new titles each month
  • Breadth and depth in over 1,000+ technologies
  1. An Introduction to Scala 2.13

About this book

The second version of Scala has undergone multiple changes to support features and library implementations. Scala 2.13, with its main focus on modularizing the standard library and simplifying collections, brings with it a host of updates.

Learn Scala Programming addresses both technical and architectural changes to the redesigned standard library and collections, along with covering in-depth type systems and first-level support for functions. You will discover how to leverage implicits as a primary mechanism for building type classes and look at different ways to test Scala code. You will also learn about abstract building blocks used in functional programming, giving you sufficient understanding to pick and use any existing functional programming library out there. In the concluding chapters, you will explore reactive programming by covering the Akka framework and reactive streams.

By the end of this book, you will have built microservices and learned to implement them with the Scala and Lagom framework.

Publication date:
October 2018
Publisher
Packt
Pages
498
ISBN
9781788836302

 

An Introduction to Scala 2.13

At the moment of this writing, Scala 2.13 has reached its five-year milestone and approaches the first release candidate. At this point, its feature set is unlikely to change and it is safe to take a look at the new features of the update.

In this chapter, we will talk about the scope of the release, placing the main focus of the conversation on its centerpiece—the new collection library.

The following topics will be discussed in this chapter:

  • Introduction to Scala 2.13
  • New features of Scala 2.13
  • The Scala 2.13 collection library
 

Technical requirements

 

Introduction to Scala 2.13

Scala 2.13 is the latest minor update of the Scala programming language. Despite looking like a minor bump in the version number, this release is much more important than it might appear.

The reason for this is that its main focus is the reworked collection library, which is going to replace the current version introduced in version 2.8 and slightly redesigned in version 2.9.

The new collection framework is here to stay in Scala 2 and also will become a part of Scala 3.

As it is mostly a library release, the language itself is not changing a lot as compared to the previous version. Apart from the collections, the new version improves on three aspects:

  • Minimizes the core library
  • Speeds up the compiler
  • Improves user-friendliness

These details are outside of the scope of this book and we will not discuss them further.

Besides that, there is an addition of literal and singleton types, which we will discuss in detail in Chapter 2, Understanding Types in Scala, and a few minor changes to the standard library which we'll look at next, before diving into the sea of maps and lists.

Eager to look into the future? We'll take you there!

 

New features of Scala 2.13

In this section, we will discuss a few small improvements in the new version, which are not related to the collections topic and don't really belong to some bigger topic, such as optional parsing for string literals, adding names-reporting functions to case classes, methods for chaining operations, and automatic resource-management.

Optional parsing for string literals

In Scala 2.13, StringOps has been extended with methods that return Option for string-literals parsing. Supported types include all numeric types and Boolean.

The new methods can greatly simplify the processing of user-provided data without the need to wrap the calls with the exception-handling, as shown in the following example:

scala> "10".toIntOption
res3: Option[Int] = Some(10)
scala> "TrUe".toBooleanOption
res4: Option[Boolean] = Some(true)
scala> val bool = "Not True"
bool: String = Not True
scala> bool.toBooleanOption
res5: Option[Boolean] = None

The optional Boolean parsing ignores the case of the argument the same way the exception-throwing toBoolean method does.

Products can report the names of their element

This feature probably will be mostly useful for the case classes as it makes possible some generic programming without the need to resort to reflection or macros.

The following examples demonstrate how the new productElementName(idx) method can be used to build a naive JSON serializer for simple case classes:

case class User(name: String, surname: String, email: String)

def naiveToJsonString(p: Product): String =
(for { i <- 0 until p.productArity } yield
s""""${p.productElementName(i)}": "${p.productElement(i)}"""")
.mkString("{ ", ", ", " }")

Obviously, this simple iteration does not take nesting and escaping into account, but it already can produce valid results in elementary cases:

scala> val user = User("John", "Doe", "[email protected]")
user: User = User(John,Doe,[email protected])
scala> naiveToJsonString(user)
res1: String = { "name": "John", "surname": "Doe", "email": "[email protected]" }

Unfortunately, the method taking an index of the element throws an exception in the case that the index is invalid:

scala> user.productElementName(3)
java.lang.IndexOutOfBoundsException: 3
at User.productElementName(<console>:1)
... 38 elided

We will discuss why throwing exceptions is not the best approach, as well as viable alternatives, in Chapter 6, Exploring Built-In Effects.

Added methods for chaining operations

Via import scala.util.chaining._, it is now possible to add tap and pipe methods to instances of any type. The functionality is provided by an implicit conversion to ChainingOps. We will look at implicits in detail in Chapter 4, Getting to Know Implicits and Type Classes.

The pipe method applies a given function to the value and returns the result. It might be helpful in situations where it is desirable to convert nested function calls into the fluid-interface-like code. The following snippet shows an example of an imaginary user database with nested function calls chained via pipe.

Consider the following the database interface:

object UserDb {
def getById(id: Long): User = ???
def update(u: User): User = ???
def save(u: User): Boolean = ???
}

We could apply all three actions to the user at once:

import UserDb._
val userId = 1L
save(update(getById(userId)))

pipe allows us to represent this in a more readable format:

getById(userId).pipe(update).pipe(save)

Arguably the same (or an even clearer) result could be achieved by combining functions before applying them:

val doEverything = (getById _).andThen(update).andThen(save)
doEverything(userId)

We will look at functions in general, and function composition in particular, in Chapter 3, Deep Dive into Functions.

tap applies a function given as an argument solely for the side-effects it produces and returns the original value. It might be useful, for example, for logging purposes and the simplest kind of performance measurements.

The next snippet demonstrates an elementary side-effect-causing performance-tracking implementation that utilizes a global variable:

scala> import scala.util.chaining._
import scala.util.chaining._
scala> val lastTick = new java.util.concurrent.atomic.AtomicLong(0)
lastTick: java.util.concurrent.atomic.AtomicLong = 0
scala> def measure[A](a: A): Unit = {
| val now = System.currentTimeMillis()
| val before = lastTick.getAndSet(now)
| println(s"$a: ${now-before} ms elapsed")
| }
measure: [A](a: A)Unit
scala> def start(): Unit = lastTick.set(System.currentTimeMillis())
start: ()Unit
scala> start()
scala> val result = scala.io.StdIn.readLine().pipe(_.toIntOption).tap(measure)
None: 291 ms elapsed
result: Option[Int] = None
scala> val anotherResult = scala.io.StdIn.readLine().pipe(_.toIntOption).tap(measure)
Some(3456): 11356 ms elapsed
anotherResult: Option[Int] = Some(3456)

Here, we defined a global value of the AtomicLong type to store the last measured timestamp. Then we define a polymorphic measure method that captures the time between the moment of the last measurement and now, and a start method to reset the clock. After that, we can use the tap method to track the execution times of our actions.

We will talk about types and polymorphism in Chapter 2, Understanding Types in Scala, side-effects and more general concept of effects in Chapter 8, Dealing with Effects, and show drawbacks of having global variables and a global state in Chapter 9, Familiarizing Yourself with Basic Monads.

Automatic Resource Management

Scala 2.13 adds a practical way to automatically manage resources. We will discuss other ways to manage resources and implement dependency-injection in Chapter 9, Familiarizing Yourself with Basic Monads and 10. scala.util.Using allows us to do this in a familiar side-effecting way. All operations on the resource are wrapped in a Try, which we'll talk about in Chapter 6, Exploring Built-In Effects. If Exceptions is thrown, the first one is returned within a Try. The exception-handling is quite sophisticated in some corner cases and we invite the reader to consult ScalaDoc for a detailed description of it.

Using is a class that takes some resources as a by-name parameter. The resource can be anything that has a type class instance for scala.util.Resource available. Such an instance for java.lang.AutoCloseable is provided in the standard library. We will study type classes in Chapter 4, Getting to Know Implicits and Type Classes. Using also has a monadic interface, which allows us to combine multiple resources in for-comprehensions. We'll discuss monads in Chapter 9, Familiarizing Yourself with Basic Monads.

Here is an example of the practical application of Using. We will define a resource that implements AutoCloseable and a few of these resources in for-comprehension as a source of the data:

scala> import scala.util.{Try, Using}
import scala.util.{Try, Using}
scala> final case class Resource(name: String) extends AutoCloseable {
| override def close(): Unit = println(s"Closing $name")
| def lines = List(s"$name line 1", s"$name line 2")
| }
defined class Resource
scala> val List(r1, r2, r3) = List("first", "2", "3").map(Resource)
r1: Resource = Resource(first)
r2: Resource = Resource(2)
r3: Resource = Resource(3)

scala> val lines: Try[Seq[String]] = for {
| u1 <- Using(r1)
| u2 <- Using(r2)
| u3 <- Using(r3)
| } yield {
| u1.lines ++ u2.lines ++ u3.lines
| }
Closing 3
Closing 2
Closing first
lines: scala.util.Try[Seq[String]] = Success(List(first line 1, first line 2, 2 line 1, 2 line 2, 3 line 1, 3 line 2))

The output in the console demonstrates that the result contains lines from all of the resources, and the resources themselves are automatically closed in the reverse order.

Now, after this small warm-up, we are ready to dive into the foundation of version 2.13the new collection library.

 

The Scala 2.13 Collection Library

Scala 2.13 delivers a new collection library, for historical reasons it is also known as "collection - strawman". The refactoring of the library pursued a few main goals, such as fixing common gotchas of the previous version, simplifying its implementation and internal structure, as well as usage and backward-compatibility, achieving better integration with lazy collections and java streams and cleaner API separation between mutable and immutable collections, improving performance, and, last but not least, minimizing the migration effort from Scala 2.12's collections.

As a result, we have a library that is mostly source-compatible with the previous version, has many old methods and types (such as Traversable, TraversableOnce, and Stream) deprecated, and has a simpler internal hierarchy.

This book assumes that the reader has a rudimentary understanding of Scala collections. With this assumption, the next section will take a holistic approach and focus on giving a consistent overview of the new collection framework.

The next diagram represents the top-level hierarchy of the collection library:

Here and further, we will pretend to always have import scala.collections._ in the scope and use the following colour encoding in our diagrams:

Each of the traits describes the structure, the essence of the collection. As the name suggests, IterableOnce can be iterated over only one time. Iterable softens this constraint so that it is possible to iterate over the collection multiple times. Seq adds a notion of succession to the elements of the collection, Set adds a constraint of the uniqueness of its elements, and Map changes the type of the collection from a single element, A, to a pair of key, K, and value, V.

As mentioned, in the spirit of the separation of concerns, these traits cover only the structural characteristics. The operations defined for the specific type are placed in the helper traits carrying the Ops suffix in the name. These traits form a hierarchical structure similar to the previous one, as shown here:

Where the "normal" traits had only one type parameter, the type of the element, the Ops have three of them. In addition to the type of the element, A, the C type describes the specific representation type of the collection this trait is mixed into and thus to the return type of the first-order methods defined on this collection. The CC type refers to the representation type that can be returned by the higher-order methods, or the type constructor. We will see later in this chapter how this works in practice.

Because the inheritance tree is structured as it is, IterableOps and IterableOnceOps are effectively mixed into every collection implementation in the library. Three traits on the bottom just add some more methods, unique to the specific collection type, and override some of the definitions for efficiency. Both Iterable*Ops traits define more than a hundred of methods and they are the reason the Scala collection library is very consistent and homogenous.

Because of the importance of IterableOnceOps and IterableOps, we will take a detailed look at them in the next section. After that, we will explore the unique features of the specialized collections.

IterableOnceOps

IterableOnceOps represents a blueprint for the collections that can be traversed one or multiple times. It defines a few abstract methods that must be implemented by every collection and a number of concrete methods implemented in terms of an iterator available from IterableOnce. The concrete methods provide default, if possible, lazy, implementations and fall into one of the following categories:

  • Size operations: isEmpty, nonEmpty, size, knownSize, and isTraversableAgain check the collection for (non) emptiness or return its size. knownSize is an optimization that returns -1 if the size cannot be determined without iterating over the collection. isTraversableAgain returns false for IterableOnce.
  • Element tests: forall, exists, and count check whether all, at least one, or some number of elements satisfy the given predicate.
  • String operations: mkString and addString. These methods with different argument sets provide a possibility to build alternative string representation.
  • Conversions to another collections: copyToArray, toList, toMap, to, toSet, toSeq, toIndexedSeq, toBuffer, and toArray. These methods copy or convert an Iterable into another collection. The to method is special in this list because it allows us to return any type of the collection that has a Factory available. We will look at it in more detail soon.
  • Fold and reduce: foldLeft, foldRight, reduce, reduceLeft, reduceRight, reduceOption, reduceLeftOption, and reduceRightOption apply a binary operation to the elements of the collection. The reduce*Option methods handle the case of an empty collection gracefully by returning None.
  • Numeric combinations: sum and product calculate the sum or product of the elements if there is an implicit Numeric[B] such that B >: A is available.
  • Ordering combinations: min, minOption, max, maxOption, maxBy, maxByOption, minBy, and minByOption find an element of the collection that satisfies giving predicate if there is an implicit Ordering[B] with B >: A available. The *Option methods return None for an empty collection instead of throwing an exception.
  • Element retrieval: collectFirst and find. Choose an element that satisfies a given condition.
  • Equality: corresponds is an alternative way to compare collections. Satisfied if every element of this collection relates to matching element of another collection by given predicate.

Abstract methods fall into one of the following categories:

  • Subcollection retrieval: filter, filterNot, take, takeWhile, drop, dropWhile, slice, and span. Take or discard elements that satisfy the given predicate or range, from the whole collection or beginning of it.
  • Mapping: map, flatMap, collect, and scanLeft. Transforms elements of the collection by applying some function and possibly filtering the results.
  • Zippers: zipWithIndex adds an index to all elements of the collection.

IterableOps

IterableOps extends IterableOnceOps and contains methods that is impossible to implement without a possibility to iterate over the collection multiple times.

They fall into the following categories:

  • Element retrieval: head, headOption, last, and lastOption return the first or last element of the collection throwing NoSuchElementException or returning None for an empty collection.
  • Size: sizeCompare is an optimization that allows us to efficiently compare the size of the collection with given value.
  • Subcollection retrieval: partition, partitionWith, splitAt, takeRight, dropRight, grouped, sliding, tail, init, groupBy, groupMap, groupMapReduce, tails, and inits. These split the collection as defined by some predicate or index, take or drop elements from the end, group elements by some criteria or predicate possibly applying transformative function, and discard first or last elements.
  • Mapping: scanRight produces a collection containing the cumulative results of applying the giving function starting from the end of the collection.
  • Addition: concat, ++ returns another collection containing all elements of this collection and a collection provided as an argument.
  • Zippers: zip, zipAll, unzip, and unzip3 combine the elements of the collection with the elements of another collection into a product, or split them into separate collections.
  • Transformation: transpose transforms the collection of collections by turning rows into columns and vice versa.

The following methods that defined the abstract in IterableOnceOps got a concrete default implementation in IterableOps: filter, filterNot, take, takeWhile, span, drop, dropWile, slice, scanLeft, map, flatMap, flatten, collect, and zipWithIndex. isTraversableAgain is overriden to return true.

It's worth noting that Iterable and IterableOnce do not support a general-equality operation, it is defined on specific collection subtypes. Because of this, it is impossible to compare these types directly using the equality operation, as the following example suggests:

scala> Set(1,2,3) == Seq(1,2,3)
res4: Boolean = false

Also there are three special methods that deserve our additional attention because they introduce types we haven't met yet:

  • def withFilter(p: A => Boolean): collection.WithFilter[A, CC]
  • def iterableFactory: IterableFactory[CC]
  • def view: View[A]

Let's discuss them quickly before moving on to more specific collection types.

WithFilter

WithFilter is a template class that contains the map, flatMap, foreach, and withFilter methods of Iterable. It allows us to specialize mapping and filtering operations for distinguished collections.

Because of its technical nature, we won't go into further details here.

IterableFactory

trait IterableFactory[+CC[_]] is a base trait for companion objects for specific collections which provides a number of operations to create a specific collection with the type specified by the CC type constructor; this is sometimes called target-type driven building because the type of the source collection is ignored. Most of the companion objects in the collection library extend this trait, which makes it possible to use them in places where IterableFactory is expected.

As this is the main abstraction that allows to build a collection from scratch, it is useful to know which methods it supplies. All of them return CC[A]. The following table contains a short summary:

def from[A](source: IterableOnce[A]) Creates a target collection from existing IterableOnce.
def empty[A]: CC[A] An empty collection, often defined as an object.
def apply[A](elems: A*): CC[A] Creates a collection from the elems given as var-arg.
def iterate[A](start: A, len: Int)(f: A => A): CC[A] Fills the collection with values taken from the result of the application of f on start, then on produced values and so on, len number of times.
def range[A : Integral](start: A, end: A, step: A): CC[A] Collection containing increasing integers [start, end-1] with difference between successive numbers of step. There is also a version of this method with default value of step = 1.
def fill[A](n: Int)(elem: => A): CC[A] Fills the collection with n evaluations of elem. There are variations of this function that go up to five dimensions.
def tabulate[A](n: Int)(f: Int => A): CC[A] The same as fill, but using index as an argument for the evaluation. Similarly, there are variations of this function that go up to five dimensions.
def concat[A](xss: Iterable[A]*): CC[A] Concatenates all argument collections into a single collection.
def unfold[A, S](init: S)(f: S => Option[(A, S)]): CC[A] Calls f to produce the element of the collection using and modifying the internal state starting with the init state.

Arguably, IterableFactory provides a lot of different possibilities to create a collection of a desired type.

View

View has been reimplemented in the new version of the library. Now it represents a reified Iterator operations.

Reification is the process by which an abstract idea about a computer program is turned into an explicit data model or other object created in a programming language (https://en.wikipedia.org/wiki/Reification_(computer_science)).

This means that the Iterator methods are represented as a subclasses of View and encapsulate transformations to apply. The evaluation happens at the moment the view is converted to the strict collection type, or traversed, for example using the foreach method. Views don't remember the type of the source collection. This can be demonstrated by the following example. First, we define a generic transformation that might be strict or lazy, depending on the type of the collection given as an argument:

def transform[C <: Iterable[Char]](i: C): Iterable[Char] = i 
map { c => print(s"-$c-"); c.toUpper }
take { println("\ntake"); 6 }

Next, for each transformation step, we print out its result in the console at the moment the step happens. Now we can compare lazy and strict collection behaviors:

val str = "Scala 2.13"
val view: StringView = StringView(str)
val transformed = transform(view) // A
val strict = transform(str.toList) // B
print("Lazy view constructed: ")
transformed.foreach(print) // C
print("\nLazy view forced: ")
println(transformed.to(List)) // D
println(s"Strict: $strict") // E

This snippet produces the following output in the REPL:

take
-S--c--a--l--a-- --2--.--1--3-
take
Lazy view constructed: -S-S-c-C-a-A-l-L-a-A- -
Lazy view forced: -S--c--a--l--a-- -List(S, C, A, L, A, )
Strict: List(S, C, A, L, A, )

In the first line, we can see that the take method is always evaluated strictly regardless of the underlying collection typethis is commented as A in the preceding code. The second and third lines show the strict evaluation for List[Char], line B in the code. Lines 4 and 5 demonstrate that View[Char] is then evaluated twice, each time at the moment it is forced, once by calling foreach (line C) and once by converting it to the List (line D). Also interesting is that map is only applied to the results of the take method even given the fact that map is the first transformation step in the chain.

Set

Set is a base trait for collections that have a notion of uniques of the elements. It is defined as trait Set[A] extends Iterable[A] with SetOps[A, Set, Set[A]] with Equals. We can see that it is effectively an Iterable that has some additional operations defined in SetOps and adds a notion of equality among sets. The sub-hierarchy of Set is represented in the following diagram:

The previously mentioned SetOps adds a few methods on top of IterableOps. These methods are shown here:

  • Element retrieval: contains and apply return true if this set contains a given element.
  • Equality: subsetOf and subsets check whether this set is a subset of another set, or return all subsets of this set, possibly with a given size.
  • Combinations with another set: intersect, &, diff, &~, concat, ++, union, and |. These methods compute the intersection, difference, or union of this and another set.

There are few classes in the hierarchy that augment Set with further properties:

  • SortedSet extends Set with SortedOps[A, +C] and has two immutable and two mutable implementations—two TreeSets and two BitSets. SortedOps embodies following methods that depend on the notion of Ordering:
  • Key retrieval: firstKey and lastKey return the first or last element of this collection.
  • Subcollection retrieval: range, rangeFrom, rangeUntil, and rangeTo create a ranged projection of this collection, satisfying given criteria.

Because of the overloading, SortedSet has many overloaded methods defined twice, with and without ordering. If an operation is intended to be applied to the underlying unsorted Set, the type has to be coerced:

scala> import scala.collection.SortedSet
import scala.collection.SortedSet
scala> val set = SortedSet(1,2,3)
set: scala.collection.SortedSet[Int] = TreeSet(1, 2, 3)
scala> val ordered = set.map(math.abs)
ordered: scala.collection.SortedSet[Int] = TreeSet(1, 2, 3)
scala> val unordered = set.to(Set).map(math.abs)
unordered: scala.collection.immutable.Set[Int] = Set(1, 2, 3)

Please note that direct type-ascription will not work in the case of Set because its definition is invariant:

scala> val set1: Set[Int] = SortedSet(1,2,3)
^
error: type mismatch;
found : scala.collection.SortedSet[Int]
required: Set[Int]

The invariance of Set is related to the fact that Set[A] extends a function, A => Boolean, that returns true if the set contains a given element. Thus, sets can be used in places where such one argument function is expected:

scala> ordered.forall(set)
res3: Boolean = true

There are four more specific set implementations in addition to TreeSet and BitSet:

  • ListSet implements immutable sets using a list-based data structure.
  • The immutable HashSet realizes immutable sets using a Compressed Hash-Array Mapped Prefix-tree (CHAMP)
  • The mutable HashSet and LinkedHashSet implement mutable sets using a hash table, storing the data unordered and ordered, respectively.

The sets are closely related to Map, which represents a collection with elements represented as a pair of keys and values.

Map

Map is defined as trait Map[K, +V] extends Iterable[(K, V)] with MapOps[K, V, Map, Map[K, V]] with Equals which makes it an Iterable over pairs of key, K, and value, V. It also defines a notion of equality among maps. The class hierarchy for Map is represented on the following diagram:

There are three different MapOps, one general for both mutable and immutable, and one specific for each of these forms.

MapOps extends IterableOps with the following specific operations:

  • Element retrieval: get, getOrElse, apply, applyOrElse, default, contains, and isDefinedAt. These methods allow us to retrieve a value or check whether a value is present by a given key, optionally returning a default value or throwing an exception if the key can't be found.
  • Subcollection retrieval: keySet, keys, values, keysIterator, and valuesIterator allow us to get a keys or values in different forms.
  • Mapping: map, flatMap, and collect are transforming and optionally filtering the pairs of keys and values.
  • Addition: concat returns a new collection with elements of both maps combined.

immutable.MapOps adds the following methods on top of MapOps:

  • Element removal: remove, removeAll, -- removes one or all given elements from the map returning new map.
  • Element updates: updated and + update an element with the given key returning new map.
  • Mapping: transform applies a given function to all elements, producing a new map with the returned results as values.

mutable.MapOps has a different set of methods as compared to the mutable one:

  • Element addition: put adds a new value or updates existing one.
  • Element update: updated, +, and getOrElseUpdate updates a value in place.
  • Element removal: remove and clear remove one or all elements of the map.
  • Filtering: filterInPlace retains only mappings that satisfy the predicate.
  • Mapping: mapValuesInPlace applies a transformation to the values of the map, storing returned results as values.

The general Map definition has quite a few specialized subtypes, as shown in the preceding diagram. We will take a quick look at them now.

SortedMap

SortedMap is similar to SortedSet. It has a two implementations, a mutable and immutable TreeMap, and provides a few methods defined in terms of SortedOps such as:

  • Subcollection retrieval: iteratorFrom, keysIteratorFrom, valuesIteratorFrom, and rangeTo give us a way to get elements of the map as an iterator.
  • Element retrieval: firstKey, lastKey, minAfter, and maxBefore allow us to retrieve an element that satisfies some ordering condition.

HashMap

HashMap is also available in two flavors—immutable and mutable.

The immutable HashMap is implemented using a CHAMP tree.

The mutable HashMap implements mutable maps using a hashtable. A hash table stores its elements in an array. The hash code of the item is used to calculate the position in the array.

MultiMap

MultiMap is a trait for mutable maps that have multiple values assigned to a key.

It defines the addBinding, removeBinding and entryExists methods, which can be used to query or manipulate entries for a key.

SeqMap

SeqMap is a generic abstraction for ordered immutable maps. SeqMap itself exists in mutable and immutable forms. These forms have few different implementations:

  • The immutable ListMap implements immutable maps using a list-based data structure. The methods traversing ListMap visit its elements in the order they were inserted.
  • The mutable ListMap is a simple mutable map backed by a list. It preserves insertion order as its immutable sibling does.
  • VectorMap exists only in immutable form. It is implemented using a vector/map-based data structure and preserves insertion order. It has constant lookup but slower other operations.
  • LinkedHashMap is a mutable map whose implementation is based on a hashtable and preserves the insertion order if iterated over.

Seq

Seq is probably the most ubiquitous collection in the library. Like Map, it has a notion of succession of elements and the elements have indices. It is defined as trait Seq[+A] extends Iterable[A] with PartialFunction[Int, A] with SeqOps[A, Seq, Seq[A]] with Equals. Also similar to map, Seq specifies support for the equality relation and also extends PartialFunction, which accepts an index of the element as a parameter. As there are a lots of classes implementing Seq, we will take a gradual approach and look at them level by level. The direct children of Seq are shown in the following diagram:

scala.Seq, known from previous Scala versions, is now replaced by scala.collection.immutable.Seq.

As with other collections, SeqOps extend IterableOps by adding quite a few methods:

  • Element retrieval: apply retrieves an element with a given index.
  • Indexing and search: segmentLength, isDefinedAt, indexWhere, indexOf, lastIndexOf, lastIndexWhere, indexOfSlice, lastIndexOfSlice, containsSlice, contains, startsWith, endsWith, indices, and search. These methods allow us to retrieve information about the presence or indexes of elements or subsequences given some predicate or element value.
  • Size: length and lengthCompare provide efficient operations to retrieve the length of the collection.
  • Addition: prepend, +, appended, :+, prependAll, ++, appendedAll, :++, concat, and union. Can be used to append or prepend one or multiple elements to the collection.
  • Filtering: distinct and distinctBy remove duplicates, possibly given some predicate.
  • Reversal: reverse and reverseIterator return a new collection with elements in the reversed order.
  • Sorting: sorted, sortWith, and sortBy sort this collection by some implicit Ordering, or by a given function, or both.
  • Equality: sameElements and corresponds check whether this collection contains the same elements in the same order as given, using equality-checking or the provided comparison function.
  • Subcollection retrieval: permutations and combinations. These methods allow us to retrieve a subcollection(s) that satisfies given conditions.
  • Updates: diff, intersect, patch, updated, and update (mutable). Modify elements of this collection using another collection or element and returning another collection (except the last method defined on mutable.Seq, which update elements in place).

Each of the Seq direct descendants has its own specific properties and a subtree of implementations. We'll breeze through them now.

IndexedSeq

IndexedSeq does not introduce new operations, at least not in its immutable incarnation, but overrides a lots of methods defined in SeqOps to provide more efficient implementations. There are four classes implementing it:

The mutable. IndexedSeq adds the following mutation options: mapInPlace, sortInPlace, sortInPlaceWith, and sortInPlaceBy.

The mutable. ArraySeq is a collection representing an Array. It defines an array method returning an underlying array, and an elemTag method that returns the tag of the element type needed to properly support different kinds of arrays as required by the JVM. Because of this requirement, it has separate implementations for all primitive types including numeric types (in addition to ofByte, there are implementations for all other numeric primitives, not shown on the diagram) and Boolean, AnyRef and Unit.

The immutable. ArraySeq was added in the version 2.13. It is an effectively an immutable sequence that wraps an array and is used to pass varargs parameters. It has the same descendants as its mutable cousin.

Range is an immutable structure that contains integer values. It is defined by start, end, and a step. There are two additional methods available: isInclusive, which is true for Range.Inclusive and false for Range.Exclusive, and by, which creates the new range with a different step but the same start and end.

Vector is an immutable structure that provides constant-time access and updates and fast append and prepend. Because of this, Vector is the default implementation for IndexedSeq, as demonstrated in the following snippet:

scala> IndexedSeq.fill(2)("A")
res6: IndexedSeq[String] = Vector(A, A)

WrappedString is an immutable wrapper over some String. It extends strings with all of the operations defined in IndexedSeqOps.

LinearSeq

Linear sequences have the notion of a head and a tail. The definition looks like trait LinearSeq[+A] extends Seq[A] with LinearSeqOps[A, LinearSeq, LinearSeq[A]] and the class diagram is shown here:

There are three representatives of LinearSeq, all are immutable:

  • List defines three symbolic methods which provide a nice syntax for pattern-matching and building lists. :: prepends element to the list, ::: prepends all elements of a given list, and reverse_::: prepends all elements of a given list but in the reverse order.
  • LazyList is a new immutable collection available since Scala 2.13. It implements a list with a head and a tail, which are not evaluated until needed. As it is superior to Stream, which is only lazy in the tail, Stream is now deprecated. LazyList has two additional methods, force which evaluates it, and lazyAppendAll which lazily appends a given collection to this list.
  • Queue in this hierarchy is also immutable. It allows a first-in-first-out (FIFO) insertion and retrieval of the elements. For this functionality, it defines the enqueue, enqueueAll, dequeue, dequeueOption, and front methods.

Buffers

Buffers conclude our rush through the collection library. In essence, Buffer is just a Seq that can grow and shrink. This sub-hierarchy exists only in immutable form, though IndexedBuffer inherits from both Buffer and IndexedSeq, as shown by the next diagram:


Let's take a look at the methods that these collections define, in addition to the definitions inherited from SeqOps:

  • Buffer defines methods to add or remove one or more elements in place or returning new buffer: prepend, append, appendAll, +=:, prependAll, ++=:, insert, insertAll, remove, subtractOne, trimStart, trimEnd, patchInPlace, dropInPlace, dropRightInPlace, takeRightInPlace, sliceInPlace, dropWhileInPlace, takeWhileInPlace, and padToInPlace.
  • ListBuffer is a concrete buffer implementation baked by a List. In addition to other discussed methods, it provides prependToList, which allows us to prepend this collection to another list and a triple of mapInPlace, flatMapInPlace, and filterInPlace, giving us a possibility to modify elements in place.
  • Take a Buffer, add an IndexedSeq, and you'll get an IndexedBuffer. Similar to ListBuffer, it provides the flatMapInPlace and filterInPlace methods.
  • ArrayBuffer is a concrete implementation of IndexedBuffer that uses an array to store its elements and has a constant time for append, update, and random access. It has a sizeHint method, which can be used to enlarge the underlying array. It is a default implementation instantiated if mutable.Seq is created.
  • ArrayDeque is another efficient collection that emerged in the 2.13 version. It implements a double-ended queue that internally uses a resizable circular buffer. This allows to have a constant time for append, prepend, remove first, remove last, and random-access operations. There are lots of additional methods available on this collection, mostly because of the notion of the second-end: removeHeadOption, removeHead, removeLastOption, removeLast, removeAll, removeAllReverse, removeHeadWhile, removeLastWhile, removeFirst, removeAll, clearAndShrink, copySliceToArray, and trimToSize.
  • The Queue in this hierarchy is mutable. It is based on ArrayDeque and allows us to insert and retrieve elements in the FIFO manner. The following methods are available for that: enqueue, enqueueAll, dequeue, dequeueFirst, dequeueAll, and dequeueWhile.
  • The Stack is similar to Queue, but it implements in last-in-first-out (LIFO) order instead of FIFO. The methods it defines are formulated in the corresponding terms: push, pushAll, pop, popAll, popWhile, and top.

Scala Collection Contrib library

Needless to say, the standard Scala collection library is very rich and provides collections for most of the common use cases. But of course, there are some other structures that might be useful in a number of specific situations. The Scala Collection Contrib module is Scala's way of having both a stable standard library and some extra features. In a sense, this module is an incubator for the new collection types; types that prove to be useful for a broad audience will presumably be incorporated into the standard library in further Scala versions.

Currently, there are four collection types available in the module, each both mutable and immutable:

  • MultiDict
  • SortedMultiDict
  • MultiSet
  • SortedMultiSet

Also this library provides a possibility to define additional operations on existing collections via an implicit enrichment. The following import is required to make it available:

import scala.collection.decorators._

And it delivers these methods:

  • On Seq: intersperse and replaced
  • On Map: zipByKey, join, zipByKeyWith, mergeByKey, mergeByKeyWith, fullOuterJoin, leftOuterJoin, and rightOUterJoing

Please consult the module documentation at https://github.com/scala/scala-collection-contrib for further details.

 

Summary

Scala 2.13 is a minor update of Scala with the main focus on the redesigned collection library. The few small additions to the standard library, such as automatic resource management, just accentuate this fact.

The new collection library mainly consists of two intermixed inheritance hierarchies with a similar shape. Members of the first hierarchy describe the structure of the collection and members of the second hierarchy—operations available on this collection type. Because of the inheritance relations, the collections situated lower in the tree define additional methods for more specific collections and override methods defined by the parent traits to provide more efficient implementation as required.

The three main collection types are Seq, Set, and Map. Each of these types has multiple implementations that are useful in specific situations. Set is also a function of one argument; Seq and Map are PartialFunctions.

Most of the collections are available in mutable and immutable forms.

In addition to the collection hierarchies, there is a concept of View, which is a reified definition of iterators’ operations and can be used to lazily apply transformations to the collection. Another related abstraction is IterableFactory, which implements some general ways to create collection instances and to perform conversions between collection representations.

In the next chapter, we will shift our focus from the new features of version 2.13 to a general exploration of Scala, starting with its type system.

 

Questions

  1. Describe two ways to make it possible for some resource, R, to use it together with scala.util.Using resource management utility.
  2. How can an instance of a Set and an instance of a List be compared to each other?
  3. Name the default concrete implementation for an immutable Seq.
  4. Name the default concrete implementation for an immutable indexed Seq.
  5. Name the default concrete implementation for a mutable Seq.
  6. Name the default concrete implementation for a mutable indexed Seq.
  7. It is sometimes said that List.flatMap is more powerful than it was expected to be. Can you explain why?
  8. Describe a way to map over a collection multiple times using different functions but without producing intermediate collections.
 

Further reading

  • Mads Hartmann, Ruslan Shevchenko, Professional Scala: Write concise and expressive, type-safe code in an environment that lets you build for the JVM, browser, and more.
  • Vikash Sharma, Learning Scala Programming: Learn how to write scalable and concurrent programs in Scala, a language that grows with you.

About the Author

  • Slava Schmidt

    Slava Schmidt is a software developer living in Germany. During his career, he has used a wide spectrum of programming languages, ranging from Assembler to Haskell, and progressed from an intern programmer, via head of engineering, to a Scala contractor. In his current role, he helps to develop and bring into production a number of Scala projects of various sizes and complexities for customers from different industries.

    He is a contributor to open source projects related to Scala and Akka. As a passionate conference speaker, he regularly gives talks about technologies to share his experiences with other developers.

    Browse publications by this author

Latest Reviews

(4 reviews total)
Not a rating of the book but the purchase experience.
Perfect. Not any problems
The book is very good, though not for beginners. Your customer service was very helpful in upgrading my eBook copy to the print version. But the shipment took exactly four(!) weeks, which is definitely too long.
Learn Scala Programming
Unlock this book and the full library FREE for 7 days
Start now