Home Programming The Clojure Workshop

The Clojure Workshop

By Joseph Fahey , Thomas Haratyk , Scott McCaughie and 2 more
books-svg-icon Book
eBook $35.99 $24.99
Print $43.99 $25.99
Subscription $15.99 $10 p/m for three months
$10 p/m for first 3 months. $15.99 p/m after that. Cancel Anytime!
What do you get with a Packt Subscription?
This book & 7000+ ebooks & video courses on 1000+ technologies
60+ curated reading lists for various learning paths
50+ new titles added every month on new and emerging tech
Early Access to eBooks as they are being written
Personalised content suggestions
Customised display settings for better reading experience
50+ new titles added every month on new and emerging tech
Playlists, Notes and Bookmarks to easily manage your learning
Mobile App with offline access
What do you get with a Packt Subscription?
This book & 6500+ ebooks & video courses on 1000+ technologies
60+ curated reading lists for various learning paths
50+ new titles added every month on new and emerging tech
Early Access to eBooks as they are being written
Personalised content suggestions
Customised display settings for better reading experience
50+ new titles added every month on new and emerging tech
Playlists, Notes and Bookmarks to easily manage your learning
Mobile App with offline access
What do you get with eBook + Subscription?
Download this book in EPUB and PDF formats, plus a monthly download credit
This book & 6500+ ebooks & video courses on 1000+ technologies
60+ curated reading lists for various learning paths
50+ new titles added every month on new and emerging tech
Early Access to eBooks as they are being written
Personalised content suggestions
Customised display settings for better reading experience
50+ new titles added every month on new and emerging tech
Playlists, Notes and Bookmarks to easily manage your learning
Mobile App with offline access
What do you get with a Packt Subscription?
This book & 6500+ ebooks & video courses on 1000+ technologies
60+ curated reading lists for various learning paths
50+ new titles added every month on new and emerging tech
Early Access to eBooks as they are being written
Personalised content suggestions
Customised display settings for better reading experience
50+ new titles added every month on new and emerging tech
Playlists, Notes and Bookmarks to easily manage your learning
Mobile App with offline access
What do you get with eBook?
Download this book in EPUB and PDF formats
Access this title in our online reader
DRM FREE - Read whenever, wherever and however you want
Online reader with customised display settings for better reading experience
What do you get with video?
Download this video in MP4 format
Access this title in our online reader
DRM FREE - Watch whenever, wherever and however you want
Online reader with customised display settings for better learning experience
What do you get with video?
Stream this video
Access this title in our online reader
DRM FREE - Watch whenever, wherever and however you want
Online reader with customised display settings for better learning experience
What do you get with Audiobook?
Download a zip folder consisting of audio files (in MP3 Format) along with supplementary PDF
What do you get with Exam Trainer?
Flashcards, Mock exams, Exam Tips, Practice Questions
Access these resources with our interactive certification platform
Mobile compatible-Practice whenever, wherever, however you want
BUY NOW $10 p/m for first 3 months. $15.99 p/m after that. Cancel Anytime!
eBook $35.99 $24.99
Print $43.99 $25.99
Subscription $15.99 $10 p/m for three months
What do you get with a Packt Subscription?
This book & 7000+ ebooks & video courses on 1000+ technologies
60+ curated reading lists for various learning paths
50+ new titles added every month on new and emerging tech
Early Access to eBooks as they are being written
Personalised content suggestions
Customised display settings for better reading experience
50+ new titles added every month on new and emerging tech
Playlists, Notes and Bookmarks to easily manage your learning
Mobile App with offline access
What do you get with a Packt Subscription?
This book & 6500+ ebooks & video courses on 1000+ technologies
60+ curated reading lists for various learning paths
50+ new titles added every month on new and emerging tech
Early Access to eBooks as they are being written
Personalised content suggestions
Customised display settings for better reading experience
50+ new titles added every month on new and emerging tech
Playlists, Notes and Bookmarks to easily manage your learning
Mobile App with offline access
What do you get with eBook + Subscription?
Download this book in EPUB and PDF formats, plus a monthly download credit
This book & 6500+ ebooks & video courses on 1000+ technologies
60+ curated reading lists for various learning paths
50+ new titles added every month on new and emerging tech
Early Access to eBooks as they are being written
Personalised content suggestions
Customised display settings for better reading experience
50+ new titles added every month on new and emerging tech
Playlists, Notes and Bookmarks to easily manage your learning
Mobile App with offline access
What do you get with a Packt Subscription?
This book & 6500+ ebooks & video courses on 1000+ technologies
60+ curated reading lists for various learning paths
50+ new titles added every month on new and emerging tech
Early Access to eBooks as they are being written
Personalised content suggestions
Customised display settings for better reading experience
50+ new titles added every month on new and emerging tech
Playlists, Notes and Bookmarks to easily manage your learning
Mobile App with offline access
What do you get with eBook?
Download this book in EPUB and PDF formats
Access this title in our online reader
DRM FREE - Read whenever, wherever and however you want
Online reader with customised display settings for better reading experience
What do you get with video?
Download this video in MP4 format
Access this title in our online reader
DRM FREE - Watch whenever, wherever and however you want
Online reader with customised display settings for better learning experience
What do you get with video?
Stream this video
Access this title in our online reader
DRM FREE - Watch whenever, wherever and however you want
Online reader with customised display settings for better learning experience
What do you get with Audiobook?
Download a zip folder consisting of audio files (in MP3 Format) along with supplementary PDF
What do you get with Exam Trainer?
Flashcards, Mock exams, Exam Tips, Practice Questions
Access these resources with our interactive certification platform
Mobile compatible-Practice whenever, wherever, however you want
  1. Free Chapter
    2. Data Types and Immutability
About this book
The Clojure Workshop is a step-by-step guide to Clojure and ClojureScript, designed to quickly get you up and running as a confident, knowledgeable developer. Because of the functional nature of the language, Clojure programming is quite different to what many developers will have experienced. As hosted languages, Clojure and ClojureScript can also be daunting for newcomers because of complexities in the tooling and the challenge of interacting with the host platforms. To help you overcome these barriers, this book adopts a practical approach. Every chapter is centered around building something. As you progress through the book, you will progressively develop the 'muscle memory' that will make you a productive Clojure programmer, and help you see the world through the concepts of functional programming. You will also gain familiarity with common idioms and patterns, as well as exposure to some of the most widely used libraries. Unlike many Clojure books, this Workshop will include significant coverage of both Clojure and ClojureScript. This makes it useful no matter your goal or preferred platform, and provides a fresh perspective on the hosted nature of the language. By the end of this book, you'll have the knowledge, skills and confidence to creatively tackle your own ambitious projects with Clojure and ClojureScript.
Publication date:
January 2020
Publisher
Packt
Pages
800
ISBN
9781838825485

 

2. Data Types and Immutability

Overview

In this chapter, we start by discovering the concept of immutability and its relevance in modern programs. We then examine simple data types such as strings, numbers and booleans, highlighting subtle differences in different environments like Clojure and ClojureScript. After a first exercise, we move on to more elaborated data types with collections such as lists, vectors, maps and sets, learning along the way which to use in different situations. After touching on the collection and sequence abstractions, we learn new techniques for working with nested data structures, before finally moving on to the final activity: implementing our very own in-memory database.

By the end of this chapter, you will be able to work with the commonly used data types in Clojure.

 

Introduction

Computer hardware has evolved dramatically in the last few decades. On a typical computer, storage and memory capacity have both increased a millionfold compared to the early 1980s. Nonetheless, standard industry practices in software development and mainstream ways of programming are not that different. Programming languages such as C++, Java, Python, and Ruby still typically encourage you to change things in place, and to use variables and mutate the state of a program, that is, to do things as if we were programming on a computer with a minimal amount of memory. However, in our quest for efficiency, better languages, and better tools, we reach for higher-level languages. We want to get further away from machine code. We want to write less code and let the computers do the tedious work.

We don't want to think about the computer's memory anymore, such as where a piece of information is stored and whether it's safe and shareable, as much as we don't want to know about the order of the instructions in the CPU. It is a distraction to the problems we are trying to solve, which are already complicated enough. If you have ever tried to do some multithreading in the languages cited previously, you will know the pain of sharing data between threads. Although, leveraging multicore CPUs with multithreaded applications is an essential part of optimizing a modern program's performance.

In Clojure, we work almost exclusively with immutable data types. They are safe to share, easy to fabricate, and improve the readability of our source code. Clojure provides the necessary tools to write programs with the functional programming paradigm: first-class citizen functions, which we will discover in the next chapter, and avoiding mutating and sharing the state of an application with immutable data types.

Let's dust off the dictionary and look up the definition of immutable, "Immutable: that cannot be changed; that will never change." It doesn't mean that a piece of information cannot change over time, but we record those modifications as a series of new values. "Updating" an immutable data structure provides a new value derived from the original value. However, the original value remains unchanged – those data structures that preserve previous versions of themselves are called persistent data structures.

Intuitively, we may think that such a persistent data structure would negatively impact performance, but it's not as bad as it seems. They are optimized for performance, and techniques such as structural sharing bring the time complexity of all operations close to classic, mutable implementations.

In other terms, unless you are programming an application that requires extraordinarily high performance, such as a video game, the benefits of using immutable data structures far outweigh the small loss in performance.

 

Simple Data Types

A data type designates what kind of value a piece of data holds; it is a fundamental way of classifying data. Different types allow different kinds of operations: we can concatenate strings, multiply numbers, and perform logic algebra operations with Booleans. Because Clojure has a strong emphasis on practicality, we don't explicitly assign types to values in Clojure, but those values still have a type.

Clojure is a hosted language and has three notable, major implementations in Java, JavaScript, and .NET. Being a hosted language is a useful trait that allows Clojure programs to run in different environments and take advantage of the ecosystem of its host. Regarding data types, it means that each implementation has different underlying data types, but don't worry as those are just implementation details. As a Clojure programmer, it does not make much difference, and if you know how to do something in Clojure, you likely know how to do it in, say, ClojureScript.

In this topic, we will go through Clojure's simple data types. Here is the list of the data types looked at in this section. Please note that the following types are all immutable:

  • Strings
  • Numbers
  • Booleans
  • Keywords
  • Nil

Strings

Strings are sequences of characters representing text. We have been using and manipulating strings since the first exercise of Chapter 1, Hello REPL.

You can create a string by simply wrapping characters with double quotes ("):

user=> "I am a String"
"I am a String"
user=> "I am immutable"
"I am immutable"

String literals are only created with double quotes, and if you need to use double quotes in a string, you can escape them with the backslash character (\):

user=> (println "\"The measure of intelligence is the ability to change\" - Albert Einstein")
"The measure of intelligence is the ability to change" - Albert Einstein
nil

Strings are not able to be changed; they are immutable. Any function that claims to transform a string yields a new value:

user=> (def silly-string "I am Immutable. I am a silly String")
#'user/silly-string
user=> (clojure.string/replace silly-string "silly" "clever")
"I am Immutable. I am a clever String"
user=> silly-string
"I am Immutable. I am a silly String"

In the preceding example, calling clojure.string/replace on silly-string returned a new string with the word "silly" replaced with "clever." However, when evaluating silly-string again, we can see that the value has not changed. The function returned a different value and did not change the original string.

Although a string is usually a single unit of data representing text, Strings are also collections of characters. In the JVM implementation of Clojure, strings are of the java.lang.String Java type and they are implemented as collections of the java.lang.Character Java type, such as the following command, which returns a character:

user=> (first "a collection of characters")
\a
user=> (type *1)
java.lang.Character

first returns the first element of a collection. Here, the literal notation of a character is \a. The type function returns a string representation of the data type for a given value. Remember that we can use *1 to retrieve the last returned value in the REPL, so *1 evaluates to \a.

It is interesting to note that, in ClojureScript, strings are collections of one-character strings, because there is no character type in JavaScript. Here is a similar example in a ClojureScript REPL:

cljs.user=> (last "a collection of 1 character strings")
"s"
cljs.user=> (type *1)
#object[String]

As with the Clojure REPL, type returns a string representation of the data type. This time, in ClojureScript, the value returned by the last function (which returns the last character of a string) is of the #object[String] type, which means a JavaScript string.

You can find a few common functions for manipulating strings in the core namespace, such as str, which we used in Chapter 1, Hello REPL!, to concatenate (combine multiple strings together into one string):

user=> (str "That's the way you " "con" "ca" "te" "nate")
"That's the way you concatenate"
user=> (str *1 " - " silly-string)
"That's the way you concatenate - I am Immutable. I am a silly String"

Most functions for manipulating strings can be found in the clojure.string namespace. Here is a list of them using the REPL dir function:

user=> (dir clojure.string)
blank?
capitalize
ends-with?
escape
includes?
index-of
join
last-index-of
lower-case
re-quote-replacement
replace
replace-first
reverse
split
split-lines
starts-with?
trim
trim-newline
triml
trimr
upper-case

As a reminder, this is how you can use a function from a specific namespace:

user=> (clojure.string/includes? "potatoes" "toes")
true

We will not cover all the string functions, but feel free to try them out now. You can always look up the documentation of a string function from the preceding list with the doc function.

Numbers

Clojure has good support for numbers and you will most likely not have to worry about the underlying types, as Clojure will handle pretty much anything. However, it is important to note that there are a few differences between Clojure and ClojureScript in that regard.

In Clojure, by default, natural numbers are implemented as the java.lang.Long Java type unless the number is too big for Long. In that case, it is typed clojure.lang.BigInt:

user=> (type 1)
java.lang.Long
user=> (type 1000000000000000000)
java.lang.Long
user=> (type 10000000000000000000)
clojure.lang.BigInt

Notice, in the preceding example, that the number was too big to fit in the java.lang.Long Java type and, therefore, was implicitly typed clojure.lang.BigInt.

Exact ratios are represented by Clojure as "Ratio" types, which have a literal representation. 5/4 is not an exact ratio, so the output is the ratio itself:

user=> 5/4
5/4

The result of dividing 3 by 4 can be represented by the ratio 3/4:

user=> (/ 3 4)
3/4
user=> (type 3/4)
clojure.lang.Ratio

4/4 is equivalent to 1 and is evaluated as follows:

user=> 4/4
1

Decimal numbers are "double" precision floating-point numbers:

user=> 1.2
1.2

If we take our division of 3 by 4 again, but this time mix in a "Double" type, we will not get a ratio as a result:

user=> (/ 3 4.0)
0.75

This is because floating-point numbers are "contagious" in Clojure. Any operation involving floating-point numbers will result in a float or a double:

user=> (* 1.0 2)
2.0
user=> (type (* 1.0 2))
java.lang.Double

In ClojureScript, however, numbers are just "JavaScript numbers," which are all double-precision floating-point numbers. JavaScript does not define different types of numbers like Java and some other programming languages do (for example, long, integer, and short):

cljs.user=> 1
1
cljs.user=> 1.2
1.2
cljs.user=> (/ 3 4)
0.75
cljs.user=> 3/4
0.75
cljs.user=> (* 1.0 2)
2

Notice that, this time, any operation returns a floating-point number. The fact that there is no decimal separation for 1 or 2 is just a formatting convenience.

We can make sure that all those numbers are JavaScript numbers (double-precision, floating-point) by using the type function:

cljs.user=> (type 1)
#object[Number]
cljs.user=> (type 1.2)
#object[Number]
cljs.user=> (type 3/4)
#object[Number]

If you need to do more than simple arithmetic, you can use the Java or JavaScript math libraries, which are similar except for a few minor exceptions.

You will learn more about host platform interoperability in Chapter 9, Host Platform Interoperability with Java and JavaScript (how to interact with the host platform and its ecosystem), but the examples in the chapter will get you started with doing some more complicated math and with using the math library:

Reading a value from a constant can be done like this:

user=> Math/PI
3.141592653589793

And calling a function, like the usual Clojure functions, can be done like this:

user=> (Math/random)
0.25127992428738254
user=> (Math/sqrt 9)
3.0
user=> (Math/round 0.7)
1

Exercise 2.01: The Obfuscation Machine

You have been contacted by a secret government agency to develop an algorithm that encodes text into a secret string that only the owner of the algorithm can decode. Apparently, they don't trust other security mechanisms such as SSL and will only communicate sensitive information with their own proprietary technology.

You need to develop an encode function and a decode function. The encode function should replace letters with numbers that are not easily guessable. For that purpose, each letter will take the character's number value in the ASCII table, add another number to it (the number of words in the sentence to encode), and finally, compute the square value of that number. The decode function should allow the user to revert to the original string. Someone highly ranked in the agency came up with that algorithm so they trust it to be very secure.

In this exercise, we will put into practice some of the things we've learned about strings and numbers by building an obfuscation machine:

  1. Start your REPL and look up the documentation of the clojure.string/replace function:
    user=> (doc clojure.string/replace)
    -------------------------
    clojure.string/replace
    ([s match replacement])
      Replaces all instance of match with replacement in s.
       match/replacement can be:
       string / string
       char / char
       pattern / (string or function of match).
       See also replace-first.
       The replacement is literal (i.e. none of its characters are treated
       specially) for all cases above except pattern / string.
       For pattern / string, $1, $2, etc. in the replacement string are
       substituted with the string that matched the corresponding
       parenthesized group in the pattern.  If you wish your replacement
       string r to be used literally, use (re-quote-replacement r) as the
       replacement argument.  See also documentation for
       java.util.regex.Matcher's appendReplacement method.
       Example:
       (clojure.string/replace "Almost Pig Latin" #"\b(\w)(\w+)\b" "$2$1ay")
       -> "lmostAay igPay atinLay"

    Notice that the replace function can take a pattern and a function of the matching result as parameters. We don't know how to iterate over collections yet, but using the replace function with a pattern and a "replacement function" should do the job.

  2. Try and use the replace function with the #"\w" pattern (which means word character), replace it with the ! character, and observe the result:
    user=> (clojure.string/replace "Hello World" #"\w" "!")

    The output is as follows:

    "!!!!! !!!!!"
  3. Try and use the replace function with the same pattern, but this time passing an anonymous function that takes the matching letter as a parameter:
    user=> (clojure.string/replace "Hello World" #"\w" (fn [letter] (do (println letter) "!")))

    The output is as follows:

    H
    e
    l
    l
    o
    W
    o
    r
    l
    d
    "!!!!! !!!!!"

    Observe that the function was called for each letter, printing the match out to the console and finally returning the string with the matches replaced by the ! character. It looks like we should be able to write our encoding logic in that replacement function.

  4. Let's now see how we can convert a character to a number. We can use the int function, which coerces its parameter to an integer. It can be used like this:
    user=> (int \a)
    97
  5. It seems that the "replacement function" will take a string as a parameter, so let's convert our string to a character. Use the char-array function combined with first to convert our string to a character as follows:
    user=> (first (char-array "a"))
    \a
  6. Now, if we combine previous steps together and also compute the square value of the character's number, we should be approaching our obfuscation goal. Combine the code written previously to obtain a character code from a string and get its square value using the Math/pow function as follows:
    user=> (Math/pow (int (first (char-array "a"))) 2)
    9409.0
  7. Let's now convert this result to the string that will be returned from our replace function. First, let's remove the decimal part by coercing the result to an int, and put things together in an encode-letter function, as follows:
    user=>
    (defn encode-letter
      [s]
      (let [code (Math/pow (int (first (char-array s))) 2)]
        (str (int code))))
    #'user/encode-letter
    user=> (encode-letter "a")
    "9409"

    Great! It seems to work. Let's now test our function as part of the replace function.

  8. Create the encode function, which uses clojure.string/replace as well as our encode-letter function:
    user=>
    (defn encode
      [s]
      (clojure.string/replace s #"\w" encode-letter))
    #'user/encode
    user=> (encode "Hello World")
    "518410201116641166412321 756912321129961166410000"

    It seems to work but the resulting string will be hard to decode without being able to identify each letter individually.

    There is another thing that we did not take into account: the encode function should take an arbitrary number to add to the code before calculating the square value.

  9. First, add a separator as part of our encode-letter function, for example, the # character, so that we can identify each letter individually. Second, add an extra parameter to encode-letter, which needs to be added before calculating the square value:
    user=>
    (defn encode-letter
      [s x]
      (let [code (Math/pow (+ x (int (first (char-array s)))) 2)]
        (str "#" (int code))))
    #'user/encode-letter
  10. Now, test the encode function another time:
    user=> (encode "Hello World")
    Execution error (ArityException) at user/encode (REPL:3).
    Wrong number of args (1) passed to: user/encode-letter

    Our encode function is now failing because it is expecting an extra argument.

  11. Modify the encode function to calculate the number of words in the text to obfuscate, and pass it to the encode-letter function. You can use the clojure.string/split function with a whitespace, as follows, to count the number of words:
    user=>
    (defn encode
      [s]
      (let [number-of-words (count (clojure.string/split s #" "))]
        (clojure.string/replace s #"\w" (fn [s] (encode-letter s number-of-words)))))
    #'user/encode
  12. Try your newly created function with a few examples and make sure it obfuscates strings properly:
    user=> (encode "Super secret")
    "#7225#14161#12996#10609#13456 #13689#10609#10201#13456#10609#13924"
    user=> (encode "Super secret message")
    "#7396#14400#13225#10816#13689 #13924#10816#10404#13689#10816#14161 #12544#10816#13924#13924#10000#11236#10816"

    What a beautiful, unintelligible, obfuscated string – well done! Notice how the numbers for the same letters are different depending on the number of words in the phrase to encode. It seems to work according to the specification!

    We can now start working on the decode function, for which we will need to use the following functions:

    Math/sqrt to obtain the square root value of a number.

    char to retrieve a letter from a character code (a number).

    subs as in substring, to get a sub-portion of a string (and get rid of our # separator).

    Integer/parseInt to convert a string to an integer.

  13. Write the decode function using a combination of the preceding functions, to decode an obfuscated character:
    user=>
    (defn decode-letter
      [x y]
      (let [number (Integer/parseInt (subs x 1))
            letter (char (- (Math/sqrt number) y))]
      (str letter)))
    #'user/decode-letter
  14. Finally, write the decode function, which is similar to the encode function except that it should use decode-letter instead of encode-letter:
    user=>
    (defn decode [s]
      (let [number-of-words (count (clojure.string/split s #" "))]
        (clojure.string/replace s #"\#\d+" (fn [s] (decode-letter s number-of-words)))))
    #'user/decode
  15. Test your functions and make sure that they both work:
    user=> (encode "If you want to keep a secret, you must also hide it from yourself.")

    The output is as follows:

    "#7569#13456 #18225#15625#17161 #17689#12321#15376#16900 #16900#15625 #14641#13225#13225#15876 #12321 #16641#13225#12769#16384#13225#16900, #18225#15625#17161 #15129#17161#16641#16900 #12321#14884#16641#15625 #13924#14161#12996#13225 #14161#16900 #13456#16384#15625#15129 #18225#15625#17161#16384#16641#13225#14884#13456."
    user=> (decode *1)
    "If you want to keep a secret, you must also hide it from yourself."

In this exercise, we've put into practice working with numbers and strings by creating an encoding system. We can now move on to learning other data types, starting with Booleans.

Booleans

Booleans are implemented as Java's java.lang.Boolean in Clojure or JavaScript's "Boolean" in ClojureScript. Their value can either be true or false, and their literal notations are simply the lowercase true and false.

Symbols

Symbols are identifiers referring to something else. We have already been using symbols when creating bindings or calling functions. For example, when using def, the first argument is a symbol that will refer to a value, and when calling a function such as +, + is a symbol referring to the function implementing the addition. Consider the following examples:

user=> (def foo "bar")
#'user/foo
user=> foo
"bar"
user=> (defn add-2 [x] (+ x 2))
#'user/add-2
user=> add-2
#object[user$add_2 0x4e858e0a "user$add_2@4e858e0a"]

Here, we have created the user/foo symbol, which refers to the "bar" string, and the add-2 symbol, which refers to the function that adds 2 to its parameter. We have created those symbols in the user namespace, hence the notation with /: user/foo.

If we try to evaluate a symbol that has not been defined, we'll get an error:

user=> marmalade
Syntax error compiling at (REPL:0:0).
Unable to resolve symbol: marmalade in this context

In the REPL Basics topic of Chapter 1, Hello REPL!, we were able to use the following functions because they are bound to a specific symbol:

user=> str
#object[clojure.core$str 0x7bb6ab3a "clojure.core$str@7bb6ab3a"]
user=> +
#object[clojure.core$_PLUS_ 0x1c3146bc "clojure.core$_PLUS_@1c3146bc"]
user=> clojure.string/replace
#object[clojure.string$replace 0xf478a81 "clojure.string$replace@f478a81"]

Those gibberish-like values are string representations of the functions, because we are asking for the values bound to the symbols rather than invoking the functions (wrapping them with parentheses).

Keywords

You can think of a keyword as some kind of a special constant string. Keywords are a nice addition to Clojure because they are lightweight and convenient to use and create. You just need to use the colon character, :, at the beginning of a word to create a keyword:

user=> :foo
:foo
user=> :another_keyword
:another_keyword

They don't refer to anything else like symbols do; as you can see in the preceding example, when evaluated, they just return themselves. Keywords are typically used as keys in a key-value associative map, as we will see in the next topic about collections.

In this section, we went through simple data types such as string, numbers, Boolean, symbols, and keywords. We highlighted how their underlying implementation depends on the host platform because Clojure is a hosted language. In the next section, we will see how those values can aggregate to collections.

 

Collections

Clojure is a functional programming language in which we focus on building the computations of our programs in terms of the evaluation of functions, rather than building custom data types and their associated behaviors. In the other dominant programming paradigm, object-oriented programming, programmers define the data types and the operations available on them. Objects are supposed to encapsulate data and communicate with each other by passing messages around. But there is an unfortunate tendency to create classes and new types of objects to customize the shape of the data, instead of using more generic data structures, which cascades into creating specific methods to access and modify the data. We have to come up with decent names, which is difficult, and then we pass instances of objects around in our programs. We create new classes all the time, but more code means more bugs. It is a recipe for disaster; it is an explosion of code, with code that is very specific and benefits from little reuse.

Of course, it is not like that everywhere, and you can write clean object-oriented code, with objects being the little black boxes of functionality they were designed for. However, as programmers, whether it's through using other libraries or maintaining a legacy code base, we spend most of our time working with other people's code.

In functional programming, and more specifically, in Clojure, we tend to work with just a few data types. Types that are generic and powerful, types that every other "Clojurian" already knows and has mastered.

Collections are data types that can contain more than one thing and describe how those items relate to each other. The four main data structures for collections that you should know about are Maps, Sets, Vectors, and Lists. There are more available, including the data structure offered by your host platform (for example, Java or JavaScript) or other libraries, but those four are your bread and butter for doing things in Clojure.

"Data dominates. If you've chosen the right data structures and organized things well, the algorithms will almost always be self-evident. Data structures, not algorithms, are central to programming." - Rob Pike's Rule #5 of programming.

Maps

A Map is a collection of key-value pairs. Clojure provides – in a persistent and immutable fashion – the usual HashMap but also a SortedMap.

HashMaps are called "Hash" because they create a hash of the key and map it to a given value. Lookups, as well as other common operations (insert and delete), are fast.

HashMaps are used a lot in Clojure, notably, for representing entities where we need to associate some attributes to some values. SortedMaps are different because they preserve the order of the keys; otherwise, they have the same interface and are used in the same way as HashMaps. SortedMaps are not very common, so let's focus on HashMaps.

You can create a HashMap with the literal notation using curly braces. Here is a Map with three key-value pairs, with the keys being the :artist, :song, and :year keywords:

user=> {:artist "David Bowtie" :song "The Man Who Mapped the World" :year 1970}
{:artist "David Bowtie", :song "The Man Who Mapped the World", :year 1970}

You might have noticed in the preceding example that key-value pairs in the map are separated by a space, but Clojure evaluates it and returns a Map with key-value pairs separated by a comma. As with other collections, you can choose to use a space or a comma to separate each entry. For maps, there's no best practice and if you think it improves a map's readability, use commas; otherwise, simply omit them. You can also separate entries with new lines.

Here's another map written with comma-separated entries:

user=> {:artist "David Bowtie", :song "Comma Oddity", :year 1969}
{:artist "David Bowtie", :song "Comma Oddity", :year 1969}

Notice that the values can be of any type, and not only simple values such as strings and numbers, but also vectors and even other maps, allowing you to create nested data structures and structure information as follows:

user=>
  {
  "David Bowtie" {
    "The Man Who Mapped the World" {:year 1970, :duration "4:01"}
    "Comma Oddity" {:year 1969, :duration "5:19"}
  }
  "Crosby Stills Hash" {
    "Helplessly Mapping" {:year 1969, :duration "2:38"}
    "Almost Cut My Hair" {:year 1970, :duration "4:29", :featuring ["Neil Young", "Rich Hickey"]}
  }
}
{"David Bowtie" {"The Man Who Mapped the World" {:year 1970, :duration "4:01"}, "Comma Oddity" {:year 1969, :duration "5:19"}}, "Crosby Stills Hash" {"Helplessly Mapping" {:year 1969, :duration "2:38"}, "Almost Cut My Hair" {:year 1970, :duration "4:29", :featuring ["Neil Young" "Rich Hickey"]}}}

Keys can be of different types too, so you could have strings, numbers, or even other types as a key; however, we generally use keywords.

Another way of creating a map is by using the hash-map function, passing in pairs of arguments as follows:

user=> (hash-map :a 1 :b 2 :c 3)
{:c 3, :b 2, :a 1}

Choose to use literal notation with curly braces when possible, but when HashMaps are programmatically generated, the hash-map function can come in handy.

Map keys are unique:

user=> {:name "Lucy" :age 32 :name "Jon"}
Syntax error reading source at (REPL:6:35).
Duplicate key: :name

An exception was thrown because the :name key was present twice in the preceding literal map.

However, different keys can have the same value:

user=> {:name "Lucy" :age 32 :number-of-teeth 32}
{:name "Lucy", :age 32, :number-of-teeth 32}

Notice that both age and number-of-teeth have the same value, and that is both valid and convenient, to say the least.

Now that you know how to create maps, it is time for a bit of practice.

Exercise 2.02: Using Maps

In this exercise, we will learn how to access and modify simple maps:

  1. Start your REPL and create a map:
    user=> (def favorite-fruit {:name "Kiwi", :color "Green", :kcal_per_100g 61 :distinguish_mark "Hairy"})
    #'user/favorite-fruit
  2. You can read an entry from the map with the get function. Try to look up a key or two, as follows:
    user=> (get favorite-fruit :name)
    "Kiwi"
    user=> (get favorite-fruit :color)
    "Green"
  3. If the value for a given key cannot be found, get returns nil, but you can specify a fallback value with a third argument to get:
    user=> (get favorite-fruit :taste)
    nil
    user=> (get favorite-fruit :taste "Very good 8/10")
    "Very good 8/10"
    user=> (get favorite-fruit :kcal_per_100g 0)
    61
  4. Maps and keywords have the special ability to be used as functions. When positioned in the "operator position" (as the first item of the list), they are invoked as a function that can be used to look up a value in a map. Try it now by using the favorite-fruit map as a function:
    user=> (favorite-fruit :color)
    "Green"
  5. Try to use a keyword as a function to look up a value in a Map:
    user=> (:color favorite-fruit)
    "Green"

    As with the get function, those ways of retrieving a value return nil when the key cannot be found, and you can pass an extra argument to provide a fallback value.

  6. Provide a fallback value for a key that doesn't exist in the favorite-fruit map:
    user=> (:shape favorite-fruit "egg-like")
    "egg-like"
  7. We would like to store this value in the map. Use assoc to associate a new key, :shape, with a new value, "egg-like", in our map:
    user=> (assoc favorite-fruit :shape "egg-like")
    {:name "Kiwi", :color "Green", :kcal_per_100g 61, :distinguish_mark "Hairy", :shape "egg-like"}

    The assoc operation returns a new map containing our previous key-value pairs as well as the new association we've just added.

  8. Evaluate favorite-fruit and notice that it remains unchanged:
    user=> favorite-fruit
    {:name "Kiwi", :color "Green", :kcal_per_100g 61, :distinguish_mark "Hairy"}

    Because a map is immutable, the value bound to the favorite-fruit symbol has not changed. By using assoc, we have created a new version of the map.

    Now, the F3C ("Funny Fruity Fruits Consortium") have reverted their previous ruling and determined during their quarterly review of fruit specifications that the color of the kiwi fruit should be brown and not green. To make sure that your application is F3C compliant, you decide to update your system with the new value.

  9. Change the color of favorite-fruit by associating a new value to the :color key:
    user=> (assoc favorite-fruit :color "Brown")
    {:name "Kiwi", :color "Brown", :kcal_per_100g 61, :distinguish_mark "Hairy"}

    assoc replaces the existing value when a key already exists, because HashMaps cannot have duplicate keys.

  10. If we wanted to add more structured information, we could add a map as a value. Add production information as a nested map in our Kiwi map:
    user=> (assoc favorite-fruit :yearly_production_in_tonnes {:china 2025000 :italy 541000 :new_zealand 412000 :iran 311000 :chile 225000})
    {:name "Kiwi", :color "Green", :kcal_per_100g 61, :distinguish_mark "Hairy", :yearly_production_in_tonnes {:china 2025000, :italy 541000, :new_zealand 412000, :iran 311000, :chile 225000}}

    Having nested maps or other data types is commonly used to represent structured information.

    New research has found out that the Kiwi contains fewer calories than previously thought, and to stay compliant, the F3C requires organizations to reduce the current value of kcal per 100 g by 1.

  11. Decrement kcal_per_100g with the assoc function, as follows:
    user=> (assoc favorite-fruit :kcal_per_100g (- (:kcal_per_100g favorite-fruit) 1))
    {:name "Kiwi", :color "Green", :kcal_per_100g 60, :distinguish_mark "Hairy"}

    Great! It works, but there is a more elegant way to deal with this type of operation. When you need to change a value in a map based on a previous value, you can use the update function. While the assoc function lets you associate a completely new value to a key, update allows you to compute a new value based on the previous value of a key. The update function takes a function as its third parameter.

  12. Decrement kcal_per_100g with the update function and dec, as follows:
    user=> (update favorite-fruit :kcal_per_100g dec)
    {:name "Kiwi", :color "Green", :kcal_per_100g 60, :distinguish_mark "Hairy"}

    Notice how the value of :kcal_per_100g changed from 61 to 60.

  13. You can also pass arguments to the function provided to update; for example, if we wanted to lower :kcal_per_100g by 10 instead of 1, we could use the subtract function, -, and write the following:
    user=> (update favorite-fruit :kcal_per_100g - 10)
    {:name "Kiwi", :color "Green", :kcal_per_100g 51, :distinguish_mark "Hairy"}

    Like assoc, update does not change the immutable map; it returns a new map.

    This example illustrates the power of functions being "first-class citizens": we treat them like typical values; in this case, a function was passed as an argument to another function. We will elaborate on this concept in the next chapter while diving into functions in more depth.

  14. Finally, use dissoc (as in "dissociate") to remove one or multiple elements from a map:
    user=> (dissoc favorite-fruit :distinguish_mark)
    {:name "Kiwi", :color "Green", :kcal_per_100g 61}
    user=> (dissoc favorite-fruit :kcal_per_100g :color)
    {:name "Kiwi", :distinguish_mark "Hairy"}

Well done! Now that we know how to use maps, it is time to move on to the next data structure: sets.

Sets

A set is a collection of unique values. Clojure provides HashSet and SortedSet. Hash Sets are implemented as Hash Maps, with the key and the value of each entry being identical.

Hash Sets are fairly common in Clojure and have a literal notation of a hash with curly braces, #{}, for example:

user=> #{1 2 3 4 5}
#{1 4 3 2 5}

Notice in the preceding expression that when the set is evaluated, it does not return the elements of the sets in the order that they were defined in the literal expression. This is because of the internal structure of the HashSet. The value is transformed in a unique hash, which allows fast access but does not keep the insertion order. If you care about the order in which the elements are added, you need to use a different data structure, for example, a sequence such as a vector (which we will soon discover). Use a HashSet to represent elements that logically belong together, for example, an enumeration of unique values.

As with maps, sets cannot have duplicate entries:

user=> #{:a :a :b :c}
Syntax error reading source at (REPL:135:15).
Duplicate key: :a

Hash Sets can be created from a list of values by passing those values to the hash-set function:

user=> (hash-set :a :b :c :d)
#{:c :b :d :a}

Hash Sets can also be created from another collection with the set function. Let's create a HashSet from a vector:

user=> (set [:a :b :c])
#{:c :b :a}

Notice that the order defined in the vector was lost.

The set function will not throw an error when converting a collection of non-unique values to a set with the set function, which can be useful for deduplicating values:

user=> (set ["No" "Copy" "Cats" "Cats" "Please"])
#{"Copy" "Please" "Cats" "No"}

Notice how one of the duplicate strings, "Cats", was silently removed to create a set.

A Sorted Set can be created with the sorted-set function and have no literal syntax as Hash Sets do:

user=> (sorted-set "No" "Copy" "Cats" "Cats" "Please")
#{"Cats" "Copy" "No" "Please"}

Notice that they are printed in the same way as Hash Sets, only the order looks different. Sorted Sets are sorted based on the natural order of elements they contain rather than the order of the arguments provided upon creation. You could instead provide your own sorting function, but we will focus on Hash Sets as they are far more common and useful.

Exercise 2.03: Using Sets

In this exercise, we will use a Hash Set to represent a collection of supported currencies:

Note

A Hash Set is a good choice of data structure for a list of currencies because we typically want to store a collection of unique values and efficiently check for containment. Also, the order of the currencies probably doesn't matter. If you wanted to associate more data to a currency (such as ISO codes and countries), then you would more likely use nested Maps to represent each currency as an entity, keyed by a unique ISO code. Ultimately, the choice of the data structure depends on how you plan to use the data. In this exercise, we simply want to read it, check for containment, and add items to our set.

  1. Start a REPL. Create a set and bind it to the supported-currencies symbol:
    user=> (def supported-currencies #{"Dollar" "Japanese yen" "Euro" "Indian rupee" "British pound"})
    #'user/supported-currencies
  2. As with maps, you can use get to retrieve an entry from a set, which returns the entry passed as a parameter when present in the set. Use get to retrieve an existing entry as well as a missing entry:
    user=> (get supported-currencies "Dollar")
    "Dollar"
    user=> (get supported-currencies "Swiss franc")
    nil
  3. It is likely that you just want to check for containment, and contains? is, therefore, semantically better. Use contains? instead of get to check for containment:
    user=> (contains? supported-currencies "Dollar")
    true
    user=> (contains? supported-currencies "Swiss franc")
    false

    Notice that contains? returns a Boolean and that get returns the lookup value or nil when not found. There is the edge case of looking up nil in a set that will return nil both when found and not found. In that case, contains? is naturally more suitable.

  4. As with maps, sets and keywords can be used as functions to check for containment. Use the supported-currencies set as a function to look up a value in the set:
    user=> (supported-currencies "Swiss franc")
    nil

    "Swiss franc" isn't in the supported-currencies set; therefore, the preceding return value is nil.

  5. If you tried to use the "Dollar" string as a function to look itself up in the set, you would get the following error:
    user=> ("Dollar" supported-currencies)
    Execution error (ClassCastException) at user/eval7 (REPL:1).
    java.lang.String cannot be cast to clojure.lang.IFn

    We cannot use strings as a function to look up a value in a set or a Map. That's one of the reasons why keywords are a better choice in both sets and maps when possible.

  6. To add an entry to a set, use the conj function, as in "conjoin":
    user=> (conj supported-currencies "Monopoly Money")
    #{"Japanese yen" "Euro" "Dollar" "Monopoly Money" "Indian rupee" "British pound"}
  7. You can pass more than one item to the conj function. Try to add multiple currencies to our Hash Set:
    user=> (conj supported-currencies "Monopoly Money" "Gold dragon" "Gil")
    #{"Japanese yen" "Euro" "Dollar" "Monopoly Money" "Indian rupee" "Gold dragon" "British pound" "Gil"}
  8. Finally, you can remove one or more items with the disj function, as in "disjoin":
    user=> (disj supported-currencies "Dollar" "British pound")
    #{"Japanese yen" "Euro" "Indian rupee"}

That's it for sets! If you ever need to, you can find more functions for working with sets in the clojure.set namespace (such as union and intersection), but this is more advanced usage, so let's move on to the next collection: vectors.

Vectors

A vector is another type of collection that is widely used in Clojure. You can think of vectors as powerful immutable arrays. They are collections of values efficiently accessible by their integer index (starting from 0), and they maintain the order of item insertion as well as duplicates.

Use a vector when you need to store and read elements in order, and when you don't mind duplicate elements. For example, a web browser history could be a good candidate, as you might want to easily go back to the recent pages but also remove older elements using a vector's index, and there would likely be duplicate elements in it. A map or a set wouldn't be of much help in that situation, as you don't have a specific key to look up a value with.

Vectors have a literal notation with square brackets ([]):

user=> [1 2 3]
[1 2 3]

Vectors can also be created with the vector function followed by a list of items as arguments:

user=> (vector 10 15 2 15 0)
[10 15 2 15 0]

You can create a vector from another collection using the vec function; for example, the following expression converts a Hash Set to a vector:

user=> (vec #{1 2 3})
[1 3 2]

As with other collections, vectors also can contain different types of values:

user=> [nil :keyword "String" {:answers [:yep :nope]}]
[nil :keyword "String" {:answers [:yep :nope]}]

We can now start practicing.

Exercise 2.04: Using Vectors

In this exercise, we will discover different ways of accessing and interacting with vectors:

  1. Start a REPL. You can look up values in a vector using their index (that is, their position in the collection) with the get function. Try to use the get function with a literal vector:
    user=> (get [:a :b :c] 0)
    :a
    user=> (get [:a :b :c] 2)
    :c
    user=> (get [:a :b :c] 10)
    nil

    Because vectors start at 0-index, :a is at index 0 and :c is at index 2. When the lookup fails, get returns nil.

  2. Let's bind a vector to a symbol to make the practice more convenient:
    user=> (def fibonacci [0 1 1 2 3 5 8])
    #'user/fibonacci
    user=> (get fibonacci 6)
    8
  3. As with maps and sets, you can use the vector as a function to look up items, but for vectors, the parameter is the index of the value in the vector:
    user=> (fibonacci 6)
    8
  4. Add the next two values of the Fibonacci sequence to your vector with the conj function:
    user=> (conj fibonacci 13 21)
    [0 1 1 2 3 5 8 13 21]

    Notice that the items are added to the end of the vector, and the order of the sequence is kept the same.

  5. Each item in the Fibonacci sequence corresponds to the sum of the previous two items. Let's dynamically compute the next item of the sequence:
    user=>
    (let [size (count fibonacci)
           last-number (last fibonacci)
           second-to-last-number (fibonacci (- size 2))]
        (conj fibonacci (+ last-number second-to-last-number)))
    [0 1 1 2 3 5 8 13]

    In the preceding example, we used let to create three local bindings and improve the readability. We used count to calculate the size of a vector, last to retrieve its last element, 8, and finally, we used the fibonacci vector as a function to retrieve the element at index "size - 2" (which is the value 5 at index 5).

In the body of the let block, we used the local binding to add the two last items to the end of the Fibonacci sequence with conj, which returns 13 (which is, indeed, 5 + 8).

Lists

Lists are sequential collections, similar to vectors, but items are added to the front (at the beginning). Also, they don't have the same performance properties, and random access by index is slower than with vectors. We mostly use lists to write code and macros, or in cases when we need a last-in, first-out (LIFO) type of data structure (for example, a stack), which can arguably also be implemented with a vector.

We create lists with the literal syntax, (), but to differentiate lists that represent code and lists that represent data, we need to use the single quote, ':

user=> (1 2 3)
Execution error (ClassCastException) at user/eval211 (REPL:1).
java.lang.Long cannot be cast to clojure.lang.IFn
user=> '(1 2 3)
(1 2 3)
user=> (+ 1 2 3)
6
user=> '(+ 1 2 3)
(+ 1 2 3)

In the preceding examples, we can see that a list that is not quoted with ' throws an error unless the first item of the list can be invoked as a function.

Lists can also be created with the list function:

user=> (list :a :b :c)
(:a :b :c)

To read the first element of a list, use first:

user=> (first '(:a :b :c :d))
:a

The rest function returns the list without its first item:

user=> (rest '(:a :b :c :d))
(:b :c :d)

We will not talk about iterations and recursion yet, but you could imagine that the combination of first and rest is all you need to "walk" or go through an entire list: simply by calling first on the rest of the list over and over again until there's no rest.

You cannot use the get function with a list to retrieve by index. You could use nth, but it is not efficient as the list is iterated or "walked" until it reaches the desired position:

user=> (nth '(:a :b :c :d) 2)
:c

Exercise 2.05: Using Lists

In this exercise, we will practice using lists by reading and adding elements to a to-do list.

  1. Start a REPL and create a to-do list with a list of actions that you need to do, using the list function as follows:
    user=> (def my-todo (list  "Feed the cat" "Clean the bathroom" "Save the world"))
    #'user/my-todo
  2. You can add items to your list by using the cons function, which operates on sequences:
    user=> (cons "Go to work" my-todo)
    ("Go to work" "Feed the cat" "Clean the bathroom" "Save the world")
  3. Similarly, you can use the conj function, which is used because a list is a collection:
    user=> (conj my-todo "Go to work")
    ("Go to work" "Feed the cat" "Clean the bathroom" "Save the world")

    Notice how the order of the parameters is different. cons is available on lists because a list is a sequence, and conj is available to use on lists because a list is a collection. conj is, therefore, slightly more "generic" and also has the advantage of accepting multiple elements as arguments.

  4. Add multiple elements at once to your list by using the conj function:
    user=> (conj my-todo "Go to work" "Wash my socks")
    ("Wash my socks" "Go to work" "Feed the cat" "Clean the bathroom" "Save the world")
  5. Now it's time to catch up with your task. Retrieve the first element in your to-do list with the first function:
    user=> (first my-todo)
    "Feed the cat"
  6. Once done, you can retrieve the rest of your tasks with the rest function:
    user=> (rest my-todo)
    ("Clean the bathroom" "Save the world")

    You could imagine then having to call first on the rest of the list (if you had to develop a fully blown to-do list application). Because the list is immutable, if you keep calling first on the same my-todo list, you will end up with the same element, "Feed the cat", over and over again, and also with a happy but very fat cat.

  7. Finally, you can also retrieve a specific element from the list using the nth function:
    user=> (nth my-todo 2)
    "Save the world"

    However, remember that retrieving an element at a specific position in a list is slower than with vectors because the list has to be "walked" until the nth element. In that case, you might be better off using a vector. One final note about nth is that it throws an exception when the element at position n is not found.

That is all you need to know about lists for now and we can move on to the next section about collection and sequence abstractions.

Collection and Sequence Abstractions

Clojure's data structures are implemented in terms of powerful abstractions. You might have noticed that the operations we used on collections are often similar, but behave differently based on the type of the collection. For instance, get retrieves items from a map with a key, but from a vector with an index; conj adds elements to a vector at the back, but to a list at the front.

A sequence is a collection of elements in a particular order, where each item follows another. Maps, sets, vectors, and lists are all collections, but only vectors and lists are sequences, although we can easily obtain a sequence from a map or a set.

Let's go through a few examples of useful functions to use with collections. Consider the following map:

user=> (def language {:name "Clojure" :creator "Rich Hickey" :platforms ["Java" "JavaScript" ".NET"]})
#'user/language

Use count to get the number of elements in a collection. Each element of this map is a key-value pair; therefore, it contains three elements:

user=> (count language)
3

Slightly more apparent, the following set contains no elements:

user=> (count #{})
0

We can test whether a collection is empty with the empty? function:

user=> (empty? language)
false
user=> (empty? [])
true

A map is not sequential because there is no logical order between its elements. However, we can convert a map to a sequence using the seq function:

user=> (seq language)
([:name "Clojure"] [:creator "Rich Hickey"] [:platforms ["Java" "JavaScript" ".NET"]])

It yielded a list of vectors or tuples, which means that there is now a logical order and we can use sequence functions on this data structure:

user=> (nth (seq language) 1)
[:creator "Rich Hickey"]

A lot of functions just work on collections directly because they can be turned into a sequence, so you could omit the seq step and, for example, call first, rest, or last directly on a map or a set:

user=> (first #{:a :b :c})
:c
user=> (rest #{:a :b :c})
(:b :a)
user=> (last language)
[:platforms ["Java" "JavaScript" ".NET"]]

The value of using sequence functions such as first or rest on maps and sets seems questionable but treating those collections as sequences means that they can then be iterated. Many more functions are available for processing each item of a sequence, such as map, reduce, filter, and so on. We have dedicated entire chapters to learning about those in the second part of the book so that we can stay focused on the other core functions for now.

into is another useful operator that puts elements of one collection into another collection. The first argument for into is the target collection:

user=> (into [1 2 3 4] #{5 6 7 8})
[1 2 3 4 7 6 5 8]

In the preceding example, each element of the #{5 6 7 8} set was added into the [1 2 3 4] vector. The resulting vector is not in ascending order because Hash Sets are not sorted:

user=> (into #{1 2 3 4} [5 6 7 8])
#{7 1 4 6 3 2 5 8}

In the preceding example, the [5 6 7 8] vector was added to the #{1 2 3 4} set. Once again, Hash Sets do not keep insertion order and the resulting set is simply a logical collection of unique values.

A usage example would be, for example, to deduplicate a vector, just put it into a set:

user=> (into #{} [1 2 3 3 3 4])
#{1 4 3 2}

To put items into a map, you would need to pass a collection of tuples representing key-value pairs:

user=> (into {} [[:a 1] [:b 2] [:c 3]])
{:a 1, :b 2, :c 3}

Each item is "conjoined" in the collection, and so it follows the semantic of the target collection for inserting items with conj. Elements are added to a list at the front:

user=> (into '() [1 2 3 4])
(4 3 2 1)

To help you understand (into '() [1 2 3 4]), here is a step-by-step representation of what happened:

user=> (conj '() 1)
(1)
user=> (conj '(1) 2)
(2 1)
user=> (conj '(2 1) 3)
(3 2 1)
user=> (conj '(3 2 1) 4)
(4 3 2 1)

If you want to concatenate collections, concat might be more appropriate than into. See how they behave differently here:

user=> (concat '(1 2) '(3 4))
(1 2 3 4)
user=> (into '(1 2) '(3 4))
(4 3 1 2)

A lot of Clojure functions that operate on sequences will return sequences no matter what the input type was. concat is one example:

user=> (concat #{1 2 3} #{1 2 3 4})
(1 3 2 1 4 3 2)
user=> (concat {:a 1} ["Hello"])
([:a 1] "Hello")

sort is another example. sort can rearrange a collection to order its elements. It has the benefit of being slightly more obvious in terms of why you would want a sequence as a result:

user=> (def alphabet #{:a :b :c :d :e :f})
#'user/alphabet
user=> alphabet
#{:e :c :b :d :f :a}
user=> (sort alphabet)
(:a :b :c :d :e :f)
user=> (sort [3 7 5 1 9])
(1 3 5 7 9)

But what if you wanted a vector as a result? Well, now you know that you could use the into function:

user=> (sort [3 7 5 1 9])
(1 3 5 7 9)
user=> (into [] *1)
[1 3 5 7 9]

It is interesting to note that conj can also be used on maps. For its arguments to be consistent with other types of collections, the new entry is represented by a tuple:

user=> (conj language [:created 2007])
{:name "Clojure", :creator "Rich Hickey", :platforms ["Java" "JavaScript" ".NET"], :created 2007}

Similarly, a vector is an associative collection of key-value pairs where the key is the index of the value:

user=> (assoc [:a :b :c :d] 2 :z)
[:a :b :z :d]

Exercise 2.06: Working with Nested Data Structures

For the purpose of this exercise, imagine that you are working with a little shop called "Sparkling," whose business is to trade gemstones. It turns out that the owner of the shop knows a bit of Clojure, and has been using a Clojure REPL to manage the inventory with some kind of homemade database. However, the owner has been struggling to work with nested data structures, and they require help from a professional: you. The shop won't share their database because it contains sensitive data – they have just given you a sample dataset so that you know about the shape of the data.

The shop owner read a blog post on the internet saying that pure functions are amazing and make for good quality code. So, they asked you to develop some pure functions that take their gemstone database as the first parameter of each function. The owner said you would only get paid if you provide pure functions. In this exercise, we will develop a few functions that will help us understand and operate on nested data structures.

Note

A pure function is a function where the return value is only determined by its input values. A pure function does not have any side effects, which means that it does not mutate a program's state nor generate any kind of I/O.

  1. Open up a REPL and create the following Hash Map representing the sample gemstone database:

    repl.clj

    1  (def gemstone-db {
    2      :ruby {
    3        :name "Ruby"
    4        :stock 480
    5        :sales [1990 3644 6376 4918 7882 6747 7495 8573 5097 1712]
    6        :properties {
    7          :dispersion 0.018
    8          :hardness 9.0
    9          :refractive-index [1.77 1.78]
    10         :color "Red"
    11       }
    12     }

    One of the most popular questions the shop gets from its customers is about the durability of a gem. This can be found in the properties of a gem, at the :hardness key. The first function that we need to develop is durability, which retrieves the hardness of a given gem.

  2. Let's start by using a function we already know, get, with the :ruby gem as an example:
    user=> (get (get (get gemstone-db :ruby) :properties) :hardness)
    9.0

    It works, but nesting get is not very elegant. We could use the map or keywords as functions and see how it improves the readability.

  3. Use the keywords as a function to see how it improves the readability of our code:
    user=> (:hardness (:properties (:ruby gemstone-db)))
    9.0

    This is slightly better. But it's still a lot of nested calls and parentheses. Surely, there must be a better way!

    When you need to fetch data in a deeply nested map such as this one, use the get-in function. It takes a vector of keys as parameters and digs in the map with just one function call.

  4. Use the get-in function with the [:ruby :properties :hardness] vector of parameters to retrieve the deeply nested :hardness key:
    user=> (get-in gemstone-db [:ruby :properties :hardness])
    9.0

    Great! The vector of keys reads left to right and there is no nested expression. It will make our function a lot more readable.

  5. Create the durability function that takes the database and the gem keyword as a parameter and returns the value of the hardness property:
    user=>
    (defn durability
      [db gemstone]
      (get-in db [gemstone :properties :hardness]))
    #'user/durability
  6. Test your newly created function to make sure that it works as expected:
    user=> (durability gemstone-db :ruby)
    9.0
    user=> (durability gemstone-db :moissanite)
    9.5

    Great! Let's move on to the next function.

    Apparently, a ruby is not simply "red" but "Near colorless through pink through all shades of red to a deep crimson." Who would have thought? The owner is now asking you to create a function to update the color of a gem, because they might want to change some other colors too, for marketing purposes. The function needs to return the updated database.

  7. Let's try to write the code to change the color property of a gem. We can try to use assoc:
    user=> (assoc (:ruby gemstone-db) :properties {:color "Near colorless through pink through all shades of red to a deep crimson"})
    {:name "Ruby", :stock 120, :sales [1990 3644 6376 4918 7882 6747 7495 8573 5097 1712], :properties {:color "Near colorless through pink through all shades of red to a deep crimson"}}

    It seems to work but, all the other properties are gone! We replaced the existing Hash Map at the key property with a new Hash Map that contains only one entry: the color.

  8. We could use a trick. Do you remember the into function? It takes a collection and put its values in another collection, like this:
    user=> (into {:a 1 :b 2} {:c 3})
    {:a 1, :b 2, :c 3}

    If we use the update function combined with into, we could obtain the desired result.

  9. Try to use update combined with into to change the :color property of the ruby gem:
    user=> (update (:ruby gemstone-db) :properties into {:color "Near colorless through pink through all shades of red to a deep crimson"})
    {:name "Ruby", :stock 120, :sales [1990 3644 6376 4918 7882 6747 7495 8573 5097 1712], :properties {:dispersion 0.018, :hardness 9.0, :refractive-index [1.77 1.78], :color "Near colorless through pink through all shades of red to a deep crimson"}}

    That's great, but there are two problems with this approach. First, the combination of update and into is not very readable or easy to understand. Second, we wanted to return the entire database, but we just returned the "Ruby" entry. We would have to add another operation to update this entry in the main database, perhaps by nesting another into, reducing readability even further.

    As with get-in, Clojure offers a simpler way of dealing with nested maps: assoc-in and update-in. They work like assoc and update, but take a vector of keys (such as get-in) as a parameter, instead of a single key.

    You would use update-in when you want to update a deeply nested value with a function (for example, to compute the new value with the previous value). Here, we simply want to replace the color with an entirely new value, so we should use assoc-in.

  10. Use assoc-in to change the color property of the ruby gem:
    user=> (assoc-in gemstone-db [:ruby :properties :color] "Near colorless through pink through all shades of red to a deep crimson")
    {:ruby {:name "Ruby", :stock 120, :sales [1990 3644 6376 4918 7882 6747 7495 8573 5097 1712], :properties {:dispersion 0.018, :hardness 9.0, :refractive-index [1.77 1.78], :color "Near colorless through pink through all shades of red to a deep crimson"}}, :emerald {:name "Emerald", :stock 85, :sales [6605 2373 104 4764 9023], :properties {:dispersion 0.014, :hardness 7.5, :refractive-index [1.57 1.58], :color "Green shades to colorless"}}, :diamond {:name "Diamond", :stock 10, :sales [8295 329 5960 6118 4189 3436 9833 8870 9700 7182 7061 1579], :properties {:dispersion 0.044, :hardness 10, :refractive-index [2.417 2.419], :color "Typically yellow, brown or gray to colorless"}}, :moissanite {:name "Moissanite", :stock 45, :sales [7761 3220], :properties {:dispersion 0.104, :hardness 9.5, :refractive-index [2.65 2.69], :color "Colorless, green, yellow"}}}

    Notice how gemstone-db was returned entirely. Can you notice the value that has changed? There is a lot of data, so it is not very obvious. You can use the pprint function to "pretty print" the value.

    Use pprint on the last returned value to improve the readability and make sure that our assoc-in expression behaved as expected. In a REPL, the last returned value can be obtained with *1:

    Figure 2.1: Printing the output to REPL

    Figure 2.1: Printing the output to REPL

    That is much more readable. We will not use pprint everywhere as it takes a lot of extra space, but you should use it.

  11. Create the change-color pure function, which takes three parameters: a database, a gemstone keyword, and a new color. This function updates the color in the given database and returns the new value of the database:
    user=>
    (defn change-color
      [db gemstone new-color]
      (assoc-in gemstone-db [gemstone :properties :color] new-color))
    #'user/change-color
  12. Test that your newly created function behaves as expected:
    user=> (change-color gemstone-db :ruby "Some kind of red")
    {:ruby {:name "Ruby", :stock 120, :sales [1990 3644 6376 4918 7882 6747 7495 8573 5097 1712], :properties {:dispersion 0.018, :hardness 9.0, :refractive-index [1.77 1.78], :color "Some kind of red"}}, :emerald {:name "Emerald", :stock 85, :sales [6605 2373 104 4764 9023], :properties {:dispersion 0.014, :hardness 7.5, :refractive-index [1.57 1.58], :color "Green shades to colorless"}}, :diamond {:name "Diamond", :stock 10, :sales [8295 329 5960 6118 4189 3436 9833 8870 9700 7182 7061 1579], :properties {:dispersion 0.044, :hardness 10, :refractive-index [2.417 2.419], :color "Typically yellow, brown or gray to colorless"}}, :moissanite {:name "Moissanite", :stock 45, :sales [7761 3220], :properties {:dispersion 0.104, :hardness 9.5, :refractive-index [2.65 2.69], :color "Colorless, green, yellow"}}}

    The owner would like to add one last function to record the sale of a gem and update the inventory accordingly.

    When a sale occurs, the shop owner would like to call the sell function with the following arguments: a database, a gemstone keyword, and a client ID. client-id will be inserted in the sales vector and the stock value for that gem will be decreased by one. As with the other functions, the new value of the database will be returned so that the client can handle the update themselves.

  13. We can use the update-in function in combination with dec to decrement (decrease by one) the stock. Let's try it with the diamond gem:
    user=> (update-in gemstone-db [:diamond :stock] dec)
    {:ruby {:name "Ruby", :stock 120, :sales [1990 3644 6376 4918 7882 6747 7495 8573 5097 1712], :properties {:dispersion 0.018, :hardness 9.0, :refractive-index [1.77 1.78], :color "Near colorless through pink through all shades of red to a deep crimson"}}, :emerald {:name "Emerald", :stock 85, :sales [6605 2373 104 4764 9023], :properties {:dispersion 0.014, :hardness 7.5, :refractive-index [1.57 1.58], :color "Green shades to colorless"}}, :diamond {:name "Diamond", :stock 9, :sales [8295 329 5960 6118 4189 3436 9833 8870 9700 7182 7061 1579], :properties {:dispersion 0.044, :hardness 10, :refractive-index [2.417 2.419], :color "Typically yellow, brown or gray to colorless"}}, :moissanite {:name "Moissanite", :stock 45, :sales [7761 3220], :properties {:dispersion 0.104, :hardness 9.5, :refractive-index [2.65 2.69], :color "Colorless, green, yellow"}}}

    The output is not very readable, and it is hard to verify that the value was correctly updated. Another useful command to improve readability in the REPL is the *print-level* option, which can limit the depth of the data structure printed to the terminal.

  14. Use the *print-level* option to set the depth level to 2, and observe how the result is printed:
    user=> (set! *print-level* 2)
    2
    user=> (update-in gemstone-db [:diamond :stock] dec)
    {:ruby {:name "Ruby", :stock 120, :sales #, :properties #}, :emerald {:name "Emerald", :stock 85, :sales #, :properties #}, :diamond {:name "Diamond", :stock 9, :sales #, :properties #}, :moissanite {:name "Moissanite", :stock 45, :sales #, :properties #}} 

    The diamond stock has indeed decreased by 1, from 10 to 9.

  15. We can use the update-in function again, this time in combination with conj and a client-id to add in the sales vector. Let's try an example with the diamond gem and client-id 999:
    user=> (update-in gemstone-db [:diamond :sales] conj 999)
    {:ruby {:name "Ruby", :stock 120, :sales #, :properties #}, :emerald {:name "Emerald", :stock 85, :sales #, :properties #}, :diamond {:name "Diamond", :stock 10, :sales #, :properties #}, :moissanite {:name "Moissanite", :stock 45, :sales #, :properties #}}

    It might have worked, but we cannot see the sales vector as the data has been truncated by the *print-level* option.

  16. Set *print-level* to nil to reset the option, and reevaluate the previous expression:
    user=> (set! *print-level* nil)
    nil
    user=> (update-in gemstone-db [:diamond :sales] conj 999)
    {:ruby {:name "Ruby", :stock 120, :sales [1990 3644 6376 4918 7882 6747 7495 8573 5097 1712], :properties {:dispersion 0.018, :hardness 9.0, :refractive-index [1.77 1.78], :color "Near colorless through pink through all shades of red to a deep crimson"}}, :emerald {:name "Emerald", :stock 85, :sales [6605 2373 104 4764 9023], :properties {:dispersion 0.014, :hardness 7.5, :refractive-index [1.57 1.58], :color "Green shades to colorless"}}, :diamond {:name "Diamond", :stock 10, :sales [8295 329 5960 6118 4189 3436 9833 8870 9700 7182 7061 1579 999], :properties {:dispersion 0.044, :hardness 10, :refractive-index [2.417 2.419], :color "Typically yellow, brown or gray to colorless"}}, :moissanite {:name "Moissanite", :stock 45, :sales [7761 3220], :properties {:dispersion 0.104, :hardness 9.5, :refractive-index [2.65 2.69], :color "Colorless, green, yellow"}}}

    Notice that our diamond sales vector now contains the value 999.

  17. Now let's write our pure function, which combines the two operations (updating the stock and the clients):
    (defn sell
      [db gemstone client-id]
      (let [clients-updated-db (update-in db [gemstone :sales] conj client-id)]
        (update-in clients-updated-db [gemstone :stock] dec)))
  18. Test your newly created function by selling a :moissanite to client-id 123:
    user=> (sell gemstone-db :moissanite 123)
    {:ruby {:name "Ruby", :stock 120, :sales [1990 3644 6376 4918 7882 6747 7495 8573 5097 1712], :properties {:dispersion 0.018, :hardness 9.0, :refractive-index [1.77 1.78], :color "Near colorless through pink through all shades of red to a deep crimson"}}, :emerald {:name "Emerald", :stock 85, :sales [6605 2373 104 4764 9023], :properties {:dispersion 0.014, :hardness 7.5, :refractive-index [1.57 1.58], :color "Green shades to colorless"}}, :diamond {:name "Diamond", :stock 10, :sales [8295 329 5960 6118 4189 3436 9833 8870 9700 7182 7061 1579], :properties {:dispersion 0.044, :hardness 10, :refractive-index [2.417 2.419], :color "Typically yellow, brown or gray to colorless"}}, :moissanite {:name "Moissanite", :stock 44, :sales [7761 3220 123], :properties {:dispersion 0.104, :hardness 9.5, :refractive-index [2.65 2.69], :color "Colorless, green, yellow"}}}

Notice that the sales vector of the moissanite entity now contains the value 123.

In this exercise, we did not really "update" data but merely derived new data structures from others because of their immutability. Even if we work mostly with immutable data types, Clojure offers simple mechanisms that allow you to persist information. In the following activity, you will create a database that can be read and updated with the techniques acquired in this chapter, and we will even provide a helper function to make the database persistent.

Activity 2.01: Creating a Simple In-Memory Database

In this activity, we are going to create our own implementation of an in-memory database. After all, if the "Sparkling" shop owner was able to do it, then it shouldn't be a problem for us!

Our database interface will live in the Clojure REPL. We will implement functions to create and drop tables, as well as to insert and read records.

For the purposes of this activity, we will provide a couple of helper functions to help you maintain the state of the database in memory:

(def memory-db (atom {}))
(defn read-db [] @memory-db)
(defn write-db [new-db] (reset! memory-db new-db))

We use an atom but you don't need to understand how atoms work for now, as they are explained in great detail later in the book. You just need to know that it will keep a reference to our database in memory, and use two helper functions, read-db and write-db, to read and persist a Hash Map in memory.

As guidance, we would like the data structure to have this shape:

{:table-1 {:data [] :indexes {}} :table-2 {:data [] :indexes {}}

For example, if we used our database in a grocery store to save clients, fruits, and purchases, we can imagine that it would contain the data in this manner:

{
  :clients {
    :data [{:id 1 :name "Bob" :age 30} {:id 2 :name "Alice" :age 24}]
    :indexes {:id {1 0, 2 1}}
    },
  :fruits {
    :data [{:name "Lemon" :stock 10} {:name "Coconut" :stock 3}]
    :indexes {:name {"Lemon" 0, "Coconut" 1}}
  },
  :purchases {
    :data [{:id 1 :user-id 1 :item "Coconut"} {:id 1 :user-id 2 :item "Lemon"}]
    :indexes {:id {1 0, 2 1}}
  }
}

Storing data and indexes separately allows multiple indexes to be created without having to duplicate the actual data.

The indexes map stores an association between the index key and its position in the data vector for each index key. In the fruits table, "Lemon" is the first record of the data vector, so the value in the :name index is 0.

These steps will help you perform the activity:

  1. Create the helper functions. You can get the Hash Map by executing the read-db function with no arguments, and write to the database by executing the write-db function with a Hash Map as an argument.
  2. Start by creating the create-table function. This function should take one parameter: the table name. It should add a new key (the table name) at the root of our Hash Map database, and the value should be another Hash Map containing two entries: an empty vector at the data key and an empty Hash Map at the indexes key.
  3. Test that your create-table function works.
  4. Create a drop-table function such that it takes one parameter as well - the table name. It should remove a table, including all its data and indexes from our database.
  5. Test that your drop-table function works.
  6. Create an insert function. This function should take three parameters: table, record, and id-key. The record parameter is a Hash Map, and id-key corresponds to a key in the record map that will be used as a unique index. For now, we will not handle cases when a table does not exist or when an index key already exists in a given table.

    Try to use a let block to divide the work of the insert function in multiple steps:

    In a let statement, create a binding for the value of the database, retrieved with read-db.

    In the same let statement, create a second binding for the new value of the database (after adding the record in the data vector).

    In the same let statement, retrieve the index at which the record was inserted by counting the number of elements in the data vector.

    In the body of the let statement, update the index at id-key and write the resulting map to the database with write-db.

  7. To verify that your insert function works, try to use it multiple times to insert new records.
  8. Create a select-* function that will return all the records of a table passed as a parameter.
  9. Create a select-*-where function that takes three arguments: table-name, field, and field-value. The function should use the index map to retrieve the index of the record in the data vector and return the element.
  10. Modify the insert function to reject any index duplicate. When a record with id-key already exists in the indexes map, we should not modify the database and print an error message to the user.

    On completing the activity, the output should be similar to this:

    user=> (create-table :fruits)
    {:clients {:data [], :indexes {}}, :fruits {:data [], :indexes {}}}
    user=> (insert :fruits {:name "Pear" :stock 3} :name)
    Record with :name Pear already exists. Aborting
    user=> (select-* :fruits)
    [{:name "Pear", :stock 3} {:name "Apricot", :stock 30} {:name "Grapefruit", :stock 6}]
    user=> (select-*-where :fruits :name "Apricot")
    {:name "Apricot", :stock 30}

In this activity, we have used our new knowledge about reading and updating both simple and deeply nested data structures to implement a simple in-memory database. This was not an easy feat – well done!

Note

The solution for this activity can be found via this link.

 

Summary

In this chapter, we discovered the concept of immutability. We learned about Clojure's simple data types, as well as their implementation on different host platforms. We discovered the most common types of collections and sequences: maps, sets, vectors, and lists. We saw how to use them with generic collections and sequence operations. We learned how to read and update complex structures of nested collections. We also learned about the standard functions for using collection data structures, as well as more advanced usage with deeply nested data structures. In the next chapter, we will learn advanced techniques for working with functions.

About the Authors
  • Joseph Fahey

    Joseph Fahey has been a developer for nearly two decades. He got his start in the Digital Humanities in the early 2000s. Ever since then, he has been trying to hone his skills and expand his inventory of techniques. This lead him to Common Lisp and then to Clojure when it was first introduced. As an independent developer, Joseph was able to quickly start using Clojure professionally. These days, Joseph gets to write Clojure for his day job at Empear AB.

    Browse publications by this author
  • Thomas Haratyk

    Thomas Haratyk graduated from Lille University of Science and Technology and has been a professional programmer for nine years. After studying computer science and starting his career in France, he is now working as a consultant in London, helping start-ups develop their products and scale their platforms with Clojure, Ruby, and modern JavaScript.

    Browse publications by this author
  • Scott McCaughie

    Scott McCaughie lives near Glasgow, Scotland where he works as a senior Clojure developer for Previse, a Fintech startup aiming to solve the problem of slow payments in the B2B space. Having graduated from Heriot-Watt University, his first 6 years were spent building out Risk and PnL systems for JP Morgan. A fortuitous offer of a role learning and writing Clojure came up and he jumped at the chance. 5 years of coding later and it's the best career decision he's made. In his spare time, Scott is an avid reader, enjoys behavioral psychology and financial independence podcasts, and keeps fit by commuting by bike, running, climbing, hill walking, snowboarding. You get the picture!

    Browse publications by this author
  • Yehonathan Sharvit

    Yehonathan Sharvit has been a software developer since 2001. He discovered functional programming in 2009. It has profoundly changed his view of programming and his coding style. He loves to share his discoveries and his expertise. He has been giving courses on Clojure and JavaScript since 2016. He holds a master's degree in Mathematics.

    Browse publications by this author
  • Konrad Szydlo

    Konrad Szydlo is a psychology and computing graduate from Bournemouth University. He has worked with Clojure for the last 8 years. Since January 2016, he has worked as a software engineer and team leader at Retailic, responsible for building a website for the biggest royalty program in Poland. Prior to this, he worked as a developer with Sky, developing e-commerce and sports applications, where he used Ruby, Java, and PHP. He is also listed in the Top 75 Datomic developers on GitHub.

    Browse publications by this author