Search icon
Arrow left icon
All Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Newsletters
Free Learning
Arrow right icon
Machine Learning with Swift

You're reading from  Machine Learning with Swift

Product type Book
Published in Feb 2018
Publisher Packt
ISBN-13 9781787121515
Pages 378 pages
Edition 1st Edition
Languages
Authors (3):
Jojo Moolayil Jojo Moolayil
Profile icon Jojo Moolayil
Alexander Sosnovshchenko Alexander Sosnovshchenko
Profile icon Alexander Sosnovshchenko
Oleksandr Baiev Oleksandr Baiev
View More author details

Table of Contents (18) Chapters

Title Page
Packt Upsell
Contributors
Preface
Getting Started with Machine Learning Classification – Decision Tree Learning K-Nearest Neighbors Classifier K-Means Clustering Association Rule Learning Linear Regression and Gradient Descent Linear Classifier and Logistic Regression Neural Networks Convolutional Neural Networks Natural Language Processing Machine Learning Libraries Optimizing Neural Networks for Mobile Devices Best Practices Index

Chapter 5. Association Rule Learning

In many practical applications data comes in the form of lists (ordered or unordered): grocery lists, playlists, visited locations or URLs, app logs, and so on. Sometimes those lists are generated as a byproduct of business processes, but they still contain potentially useful information and insights for process improvement. To extract some of that hidden knowledge, one can use a special kind of unsupervised learning algorithm—association rule mining. In this chapter, we are going to build an app that can analyze your shopping lists to find out your preferences in the form of rules such as "If you've bought oatmeal and cornflakes, you also want to buy milk." This can be used to create an adaptable user experience, for instance, contextual suggestions or reminders.

In this chapter, we will cover the following topics:

  • Association rules
  • Association measures
  • Association rule mining algorithms
  • Building an adaptable user experience

Seeing association rules


There are many situations where we're interested in patterns demonstrating the co-occurrence of some items. For example, marketers want to know which goods are often bought together, clinical personnel need to know symptoms associated with certain medical conditions, and in information security we want to know which activity patterns are associated with intrusion or fraud. All of these problems have a common structure: there are items (goods, symptoms, records in logs) organized in transactions (shopping list, medical case, user activity transaction). With this type of data, we can then analyze it to find association rules, such as If the client bought a lemon and some cookies, he is also likely to buy tea, or in more formal notation: (cookies, lemon → tea).

Note

We will use pictograms throughout this chapter to facilitate the visual notation of item sets and rules: {

 

 →

}.

These rules allow us to make informed decisions, such as putting associated items on the same...

Defining data structures


What we want to have by the end of this chapter is a rule learning algorithm called Apriori. We will learn about the algorithm details later; for now, we only want to define the data structures that we will work with throughout the chapter, along with some utility functions.

The generic structure for the algorithm is as follows:

public struct Apriori<Item: Hashable & Equatable> { 

In the simplest case, the ordering of the items in the transaction doesn't matter, and neither does their number nor the associated timestamps. This means that we consider our item sets and transactions as mathematical or Swift sets:

public typealias ItemSet = Set<Item> 

The parameter I is a type of item in your transactions. Next, we have to implement some structures for subsets and rules:

class Subsets: Sequence {
  var subsets: [ItemSet]
  init(_ set: ItemSet) {
    self.subsets = Array(set).combinations().map(Set.init)
  }
  func makeIterator() -> AnyIterator<ItemSet...

Using association measures to assess rules


Look at these two rules:

  • {Oatmeal, corn flakes → Milk}
  • {Dog food, paperclips → Washing powder}

Intuitively, the second rule looks more unlikely than the first one, doesn't it? How can we tell that for sure, though? In this case, we need some quantitative measures that will show us how likely each rule is. What we are looking for here are association measures, as we call them in machine learning and data mining. Rule mining algorithms revolve around this notion in a similar manner to how distance-based algorithms revolve around distance metrics. In this chapter, we're going to use four association measures: support, confidence, lift, and conviction (see Table 5.1).

Note that these measures tell us nothing about how useful or interesting the rules are, but only quantify their probabilistic characteristics. A rule's usefulness and practicality can be hard to grasp mathematically and often requires human judgment in each case. As usual in statistics, interpreting...

Decomposing the problem


The task of extracting all association rules with the given confidence and support from the dataset is non-trivial. Let's approach it by decomposing it into smaller subtasks:

  • Find all item sets with the support above the given threshold
  • Generate all possible rules from the item sets that have confidence above the given threshold

Generating all possible rules


We need a method to generate all possible combinations of elements of this array. Combinations are found via the binary representation of subsets, as shown in the following snippet:

public extension Array { 
public func combinations() -> [[Element]] { 
        if isEmpty { return [] } 
        let numberOfSubsets = Int(pow(2, Double(count))) 
        var result = [[Element]]() 
        for i in 0..<numberOfSubsets { 
            var remainder = i 
            var index = 0 
            var combination = [Element]() 
            while remainder > 0 { 
                if remainder % 2 == 1 { 
                    combination.append(self[index]) 
                } 
                index += 1 
                remainder /= 2 
            } 
            result.append(combination) 
        } 
        return result 
    } 
} 

The following usage example:

let array = [1,2,3] 
print(array.combinations()) 

Produces:

[[], [1], [2], [1, 2], [3], [1, 3], [2, 3], [1, 2...

Finding frequent item sets


The first step of the algorithm that we implement is based on the support measure. This function returns a set of all item sets with support larger than minSupport:

func frequentItemSets(minSupport: Double) -> Set<ItemSet> { 
     
    var itemSets = Set<ItemSet>() 
     
    let emptyItemSet: ItemSet = ItemSet() 
     
    supporters[emptyItemSet] = Array(0 ..< transactions.count) 

Here we use the priority queue data structure to keep track of possible extensions.

Note

There is no priority queue implementation in the Foundation or Swift standard libraries, and standard data structures are out of the scope of this book. We are using the open source implementation by David Kopec (MIT license): https://github.com/davecom/SwiftPriorityQueue.

To make it work with item sets we had to change the code a bit—instead of being parameterized with the comparable types, it is now parameterized with types conforming to the equatable protocol:

    var queue = PriorityQueue...

The Apriori algorithm


The most famous algorithm for association rule learning is Apriori. It was proposed by Agrawal and Srikant in 1994. The input of the algorithm is a dataset of transactions where each transaction is a set of items. The output is a collection of association rules for which support and confidence are greater than some specified threshold. The name comes from the Latin phrase a priori (literally, "from what is before") because of one smart observation behind the algorithm: if the item set is infrequent, then we can be sure in advance that all its subsets are also infrequent.

You can implement Apriori with the following steps:

  1. Count the support of all item sets of length 1, or calculate the frequency of every item in the dataset.
  2. Drop the item sets that have support lower than the threshold.
  3. Store all the remaining item sets.
  4. Extend each stored item set by one element with all possible extensions. This step is known as candidate generation.
  5. Calculate the support value of each...

Implementing Apriori in Swift


What will follow is a simplified version of a code that can be found in supplementary materials. We will skip some of the less important parts here.

The main method, which returns association rules with the given support and confidence, is as follows:

public func associationRules(minSupport: Double, minConfidence: Double) -> [Rule] { 
    var rules = [Rule]() 
    let frequent = frequentItemSets(minSupport: minSupport) 
     
    for itemSet in frequent { 
        for (ifPart, thenPart) in nonOverlappingSubsetPairs(itemSet) { 
            if confidence(ifPart, thenPart) >= minConfidence { 
                let rule = Rule(ifPart: convertIndexesToItems(ifPart), thenPart: convertIndexesToItems(thenPart)) 
                rules.append(rule) 
            } 
        } 
    } 
     
    return rules 
} 
func nonOverlappingSubsetPairs(_ itemSet: ItemSet) -> [(ItemSet, ItemSet)] { 
    var result = [(ItemSet, ItemSet)]() 
    let ifParts = Subsets(itemSet) 
 ...

Running Apriori


And finally, this is how we use the algorithm with our toy example:

let transactions = [["
", "
", "
", "
"], ["
", "
", "
"], ["
", "
", "
"], ["
", "
"], ["
", "
"], ["
", "
"], ["
"] ] let apriori = Apriori<String>(transactions: transactions) let rules = apriori.associationRules(minSupport: 0.3, minConfidence: 0.5) for rule in rules { print(rule) print("Confidence: ", apriori.confidence(rule), "Lift: ", apriori.lift(rule), "Conviction: ", apriori.conviction(rule)) }

It produces the following:

{ 
} Confidence: 0.8 Lift: 1.4 Conviction: 2.14285714285714 {
} Confidence: 1.0 Lift: 1.4 Conviction: inf {
} Confidence: 0.75 Lift: 1.3125 Conviction: 1.71428571428571 {
} Confidence: 0.75 Lift: 1.3125 Conviction: 1.71428571428571

Let's analyze what's going on here. The second rule has the maximum confidence as well as conviction...

Running Apriori on real-world data


In this example, we collected real-world shopping lists from an apartment and composed a small, but nevertheless realistic, dataset. Let's see if we'll be able to extract any meaningful rules from it using our algorithm. Please note that this dataset is extremely small. For any production application of Apriori, you will need much larger datasets:

let transactions =  
[["Grapes", "Cheese"], 
["Cheese", "Milk"], 
["Apples", "Oranges", "Cheese", "Gingerbread", "Marshmallows", "Eggs", "Canned vegetables"], 
["Tea", "Apples", "Bagels", "Marshmallows", "Icecream", "Canned vegetables"], 
["Cheese", "Buckwheat", "Cookies", "Oatmeal", "Banana", "Butter", "Bread", "Apples", "Baby puree"], 
["Baby puree", "Cookies"], 
["Cookies"], 
["Chicken", "Grapes", "Pizza", "Cheese", "Marshmallows", "Cream"], 
["Potatoes"], 
["Chicken"], 
["Сornflakes", "Cookies", "Oatmeal"], 
["Tea"], 
["Chicken"], 
["Chicken", "Eggs", "Cheese", "Oatmeal", "Bell pepper", "Bread", "Chocolate...

The pros and cons of Apriori


The pros of Apriori are as follows:

  • This is the most simple and easy-to-understand algorithm among association rule learning algorithms
  • The resulting rules are intuitive and easy to communicate to an end user
  • It doesn't require labeled data as it is fully unsupervised; as a result, you can use it in many different situations because unlabeled data is often more accessible
  • Many extensions were proposed for different use cases based on this implementation—for example, there are association learning algorithms that take into account the ordering of items, their number, and associated timestamps
  • The algorithm is exhaustive, so it finds all the rules with the specified support and confidence

The cons of Apriori are as follows:

  • If the dataset is small, the algorithm can find many false associations that happened simply by chance. You can address this issue by evaluating obtained rules on the held-out test data for the support, confidence, lift, and conviction values.
  • As Agrawal...

Building an adaptable user experience


Human-computer interaction is never easy. Computers don't understand speech, sentiments, or body language. However, we are all used to communicating with our smart devices using not-so-smart buttons, drop downs, pickers, switches, checkboxes, sliders, and hundreds of other controls. They comply with a new kind of language that is commonly referred to as UI. Slowly but unavoidably, machine learning has made its way into all areas where computers interact directly with humans: voice input, handwriting input, lip reading, gesture recognition, body pose estimation, face emotion recognition, sentiment analysis, and so on. This may not be immediately obvious, but machine learning is the future of both UI and UX. Today, machine learning is already changing the way users interact with their devices. Machine learning-based solutions are likely to become widely-adopted in UIs because of their convenience. Furthermore, ranking, contextual suggestions, automatic...

Summary


In this chapter, we explored association rule learning, which is a branch of unsupervised learning. We implemented the Apriori algorithm, which can be used to find patterns in the form of rules in different transactional datasets. Apriori's classical use case is market basket analysis. However, it is also important conceptually, because rule learning algorithms bridge the gap between classical artificial intelligence approaches (logical programming, concept learning, searching graphs, and so on) and logic-based machine learning (decision trees).

In the following chapter, we're going to return to supervised learning, but this time we will switch our attention from non-parametric models, such as KNN and k-means, to parametric linear models. We will also discuss linear regression and the gradient descent optimization method.

Bibliography


  1. Sergey Brin, Rajeev Motwani, Jeffrey D. Ullman, and Shalom Tsur, Dynamic itemset counting and implication rules for market basket data, in SIGMOD 1997, Proceedings ACM SIGMOD international conference on Management of data, pages 255-264, Tucson, Arizona, USA, May 1997
  2. Rakesh Agrawal and Ramakrishnan Srikant, Fast Algorithms for Mining Association Rules, Proceedings of the 20th international conference on very large databases, VLDB, pages 487-499, Santiago, Chile, September 1994 at: http://www.vldb.org/conf/1994/P487.PDF

 

lock icon The rest of the chapter is locked
You have been reading a chapter from
Machine Learning with Swift
Published in: Feb 2018 Publisher: Packt ISBN-13: 9781787121515
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime}