Reader small image

You're reading from  Python for Secret Agents - Volume II - Second Edition

Product typeBook
Published inDec 2015
Reading LevelIntermediate
Publisher
ISBN-139781785283406
Edition2nd Edition
Languages
Right arrow
Authors (2):
Steven F. Lott
Steven F. Lott
author image
Steven F. Lott

Steven Lott has been programming since computers were large, expensive, and rare. Working for decades in high tech has given him exposure to a lot of ideas and techniques, some bad, but most are helpful to others. Since the 1990s, Steven has been engaged with Python, crafting an array of indispensable tools and applications. His profound expertise has led him to contribute significantly to Packt Publishing, penning notable titles like "Mastering Object-Oriented," "The Modern Python Cookbook," and "Functional Python Programming." A self-proclaimed technomad, Steven's unconventional lifestyle sees him residing on a boat, often anchored along the vibrant east coast of the US. He tries to live by the words “Don't come home until you have a story.”
Read more about Steven F. Lott

Steven F. Lott
Steven F. Lott
author image
Steven F. Lott

Steven Lott has been programming since computers were large, expensive, and rare. Working for decades in high tech has given him exposure to a lot of ideas and techniques, some bad, but most are helpful to others. Since the 1990s, Steven has been engaged with Python, crafting an array of indispensable tools and applications. His profound expertise has led him to contribute significantly to Packt Publishing, penning notable titles like "Mastering Object-Oriented," "The Modern Python Cookbook," and "Functional Python Programming." A self-proclaimed technomad, Steven's unconventional lifestyle sees him residing on a boat, often anchored along the vibrant east coast of the US. He tries to live by the words “Don't come home until you have a story.”
Read more about Steven F. Lott

View More author details
Right arrow

Chapter 5. Data Collection Gadgets

We've looked at gathering intelligence from web server logs, from the social network, and from hard-to-examine PDF files. In this chapter, we'll see how we can gather data from the real world—the space that isn't inside our computers. This can be poetically termed meatspace to distinguish it from the cyberspace of networked computers.

There are a large number of small, programmable devices with sophisticated network interfaces, processes, and memory. We'll focus on the Arduino family of processors as a way to gather data for analysis. Some agents will prefer to use other processors because they're less expensive, more sophisticated, or have different programming language support.

Our goal is to build our own gadget using a simple sensor. We can easily find numerous online shops that sell a variety of sensors. We can even find kits with a collection of different kinds of sensors. Some agents will prefer to shop for individual sensors on a project-by-project...

Background briefing: Arduino basics


The history of the Arduino combines art and technology. For more information, visit http://aliciagibb.com/wp-content/uploads/2013/01/New-Media-Art-Design-and-the-Arduino-Microcontroller-2.pdf.

More relevant background will come from the Getting Started with Arduino by author Massimo Banzi, which is the standard starter book. Also, most publishers, like Packt Publishing, have dozens of books on Arduino. We won't duplicate any of that material. We will show how data collection and analysis works with customized gadgets.

The Arduino supports the idea of physical computing—computing that interacts with the real world through sensors and actuators. To this end, the Arduino board has a processor, some pins that the processor can sense, and some pins that the processor can control. There's also a small reset button and a USB connector, plus miscellaneous parts like a crystal for the clock and a connector for power that can be used instead of the USB connector.

Each...

Starting with the digital output pins


We'll use the digital output pins for our first mission. This will show us the basics of preparing a sketch—an Arduino program. We'll download it to an Arduino and watch it work.

The Arduino language is vaguely like Python, with some extraneous punctuation. The language is quite a bit simpler and is statically compiled into hardware-level instructions that are downloaded to the processor.

An Arduino sketch must define two functions: setup() and loop(). The setup() function will run just once when the board is reset. The loop() will be evaluated repeatedly—as often as possible—by the Arduino processor. The exact timing will vary depending on what additional tasks the processor has to engage in to manage memory and deal with the various devices. It's almost is if the Arduino has an overall piece of code that looks like this:

main() {
   setup();
   while(true) { loop(); }
}

We don't need to actually write code like this; our sketch is written as if this processing...

Mastering the Arduino programming language


The Arduino programming language is based on C++. In Python, we use indentation to identify the body of an if statement, while statement, a function, or a class. An Arduino sketch will use {} instead of indentation.

While the {} are required syntax, almost all Arduino code that we'll see will be nicely indented as if it was Python.

Similarly, Arduino statements are separated by ; (semicolon). Python statements end at the end of the line, or the end of the matching (), [], or {}. It's challenging—at first—to remember the ; (semicolon). When we try to upload the sketch to our Arduino, the final syntax check will alert us to missing ;(semicolon).

Arduino has two kinds of comments: everything after // is a comment. This is similar to Python's # comment delimiter. Also, Arduino programs can have longer comments which begin with /* and end with */. This will often be used similarly to Python's ''' triple-quote strings. The Arduino /* */ comments can be used...

Seeing a better blinking light


The core blinking light sketch uses a delay(1000) to essentially stop all work for 1 second. If we want to have a more responsive gadget, this kind of delay can be a problem. This design pattern is called Busy Waiting or Spinning: we can do better.

The core loop() function is executed repeatedly. We can use the millis() function to see how long it's been since we turned the LED on or turned the LED off. By checking the clock, we can interleave LED blinking with other operations. We can gather sensor data as well as check for button presses, for example.

Here's a way to blink an LED that allows for additional work to be done:

const int LED=13; // the on-board LED
void setup() {
    pinMode( LED, OUTPUT );
}
void loop() {
    // Other Work goes here.
    heartbeat();
}
// Blinks LED 13 once per second.
void heartbeat() {
  static unsigned long last= 0;
  unsigned long now= millis();
  if (now - last > 1000) {
    digitalWrite( LED, LOW );
    last= now;
  }
...

Simple Arduino sensor data feed


A button is not the simplest kind of sensor to read. While the concept is simple, there's an interesting subtlety to reading a button. The problem is that buttons bounce: they make some intermittent contact before they make a final, solid connection.

There's a simplistic way to debounce that involves a Busy Waiting design. We'll avoid this and show a somewhat more sophisticated debounce algorithm. As with the LED blinking, we'll rely on the millis() function to see how long the button has stayed in a given state.

To debounce, we'll need to save the current state of the button. When the button is pressed, the signal on the input pin will become HIGH and we can save the time at which this leading edge event occurred. When the button is released, the signal will go back to LOW. We can subtract the two times to compute the duration. We can use this to determine if this was a proper press, just a bounce, or even a long press-and-hold event.

The function looks like...

Collecting analog data


Our goal is to gather data from an analog range sensor. This device must be connected to one of the analog input pins. We also need to connect it to power and ground pins. According to the documentation for the GP2Y0A21YK, the sensor has three connections: Vo, GND, and Vcc. With the sensor pointing up, the left-most pin is generally Vo, the output that will connect to Arduino A0 to provide analog input. The center pin is ground, which connects to one of the Arduino GND connections. The right-most pin will be Vcc, the supply voltage that will connect to the +5 V pin on the Arduino.

Many of these devices have a little JST three-wire socket. We can buy a nice little JST three-wire jumper. The color coding of the wires on the jumper may not fit our expectations very well. We may wind up with Vcc on the black-colored wire (black is often used for ground) and the ground connection on the red-colored wire (red is often used for +5V.)

If we connect things up wrong, there will...

Collecting bulk data with the Arduino


First, we'll expand our IR sensor reading sketch to wait for the sensor to show a stable reading. We'll use this stable reading algorithm to gather a batch of 16 samples from the sensor. This will give us some data that we can use for calibration.

The expanded version of the gather_data() function includes three separate features. We've left them in a single function, but they can be decomposed if required to make the Arduino more responsive. Here are the global variables shared by gather_data() and share_data():

unsigned long start_time, end_time;
int raw_reading;
#define MODE_LIMIT 6
int reading[MODE_LIMIT];
int count[MODE_LIMIT];

const int TIME_LIMIT = 5000; // Microseconds.

We defined three important features for our data collection: the start time for reading, the end time for reading, and the raw value that was read. We included two other pieces of data definition that are unique to the Arduino language.

The #define statement creates a symbol that...

Data modeling and analysis with Python


We will use the pyserial module to write a separate data gathering application in Python. For this to work, we'll have to shut down the Arduino IDE so that our Python program can access the USB serial port.

A serial interface will see a stream of individual bits that can be reassembled into bytes. The low-level sequence of signals flips between high and low voltage at a defined rate, called baud. In addition to the baud, there are a number of other parameters that define serial interface configuration.

In some contexts, we might summarize an interface configuration as 9600/8-N-1. This says that we will exchange bits at 9600 baud, using 8-bit bytes, no parity checking, and a single stop bit included after the data bits. 8-N-1 specification after the "/" is a widely-used default and can be safely assumed. The transmission speed of 9600 baud can't be assumed, and needs to be stated explicitly. Our Arduino Serial.begin(9600) in the setup() function specified...

Reducing noise with a simple filter


Is there a way that we can reduce the variability in the output? One possibility to use a moving average of the raw values. Using an Exponentially Weighted Moving Average (EWMA) algorithm will tend to damp out small perturbations in the data, providing a more stable reading.

This moving average is called exponentially weighted because the weights given to previous values fall off exponentially. The immediately previous value is weighted more heavily than the value before that. All values figure into the current value, but as we go back in time, the weights for those old values become very, very small.

The core calculation for a weighted data point, , from the raw data point, , looks like this:

We used a weighting value, w, that expresses the influence of the previous data point on the current data point. If w is one, previous values have no influence. If w is zero, the initial value is the only one that matters and new values are ignored.

The very first data...

Solving problems adding an audible alarm


We used LEDs to provide feedback. We started with a heartbeat LED that showed us that our sketch was running properly. We can easily add LEDs based on other conditions. For example, we might add red and green LEDs to show when the distance being measured is outside certain limits.

We'll need to add appropriate resistors for these LEDs. We'll also need to allocate two pins to control these LEDs.

We can convert the raw measurement into a distance with a simple calculation in the Arduino program. We might add code somewhat like this:

float next = debounce_ir();
float raw = next*w + (1-w)*current; 
float d = -0.12588*raw + 43.90;

This depends on a debounce_ir() function that reads a single distance value from the IR device. This is a small change to our gather_data() function. We want to return a value instead of update a global variable.

We used a EWMA algorithm to compute the weighted moving average of the sequence of raw values. This is saved in a global...

Summary


In this chapter, we looked at data collection at the gadget level. We used an Arduino board to build a data collection gadget. We looked at how we can build a simple display using LEDs. We also discussed providing simple input by debouncing a push button. We used Python programs to help us design some of these circuits.

We also used the analog input pins to gather data from an infrared distance sensor. To be sure that we've got reliable, usable data, we wrote a calibration process. We collected that raw data and analyzed it in Python. Because of the sophisticated statistics module, we were able to evaluate the quality of the results.

We used Python to build a linear model that maps the raw measurements to accurate distance measurements. We can use these Python analytical modules for ongoing calibration and fine-tuning the way this device works.

We also used Python to examine the parameters for an EWMA algorithm. We can use Python to explore the weighting factor. It's very easy to process...

lock icon
The rest of the chapter is locked
You have been reading a chapter from
Python for Secret Agents - Volume II - Second Edition
Published in: Dec 2015Publisher: ISBN-13: 9781785283406
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
undefined
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime

Authors (2)

author image
Steven F. Lott

Steven Lott has been programming since computers were large, expensive, and rare. Working for decades in high tech has given him exposure to a lot of ideas and techniques, some bad, but most are helpful to others. Since the 1990s, Steven has been engaged with Python, crafting an array of indispensable tools and applications. His profound expertise has led him to contribute significantly to Packt Publishing, penning notable titles like "Mastering Object-Oriented," "The Modern Python Cookbook," and "Functional Python Programming." A self-proclaimed technomad, Steven's unconventional lifestyle sees him residing on a boat, often anchored along the vibrant east coast of the US. He tries to live by the words “Don't come home until you have a story.”
Read more about Steven F. Lott

author image
Steven F. Lott

Steven Lott has been programming since computers were large, expensive, and rare. Working for decades in high tech has given him exposure to a lot of ideas and techniques, some bad, but most are helpful to others. Since the 1990s, Steven has been engaged with Python, crafting an array of indispensable tools and applications. His profound expertise has led him to contribute significantly to Packt Publishing, penning notable titles like "Mastering Object-Oriented," "The Modern Python Cookbook," and "Functional Python Programming." A self-proclaimed technomad, Steven's unconventional lifestyle sees him residing on a boat, often anchored along the vibrant east coast of the US. He tries to live by the words “Don't come home until you have a story.”
Read more about Steven F. Lott