Packt+ | Advance your knowledge in tech

You're reading from NumPy Cookbook

Product type Book

Published in Oct 2012

Publisher Packt

ISBN-13 9781849518925

Pages 226 pages

Edition 1st Edition

Languages

Python

Concepts

Data Science

Table of Contents (17) Chapters

NumPy Cookbook

Credits

About the Author

About the Reviewers

www.PacktPub.com

Preface

1. Winding Along with IPython

2. Advanced Indexing and Array Concepts

3. Get to Grips with Commonly Used Functions

4. Connecting NumPy with the Rest of the World

5. Audio and Image Processing

6. Special Arrays and Universal Functions

7. Profiling and Debugging

8. Quality Assurance

9. Speed Up Code with Cython

10. Fun with Scikits

Index

Chapter 3. Get to Grips with Commonly Used Functions

In this chapter, we will cover a number of commonly used functions:

sqrt, log, arange, astype, and sum
ceil, modf, where, ravel, and take
sort and outer
diff, sign, eig
histogram and polyfit
compress and randint

We will be discussing these functions through the following recipes:

Summing Fibonacci numbers
Finding prime factors
Finding palindromic numbers
The steady state vector determination
Discovering a power law
Trading periodically on dips
Simulating trading at random
Sieving integers with the Sieve of Eratosthenes

Introduction

This chapter is about the commonly used functions. These are the functions that you will be using on a daily basis. Obviously, the usage may differ for you. There are so many NumPy functions that it is virtually impossible to know all of them, but the functions in this chapter will be the bare minimum with which we must be familiar. You can download source code for this chapter from the book website http://www.packtpub.com.

Summing Fibonacci numbers

In this recipe, we will sum the even-valued terms in the Fibonacci sequence whose values do not exceed four million. The Fibonacci series is a sequence of integers starting with zero, where each number is the sum of the previous two; except, of course, the first two numbers zero and one.

Note

For more information, read the Wikipedia article about Fibonacci numbers at http://en.wikipedia.org/wiki/Fibonacci_number .

This recipe uses a formula based on the golden ratio, which is an irrational number with special properties comparable to pi. It we will use the sqrt, log, arange, astype, and sum functions.

How to do it...

The first thing to do is calculate the golden ratio (http://en.wikipedia.org/wiki/Golden_ratio), also called the golden section or golden mean.

Calculate the golden ratio.
We will be using the sqrt function to calculate the square root of five:
```
phi = (1 + numpy.sqrt(5))/2
print "Phi", phi
```
This prints the golden mean:
```
Phi 1.61803398875
```
Find the index below...

Finding prime factors

Prime factors (http://en.wikipedia.org/wiki/Prime_factor) are prime numbers that divide an integer exactly without a remainder. Finding prime factors seems almost impossible to crack. However, using the right algorithm—Fermat's factorization method (http://en.wikipedia.org/wiki/Fermat%27s_factorization_method) and NumPy—it becomes very easy. The idea is to factor a number N into two numbers c and d, according to the following equation:

We can apply the factorization recursively, until we get the required prime factors.

How to do it...

The algorithm requires us to try a number of trial values for a.

Create an array of trial values.
It makes sense to create a NumPy array and eliminate the need for loops. However, you should be careful to not create an array that is too big in terms of memory requirements. On my system, an array of a million elements seems to be just the right size:
```
a = numpy.ceil(numpy.sqrt(n))
lim = min(n, LIM)
a = numpy.arange(a, a + lim)
b2 = a ** 2 - n
```
We...

Finding palindromic numbers

A palindromic number reads the same both ways. The largest palindrome made from the product of two 2-digit numbers is 9009 = 91 x 99. Let's try to find the largest palindrome made from the product of two 3-digit numbers.

How to do it...

We will create an array to hold 3-digit numbers from 100 to 999 using our favorite NumPy function arange.

Create a 3-digit numbers array.
Check the first and last element of the array with the assert_equal function from the numpy.testing package:
```
a = numpy.arange(100, 1000)
numpy.testing.assert_equal(100, a[0])
numpy.testing.assert_equal(999, a[-1])
```
Create the products array
Now, we will create an array to hold all the possible products of the elements of the 3-digits array with itself. We can accomplish this with the outer function. The resulting array needs to be flattened with ravel, to be able to easily iterate over it. Call the sort method on the array to make sure the array is properly sorted. After that, we can do some sanity checks...

The steady state vector determination

A Markov chain is a system that has at least two states. For detailed information on Markov chains, please refer to http://en.wikipedia.org/wiki/Markov_chain.The state at time t depends on the state at time t-1, and only the state at t-1. The system switches at random between these states. I would like to define a Markov chain for a stock. Let's say that we have the states flat F, up U, and down D. We can determine the steady state based on end of day close prices.

Far into the distant future or in theory infinite time, the state of our Markov chain system will not change anymore. This is also called a steady state (http://en.wikipedia.org/wiki/Steady_state). The stochastic matrix (http://en.wikipedia.org/wiki/Stochastic_matrix) A, which contains the state transition probabilities, and when applied to the steady state, will yield the same state x. The mathematical notation for this will be as follows:

Another way to look at this is as the eigenvector...

Discovering a power law

For the purpose of this recipe, imagine that we are operating a Hedge Fund. Let it sink in; you are part of the one percent now!

Power laws occur in a lot of places, see http://en.wikipedia.org/wiki/Power_law for more information. The Pareto principle (http://en.wikipedia.org/wiki/Pareto_principle) for instance, which is a power law, states that wealth is unevenly distributed. This principle tells us that if we group people by their wealth, the size of the groups will vary exponentially. To put it simply, there are not a lot of rich people, and there are even less billionaires; hence the one percent.

Assume that there is a power law in the closing stock prices log returns. This is a big assumption, of course, but power law assumptions seem to pop up all over the place.

We don't want to trade too often, because of involved transaction costs per trade. Let's say that we would prefer to buy and sell once a month based on a significant correction (in other words a big drop...

Trading periodically on dips

Stock prices periodically dip and go up. We will have a look at the probability distribution of the stock price log returns.

Let's start by downloading the historical data for a stock; for instance, AAPL. Next, calculate the daily log returns (http://en.wikipedia.org/wiki/Rate_of_return) of the close prices. We will skip these steps because they were already done in the previous recipe.

Getting ready

If necessary, install Matplotlib and SciPy. Refer to the See Also section for the corresponding recipes.

How to do it...

Now comes the interesting part.

Calculate breakout and pullback.
Let's say we want to trade five times per year, or roughly every 50 days. One strategy would be to buy when the price drops by a certain percentage—a pullback, and sell when the price increases by another percentage—a breakout.
By setting the percentile appropriate for our trading frequency, we can match the corresponding log returns. SciPy offers the scoreatpercentile function, which we...

Simulating trading at random

In the previous recipe, we tried out a trading idea. However, we have no benchmark that can tell us if the result we got was any good. It is common in such cases to trade at random, under the assumption that we should be able to beat a random process. We will simulate trading by taking some random days from a trading year. This should illustrate working with random numbers using NumPy.

Getting ready

If necessary, install Matplotlib. Refer to the See Also section for the corresponding recipe.

How to do it...

First, we need an array filled with random integers.

Generate random indices.
Generate random integers with the NumPy randint function. This will be linked to random days of a trading year:
```
return numpy.random.randint(0, high, size)
```
Simulate trades.
Simulate trades with the random indices from the previous step. Use the NumPy take function to extract random close prices from the array of step 1:
```
buys = numpy.take(close, get_indices(len(close), nbuys))
sells = numpy...
```

Sieving integers with the Sieve of Erasthothenes

The Sieve of Eratosthenes (http://en.wikipedia.org/wiki/Sieve_of_Eratosthenes) is an algorithm that filters out prime numbers. It iteratively identifies multiples of found primes. This sieve is efficient for primes smaller than 10 million. Let's now try to find the 10001st prime number.

How to do it...

The first mandatory step is to create a list of natural numbers.

Create a list of consecutive integers.
NumPy has the arange function for that:
```
a = numpy.arange(i, i + LIM, 2)
```
Sieve out multiples of p.
We are not sure if this is what Eratosthenes wanted us to do, but it works. In the following code, we are passing a NumPy array and getting rid of all the elements that have a zero remainder, when divided by p:
```
a = a[a % p != 0]
```

The following is the entire code for this problem:

import numpy

LIM = 10 ** 6
N = 10 ** 9
P = 10001
primes = []
p = 2

#By listing the first six prime numbers: 2, 3, 5, 7, 11, and 13, we can see that the 6th prime is 13.
#What...