Search icon
Arrow left icon
All Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Newsletters
Free Learning
Arrow right icon
NumPy Essentials
NumPy Essentials

NumPy Essentials: Boost your scientific and analytic capabilities in no time at all by discovering how to build real-world applications with NumPy

By Leo (Liang-Huan) Chin , Tanmay Dutta , Shane Holloway
$15.99 per month
Book Apr 2016 156 pages 1st Edition
eBook
$25.99 $17.99
Print
$32.99
Subscription
$15.99 Monthly
eBook
$25.99 $17.99
Print
$32.99
Subscription
$15.99 Monthly

What do you get with a Packt Subscription?

Free for first 7 days. $15.99 p/m after that. Cancel any time!
Product feature icon Unlimited ad-free access to the largest independent learning library in tech. Access this title and thousands more!
Product feature icon 50+ new titles added per month, including many first-to-market concepts and exclusive early access to books as they are being written.
Product feature icon Innovative learning tools, including AI book assistants, code context explainers, and text-to-speech.
Product feature icon Thousands of reference materials covering every tech concept you need to stay up to date.
Subscribe now
View plans & pricing

Product Details


Publication date : Apr 28, 2016
Length 156 pages
Edition : 1st Edition
Language : English
ISBN-13 : 9781784393670
Category :
Concepts :
Table of content icon View table of contents Preview book icon Preview Book

NumPy Essentials

Chapter 1. An Introduction to NumPy

 

"I'd rather do math in a general-purpose language than try to do general-purpose programming in a math language."                                                                                                                     

 
 -- John D Cook

Python has become one of the most popular programming languages in scientific computing over the last decade. The reasons for its success are numerous, and these will gradually become apparent as you proceed with this book. Unlike many other mathematical languages, such as MATLAB, R and Mathematica, Python is a general-purpose programming language. As such, it provides a suitable framework to build scientific applications and extend them further into any commercial or academic domain. For example, consider a (somewhat) simple application that requires you to write a piece of software and predicts the popularity of a blog post. Usually, these would be the steps that you'd take to do this:

  1. Generating a corpus of blog posts and their corresponding ratings (assuming that the ratings here are suitably quantifiable).
  2. Formulating a model that generates ratings based on content and other data associated with the blog post.
  3. Training a model on the basis of the data you found in step 1. Keep doing this until you are confident of the reliability of the model.
  4. Deploying the model as a web service.

Normally, as you move through these steps, you will find yourself jumping between different software stacks. Step 1 requires a lot of web scraping. Web scraping is a very common problem, and there are tools in almost every programming language to scrape the Web (if you are already using Python, you would probably choose Beautiful Soup or Scrapy). Steps 2 and 3 involve solving a machine learning problem and require the use of sophisticated mathematical languages or frameworks, such as Weka or MATLAB, which are only a few of the vast variety of tools that provide machine learning functionality. Similarly, step 4 can be implemented in many ways using many different tools. There isn't one right answer. Since this is a problem that has been amply studied and solved (to a reasonable extent) by a lot of scientists and software developers, getting a working solution would not be difficult. However, there are issues, such as stability and scalability, that might severely restrict your choice of programming languages, web frameworks, or machine learning algorithms in each step of the problem. This is where Python wins over most other programming languages. All the preceding steps (and more) can be accomplished with only Python and a few third-party Python libraries. This flexibility and ease of developing software in Python is precisely what makes it a comfortable host for a scientific computing ecosystem. A very interesting interpretation of Python's prowess as a mature application development language can be found in Python Data Analysis, Ivan Idris, Packt Publishing. Precisely, Python is a language that is used for rapid prototyping, and it is also used to build production-quality software because of the vast scientific ecosystem it has acquired over time. The cornerstone of this ecosystem is NumPy.

Numerical Python (NumPy) is a successor to the Numeric package. It was originally written by Travis Oliphant to be the foundation of a scientific computing environment in Python. It branched off from the much wider SciPy module in early 2005 and had its first stable release in mid-2006. Since then, it has enjoyed growing popularity among Pythonists who work in the mathematics, science, and engineering fields. The goal of this book is to make you conversant enough with NumPy so that you're able to use it and can build complex scientific applications with it.

The scientific Python stack


Let's begin by taking a brief tour of the Scientific Python (SciPy) stack.

Note

Note that SciPy can mean a number of things: the Python module named scipy (http://www.scipy.org/scipylib), the entire SciPy stack (http://www.scipy.org/about.html), or any of the three conferences on scientific Python that take place all over the world.

Figure 1: The SciPy stack, standard, and extended libraries

Fernando Perez, the primary author of IPython, said in his keynote at PyCon, Canada 2012:

"Computing in science has evolved not only because software has evolved, but also because we, as scientists, are doing much more than just floating point arithmetic."

This is precisely why the SciPy stack boasts such rich functionality. The evolution of most of the SciPy stack is motivated by teams of scientists and engineers trying to solve scientific and engineering problems in a general-purpose programming language. A one-line explanation of why NumPy matters so much is that it provides the core multidimensional array object that is necessary for most tasks in scientific computing. This is why it is at the root of the SciPy stack. NumPy provides an easy way to interface with legacy Fortran and C/C++ numerical code using time-tested scientific libraries, which we know have been working well for decades. Companies and labs across the world use Python to glue together legacy code that has been around for a long time. In short, this means that NumPy allows us to stand on the shoulders of giants; we do not have to reinvent the wheel. It is a dependency for every other SciPy package. The NumPy ndarray object, which is the subject of the next chapter, is essentially a Pythonic interface to data structures used by libraries written in Fortran, C, and, C++. In fact, the internal memory layouts used by NumPy ndarray objects implement C and Fortran layouts. This will be addressed in detail in upcoming chapters.

The next layer in the stack consists of SciPy, matplotlib, IPython (the interactive shell of Python; we will use it for the examples throughout the book, and details of its installation and usage will be provided in later sections), and SymPy modules. SciPy provides the bulk of the scientific and numerical functionality that a major part of the ecosystem relies on. Matplotlib is the de facto plotting and data visualization library in Python. IPython is an increasingly popular interactive environment for scientific computing in Python. In fact, the project has had such active development and enjoyed such popularity that it is no longer limited to Python and extends its features to other scientific languages, particularly R and Julia. This layer in the stack can be thought of as a bridge between the core array-oriented functionality of NumPy and the domain-specific abstractions provided by the higher layers of the stack. These domain-specific tools are commonly called SciKits-popular ones among them are scikit-image (image processing), scikit-learn (machine learning), statsmodels (statistics), pandas (advanced data analysis), and so on. Listing every scientific package in Python would be nearly impossible since the scientific Python community is very active, and there is always a lot of development happening for a large number of scientific problems. The best way to keep track of projects is to get involved in the community. It is immensely useful to join mailing lists, contribute to code, use the software for your daily computational needs, and report bugs. One of the goals of this book is to get you interested enough to actively involve yourself in the scientific Python community.

The need for NumPy arrays


A fundamental question that beginners ask is. Why are arrays necessary for scientific computing at all? Surely, one can perform complex mathematical operations on any abstract data type, such as a list. The answer lies in the numerous properties of arrays that make them significantly more useful. In this section, let's go over a few of these properties to emphasize why something such as the NumPy ndarray object exists at all.

Representing of matrices and vectors

The abstract mathematical concepts of matrices and vectors are central to many scientific problems. Arrays provide a direct semantic link to these concepts. Indeed, whenever a piece of mathematical literature makes reference to a matrix, one can safely think of an array as the software abstraction that represents the matrix. In scientific literature, an expression such as Aij is typically used to denote the element in the i th row and j th column of array A. The corresponding expression in NumPy would simply be A[i,j]. For matrix operations, NumPy arrays also support vectorization (details are addressed in Chapter 3 , Using NumPy Arrays), which speeds up execution greatly. Vectorization makes the code more concise, easier to read, and much more akin to mathematical notation. Like matrices, arrays can be multidimensional too. Every element of an array is addressable through a set of integers called indices, and the process of accessing elements of an array with sets of integers is called indexing. This functionality can indeed be implemented without using arrays, but this would be cumbersome and quite unnecessary.

Efficiency

Efficiency can mean a number of things in software. The term may be used to refer to the speed of execution of a program, its data retrieval and storage performance, its memory overhead (the memory consumed when a program is executing), or its overall throughput. NumPy arrays are better than most other data structures with respect to almost all of these characteristics (with a few exceptions such as pandas, DataFrames, or SciPy's sparse matrices, which we shall deal with in later chapters). Since NumPy arrays are statically typed and homogenous, fast mathematical operations can be implemented in compiled languages (the default implementation uses C and Fortran). Efficiency (the availability of fast algorithms working on homogeneous arrays) makes NumPy popular and important.

Ease of development

The NumPy module is a powerhouse of off-the-shelf functionality for mathematical tasks. It adds greatly to Python's ease of development. The following is a brief summary of what the module contains, most of which we shall explore in this book. A far more detailed treatment of the NumPy module is in the definitive Guide to NumPy, Travis Oliphat. The NumPy API is so flexible that it has been adopted extensively by the scientific Python community as the standard API to build scientific applications. Examples of how this standard is applied across scientific disciplines can be found in The NumPy Array: a structure for efficient numerical computation, Van Der Walt, and others:

Submodule

Contents

numpy.core

Basic objects

lib

Additional utilities

linalg

Basic linear algebra

fft

Discrete Fourier transforms

random

Random number generators

distutils

Enhanced build and distribution

testing

Unit testing

f2py

Automatic wrapping of the Fortran code

NumPy in Academia and Industry


It is said that, if you stand at Times Square long enough, you will meet everyone in the world. By now, you must have been convinced that NumPy is the Times Square of SciPy. If you are writing scientific applications in Python, there is not much you can do without digging into NumPy. Figure 2 shows the scope of SciPy in scientific computing at varying levels of abstraction. The red arrow denotes the various low-level functions that are expected of scientific software, and the blue arrow denotes the different application domains that exploit these functions. Python, armed with the SciPy stack, is at the forefront of the languages that provide these capabilities.

A Google Scholar search for NumPy returns nearly 6,280 results. Some of these are papers and articles about NumPy and the SciPy stack itself, and many more are about NumPy's applications in a wide variety of research problems. Academics love Python, which is showcased by the increasing popularity of the SciPy stack as the primary language of scientific programming in countless universities and research labs all over the world. The experiences of many scientists and software professionals have been published on the Python website:

Figure 2: Python versus other languages

Code conventions used in the book


Now that the credibility of Python and NumPy has been established, let's get our hands dirty.

The default environment used for all Python code in this book will be IPython. Instructions on how to install IPython and other tools follow in the next section. Throughout the book, you will only have to enter input in either the command window or the IPython prompt. Unless otherwise specified, code will refer to Python code, and command will refer to bash or DOS commands.

All Python input code will be formatted in snippets like these:

 In [42]: print("Hello, World!")

In [42]: in the preceding snippet indicates that this is the 42nd input to the IPython session. Similarly, all input to the command line will be formatted as follows:

 $ python hello_world.py 

On Windows systems, the same command will look something like this:

C:\Users\JohnDoe> python hello_world.py

For the sake of consistency, the $ sign will be used to denote the command-line prompt, regardless of OS. Prompts, such as C:\Users\JohnDoe>, will not appear in the book. While, conventionally, the $ sign indicates bash prompts on Unix systems, the same commands (without typing the actual dollar sign or any other character), can be used on Windows too. If, however, you are using Cygwin or Git Bash, you should be able to use Bash commands on Windows too.

Note that Git Bash is available by default if you install Git on Windows.

Installation requirements


Let's take a look at the various requirements we need to set up before we proceed.

Using Python distributions

The three most important Python modules you need for this book are NumPy, IPython, and matplotlib; in this book, the code is based on the Python 3.4/2.7- compatible version, NumPy version 1.9, and matplotlib 1.4.3. The easiest way to install these requirements (and more) is to install a complete Python distribution, such as Enthought Canopy, EPD, Anaconda, or Python (x,y). Once you have installed any one of these, you can safely skip the remainder of this section and should be ready to begin.

Note

Note for Canopy users: You can use the Canopy GUI, which includes an embedded IPython console, a text editor, and IPython notebook editors. When working with the command line, for best results use the Canopy Terminal found in Canopy's Tools menu.

Note for Windows OS users: Besides the Python distribution, you can also install the prebuilt Windows python extended packages from Ghristoph Gohlke's website at http://www.lfd.uci.edu/~gohlke/pythonlibs/

Using Python package managers

You can also use Python package managers, such enpkg, Conda, pip or easy_install, to install the requirements using one of the following commands; replace numpy with any other package name you'd like to install, for example, ipython, matplotlib and so on:

$ pip install numpy
$ easy_install numpy
$ enpkg numpy # for Canopy users
$ conda install numpy # for Anaconda users

Using native package managers

If the Python interpreter you want to use comes with the OS and is not a third-party installation, you may prefer using OS-specific package managers such as aptitude, yum, or Homebrew. The following table illustrates the package managers and the respective commands used to install NumPy:

Package managers

Commands

Aptitude

$ sudo apt-get install python-numpy

Yum

$ yum install python-numpy

Homebrew

$ brew install numpy

Note that, when installing NumPy (or any other Python modules) on OS X systems with Homebrew, Python should have been originally installed with Homebrew.

Detailed installation instructions are available on the respective websites of NumPy, IPython, and matplotlib. As a precaution, to check whether NumPy was installed properly, open an IPython terminal and type the following commands:

 In [1]: import numpy as np 
 In [2]: np.test()

If the first statement looks like it does nothing, this is a good sign. If it executes without any output, this means that NumPy was installed and has been imported properly into your Python session. The second statement runs the NumPy test suite. It is not critically necessary, but one can never be too cautious. Ideally, it should run for a few minutes and produce the test results. It may generate a few warnings, but these are no cause for alarm. If you wish, you may run the test suites of IPython and matplotlib, too.

Note

Note that the matplotlib test suite only runs reliably if matplotlib has been installed from a source. However, testing matplotlib is not very necessary. If you can import matplotlib without any errors, it indicates that it is ready for use.

Congratulations! We are now ready to begin.

Summary


In this chapter, we introduced ourselves to the NumPy module. We took a look at how NumPy is a useful software tool to have for those of you who are working in scientific computing. We installed the software required to proceed through the rest of this book.

In next chapter, we will get to the powerful NumPy ndarray object, showing you how to use it efficiently.

Left arrow icon Right arrow icon

Key benefits

  • Optimize your Python scripts with powerful NumPy modules
  • Explore the vast opportunities to build outstanding scientific/ analytical modules by yourself
  • Packed with rich examples to help you master NumPy arrays and universal functions

Description

In today’s world of science and technology, it’s all about speed and flexibility. When it comes to scientific computing, NumPy tops the list. NumPy gives you both the speed and high productivity you need. This book will walk you through NumPy using clear, step-by-step examples and just the right amount of theory. We will guide you through wider applications of NumPy in scientific computing and will then focus on the fundamentals of NumPy, including array objects, functions, and matrices, each of them explained with practical examples. You will then learn about different NumPy modules while performing mathematical operations such as calculating the Fourier Transform; solving linear systems of equations, interpolation, extrapolation, regression, and curve fitting; and evaluating integrals and derivatives. We will also introduce you to using Cython with NumPy arrays and writing extension modules for NumPy code using the C API. This book will give you exposure to the vast NumPy library and help you build efficient, high-speed programs using a wide range of mathematical features.

What you will learn

[*] Manipulate the key attributes and universal functions of NumPy [*] Utilize matrix and mathematical computation using linear algebra modules [*] Implement regression and curve fitting for models [*] Perform time frequency / spectral density analysis using the Fourier Transform modules [*] Collate with the distutils and setuptools modules used by other Python libraries [*] Establish Cython with NumPy arrays [*] Write extension modules for NumPy code using the C API [*] Build sophisticated data structures using NumPy array with libraries such as Panda and Scikits

What do you get with a Packt Subscription?

Free for first 7 days. $15.99 p/m after that. Cancel any time!
Product feature icon Unlimited ad-free access to the largest independent learning library in tech. Access this title and thousands more!
Product feature icon 50+ new titles added per month, including many first-to-market concepts and exclusive early access to books as they are being written.
Product feature icon Innovative learning tools, including AI book assistants, code context explainers, and text-to-speech.
Product feature icon Thousands of reference materials covering every tech concept you need to stay up to date.
Subscribe now
View plans & pricing

Product Details


Publication date : Apr 28, 2016
Length 156 pages
Edition : 1st Edition
Language : English
ISBN-13 : 9781784393670
Category :
Concepts :

Table of Contents

16 Chapters
NumPy Essentials Chevron down icon Chevron up icon
Credits Chevron down icon Chevron up icon
About the Authors Chevron down icon Chevron up icon
About the Reviewers Chevron down icon Chevron up icon
www.PacktPub.com Chevron down icon Chevron up icon
Preface Chevron down icon Chevron up icon
An Introduction to NumPy Chevron down icon Chevron up icon
The NumPy ndarray Object Chevron down icon Chevron up icon
Using NumPy Arrays Chevron down icon Chevron up icon
NumPy Core and Libs Submodules Chevron down icon Chevron up icon
Linear Algebra in NumPy Chevron down icon Chevron up icon
Fourier Analysis in NumPy Chevron down icon Chevron up icon
Building and Distributing NumPy Code Chevron down icon Chevron up icon
Speeding Up NumPy with Cython Chevron down icon Chevron up icon
Introduction to the NumPy C-API Chevron down icon Chevron up icon
Further Reading Chevron down icon Chevron up icon

Customer reviews

Filter icon Filter
Top Reviews
Rating distribution
Empty star icon Empty star icon Empty star icon Empty star icon Empty star icon 0
(0 Ratings)
5 star 0%
4 star 0%
3 star 0%
2 star 0%
1 star 0%

Filter reviews by


No reviews found
Get free access to Packt library with over 7500+ books and video courses for 7 days!
Start Free Trial

FAQs

What is included in a Packt subscription? Chevron down icon Chevron up icon

A subscription provides you with full access to view all Packt and licnesed content online, this includes exclusive access to Early Access titles. Depending on the tier chosen you can also earn credits and discounts to use for owning content

How can I cancel my subscription? Chevron down icon Chevron up icon

To cancel your subscription with us simply go to the account page - found in the top right of the page or at https://subscription.packtpub.com/my-account/subscription - From here you will see the ‘cancel subscription’ button in the grey box with your subscription information in.

What are credits? Chevron down icon Chevron up icon

Credits can be earned from reading 40 section of any title within the payment cycle - a month starting from the day of subscription payment. You also earn a Credit every month if you subscribe to our annual or 18 month plans. Credits can be used to buy books DRM free, the same way that you would pay for a book. Your credits can be found in the subscription homepage - subscription.packtpub.com - clicking on ‘the my’ library dropdown and selecting ‘credits’.

What happens if an Early Access Course is cancelled? Chevron down icon Chevron up icon

Projects are rarely cancelled, but sometimes it's unavoidable. If an Early Access course is cancelled or excessively delayed, you can exchange your purchase for another course. For further details, please contact us here.

Where can I send feedback about an Early Access title? Chevron down icon Chevron up icon

If you have any feedback about the product you're reading, or Early Access in general, then please fill out a contact form here and we'll make sure the feedback gets to the right team. 

Can I download the code files for Early Access titles? Chevron down icon Chevron up icon

We try to ensure that all books in Early Access have code available to use, download, and fork on GitHub. This helps us be more agile in the development of the book, and helps keep the often changing code base of new versions and new technologies as up to date as possible. Unfortunately, however, there will be rare cases when it is not possible for us to have downloadable code samples available until publication.

When we publish the book, the code files will also be available to download from the Packt website.

How accurate is the publication date? Chevron down icon Chevron up icon

The publication date is as accurate as we can be at any point in the project. Unfortunately, delays can happen. Often those delays are out of our control, such as changes to the technology code base or delays in the tech release. We do our best to give you an accurate estimate of the publication date at any given time, and as more chapters are delivered, the more accurate the delivery date will become.

How will I know when new chapters are ready? Chevron down icon Chevron up icon

We'll let you know every time there has been an update to a course that you've bought in Early Access. You'll get an email to let you know there has been a new chapter, or a change to a previous chapter. The new chapters are automatically added to your account, so you can also check back there any time you're ready and download or read them online.

I am a Packt subscriber, do I get Early Access? Chevron down icon Chevron up icon

Yes, all Early Access content is fully available through your subscription. You will need to have a paid for or active trial subscription in order to access all titles.

How is Early Access delivered? Chevron down icon Chevron up icon

Early Access is currently only available as a PDF or through our online reader. As we make changes or add new chapters, the files in your Packt account will be updated so you can download them again or view them online immediately.

How do I buy Early Access content? Chevron down icon Chevron up icon

Early Access is a way of us getting our content to you quicker, but the method of buying the Early Access course is still the same. Just find the course you want to buy, go through the check-out steps, and you’ll get a confirmation email from us with information and a link to the relevant Early Access courses.

What is Early Access? Chevron down icon Chevron up icon

Keeping up to date with the latest technology is difficult; new versions, new frameworks, new techniques. This feature gives you a head-start to our content, as it's being created. With Early Access you'll receive each chapter as it's written, and get regular updates throughout the product's development, as well as the final course as soon as it's ready.We created Early Access as a means of giving you the information you need, as soon as it's available. As we go through the process of developing a course, 99% of it can be ready but we can't publish until that last 1% falls in to place. Early Access helps to unlock the potential of our content early, to help you start your learning when you need it most. You not only get access to every chapter as it's delivered, edited, and updated, but you'll also get the finalized, DRM-free product to download in any format you want when it's published. As a member of Packt, you'll also be eligible for our exclusive offers, including a free course every day, and discounts on new and popular titles.