Reader small image

You're reading from  NumPy Essentials

Product typeBook
Published inApr 2016
Reading LevelIntermediate
Publisher
ISBN-139781784393670
Edition1st Edition
Languages
Tools
Right arrow
Authors (3):
Leo (Liang-Huan) Chin
Leo (Liang-Huan) Chin
author image
Leo (Liang-Huan) Chin

Leo (Liang-Huan) Chin is a data engineer with more than 5 years of experience in the field of Python. He works for Gogoro smart scooter, Taiwan, where his job entails discovering new and interesting biking patterns . His previous work experience includes ESRI, California, USA, which focused on spatial-temporal data mining. He loves data, analytics, and the stories behind data and analytics. He received an MA degree of GIS in geography from State University of New York, Buffalo. When Leo isn't glued to a computer screen, he spends time on photography, traveling, and exploring some awesome restaurants across the world. You can reach Leo at http://chinleock.github.io/portfolio/.
Read more about Leo (Liang-Huan) Chin

Tanmay Dutta
Tanmay Dutta
author image
Tanmay Dutta

Tanmay Dutta is a seasoned programmer with expertise in programming languages such as Python, Erlang, C++, Haskell, and F#. He has extensive experience in developing numerical libraries and frameworks for investment banking businesses. He was also instrumental in the design and development of a risk framework in Python (pandas, NumPy, and Django) for a wealth fund in Singapore. Tanmay has a master's degree in financial engineering from Nanyang Technological University, Singapore, and a certification in computational finance from Tepper Business School, Carnegie Mellon University.
Read more about Tanmay Dutta

Shane Holloway
Shane Holloway
author image
Shane Holloway

http://shaneholloway.com/resume/
Read more about Shane Holloway

View More author details
Right arrow

Chapter 4. NumPy Core and Libs Submodules

After covering so many NumPy ufuncs in the previous chapter, I hope you still remember the very core of NumPy, which is the ndarray object. We are going to finish the last important attribute of ndarray: strides, which will give you the full picture of memory layout. Also, it's time to show you that NumPy arrays can deal not only with numbers but also with various types of data; we will talk about record arrays and date time arrays. Lastly, we will show how to read/write NumPy arrays from/to files, and start to do some real-world analysis using NumPy.

The topics that will be covered in this chapter are:

  • The core of NumPy arrays: memory layout
  • Structure arrays (record arrays)
  • Date-time in NumPy arrays
  • File I/O in NumPy arrays

Introducing strides


Strides are the indexing scheme in NumPy arrays, and indicate the number of bytes to jump to find the next element. We all know the performance improvements of NumPy come from a homogeneous multidimensional array object with fixed-size items, the numpy.ndarray object. We've talked about the shape (dimension) of the ndarray object, the data type, and the order (the C-style row-major indexing arrays and the Fortran style column-major arrays.) Now it's time to take a closer look at strides.

Let's start by creating a NumPy array and changing its shape to see the differences in the strides.

  1. Create a NumPy array and take a look at the strides:

          In [1]: import numpy as np
          In [2]: x = np.arange(8, dtype = np.int8)
          In [3]: x
          Out[3]: array([0, 1, 2, 3, 4, 5, 6, 7])
          In [4]: x.strides
          Out[4]: (1,)
          In [5]: str(x.data)
          Out[5]: '\x00\x01\x02\x03\x04\x05\x06\x07'

    A one-dimensional array x is created and its...

Structured arrays


Structured arrays or record arrays are useful when you perform computations, and at the same time you could keep closely related data together. For example, when you process incident data and each incident contains geographic coordinates and the occurrence time, while you calculate the final result, you can easily find the associated geographic locations and timepoint for further visualization. NumPy also provides powerful capabilities to create arrays of records, as multiple data types live in one NumPy array. However, one principle in NumPy that still needs to be honored is that the data type in each field (you can think of this as a column in the records) needs to be homogeneous. Here are some simple examples that show you how it works:

In [20]: x = np.empty((2,), dtype = ('i4,f4,a10')) 
In [21]: x[:] = [(1,0.5, 'NumPy'), (10,-0.5, 'Essential')] 
In [22]: x 
Out[22]: 
array([(1, 0.5, 'NumPy'), (10, -0.5, 'Essential')], 
      dtype=[('f0', '<...

Summary


In this chapter, we covered the last important component of the ndarray object: strides. We saw a huge difference in memory layouts and also in performance when you use different ways to initialize your NumPy array. We also got to know the record array (structured array) and how to manipulate the date/time in NumPy. Most importantly, we saw how to read and write our data with NumPy.

NumPy is powerful not only because of its performance or ufuncs, but also because of how easy it can make your analysis. Use NumPy with your data as much as you can!

Next, we will look at linear algebra and matrix computation using NumPy.

lock icon
The rest of the chapter is locked
You have been reading a chapter from
NumPy Essentials
Published in: Apr 2016Publisher: ISBN-13: 9781784393670
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
undefined
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime

Authors (3)

author image
Leo (Liang-Huan) Chin

Leo (Liang-Huan) Chin is a data engineer with more than 5 years of experience in the field of Python. He works for Gogoro smart scooter, Taiwan, where his job entails discovering new and interesting biking patterns . His previous work experience includes ESRI, California, USA, which focused on spatial-temporal data mining. He loves data, analytics, and the stories behind data and analytics. He received an MA degree of GIS in geography from State University of New York, Buffalo. When Leo isn't glued to a computer screen, he spends time on photography, traveling, and exploring some awesome restaurants across the world. You can reach Leo at http://chinleock.github.io/portfolio/.
Read more about Leo (Liang-Huan) Chin

author image
Tanmay Dutta

Tanmay Dutta is a seasoned programmer with expertise in programming languages such as Python, Erlang, C++, Haskell, and F#. He has extensive experience in developing numerical libraries and frameworks for investment banking businesses. He was also instrumental in the design and development of a risk framework in Python (pandas, NumPy, and Django) for a wealth fund in Singapore. Tanmay has a master's degree in financial engineering from Nanyang Technological University, Singapore, and a certification in computational finance from Tepper Business School, Carnegie Mellon University.
Read more about Tanmay Dutta

author image
Shane Holloway

http://shaneholloway.com/resume/
Read more about Shane Holloway