Reader small image

You're reading from  Learning NumPy Array

Product typeBook
Published inJun 2014
Reading LevelIntermediate
Publisher
ISBN-139781783983902
Edition1st Edition
Languages
Tools
Concepts
Right arrow
Author (1)
Ivan Idris
Ivan Idris
author image
Ivan Idris

Ivan Idris has an MSc in experimental physics. His graduation thesis had a strong emphasis on applied computer science. After graduating, he worked for several companies as a Java developer, data warehouse developer, and QA analyst. His main professional interests are business intelligence, big data, and cloud computing. Ivan Idris enjoys writing clean, testable code and interesting technical articles. Ivan Idris is the author of NumPy 1.5. Beginner's Guide and NumPy Cookbook by Packt Publishing.
Read more about Ivan Idris

Right arrow

Describing data with pandas DataFrames


Luckily, pandas has descriptive statistics utilities. We will read the average wind speed, temperature, and pressure values from the KNMI De Bilt data file into a pandas DataFrame. This object is similar to the R dataframe, which is like a data table in a spreadsheet or a database. The columns are labeled, the data can be indexed, and you can run computations on the data. We will then print out descriptive statistics and a correlation matrix as shown in the following steps:

  1. Read the CSV file with the pandas read_csv function. This function works in a similar fashion to the NumPy load_txt function:

    to_float = lambda x: .1 * float(x.strip() or np.nan)
    to_date = lambda x: dt.strptime(x, "%Y%m%d")
    cols = [4, 11, 25]
    conv_dict = dict( (col, to_float) for col in cols) 
    
    conv_dict[1] = to_date
    cols.append(1)
     
    headers = ['dates', 'avg_ws', 'avg_temp', 'avg_pres']
    df = pd.read_csv(sys.argv[1], usecols=cols, names=headers, index_col=[0], converters=conv_dict)
  2. Print...

lock icon
The rest of the page is locked
Previous PageNext Page
You have been reading a chapter from
Learning NumPy Array
Published in: Jun 2014Publisher: ISBN-13: 9781783983902

Author (1)

author image
Ivan Idris

Ivan Idris has an MSc in experimental physics. His graduation thesis had a strong emphasis on applied computer science. After graduating, he worked for several companies as a Java developer, data warehouse developer, and QA analyst. His main professional interests are business intelligence, big data, and cloud computing. Ivan Idris enjoys writing clean, testable code and interesting technical articles. Ivan Idris is the author of NumPy 1.5. Beginner's Guide and NumPy Cookbook by Packt Publishing.
Read more about Ivan Idris