Search icon
Arrow left icon
All Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Newsletters
Free Learning
Arrow right icon
Learning Cython Programming (Second Edition) - Second Edition

You're reading from  Learning Cython Programming (Second Edition) - Second Edition

Product type Book
Published in Feb 2016
Publisher Packt
ISBN-13 9781783551675
Pages 110 pages
Edition 2nd Edition
Languages
Author (1):
Philip Herron Philip Herron
Profile icon Philip Herron

Parsing large amounts of data


I want to try and prove how powerful and natively compiled C types are to programmers by showing the difference in parsing large amounts of XML. We can take the geographic data from the government as the test data for this experiment (http://www.epa.gov/enviro/geospatial-data-download-service).

Let's look at the size of this XML data:

 ls -liah
total 480184
7849156 drwxr-xr-x   5 redbrain  staff   170B 25 Jul 16:42 ./
5803438 drwxr-xr-x  11 redbrain  staff   374B 25 Jul 16:41 ../
7849208 -rw-r--r--@  1 redbrain  staff   222M  9 Mar 04:27 EPAXMLDownload.xml
7849030 -rw-r--r--@  1 redbrain  staff    12M 25 Jul 16:38 EPAXMLDownload.zip
7849174 -rw-r--r--   1 redbrain  staff    57B 25 Jul 16:42 README

It's huge! Before we write programs, we need to understand a little bit about the structure of this data to see what we want to do with it. It contains facility site locations with addresses. This seems to be the bulk of the data in here, so let's try and parse it all...

lock icon The rest of the chapter is locked
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime}