Python Data Persistence using MySQL

Exclusive offer: get 50% off this eBook here
Expert Python Programming

Expert Python Programming — Save 50%

Best practices for designing, coding, and distributing your Python software

$26.99    $13.50
by Yuli Vasiliev | December 2008 | Open Source

Among other merits, Python is an ideal language for writing server-side scripts, allowing you to integrate interactive behavior with HTML. Persisting dynamic content to an underlying database is fairly straightforward. By installing an appropriate Python DB module, you get the ability to interact with the database of choice from within Python code, moving your application data in and out of the underlying persistent store.

This article by Yuli Vasiliev will walk you through the process of building a simple Python application that interacts with a MySQL database. In a nutshell, the application picks up some live data from a web site and then persists it to an underlying MySQL database.

 

To keep things simple though, the article doesn’t discuss how to implement database-backed web pages with Python, concentrating only on how to connect Python with MySQL.

Sample Application

The best way to learn new programming techniques is to write an application that exercises them. This article will walk you through the process of building a simple Python application that interacts with a MySQL database. In a nutshell, the application picks up some live data from a web site and then persists it to an underlying MySQL database. For the sake of simplicity, it doesn’t deal with a large dataset. Rather, it picks up a small subset of data, storing it as a few rows in the underlying database.

In particular, the application gets the latest post from the Packt Book Feed page available at http://feeds.feedburner.com/packtpub/sDsa?format=xml. Then, it analyzes the post’s title, finding appropriate tags for the article associated with the post, and finally inserts information about the post into the posts and posttags underlying database tables. As you might guess, a single post may be associated with more than one tag, meaning a record in the posts table may be related to several records in the posttags table.

Diagrammatically, the sample application components and their interactions might look like this:

Python Data Persistence using MySQL

Note the use of appsample.py. This script file will contain all the application code written in Python. In particular, it will contain the list of tags, as well as several Python functions packaging application logic.

Software Components

To build the sample discussed in the article you’re going to need the following software components installed on your computer:

  • Python 2.5.x
  • MySQLdb 1.2.x
  • MySQL 5.1

All these software components can be downloaded and used for free. Although you may already have these pieces of software installed on your computer, here’s a brief overview of where you can obtain them.

You can download an appropriate Python release from the Downloads page at Python’s web site at http://python.org/download/. You may be tempted to download the most recent release. Before you choose the release, however, it is recommended that you visit the Python for MySQL page at http://sourceforge.net/projects/mysql-python/ to check what Python releases are supported by the current MySQLdb module that will be used to connect your Python installation with MySQL.

MySQLdb is the Python DB API-2.0 interface for MySQL. You can pick up the latest MySQLdb package (version 1.2.2 at the time of writing) from the sourceforge.net’s Python for MySQL page at http://sourceforge.net/projects/mysql-python/. Before you can install it, though, make sure you have Python installed in your system.

You can obtain the MySQL 5.1 distribution from the mysql.com web site at http://dev.mysql.com/downloads/mysql/5.1.html, picking up the package designed for your operating system.

Setting up the Database

Assuming you have all the software components that were outlined in the preceding section installed in your system, you can now start building the sample application. The first step is to create the posts and posttags tables in your underlying MySQL database. As mentioned earlier, a single post may be associated with more than one tag. What this means in practice is that the posts and posttags tables should have a foreign key relationship. In particular, you might create these tables as follows:

CREATE TABLE posts (
title VARCHAR(256) PRIMARY KEY,
guid VARCHAR(1000),
pubDate VARCHAR(50)
)
ENGINE = InnoDB;

CREATE TABLE posttags (
title VARCHAR(256),
tag VARCHAR(20),
PRIMARY KEY(title,tag),
FOREIGN KEY(title) REFERENCES posts(title)
)
ENGINE = InnoDB;

As you might guess, you don’t need to populate above tables with data now. This will be automatically done later when you launch the sample.

Developing the Script

Now that you have the underlying database ready, you can move on and develop the Python code to complete the sample. In particular, you’re going to need to write the following components in Python:

  • tags nested list of tags that will be used to describe the posts obtained from the Packt Book Feed page.
  • obtainPost function that will be used to obtain the information about the latest post from the Packt Book Feed page.
  • determineTags function that will determine appropriate tags to be applied to the latest post obtained from the Packt Book Feed page.
  • insertPost function that will insert the information about the post obtained into the underlying database tables: posts and posttags.
  • execPr function that will make calls to the other, described above functions. You will call this function to launch the application.

All the above components will reside in a single file, say, appsample.py that you can create in your favorite text editor, such as vi or Notepad.

First, add the following import declarations to appsample.py:

import MySQLdb
import urllib2
import xml.dom.minidom

As you might guess, the first module is required to connect Python with MySQL, providing the Python DB API-2.0 interface for MySQL. The other two are needed to obtain and then parse the Packt Book Feed page’s data. You will see them in action in the obtainPost function in a moment.

But first let’s create a nested list of tags that will be used by the determineTags function that determines the tags appropriate for the post being analyzed. To save space here, the following list contains just a few tags. You may and should include more tags to this list, of course.

tags=["Python","Java","Drupal","MySQL","Oracle","Open Source"]

The next step is to add the obtainPost function responsible for getting the data from the Packt Book Feed page and generating the post dictionary that will be utilized in further processing:

def obtainPost():
addr = "http://feeds.feedburner.com/packtpub/sDsa?format=xml"
xmldoc = xml.dom.minidom.parseString(urllib2.urlopen(addr).read())
item = xmldoc.getElementsByTagName("item")[0]
title = item.getElementsByTagName("title")[0].firstChild.data
guid = item.getElementsByTagName("guid")[0].firstChild.data
pubDate = item.getElementsByTagName("pubDate")[0].firstChild.data
post ={"title": title, "guid": guid, "pubDate": pubDate}
return post

Now that you have obtained all the required information about the latest post on the Packt Book Feed page, you can analyze the post’s title to determine appropriate tags. For that, add the determineTags function to appsample.py:

def determineTags(title, tagslist):
curtags=[]
for curtag in tagslist:
if title.find(curtag)>-1:curtags.append(curtag)
return curtags

By now, you have both the post and tags to be persisted to the database. So, add the insertPost function that will handle this task (don’t forget to change the parameters specified to the MySQLdb.connect function for the actual ones):

def insertPost(title, guid, pubDate, curtags):
db=MySQLdb.connect(host="localhost",user="usrsample",passwd="pswd",db="dbsample")
c=db.cursor()
c.execute("""INSERT INTO posts (title, guid, pubDate) VALUES(%s,
%s,%s)""", (title, guid, pubDate))
db.commit()
for tag in curtags:
c.execute("""INSERT INTO posttags (title, tag) VALUES(%s,%s)""", (title, tag))
db.commit()
db.close()

All that is left to do is add the execPr function that brings all the pieces together, calling the above functions in the proper order:

def execPr():
p = obtainPost()
t = determineTags(p["title"],tags)
insertPost(p["title"], p["guid"], p["pubDate"], t)

Now let’s test the code we just wrote. The simplest way to do this is through Python’s interactive command line. To start an interactive Python session, you can type python at your system shell prompt.

It’s important to realize that since the sample discussed here is going to obtain some data from the web, you must connect to the Internet before you launch the application. Once you’re connected, you can launch the execPr function in your Python session, as follows:

>>>import appsample
>>>appsample.execPr()

If everything is okay, you should see no messages. To make sure that everything really went as planned, you can check the posts and posttags tables. To do this, you might connect to the database with the MySQL command-line tool and then issue the following SQL commands:

SELECT * FROM posts;

The above should generate the output that might look like this:

|title                                  |guid                        
|pubDate
------------------------------------------------------------------
Open Source CMS Award Voting Now Closed | http://www.packtpub.com/
article/2008-award-voting-closed | Tue, 21 Oct 2008 09:29:54 +0100

Then, you might want to check out the posttags table:

SELECT * FROM posttags;

This might generate the following output:

|title                                  |tag                        
Open Source CMS Award Voting Now Closed | Open Source

Please note that you may see different results since you are working with live data. Another thing to note here is that if you want to re-run the sample, you first need to empty the posts and posttags tables. Otherwise, you will encounter the problem related to the primary key constraints. However, that won’t be a problem at all if you re-run the sample in a few days, when a new post or posts appear on the Packt Book Feed page.

Conclusion

In this article you looked at a simple Python application persisting data to an underlying MySQL database. Although, for the sake of simplicity, the sample discussed here doesn’t offer a web interface, it illustrates how you can obtain data from the Internet, and then utilize it within your application, and finally store that data in the database.

Expert Python Programming Best practices for designing, coding, and distributing your Python software
Published: September 2008
eBook Price: $26.99
Book Price: $44.99
See more
Select your format and quantity:

About the Author :


Yuli Vasiliev

Yuli Vasiliev is a software developer, freelance author, and consultant currently specializing in open-source development, Oracle technologies, and service-oriented architecture (SOA). He has over 10 years of software development experience as well as several years of technical writing experience. He wrote a series of technical articles for Oracle Technology Network (OTN) and Oracle Magazine.

Contact Yuli Vasiliev

Books From Packt

Expert Python Programming
Expert Python Programming

CherryPy Essentials: Rapid Python Web Application Development
CherryPy Essentials: Rapid Python Web Application Development

Professional Plone Development
Professional Plone Development

Learning Website Development with Django
Learning Website Development with Django

OpenCms 7 Development
OpenCms 7 Development

Zenoss Core Network and System Monitoring
Zenoss Core Network and System Monitoring

Building Websites with Joomla! 1.5
Building Websites with Joomla! 1.5

Building Powerful and Robust Websites with Drupal 6
Building Powerful and Robust Websites with Drupal 6

No votes yet

Post new comment

CAPTCHA
This question is for testing whether you are a human visitor and to prevent automated spam submissions.
c
K
f
d
4
c
Enter the code without spaces and pay attention to upper/lower case.
Code Download and Errata
Packt Anytime, Anywhere
Register Books
Print Upgrades
eBook Downloads
Video Support
Contact Us
Awards Voting Nominations Previous Winners
Judges Open Source CMS Hall Of Fame CMS Most Promising Open Source Project Open Source E-Commerce Applications Open Source JavaScript Library Open Source Graphics Software
Resources
Open Source CMS Hall Of Fame CMS Most Promising Open Source Project Open Source E-Commerce Applications Open Source JavaScript Library Open Source Graphics Software