Packt Publishing Community Experience, Distilled

Python Data Persistence using MySQL

HomeBooksSupportFreeAuthorsAward
WELCOME YOUR ACCOUNT NEWSLETTERS ARTICLES ABOUT US

 
Article Network FAQ

Want to know more about Packt's Article Network? Interested in contributing your article ideas?

Please visit our FAQ for more information.


See More

SEARCH

Search our Site


Automation with Python and STAF/STAX

The aim of this article by Javier Collado is to show how tasks may be automated using Python together with STAF (Software Testing Automation Framework) by means of an example. We will first see a problem and then derive its solution using classical Python-only as well as Python+STAF. The implementation of the solution will evolve in different stages. This will help us in comparing both the solution in terms of simplicity and efficiency.


See More
 
Python Data Persistence using MySQL

Among other merits, Python is an ideal language for writing server-side scripts, allowing you to integrate interactive behavior with HTML. Persisting dynamic content to an underlying database is fairly straightforward. By installing an appropriate Python DB module, you get the ability to interact with the database of choice from within Python code, moving your application data in and out of the underlying persistent store.

This article by Yuli Vasiliev will walk you through the process of building a simple Python application that interacts with a MySQL database. In a nutshell, the application picks up some live data from a web site and then persists it to an underlying MySQL database.

To keep things simple though, the article doesn’t discuss how to implement database-backed web pages with Python, concentrating only on how to connect Python with MySQL.

Sample Application

The best way to learn new programming techniques is to write an application that exercises them. This article will walk you through the process of building a simple Python application that interacts with a MySQL database. In a nutshell, the application picks up some live data from a web site and then persists it to an underlying MySQL database. For the sake of simplicity, it doesn’t deal with a large dataset. Rather, it picks up a small subset of data, storing it as a few rows in the underlying database.

In particular, the application gets the latest post from the Packt Book Feed page available at http://feeds.feedburner.com/packtpub/sDsa?format=xml. Then, it analyzes the post’s title, finding appropriate tags for the article associated with the post, and finally inserts information about the post into the posts and posttags underlying database tables. As you might guess, a single post may be associated with more than one tag, meaning a record in the posts table may be related to several records in the posttags table.

Diagrammatically, the sample application components and their interactions might look like this:

Note the use of appsample.py. This script file will contain all the application code written in Python. In particular, it will contain the list of tags, as well as several Python functions packaging application logic.

Software Components

To build the sample discussed in the article you’re going to need the following software components installed on your computer:

  • Python 2.5.x
  • MySQLdb 1.2.x
  • MySQL 5.1

All these software components can be downloaded and used for free. Although you may already have these pieces of software installed on your computer, here’s a brief overview of where you can obtain them.

You can download an appropriate Python release from the Downloads page at Python’s web site at http://python.org/download/. You may be tempted to download the most recent release. Before you choose the release, however, it is recommended that you visit the Python for MySQL page at http://sourceforge.net/projects/mysql-python/ to check what Python releases are supported by the current MySQLdb module that will be used to connect your Python installation with MySQL.

MySQLdb is the Python DB API-2.0 interface for MySQL. You can pick up the latest MySQLdb package (version 1.2.2 at the time of writing) from the sourceforge.net’s Python for MySQL page at http://sourceforge.net/projects/mysql-python/. Before you can install it, though, make sure you have Python installed in your system.

You can obtain the MySQL 5.1 distribution from the mysql.com web site at http://dev.mysql.com/downloads/mysql/5.1.html, picking up the package designed for your operating system.

Setting up the Database

Assuming you have all the software components that were outlined in the preceding section installed in your system, you can now start building the sample application. The first step is to create the posts and posttags tables in your underlying MySQL database. As mentioned earlier, a single post may be associated with more than one tag. What this means in practice is that the posts and posttags tables should have a foreign key relationship. In particular, you might create these tables as follows:

CREATE TABLE posts (
title VARCHAR(256) PRIMARY KEY,
guid VARCHAR(1000),
pubDate VARCHAR(50)
)
ENGINE = InnoDB;

CREATE TABLE posttags (
title VARCHAR(256),
tag VARCHAR(20),
PRIMARY KEY(title,tag),
FOREIGN KEY(title) REFERENCES posts(title)
)
ENGINE = InnoDB;

As you might guess, you don’t need to populate above tables with data now. This will be automatically done later when you launch the sample.

Developing the Script

Now that you have the underlying database ready, you can move on and develop the Python code to complete the sample. In particular, you’re going to need to write the following components in Python:

  • tags nested list of tags that will be used to describe the posts obtained from the Packt Book Feed page.
  • obtainPost function that will be used to obtain the information about the latest post from the Packt Book Feed page.
  • determineTags function that will determine appropriate tags to be applied to the latest post obtained from the Packt Book Feed page.
  • insertPost function that will insert the information about the post obtained into the underlying database tables: posts and posttags.
  • execPr function that will make calls to the other, described above functions. You will call this function to launch the application.

All the above components will reside in a single file, say, appsample.py that you can create in your favorite text editor, such as vi or Notepad.

First, add the following import declarations to appsample.py:

import MySQLdb
import urllib2
import xml.dom.minidom

As you might guess, the first module is required to connect Python with MySQL, providing the Python DB API-2.0 interface for MySQL. The other two are needed to obtain and then parse the Packt Book Feed page’s data. You will see them in action in the obtainPost function in a moment.

But first let’s create a nested list of tags that will be used by the determineTags function that determines the tags appropriate for the post being analyzed. To save space here, the following list contains just a few tags. You may and should include more tags to this list, of course.

tags=["Python","Java","Drupal","MySQL","Oracle","Open Source"]

The next step is to add the obtainPost function responsible for getting the data from the Packt Book Feed page and generating the post dictionary that will be utilized in further processing:

def obtainPost():
addr = "http://feeds.feedburner.com/packtpub/sDsa?format=xml"
xmldoc = xml.dom.minidom.parseString(urllib2.urlopen(addr).read())
item = xmldoc.getElementsByTagName("item")[0]
title = item.getElementsByTagName("title")[0].firstChild.data
guid = item.getElementsByTagName("guid")[0].firstChild.data
pubDate = item.getElementsByTagName("pubDate")[0].firstChild.data
post ={"title": title, "guid": guid, "pubDate": pubDate}
return post

Now that you have obtained all the required information about the latest post on the Packt Book Feed page, you can analyze the post’s title to determine appropriate tags. For that, add the determineTags function to appsample.py:

def determineTags(title, tagslist):
curtags=[]
for curtag in tagslist:
if title.find(curtag)>-1:curtags.append(curtag)
return curtags

By now, you have both the post and tags to be persisted to the database. So, add the insertPost function that will handle this task (don’t forget to change the parameters specified to the MySQLdb.connect function for the actual ones):

def insertPost(title, guid, pubDate, curtags):
db=MySQLdb.connect(host="localhost",user="usrsample",passwd="pswd",db="dbsample")
c=db.cursor()
c.execute("""INSERT INTO posts (title, guid, pubDate) VALUES(%s,%s,%s)""", (title, guid, pubDate))
db.commit()
for tag in curtags:
c.execute("""INSERT INTO posttags (title, tag) VALUES(%s,%s)""", (title, tag))
db.commit()
db.close()

All that is left to do is add the execPr function that brings all the pieces together, calling the above functions in the proper order:

def execPr():
p = obtainPost()
t = determineTags(p["title"],tags)
insertPost(p["title"], p["guid"], p["pubDate"], t)

Now let’s test the code we just wrote. The simplest way to do this is through Python’s interactive command line. To start an interactive Python session, you can type python at your system shell prompt.

It’s important to realize that since the sample discussed here is going to obtain some data from the web, you must connect to the Internet before you launch the application. Once you’re connected, you can launch the execPr function in your Python session, as follows:

>>>import appsample
>>>appsample.execPr()

If everything is okay, you should see no messages. To make sure that everything really went as planned, you can check the posts and posttags tables. To do this, you might connect to the database with the MySQL command-line tool and then issue the following SQL commands:

SELECT * FROM posts;

The above should generate the output that might look like this:

|title                                  |guid                        
|pubDate
------------------------------------------------------------------
Open Source CMS Award Voting Now Closed | http://www.packtpub.com/
article/2008-award-voting-closed | Tue, 21 Oct 2008 09:29:54 +0100

Then, you might want to check out the posttags table:

SELECT * FROM posttags;

This might generate the following output:

|title                                  |tag                        
Open Source CMS Award Voting Now Closed | Open Source

Please note that you may see different results since you are working with live data. Another thing to note here is that if you want to re-run the sample, you first need to empty the posts and posttags tables. Otherwise, you will encounter the problem related to the primary key constraints. However, that won’t be a problem at all if you re-run the sample in a few days, when a new post or posts appear on the Packt Book Feed page.

Conclusion

In this article you looked at a simple Python application persisting data to an underlying MySQL database. Although, for the sake of simplicity, the sample discussed here doesn’t offer a web interface, it illustrates how you can obtain data from the Internet, and then utilize it within your application, and finally store that data in the database.







Expert Python Programming
 
Expert Python Programming Best practices for designing, coding, and distributing your Python software
  • Learn Python development best practices from an expert, with detailed coverage of naming and coding conventions
  • Apply object-oriented principles, design patterns, and advanced syntax tricks
  • Manage your code with distributed version control
  • Profile and optimize your code
  • Proactive test-driven development and continuous integration
  • <
http://www.PacktPub.com/expert-python-programming/book


About the Author

Yuli Vasiliev is a software developer, freelance author, and consultant currently specializing in open-source development, Oracle technologies, and service-oriented architecture (SOA). He has over 10 years of software development experience as well as several years of technical writing experience. He wrote a series of technical articles for Oracle Technology Network (OTN) and Oracle Magazine.


Books from Packt

Expert Python Programming
Expert Python Programming

CherryPy Essentials: Rapid Python Web Application Development
CherryPy Essentials: Rapid Python Web Application Development

Professional Plone Development
Professional Plone Development

Learning Website Development with Django
Learning Website Development with Django

OpenCms 7 Development
OpenCms 7 Development

Zenoss Core Network and System Monitoring
Zenoss Core Network and System Monitoring

Building Websites with Joomla! 1.5
Building Websites with Joomla! 1.5

Building Powerful and Robust Websites with Drupal 6
Building Powerful and Robust Websites with Drupal 6



 
Python LDAP Applications: Part 1 - Installing and Configuring the Python-LDAP Library and Binding to an LDAP Directory

This article mini-series by Matt Butcher will look at the Python application programmers interface (API) for the LDAP libraries, and using this API, we will connect to our OpenLDAP server and manipulate the directory information tree. More specifically, we will cover the following in this article series:

  • Installing and configuring the Python-LDAP library.
  • Binding to an LDAP directory.
  • Comparing attributes between the client and server.
  • Performing searches on the directory.
  • Modifying the directory information tree with add, delete, and modify operations.
  • Modifying directory passwords.
  • Working with LDAP schemas.

This first part will deal with installation and configuration of the Python-LDAP library. We will then see how the binding operation is performed.


See More

Article Network


Packt Article Network

Visit Packt's Article Network, for all the latest quality, relevant and free content.
See More


NEWSLETTER

Sign up for updates, offers, free downloads and you could win an iPod Shuffle.
Subscription center

Writing a Package in Python

This article by Tarek Ziadé focuses on a repeatable process to write and release Python packages. We will focus on how to install, uninstall, develop, test, register, and upload a package.


See More
 




© Packt Publishing Ltd 2010

RSS