Chapter 1. Getting to Know SimpleDB
Most developers would describe a modern database as relational with stored procedures and cross-table functions such as join. So why would you use a database that has none of these capabilities? The answer is scalability.
This morning, CNN ran a story on your new web application. Yesterday you had 10 concurrent users, and now your site is viral with 50,000 users signing on. Which database will handle 50,000 concurrent users without a complex expensive cluster? The answer is SimpleDB.
Why SimpleDB?
Challenges?
SimpleDB is one of the core Amazon Web Services, which include Amazon Simple Storage Service (S3) and Amazon Elastic Compute Cloud (EC2). Amazon SimpleDB stores your structured data as key-value pairs in the Amazon Web Services (AWS) cloud and lets you run real-time queries against this data. You can scale it easily in response to increased load from your successful applications without the need for a costly cluster database server complex.
SimpleDB, as illustrated in the following diagram, is designed to be used either as an independent data storage component in your applications or in conjunction with some of the other services from Amazon's stable of Cloud Services, such as Amazon S3 and Amazon EC2.
The biggest challenge in SimpleDB is learning to think in its unique metaphor. Like speaking a new language, you need to stop translating and start thinking in that language. Rather than thinking of SimpleDB as a database, approach it as a spreadsheet with some XML characteristics.
SimpleDB functionality can be accessed from almost any programming language (such as Python, Ruby, Java, PHP, Erlang, and Perl) using super simple HTTP-based requests. You can get started anytime you like, and you pay for it based on how much you use it. It is very different from a relational database, and takes a completely different approach toward storing and querying data. It follows the convention of eventual consistency. Think of it as a single master database for updates and a large collection of read database slaves. Any changes made to your data will need to be propagated across all the different copies. This can sometimes take a few seconds depending upon the system load at that time and network latency, which means that a consumer of your domain and data may not see the changes immediately. The changes will eventually be propagated throughout SimpleDB, but this is an important consideration you need to think about when designing your application.