Welcome to Getting Started with Memcached (A handy guide for memcached that helps you boost your application performance easily).
In this section, we will be covering the basic steps to get your memcached server up and running, either for testing or for a real production environment.
Let's get started with the basic installation of memcached on Ubuntu Linux 12.04LTS (long-time support) using apt-get. We have picked this particular version of Ubuntu because it's the latest LTS version that came out while writing this book, however, the steps of installation are the same for any other Ubuntu version. LTS is generally recommended for production servers because it gets a long period of maintenance and support from the folks at canonical.
You will need to have an administrator account on the Ubuntu box you are setting up. If you are performing some tests then most likely any machine would do the job but if you are setting this up as a production environment, you will need a machine with a decent amount of free memory for the caching job.
Update your apt local repository by using:
sudo apt-get update
When asked for the password, just enter your account password to give permission to the application to run as root.
Use
apt-get
to install the memcached service:sudo apt-get install memcached
Now, let's verify that memcached service has been started.
ps aux | grep memcached
You are supposed to see something similar to the following:
memcache 830 0.0 0.1 323220 1188 ? Sl 17:33 0:00 /usr/bin/memcached -m 64 -p 11211 -u memcache -l 127.0.0.1
First, we pulled the latest packages information from the apt repository online to make sure we are downloading the latest version of memcached to our local server. Then we simply used the apt-get
command to download and auto-install the memcached package.
The installation script also starts the memcached daemon and marks this service to be auto-started on every boot of our Ubuntu box.
We validated that memcached was properly started by checking the running processes with the ps
command and grep-ing
to see only processes with the word memcached in them.
It's also important to note here that the default configuration of memcached limits the memcached daemon to listen only on the loopback device (localhost). This means that you can connect to your memcached daemons only from local processes running on the same computer.
Let's take a look at the configuration of the memcached daemon installed on our Ubuntu server.
Open the configuration file located at /etc/memcached.conf
and locate the line where you see something like the following:
-l 127.0.0.1
This tells the memcached daemon to listen only on the localhost, note that this is the only security measure that memcached can offer, so make sure it's listening on a firewalled interface.
Change it to the following if you want the daemon to listen on all interfaces.
-l 0.0.0.0
Also locate the following line:
-m 64
This configuration parameter configures the upper cap of how large the in-memory storage can grow to. The default here is 64 megabytes. This means that you can store up to 64 MB worth data on your memcached daemon, but this doesn't mean that the daemon will allocate this memory on its boot.
In most cases you will only need memcached on windows for development or testing, it's quite unlikely to see memcached installed on a production server.
Memcached is written in C so it's portable, but it is not officially supported or recommended to run on Windows. However, there have been a few ports to Windows, a popular one can be found at http://code.jellycan.com/memcached/, see memcached for win32. As advertised, it comes with no promises or support.
Installation on Mac OS X is quite a straightforward process if you have the right tools installed.
We will be using a package manager for Mac OS X that is really a must-have tool for any Mac user and even more important if you are a developer or a system engineer.
The package manager we will be using is Homebrew, the missing package manager for OS X
You will need to install Homebrew first if you don't have it, installation is straightforward and all instructions are explained in different languages for your comfort at http://brew.sh/.
Or you can simply use this one liner to install Homebrew on your Mac:
ruby -e "$(curl -fsSL https://raw.github.com/mxcl/homebrew/go)"
Update the Homebrew local repository:
brew update
Use Homebrew to install the memcached package:
brew install memcached
Memcached is not started by default after installation, if you manually want to start memcached use,
/usr/local/opt/memcached/bin/memcached
If you are planning to start memcached on boot every time, you will need to create a link:
ln -sfv /usr/local/opt/memcached/*.plist ~/Library/LaunchAgents
Then, you may want to start memcached immediately using
launchctl
:launchctl load ~/Library/LaunchAgents/homebrew.mxcl.memcached.plist
We started by updating the local copy of the Homebrew repository from the Internet using the brew update, this ensures we are installing the latest version of any package we want to. Then we installed the latest version of memcached.
Homebrew does not start memcached automatically after installation nor during boot time, so we had to do this ourselves.
You can always start memcached manually in the foreground by using the memcached daemon executable. But if you want memcached to start on boot we created a symlink so that launchctl
picks it up on boot.
If you have configured your memcached daemon to start on boot as previously described, you might be wondering, where is the default configuration? Is it in the same place as Ubuntu? The answer is No!
Because the memcached service will be started by launchctl
, the configuration is controlled by it. You will find the configuration file at /usr/local/opt/memcached/homebrew.mxcl.memcached.plist
and it's basically an XML file.
See the following section in the file:
<array> <string>/usr/local/opt/memcached/bin/memcached</string> <string>-l</string> <string>localhost</string> </array>
As you may have discovered yourself, this represents a list of parameters passed to the memcached executable at runtime.
You can edit the localhost
field, as previously described in the Ubuntu configuration section, but if you want to configure the amount of memory that memcached can use, you will need to insert a couple of directives for that directly after the localhost directive:
<string>-m</string><string>256</string>
This configures memcached to use up to 256 MB of memory for its on-memory storage.
Another option if you are not a Homebrew user, is to use MacPorts (http://www.macports.org/), it works almost the same way and you can use the command port instead of brew.
Another interesting feature of brew, is that you can specify options to control the way memcached is built (compiled), so most of the time you don't really need to compile memcached from source on Mac OS X, instead, you use brew options for that. An example, is to enable SASL support to disable the plain ASCII protocol or to add SASL with password option, as stated in the brew info memcached
.
--enable-sasl Enable SASL support -- disables ASCII protocol! --enable-sasl-pwdb Enable SASL with memcached's own plain text password db support -- disables ASCII protocol!
So, for example, if you want to enable SASL support during installation, use the following:
brew install memcached --enable-sasl
In some cases, you might want to enable some of the memcached features that have to be baked-in during the compile time of the program. In this recipe, we will learn how to compile memcached from a source on Ubuntu
We will install the requirements of the package by using apt-get
:
sudo apt-get install g++ make libevent-dev
This installed the C++ compiler , make, and the libevent library headers needed to compile memcached.
Let's download the latest version of memcached in your home directory
curl -O --location http://memcached.org/latest mv latest memcached-latest.tar.gz tar vxzf memcached-latest.tar.gz
Next, let's configure and compile
cd memcached-* ./configure && make
If the compilation process went well, we install the binaries:
sudo make install
Now, since we have memcached installed, let's see what kind of commands memcached daemon supports and how simple the memcached protocol is.
We will be using a plain simple TELNET tool to connect to the memcached daemon.
Remember that memcached has no persistent storage whatsoever, so it's totally memory-based and once we terminate the daemon everything we have stored is simply gone!
You will need to have telnet client installed on your machine, in most cases you will find it already installed but in case you didn't find it you can install it on your Ubuntu box using
sudo apt-get install telnet
You also must make sure that the memcached daemon is actually running.
Connect to the running memcached daemon with telnet on port 11211
.
telnet localhost 11211
You should see something like the following:
Trying 127.0.0.1... Connected to localhost. Escape character is '^]'.
So, let's play with some basic storage commands to understand the main concepts behind memcached. We believe that this way is the best way to understand the features of the service and to truly realize its design simplicity and power.
Memcached supports a plain ASCII (text) protocol, you can find the protocol definition and specifications in the document in this link https://github.com/memcached/memcached/blob/master/doc/protocol.txt
The first command is
stats
where you request some basic information about the running service:stats STAT pid 8141 STAT uptime 1926 STAT time 1380294691 STAT version 1.4.13 STAT libevent 2.0.16-stable STAT pointer_size 64 STAT rusage_user 0.108006 –
So, let's now use it for settings:
stats settings STAT maxbytes 67108864 STAT maxconns 1024 STAT tcpport 11211 STAT udpport 11211 STAT inter 127.0.0.1 STAT verbosity 0 STAT oldest 849 STAT evictions on STAT domain_socket NULL STAT umask 700 STAT growth_factor 1.25 STAT chunk_size 48 STAT num_threads 4 STAT num_threads_per_udp 4 STAT stat_key_prefix : STAT detail_enabled no STAT reqs_per_event 20 STAT cas_enabled yes STAT tcp_backlog 1024 STAT binding_protocol auto-negotiate STAT auth_enabled_sasl no STAT item_size_max 1048576 STAT maxconns_fast no STAT hashpower_init 0 STAT slab_reassign no STAT slab_automove no END
Now, let's store some value for a given key:
set mykey 0 300 5 16 I Love Memcached
After you hit the return key, you will see the
STORED
message. So the whole listing is as follows:set mykey 0 300 16 I Love Memcached STORED
Now, let's read this key by using the
get
command:get mykey VALUE mykey 0 16 I Love Memcached END
This recipe gives you a glimpse of the kind of commands you can send to your memcached daemon using a simple tool like telnet.
We started by connecting to the memcached daemon on the default port 11211
using telenet. Then we used the stats
command which asks the daemon to send us some useful statistics from the service such as the uptime, how many get requests actually returned data get_hits
, and how many get requests resulted in a miss hit get_misses
.
Then, we used stats settings
which prints out the settings and configuration of the current running daemon, you will be able to see things such as tcpport
which points to the port it is listening to and something such as maxbytes
which is the maximum number of bytes allowed in this cache server.
Then, we moved to the storage commands set
and get
. Storage commands have the following format:
<command name> <key> <flags> <exptime> <bytes>
The <command name>
field can be set, add, replace, append, or prepend.
The <key>
field is the name of the key you are storing, in our case that was mykey
.
The <flags>
field is an arbitrary 16-bit unsigned number that the server stores along with the key and is returned when the client requests to get the value. It's opaque to the server, so it doesn't give any special meaning to the server itself but the client can use this number to add some special meaning to this key if needed. In our case, we just passed 0
for this field.
The <exptime>
field indicates the expiration time, if it's 0, the item never expires (although it might get deleted when the server needs to free up place for another key to be stored). If it's non-zero (either Unix time or offset in seconds from the current time), it is guaranteed that clients will not be able to retrieve this item after the expiration time arrives (measured by the server time).
The <bytes>
field indicates the length of the value to be stored, in our case that was 16
, which is the length of the words I Love Memcached
.
After hitting your return (Enter) key, you are supposed to feed the server with the value to be stored along with the key. Then, after hitting another return, you receive a STORED
message indicating that the key-value pair has been stored.
Then, we moved to get
, it's very simple, you get a <key>
field and the value returns along with the <flags>
field and the length of the value, then the value is printed before the END
sentinel.
VALUE mykey 0 16 I Love Memcached END
How to make sure that memcached daemon is started by default on boot?
On server installations, we need to ensure that memcached is automatically started on boot if it's not already.
Check if memcached is already running or not:
/etc/init.d/memcached status* memcached is running
If you want to disable starting memcached on boot:
sudo update-rc.d memcached disable
If you want to re-enable memcached to start on boot:
sudo update-rc.d memcached enable
To ensure it's running in the default run levels:
sudo update-rc.d memcached defaults
We are using update-rc.d
script to create and delete symbolic links at /etc/rcX.d/
where X
is the runlevel number.
Those symlinks are scanned on boot and they control whether the service is going to be started or not, based on the initial letter.
If you have seen the output of update-rc.d memcached enable
Enabling system startup links for /etc/init.d/memcached. Removing any system startup links for /etc/init.d/memcached: /etc/rc0.d/K20memcached /etc/rc1.d/K20memcached /etc/rc2.d/K80memcached /etc/rc3.d/K80memcached /etc/rc4.d/K80memcached /etc/rc5.d/K80memcached /etc/rc6.d/K20memcached Adding system startup for /etc/init.d/memcached: /etc/rc0.d/K20memcached -> ../init.d/memcached /etc/rc1.d/K20memcached -> ../init.d/memcached /etc/rc6.d/K20memcached -> ../init.d/memcached /etc/rc2.d/S20memcached -> ../init.d/memcached /etc/rc3.d/S20memcached -> ../init.d/memcached /etc/rc4.d/S20memcached -> ../init.d/memcached /etc/rc5.d/S20memcached -> ../init.d/memcached
You will see that the symlinks may start with K
or S
. which indicates that in a certain runlevel, the system should Kill
or Start
the service, respectively.
One of the most common use cases of using memcached is to build a distributed cache environment over multiple machines in a cluster. The setup allows you to scale up memcached horizontally by adding more machines to a cluster, you expand the total memory available for your application as a cache. The benefit of having a horizontally scalable caching, is that you are not limited by the amount of RAM you can install in a single server any more. It also means that you can utilize some of the free memory you have in your web server or so, and collectively you will have a distributed memcached environment with a large single virtual memory pool for your caching needs.
Building a distributed memcached environment is far simpler than you might have thought. The memcached daemon is blind about the cluster setup and has no special configuration on the server side to run the cluster, the client is actually doing the data distribution not the server.
So, it all starts when a single server cannot hold your entire cache and you need to split the cache pool across several servers.
If you are running multiple instances of the memcached daemon on the same server, make sure you are running them on different ports.
memcached -p 3030 memcached -p 3031
The server installation goes as previously described and the cluster configuration goes to your client by adding the list of servers to all your clients.
It's important to note that in order to ensure that the cluster is sane, is to have the same order of servers in all of your clients.
As an example, I'll be using python's pylibmc
library to communicate with the memcached cluster:
import pylibmc mc = pylibmc.Client(["127.0.0.1:3030", "127.0.0.1:3031"], binary=True, behaviors={"tcp_nodelay": True, "ketama": True}) mc["ahmed"] = "Hello World" mc["tek"] = "Hello World"
What happens is that you specify a list of your servers to your client configuration and the client library uses consistent hashing to decide which server a certain key-value should go to.
The constructor of the client object here was fed with a couple of interesting parameters:
binary = True
: This is to configurepylibmc
to use the memcached binary protocol not the ASCII protocol.behaviors={"tcp_nodelay": True, "ketama": True}
: This configures the memcached connection socket to use thetcp_nodelay
socket option which disables Nagle's algorithm (http://en.wikipedia.org/wiki/Nagle%27s_algorithm) on the socket level. Setting"ketama" = True
means thatpylibmc
is usingmd5
hashing and that it's using consistent hashing for key distribution.
After we have created the client object, we have set two keys ahmed
and tek
with the value Hello World
and what actually happens behind the scenes is that each key-value pair is actually stored on a different daemon, according to the consistent hashing of the key.
Sometimes you want your caching server to be persistent; there are several very good alternatives to memcached that can help you achieve that.
You can checkout Redis at http://redis.io and Kyoto Tycoon at http://fallabs.com/kyototycoon/.
There are basically two memcached clients for PHP right now (memcache, and memcached), note the d at the latter. The memcache extension is older, lightweight, and most commonly used, and easier to install. The memcached module is feature-rich but still not widely adopted.
For the sake of simplicity, we will be using the memcache PHP extension in this recipe.
PHP is one of the most popular languages used for Web development today, it's very likely that you are actually using many pieces of software written in PHP on a daily basis without knowing.
I'm assuming you are using Ubuntu; you will need to have the simple setup of apache2 and php5. It's simple to get this stack working using this command:
sudo apt-get install apache2 php5
First, we need to install the PHP memcache extension using apt-get
:
sudo apt-get install php5-memcache
This automatically installs the extension and gets everything wired and configured for you, if you want to use the memcached extension instead, all you need to do is to replace php5-memcache
with php5-memcached
and voila, everything just works!
If you are using Mac OS X, it's a slightly different story, and you will need to install apache2 and php5.
One of the quickest ways to do so is to install a nice package called MAMP (http://www.mamp.info/en/index.html); it will make life a lot easier for you. But, if you are an advanced user and want to go with the more manual route, you get really detailed instructions to get your OS X setup ready with Apache, MySQL, and PHP ready at (http://jason.pureconcepts.net/2012/10/install-apache-php-mysql-mac-os-x/).
First, we are going to start with a connection test to the memcached daemon:
<?php $memcache = new Memcache; $memcache->connect('localhost', 11211) or die ("Could not connect"); $version = $memcache->getVersion(); echo "Server's version: ".$version."<br/>\n"; ?>
The output of this script actually depends on the current version of the memcached server running, in my case the output is:
Server's version: 1.4.13
Next, let's set and get some keys from the connected memcached server:
<?php $memcache = new Memcache; $memcache->connect('localhost', 11211) or die ("Could not connect"); $sample_obj = new stdClass; $sample_obj->str_attr = 'Memcache in PHP is cool'; $sample_obj->int_attr = 2468; $memcache->set('sample_user', $sample_obj, false, 15) or die ("Failed to store data in memcached"); echo "Data stored in Memcached (will expire in 15 seconds)<br/>\n"; $get_result = $memcache->get('sample_user'); echo "Object from the cache:<br/>\n"; var_dump($get_result); ?>
First, we are creating the Memcache
object, that's the object we will be using to communicate with our memcached server in an object-oriented manner.
Then, we initialize the connection to the memcached server using the connect
method that has the following signature which takes the host, port, and connection timeout:
bool Memcache::connect ( string $host [, int $port [, int $timeout ]] )
The connect
method closes the connection to the memcached server automatically at the end of the execution of the script.
Then, we used the getVersion
method to retrieve the version of the memcached server we are connected to. We are using this method only to test our connection to the memcached server.
We then moved to the real work, we created an instance of the stdClass
of PHP and added two attributes to the object to serialize and store in memcached under the key "sample_user"
; We set the timeout to 15 seconds, this means that the memcached server will delete the key after 15 seconds. We also used false
for flags, since we don't need compression or any other setting at the moment.
Then, we retrieved the value back from memcached using the get
method of the Memcache
object, and then we printed it on the screen. The output of the script would be as follows:
Data stored in cached (will expire in 15 seconds)Object from the cache:object(stdClass)#3 (2) { ["str_attr"]=> string(23) "Memcache in PHP is cool" ["int_attr"]=> int(2468) }
If you are planning to connect to a cluster of memcached servers you will need to add all the servers using the addServer
method:
<?php /* OO API */ $memcache = new Memcache; $memcache->addServer('memcached_host1', 11211); $memcache->addServer('memcached_host2', 11211); ?>
Then, start using your memcache instance as usual and the magic will happen.
If you are planning to connect to memcached server(s) from your Python application, there are several clients available for you. The most popular ones are:
python-memcached: This is a pure-python implementation of the memcached client (implemented 100 percent in Python). It offers good performance and is extremely simple to install and use.
pylibmc: This is a Python wrapper on the
libmemcached
C/C++ library, it offers excellent performance, thread safety, and light memory usage, yet it's not as simple as python-memcached to install, since you will need to have thelibmemcached
library compiled and installed on your system.Twisted memcache: This client is part of the Python twisted event-driven networking engine for Python. It offers a reactive code structure and excellent performance as well, but it is not as simple to use as pylibmc or python-memcached but it fits perfectly if your entire application is built on twisted.
In this recipe, we will be using python-memcached for the sake of simplicity and since other clients have almost the same API, it does not make much difference from a developer's perspective.
It's always a good idea to create virtualenv
for your experiments to keep your experiments contained and not to pollute the global system with the packages you install.
You can create virtualenv
easily:
virtualenv memcache_experimentssource memcache_experiments/bin/activate
We will need to install python-memcached
first, using the pip
package manager on our system:
sudo pip install python-memcached
Let's start with a simple
set
andget
script:import memcache client = memcache.Client([('127.0.0.1', 11211)]) sample_obj = {"name": "Soliman", "lang": "Python"} client.set("sample_user", sample_obj, time=15) print "Stored to memcached, will auto-expire after 15 seconds" print client.get("sample_user")
Save the script into a file called
memcache_test1.py
and run it usingpython memcache_test1.py
.On running the script you should see something like the following:
Stored to memcached, will auto-expire after 15 seconds {'lang': 'Python', 'name': 'Soliman'}
Let's now try other memcached features:
import memcache client = memcache.Client([('127.0.0.1', 11211)]) client.set("counter", "10") client.incr("counter") print "Counter was incremented on the server by 1, now it's %s" % client.get("counter") client.incr("counter", 9) print "Counter was incremented on the server by 9, now it's %s" % client.get("counter") client.decr("counter") print "Counter was decremented on the server by 1, now it's %s" % client.get("counter")
The output of the script looks like the following:
Counter was incremented on the server by 1, now it's 11 Counter was incremented on the server by 9, now it's 20 Counter was decremented on the server by 1, now it's 19
The incr
and decr
methods allow you to specify a delta value or to by default increment/decrement by 1.
Alright, now let's sync a Python dict
to memcached with a certain prefix:
import memcache client = memcache.Client([('127.0.0.1', 11211)]) data = {"some_key1": "value1", "some_key2": "value2"} client.set_multi(data, time=15, key_prefix="pfx_") print "saved the dict with prefix pfx_" print "getting one key: %s" % client.get("pfx_some_key1") print "Getting all values: %s" % client.get_multi(["some_key1", "some_key2"], key_prefix="pfx_")
In this script, we are connecting to the memcached server(s) using the Client
constructor, and then we are using the set
method to store a standard Python dict
as the value of the "sample_user"
key. After that we use the get
method to retrieve the value.
Note
The client automatically serialized the python dict
to memcached and deserialized the object after getting it from memcached server.
In the second script, we are playing with some of the features we never tried in the memcached server. The incr
and decr
are methods that allow you to increment and decrement integer values directly on the server automatically.
Then, we are using an awesome feature that we also didn't play with before, that is get/set_multi
that allows us to set or get multiple key/values at a single request. Also it allows us to add a certain prefix to all the keys during the set or get operations.
The output of the last script should look like the following:
saved the dict with prefix pfx_ getting one key: value1 Getting all values: {'some_key1': 'value1', 'some_key2': 'value2'}
In the Client
constructor, we specified the server hostname and port in a tuple (host, port) and passed that in a list of servers. This allows you to connect to a cluster of memcached servers by adding more servers to this list. For example:
client = memcache.Client([('host1', 1121), ('host2', 1121), ('host3', 1122)])
Also, you can also specify custom picklers/unpicklers to tell the memcached client how to serialize or de-serialize the Python types using your custom algorithm.
Ruby is also one of the most popular language used today by Web developers to build brilliant applications. The rise of the Rails framework was actually one of the main reasons this language received such popularity, however, Ruby is also a Swiss Army Knife language that is often used by system administrators for orchestration and automation.
There are of course, several clients to be memcached in Ruby but here we will be focusing on one of the most recent and stable clients that delivers high performance pure Ruby implementation of the memcached protocol, Dalli!
Dalli was written by the maintainer of memcache-client and is currently stable and being actively maintained.
The good thing about Dalli is that it can be integrated with Rails 3.x but, unfortunately, it does not integrate with the more popular Rails 2.x.
We need to install the Dalli gem using the following command:
gem install dalli
If you don't have the gem
tool, then most likely you don't have Ruby properly installed. In Ubuntu you can always use the following to get Ruby installed:
sudo apt-get install ruby
Let's start by doing a very basic
set
/get
operation on the memcached server from Ruby:require 'dalli' dc = Dalli::Client.new('localhost:11211', :threadsafe => true, :compress => true) dc.set('somekey', 123) puts("the value from cache is: #{dc.get('somekey')}")
This looks great; now let's store more complex structures in the memcached server:
require 'dalli' dc = Dalli::Client.new('localhost:11211', :threadsafe => true, :compress => true) user = {:name => "Ahmed", :job => "Engineer"} dc.set('user1', user, ttl=20) puts("user from cache: #{dc.get('user1')}")
Now, let's use a new feature which we did not use before (
replace
)require 'dalli' dc = Dalli::Client.new('localhost:11211', :threadsafe => true, :compress => true) user = {:name => "Ahmed", :job => "Engineer"} dc.set('user1', user, ttl=20) puts("user from cache: #{dc.get('user1')}") user[:age] = 31 dc.replace("user1", user, ttl=5) puts("user from cache: #{dc.get('user1')}")
First, we are importing Dalli into our namespace by using the require
statement. Then, we are creating a client that connects to the memcached server and we are also setting some options. The following are some of the interesting options:
:compress => true
: This will ask Dalli to compress values larger than 1024 bytes.:threadsafe => true
: This ensures that only a single thread is actively using the connection socket at a time; this is actually enabled by default, we added this to the snippet for clarity only.:namespace => "app"
: This adds a prefix to all keys set in this connection.:expires_in => 100
: This sets the default TTL(timeout) for all the keys where you are not specifying a TTL.
Then, we used the simple set
method to set a basic integer value, after we retrieved that value and printed to the console using puts.
In the second snippet, we created a standard Ruby Hash and we used the built-in serializer to store this hash as a value for the "user1"
key.
In the third snippet, we introduced a new feature of memcached, that is replace
. This was used to replace the entire hash stored for the "user1"
key with a modified version (we added age to it).
While we were replacing the value we also respecified the TTL value and changed the TTL of the value to be 5 seconds only. The replace
feature fails if the key is not already stored in the memcached server and that's the main difference between it and the method that you are already familiar with set
.
The constructor of the Client
class can also accept a list of servers to specify, in case you have a memcached cluster as you can see, it's a list of servers.
Dalli::Client.new(['localhost:11211:10', 'cache-2.example.com:11211:5', '192.168.0.1:22122:5'], :threadsafe => true, :failover => true, :expires_in => 300)
For every server, the format is server:port:weight
, where weight allows you to distribute cache unevenly. Both weight and port are optional. If you pass in nil, Dalli will use the MEMCACHE_SERVERS
environment variable or default to localhost:11211
if it is not present.
In this recipe, we will be using Java to talk to our memcached server. Java is a very powerful programming language and is famous for enterprise-class applications.
There are, of course, a variety of memcached clients written for Java. We have chosen the most powerful client that is not hard to use, that is spymemcached
.
The spymemcached
java library has artifacts published on the maven central repository. If you are using maven you will need to add this to your pom.xml
file (highlighted):
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/maven-v4_0_0.xsd"> <modelVersion>4.0.0</modelVersion> <groupId>com.sample</groupId> <artifactId>spycache</artifactId> <packaging>jar</packaging> <version>1.0-SNAPSHOT</version> <name>spycache</name> <url>http://maven.apache.org</url> <dependencies> <dependency> <groupId>net.spy</groupId> <artifactId>spymemcached</artifactId> <version>2.10.1</version> </dependency> <dependency> <groupId>junit</groupId> <artifactId>junit</artifactId> <version>3.8.1</version> <scope>test</scope> </dependency> </dependencies> </project>
In my project here (named spycache), we will be showing snippets written inside the src/main/java/com/sample/
directory and to run the application you will need to run the mvn
package.
Then after seeing BUILD SUCCESS you will need to run the application by using the following:
java -cp target/spycache-1.0-SNAPSHOT.jar com.sample.App
Let's start by adding the following snippet into our main function:
try { MemcachedClient client = new MemcachedClient(new InetSocketAddress("127.0.0.1", 11211)); client.set("city", 20, "Istanbul"); System.out.println((String)client.get("city")); client.shutdown(); } catch (IOException e) { e.printStackTrace(); }
We have just created a connection and did a simple set/get operation. Let's now store a more complex object:
class Employee implements Serializable{ private static final long serialVersionUID = 2620538145665245947L; private String name; private int age; public Employee(String name, int age) { this.name = name; this.age = age; } public String getName() { return name; } public void setName(String name) { this.name = name; } public int getAge() { return age; } public void setAge(int age) { this.age = age; } @Override public String toString() { return "Employee(\"" + name + "\", " + age + ")"; } }
Then, let's store and retrieve an instance of this class:
Employee sample = new Employee("Ihab", 26); client.set("engineer", 20, sample); System.out.println((Employee)client.get("engineer"));
The output will be like the following:
2013-10-29 19:22:26.928 INFO net.spy.memcached.MemcachedConnection: Added {QA sa=/127.0.0.1:11211, #Rops=0, #Wops=0, #iq=0, topRop=null, topWop=null, toWrite=0, interested=0} to connect queue
2013-10-29 19:22:26.935 INFO net.spy.memcached.MemcachedConnection: Connection state changed for sun.nio.ch.SelectionKeyImpl@680e62df
Employee("Ihab", 26)
2013-10-29 19:22:26.964 INFO net.spy.memcached.MemcachedConnection: Shut down memcached client
Basically, we are creating a MemcachedClient
object that creates our memcached connection, note that this object is quite smart and does automatic reconnect on connection failure. By default, we are using the plain-text protocol but in the second example we are using the new binary protocol which is far more efficient.
The second snippet also shows how to configure MemcachedClient
to connect to multiple memcached servers (cluster), automatic data distribution will be done for you.
The first method we used on the client object is set
which is actually asynchronous, this means that it works like fire and forget, it does not block the current thread until the actual set operation happens. Then, we used get
which is synchronous (the opposite), it blocks until data is retrieved and returns that data as Object
, that's why we need casting to get a reference of the correct object type.
Java has an already very stable data serialization mechanism, and that's exactly what we have used. Standard types are serializable by default but what happens if you are trying to serialize your custom Employee
object to memcached? You need to implement the serializable interface for that, then magically everything works as presented in the last recipe.
Spymemcached
offers an asynchronous API for get
as well, it's quite interesting actually and if you are familiar with java.util.concurrent.Future
you will find it very easy to follow and understand.
The idea is that client.asyncGet
returns a Future<Object>
which you can use to retrieve the value asynchronously. You can use this future object to get the actual value later and set a timeout on your get request, as follows:
Future<Object> fobject = client.asyncGet("engineer"); try { fobject.get(10, TimeUnit.SECONDS); } catch (InterruptedException e) { e.printStackTrace(); } catch (ExecutionException e) { e.printStackTrace(); } catch (TimeoutException e) { fobject.cancel(false); }
What you can see here, is that we tried to get
the future and we set 10
seconds for our trial, if the timeout expired, we get a TimeoutException
, and then we can safely cancel the operation (by calling cancel()
on the future). This means that we are not interested in the operation anymore.
You can also catch multiple exceptions to handle different types of potential failures, such as InterruptedException
, which means that something interrupted the background thread, or ExecutionException
, which means that an exception was thrown while trying to execute the background job.
Ruby on Rails Web development framework is extremely popular for rapid development of Web applications and, at some point in time it started an actual movement for a set of convention-over-configuration frameworks and most likely you have used one of them recently if you are a seasoned web developer.
We are assuming that you are familiar with Rails and you already have some experience using rails caching API, but even if you are not, that's a good introduction about caching in Rails anyway.
We will be using Dalli as the memcached client and we will be configuring a simple Rails application to use it as a backend for Rails Caching. If you want more information about Rails caching in general, you are advised to visit http://guides.rubyonrails.org/caching_with_rails.html.
You will need to have a working Rails installation on your system for that, you can find great tutorials on the Web for that, such as this one https://www.digitalocean.com/community/articles/how-to-install-ruby-on-rails-on-ubuntu-12-04-lts-precise-pangolin-with-rvm.
Let's now start by creating a really simple Rails application as a mock, to be the test base for our caching experiments:
rails new cachesample
.In a few minutes, you will have your empty Rails application ready, you can run it using:
cd cachesample/ rails server
You should see something like the following:
=> Booting WEBrick => Rails 4.0.0 application starting in development on http://0.0.0.0:3000 => Run `rails server -h` for more startup options => Ctrl-C to shutdown server [2013-10-29 16:40:56] INFO WEBrick 1.3.1 [2013-10-29 16:40:56] INFO ruby 2.0.0 (2013-06-27) [x86_64-linux] [2013-10-29 16:40:56] INFO WEBrick::HTTPServer#start: pid=18450 port=3000
Now, we need to configure our Rails application to use Dalli gem as a dependency.
Edit your
Gemfile
and add this line at the end of the file:gem 'dalli'
Then, you will need to run:
bundle install
Now, we need to edit the application configuration to actually use memcached as the caching backend for the rails caching API.
Normally, you don't enable caching while development and you only turn it on in production, so we will be editing the config/environments/production.rb
file (you may also want to add this to config/environments/development.rb
if you want to see caching in development mode). Let's add:
config.cache_store = :dalli_store
You will be able to use all of Rails automatic and manual caching features of Rails and the default cache store will be now Dalli
(memcached client).
Rails has different features for caching, such as the following:
Action caching: It caches an action response based on the input parameters and every request will go to all the before filters, this means that authentication is verified for example.
Fragment caching: This means that specific pieces of template code can be cached, this is very useful, specially if you are building a huge page that contains pieces that are constantly changing and dynamic, you still can modularize parts of the page and cache those parts as fragments. For example, you want to cache a fragment that generates a list of products:
<% Order.find_recent.each do |o| %> <%= o.buyer.name %> bought <%= o.product.name %> <% end %> <% cache do %> All available products: <% Product.all.each do |p| %> <%= link_to p.name, product_url(p) %> <% end %> <% end %>
SQL caching: This is a feature that allows Rails to cache results of a SQL query so that if it encountered the same query again for that request, it will use the cached result set and will not be running the same query again on the database server.
To use Dalli for Rails session storage that times out after 20 minutes, in config/initializers/session_store.rb
:
config.cache_store = :dalli_store, 'cache-1.example.com', 'cache-2.example.com', { :namespace => NAME_OF_RAILS_APP, :expires_in => 1.day, :compress => true }
Of course, you will need to write the correct names of the memcached hosts and ports for your caching cluster, instead of cache-1.example.com
and cache-2.example.com
.
In the Python world, Django is the de facto standard choice as the most popular rich MVC/MVP framework around. It has a fantastic caching framework as well.
Django comes with a robust caching framework that lets you save dynamic pages so they don't have to be calculated for each request. Not only that, but also Django offers an abstract caching API that hides the specific implementation of the caching backend and offers a clean API to cache whatever you feel right, whenever you want to.
We are assuming you are a seasoned Django developer with some experience building Django applications in this recipe. Our goal here is to configure Django to use memcached as a caching backend and to introduce you to some of the features of Django's caching framework.
You will need to have a simple Django application to play with, if you don't have one you can create an empty project with an empty application by using the following:
django-admin.py startproject djangocache cd djangocache/ python manage.py startapp cachista python manage.py runserver
This will create a project called djangocache
and a simple app (module) inside your project that we called cachista
.
If you don't have python-memcached installed already, you can simply use pip
for that:
pip install python-memcached
Let's start by editing the
settings.py
file in your Django project (djangocache/settings.py
, in our case), we will be using python-memcached for this recipe (you can usepylibmc
too if you like).The caching configuration parameter is controlled by the
CACHES
variable in the settings file. By default, you will not find this variable in yoursettings.py
file, so we will need to add to it theBACKEND
key in the'default'
dict which indicates the memcached client that you are planning to use.CACHES = { 'default': { 'BACKEND': 'django.core.cache.backends.memcached.MemcachedCache', 'LOCATION': '127.0.0.1:11211', } }
In this example, we used
MemcachedCache
which uses the python-memcached library.If you are planning to use the faster
pylibmc
, you will need to replace this withdjango.core.cache.backends.memcached.PyLibMCCache
.The
LOCATION
key in the'default'
dict is where your memcached server is located, if you have a memcached cluster, you can change the value to be a list as follows:CACHES = { 'default': { 'BACKEND': 'django.core.cache.backends.memcached.MemcachedCache', 'LOCATION': [ 'cache-1.example.com:11211', 'cache-2.example.com:11211', ] } }
It's very important to understand that if you are planning to use multiple Django servers as a cluster, all the configurations of those servers need to have the same order as this caching list.
Now, let's tell Django to cache one of our views, it will automatically cache the view response for us. You will need to import that cache_page decorator
first from django.views.decorators.cache import cache_page
.
@cache_page(60 * 15, key_prefix="site1") def my_view(request): """ my view code goes here """
Piece of cake! We told Django to cache this view for 15
minutes and the key prefix in the cache store will be "site1"
.
Now, do you remember the "default" we wrote in our CACHES
setting? That was to setup multiple caching backends for Django! Yes, you can cache certain pages on certain caching backends. You can specify the caching backend in your cache_page
view.
@cache_page(60 * 15, cache="memory_cache")
The "memory_cache"
value must correspond to a key in your CACHES
setting where you specify the caching backend settings. Fantastic!
As in Rails, you can specify fragments of your template to be cached.
{% load cache %} {% cache 500 sidebar %} .. sidebar .. {% endcache %}
Now, let's use the caching API to do manual caching of a value in our action/controller code from the following:
django.core.cache import get_cache cache = get_cache('default') cache.set('key', 'Hello Memcached!', 15) print cache.get('key')
This looks very similar to the direct memcached API but it's not! It's an abstract API that can actually use multiple backends for you; memcached is one of them as configured in the CACHES
setting.
We started by defining the CACHES
variable in the settings.py
file and there we can define multiple cache regions with different backends. Django supports multiple cache backends, file-based, memory-based, and database-based. In our case, we used python-memcached backend and we specified that for the 'default'
cache region.
Of course, it's very popular to use memcached as a cluster and to specify the list of servers to your configuration.
You can also specify some interesting options along with the LOCATION
and BACKEND
keys, some examples of the same are as follows:
TIMEOUT
: The default timeout, in seconds, to use for the cache. The default value is 300 seconds (5 minutes)KEY_PREFIX
: A string that automatically will be prefixed to all cache keys.
Then we played with cache_page decorator
which automatically caches a view for us, you can specify the prefix or the cache region you are planning to use for this particular page.
Then we have seen the template caching, you can cache pieces/fragments of your template code with the "cache"
tag, you can specify in the identifier for this cached fragment and expiration.
In our case we used the sidebar
identifier as stated in the following line:
{% cache 500 sidebar %}
The expiration is set to 500
seconds, but interestingly you can specify more keywords for your identifier for the same fragment.
{% cache 500 sidebar welcome %}
Also, you can use the low-level caching API if you want more granular control over your caching and that was described in the last code snippet.
Play Framework is a modern Java/Scala framework that promises a lightweight, stateless, Web-friendly architecture.
It's built on Akka and it's very reliable for building highly-scalable applications with predictable resource consumption.
Play 2 is the next generation of the framework; it has gone through almost a complete rewrite, and it's now fully written in Scala but offers a good Java API. We will be focusing on Scala examples right here.
Play 2 uses Ehcache
by default as a backend, you can always replace the backend by writing plugins for Play 2, fortunately, someone already did that and it's using the spymemcached
java client for memcached.
We are using Play 2.2.X for this recipe which uses sbt 0.13.X.
Let's start by creating a simple play project.
play new playcache
Select create a simple Scala application.
Then we need to configure our project's
build.sbt
to use play2-memcached as a dependency. Edit your build.sbt to look like the following:name := "playcache" version := "1.0-SNAPSHOT" libraryDependencies ++= Seq( jdbc, anorm, cache, "com.github.mumoshu" %% "play2-memcached" % "0.3.0.2" ) resolvers += "Spy Repository" at "http://files.couchbase.com/maven2" play.Project.playScalaSettings
We need to configure our play application to use memcached instead of the default Ehcache backend for the Caching API of Play Start by adding the
play2-memcached
plugin toconf/play.plugins
(create the file if not created already)5000:com.github.mumoshu.play2.memcached.MemcachedPlugin
Then, let's edit the configuration
conf/application.conf
file and add near the end of the file, the following line to disable theehcache
plugin:ehcacheplugin=disabled
Now, let's configure the memcached plugin to the memcached server:
memcached.host="127.0.0.1:11211"
After that, you are ready to use memcached, start the application server by running:
play run
Then, let's edit the controller at
app/controllers/Application.scala
to look like the following snippet:package controllers import play.api._ import play.api.mvc._ import play.api.cache.Cache import play.api.Play.current object Application extends Controller { def index = Action { Cache.getAs[String]("key") match { case Some(v) => Ok(s"Got the value from cache: $v") case None => Cache.set("key", "Fantastic Value", 50) Ok("Setting value in Cache") } } }
From your browser, visit
http://localhost:9000/
and see what happens. On your first request you should see something like the following:Setting value in Cache
Then if you refreshed the page, you will see the following line:
Got the value from cache: Fantastic Value
One more thing you can do is to cache the entire action response by using the
Cached
object for that.def index = Cached("homePage") { Action { Ok("Hello world") } }
Congratulations, Play 2 is now connected and uses memcached as the caching backend.
First, we needed to edit the build script (build.sbt
) to add the play2-memcached
as a dependency, we did that by appending in the libraryDependencies
setting key the value to "com.github.mumoshu" %% "play2-memcached" % "0.3.0.2"
.
Then, we added a resolver to tell sbt
where to get this plugin from. Next, we created/edited the conf/play.plugins
file to add the plugin to play and we configured conf
/application.conf
to point to our memcached server.
In the controller code, we used the play.api.cache.Cache
object to get and set values from the cache.
The last thing is that you can use the Cached
object to cache the entire action response in a named cache key "homePage"
.
If you are planning to use Play2 with a memcached cluster, you will need to configure the list of your servers in conf/application.conf
.
The configuration is really straightforward. Just replace memcached.host="127.0.0.1:11211"
with the list of the servers as follows:
memcached.1.host="cache-1.example.com:11211" memcached.2.host="cache-1.example.com:11211"
As mentioned several times before, it's important to keep this list in-sync for all your Play servers.