Running Your First Apache Thrift Service and Client

In this article by Krzysztof Rakowski, the author of Learning Apache Thrift, you'll first create necessary project files. After brief configuration, you will be able to run the service and connect it with client by yourself. We willbe using server written in PHP and client in Python. The code is very simple, so you can adapt it to any other programming language if you wish.

At the end of this article, we will discuss the code and what it exactly means. Don't worry if you don't understand everything at the beginning. The goal of this articleis to allow you to have running client and service that you could easily manipulate and change.

(For more resources related to this topic, see here.)

Creating necessary project files

Let's make a fresh start by creating a new directory on your disk. In this directory, we will keep all the files related to the current article's mini project.

Creating local copy of the Apache Thrift libraries

To make things simpler, we will make a local copy of the Apache Thrift libraries.

Copy the archive to your newly-created directory and decompress it:

$ tar -xvzf thrift-0.9.2.tar.gz

Note for Windows users

In the examples, we will use Unix-style commands as this is the most popular platform for Apache Thrift. Use Windows equivalents when needed.

To decompress the.tar.gz archives under Windows, I recommend suitable opensource (free) software, for example, 7-Zip, which you can download from http://www.7-zip.org/.

Now we will have full Apache Thrift in the thrift-0.9.2 directory (the name may differ depending on version; substitute it in all the commands). We will be using PHP and Python librariesthat are in the thrift-0.9.2/lib directory.

Python library needs to be built. Enter its directory and run the setup command. This library will not be installed system-wise; just built in place:

$ cd thrift-0.9.2/lib/py
$ python setup.py build
$ cd ../../..

Defining our first service and generating files

Now we're back in our main directory and ready for the real work. Here is the description of the service, MyFirstService, that we will be working on. Note that Apache Thrift's interface description language (IDL) only describes the interface without really providing any information about what the methods will be doing. It is service developer's responsibility to make sure that the names of the services and methods are consistent with what they are actually doing.

You don't have to type all the code by yourself. You can download it from Packt's website and use it in your project.

Let's look at the interface description:

// namespaces are used in resulting packages
namespace php myfirst
namespace py myfirst

// you can define names for your types.
// Usable primarily for integers.
typedef i32 int

// simple exception
exception MyError {
    1: int error_code,
    2: string error_description
}

// here starts the service definition
service MyFirstService {

    // log method - it saves timestamp to given file
    oneway void log(1:string filename),

    // multiply method - returns result of multiplication of two integers; it returns integer as well
    int multiply(1:int number1, 2:int number2),

    // get_log_size method -  returns size of log file; throws exception when problems occur
    int get_log_size(1:string filename) throws (1:MyError error),

}

As you can see, in MyFirstService, we described three methods:

  • oneway void log(1:string filename): This will save some value (timestamp in our example) to the given file
  • int multiply(1:int number1, 2:int number2): This will multiply two integers and return the result
  • int get_log_size(1:string filename) throws (1:MyError error): This will return the size of given log file and in case of any trouble, it will throw an exception

We will get to the details later. Now save this code in the myfirst.thrift file.As a next step, use Apache Thrift compiler to generate PHP and Python files:

thrift --gen py --gen php:server myfirst.thrift

This command will generate lots of PHP files in the gen-php directory and Python files in the gen-py directory. You can browse them and admire how much work Apache Thrift does for you. Of main interest for us are thegen-php/MyFirstService.php and gen-py/MyFirstService.py files, which contain (among others) definition of interface that we have to implement by ourselves in PHP and client classes.

The service code in PHP

At this point, we have everything what Apache Thrift can offer automatically. The next step is to prepare the code for our service. In a real-world scenario, this probably will be a part of larger application. Now, as a demonstration, we will have extremely basic setup and its main purpose is to be portable and easy to run.

Note that in order to be simple and comprehensive, code in this articlemay lack features that need to be taken care of in professional applications, which are meant to be used in real-world solutions. For example, there is not much validation or input sanitization, errors are not always properly handled, and the application may not perform well under heavy load.

If you plan to use this or similar code in your applications, please apply general knowledge related to application security and performance characteristic to your programming language.

Let's have a look at our server's code:

#!/usr/bin/env php
<?php

error_reporting(E_ERROR);
date_default_timezone_set('UTC');

define('THRIFT_PHP_LIB', __DIR__.'/thrift-0.9.2/lib/php/lib');
define('GEN_PHP_DIR', __DIR__.'/gen-php');

require_once THRIFT_PHP_LIB.'/Thrift/ClassLoader/ThriftClassLoader.php';

use Thrift\ClassLoader\ThriftClassLoader;

$loader = new ThriftClassLoader();
$loader->registerNamespace('Thrift', THRIFT_PHP_LIB);
$loader->registerDefinition('myfirst', GEN_PHP_DIR);
$loader->register();


use Thrift\Protocol\TBinaryProtocol;
use Thrift\Transport\TPhpStream;
use Thrift\Transport\TBufferedTransport;

class MyFirstHandler implements \myfirst\MyFirstServiceIf {

    public function log($filename) {
        $time = date('Y-m-d H:m:s');
        file_put_contents(__DIR__."/".$filename, $time."\n", FILE_APPEND);
        error_log("Written " . $time . " to " . $filename);
    }

    public function multiply($number1, $number2) {
        error_log("multiply " . $number1 . " by " . $number2);
        return $number1 * $number2;
    }


    public function get_log_size($filename) {
        $filesize = filesize(__DIR__."/".$filename);
        if ($filesize === false)
            {
                $e = new \myfirst\MyError();
                $e->error_code = 1;
                $e->error_description = "Can't get size information for file " . $filename;
                error_log($e->error_description);
                throw $e;
            }
        error_log("size of log file " . $filename . " is " . $filesize . "B");
        return $filesize;
    } 

};

header('Content-Type', 'application/x-thrift');
echo "\r\n";


$handler = new MyFirstHandler();
$processor = new \myfirst\MyFirstServiceProcessor($handler);

$transport = new TBufferedTransport(new TPhpStream(TPhpStream::MODE_R | TPhpStream::MODE_W));
$protocol = new TBinaryProtocol($transport, true, true);

$transport->open();
$processor->process($protocol, $protocol);
$transport->close();

Don't get scared by the broadness of this file. We will discuss it in detail later. Basic knowledge of any programming language will let you easily see that here we are implementing three methods (log, multiply, andget_log_size) that were just briefly described in the IDL file. There is some PHP code that performs the basic operations promised by the methods' names.

Save this code to the MyFirstServer.php file. Make sure that the path to the Apache Thrift library in the line 7 is correct:

define('THRIFT_PHP_LIB', __DIR__.'/thrift-0.9.2/lib/php/lib');

If you want the timestamps in the log file to reflect your time zone (helpful for debugging), set proper identifier in line 5:

date_default_timezone_set('UTC');

You may check the list of available time zone identifiers at http://php.net/manual/en/timezones.php.

You may run this code via regular webserver (that is, Apache HTTP server or nginx) if you have it set up. If not, there isa simpler solution in few lines of Python that is suggested in Apache Thrift's code library:

#!/usr/bin/env python

import os
import BaseHTTPServer
import CGIHTTPServer

class Handler(CGIHTTPServer.CGIHTTPRequestHandler):
    cgi_directories  = ['/']

print "Starting server on port 8080..."

BaseHTTPServer.HTTPServer(('', 8080), Handler).serve_forever()

This code uses Python's native capabilities to serve files from the local directory via HTTP on port 8080. They will be parsed as a CGI scripts, that is, the PHP code will be interpreted by the PHP interpreter,and that's exactly what we need. Save this code to the runserver.py file.

Note that you shouldn't use this method for running the PHP scripts in production environment. It's not reliable and doesn't provide adequate performance, Also,it may be vulnerable in terms of security. It's intended only as a helper to developers.

Instead, you should run your PHP scripts in one of the ways recommended and extensively explained in the PHP documentation at http://php.net/manual/en/install.php.

The client code in Python

As we have the service code, now it's time to have some code for the client. This small Python script will connect to the service exposed by PHP and run some methods from MyFirstService. Let's seehow simple is that:

#!/usr/bin/env python

import sys, glob
sys.path.append('gen-py')
sys.path.insert(0, glob.glob('thrift-0.9.2/lib/py/build/lib.*')[0])

from myfirst import MyFirstService

from thrift import Thrift
from thrift.transport import THttpClient
from thrift.transport import TTransport
from thrift.protocol import TBinaryProtocol

from random import randint

try:

    socket = THttpClient.THttpClient('localhost', 8080, '/MyFirstServer.php')
    transport = TTransport.TBufferedTransport(socket)
    protocol = TBinaryProtocol.TBinaryProtocol(transport)
    client = MyFirstService.Client(protocol)
    transport.open()

    # calling log method
    client.log("logfile.log")
    print 'logged current time to logfile (not waiting for response)'

    # calling multiply method with random parameters
    number1 = randint(1,100)
    number2 = randint(1,100)
    product = client.multiply(number1,number2)
    print '%dx%d=%d' % (number1, number2, product)

    # calling get_log_size method
    print "current size of logfile is: %d Bytes" % client.get_log_size("logfile.log")

    # calling get_log_size method second time, but this time with wrong parameter
    print "current size of logfile is: %d Bytes" % client.get_log_size("no_such_file.log")


    transport.close()

except Thrift.TException, e:
    print 'Received following error:\n  error code: %d\n  error desc: %s' % (e.error_code, e.error_description)

 

After a brief analysis of the code, you may see that we are using an instance of theMyFirstService.Client class that implements the same methods, which we had defined in our PHP service. Running remote code is as easy as calling methods of this instance.

Note that we send random values to the multiply method to illustratethat every request is different. Also, in the second calling of theget_log_size method, we provide the name of the file that doesn't exists, so we have an opportunity to see how errors are handled.

Save this code to theclient.py file.

Running the code

Now is the time to run the scripts and see the outcome. It is best to use two separate terminal windowsso that you can observe the result of operation on both client and server side.

To start your PHP service, you need to run it through the Python wrapper that you wrote. To do this, run the following command:

$ python runserver.py
Starting server on port 8080...

If you see a message like this and no other error, it means that your PHP service is listening on the port 8080 on your computer. In this window, you will see information about all the incoming connections and their results.

Now, let's try to call our service with Python client script. To do this, run the following command:

python client.py

If everything happened as expected, you will see some output. It will look similar to this:

logged current time to logfile (not waiting for response)
70x7=490
current size of logfile is: 760 Bytes
Received following error:
  error code: 1
  error desc: Can't get size information for file no_such_file.log

In the first terminal (the one, where you ran your server), you will see messages similar to this:

127.0.0.1 - - [04/Aug/2015 23:05:59] "POST /MyFirstServer.php HTTP/1.0" 200 -
Written 2015-08-04 23:05:59 to logfile.log
127.0.0.1 - - [04/Aug/2015 23:05:59] "POST /MyFirstServer.php HTTP/1.0" 200 -
multiply 22 by 52
127.0.0.1 - - [04/Aug/2015 23:05:59] "POST /MyFirstServer.php HTTP/1.0" 200 -
size of log file logfile.log is 20B
127.0.0.1 - - [04/Aug/2015 23:05:59] "POST /MyFirstServer.php HTTP/1.0" 200 -
Can't get size information for file no_such_file.log

It is a regular access log with extra information delivered by the PHP script.

You can run client script multiple times, just to see the different results. You will also notice thatlogfile.log was created on disk in our scripts directory, and is appended with current time every time you run the client script.

What really happened?

Let's talk a little about what really happened here.

At the beginning, you ran the runserver.py script. It doesn't play a huge role here; it is just a little helper Python script. Its purpose is to run continuously and listen on the 8080 port of your computer and serve files from the current directory over HTTP without all the hassle of setting up a regular web server. It runs them as the CGI scripts, so our PHP file is going to be parsed by the PHP interpreter installed on your system.

When we have the server running, we can connect to it. To do this, you ran the client.py script. This piece of code benefits from the capabilities of Apache Thrift. Using autogenerated libraries, it allows you to call remote commands from your Python script. In this case, you asked to run the following methods:

  • log("logfile.log")
  • multiply(number1, number2), where both numbers were randomly chosen from the range of 1 to 100
  • get_log_size("logfile.log")
  • get_log_size("no_such_file.log")

Those requests were sent to the http://localhost:8080/MyFirstServer.phpaddress and they were passed by our helper script to theMyFirstServer.php script. There, they were executed.The running log method resulted in appending the current time to logfile.log (you can see it by yourself); multiply yielded result of the multiplication of the two numbers that you provided; andget_log_size returned the size of the log file at the first call and at the second (where nonexistent filename was given), it threw an exception, which was transferred from the PHP server to the Python client and handled there.

In general, what you have seen here is that the procedures were called remotely. They were implemented in the PHP script, but were invoked from, and the result was processed in an independent Python script. In this simple example, everything occurred on one machine, but it could work equally good between computers spread all over the world.

The purpose of Apache Thrift in this example is to provide communication framework and automatically generate libraries so that the developer doesn't have to care about serializing and transfer methods (and lots of other stuff) and he/she can focus on implementing the service and client.

Analyzing the code

Before we get further, let's have a quick look and explain most important parts of the code that we used in this article.

The service description – IDL

The most important part of the service description begins with the following line:

service MyFirstService {

In this block, there are descriptions of methods exposed by this service. Let's have a look at the first two:

oneway void log(1:string filename),
int multiply(1:int number1, 2:int number2),

The syntax bears strong resemblance to C++'s (and other popular languages') method definitions. The onewaykeyword means that the client only makes request and won't wait for the result of the log method. This method doesn't provide any return value, which is marked by the void keyword (note that all oneway methods should return void). You may also notice numbered argument notation that is typical for Apache Thrift.

The multiply method, on the other hand, returns int, which is the name defined by us for the not intuitively named type of i32, which is 32-bit signed integer.

The get_log_size method also returns int, but has one distinct feature:

int get_log_size(1:string filename) throws (1:MyError error),

It throws MyError, which means that in case of some failure, it will throw an exception that we defined earlier in the file.

The server script – PHP

Our service is implemented in PHP. Lots of Apache Thrift-specific code surrounds the most important part, which is the implementation of our service's interface:

class MyFirstHandler implements \myfirst\MyFirstServiceIf {

The MyFirstHandlerclass has the same methods as the ones described in our IDL file. It is essential that the names of the methods and parameters lists and names are the same; otherwise, you won't be able to call them.

Implementations of the methods are very simplistic and lack lots of essential features (such as error handling or input validation), but serve their purpose of providing some output to the client. An extra output is also sent through the error_log function to the terminal window in which you run runserver.py.

The code preceding and following theMyFirstHandler class prepares the environment for running the script and defines how the data will be transferred.

The client script – Python

You may have noticed that in the client script, the transfer details are the same as in the server script. It's the same transport (TBufferedTransport) and protocol (TBinaryProtocol). The script will connect to the port,8080, on localhost.

The most important part of this script is the client object. It has the same methods as those provided by the server, so you can just run them without giving any second thought about how the data is transferred between the client and the server. The return values are native to Python, and exceptions are handled just the same way as in any other application in this language.

Activity for you

As an extra task, I encourage you to play with the service and the client. Try to change the code of methods in the PHP file. Issue different calls from the client Python file. Add extra methods to your service and handle them in the client (remember to do it in service definition of the myfirst.thrift file as well and generate needed libraries using thethrift command). You may even want to add some extra error handling or throw an exception or two.

Summary

In this article, you implemented your first service in PHP language and client script in Python. You ran the client, called service's methods remotely, and examined the output. I hope that you experimented a little bit with your scripts.

After reading this article, you are now able to prepare the development environment, implement services, and make use of them through the client application.

Resources for Article:


Further resources on this subject:


You've been reading an excerpt of:

Learning Apache Thrift

Explore Title
comments powered by Disqus