Cloud being the new buzzword, getting a correct definition is complex, and you can be lost trying to choose the best Cloud offers available on the Internet for [put your favorite stuff here]-vendors. Even talking about actual cloud implementations is difficult since there are multiple levels of them.
The concept of the Cloud is about the ability to get resources on demand without limits, but with the related cost, and without delays or human operation. Amazon Web Services is one of the most popular Cloud services, and with an adequate account set, anyone can get an EC2 (Elastic Compute Cloud) instance running to host applications or use an S3 (Simple Storage Service) bucket to store files.
Cloud services are commonly organized into three categories namely, Infrastructure as a Service (IaaS), Software as a Service (SaaS), and Platform as a Service (PaaS).
Amazon EC2 is a typical IaaS. This service lets users lean using simple API calls, servers to deploy applications, storage, or network routers. It only gives you the hardware, which you then have to manage to get your whole technical stack up and running. You have to select (or build) a virtual machine image (such as AMI) with your preferred operating system, configure network and routing, attach disks for persistent data, and so on. It looks like going to your favorite broker to buy PC components and build your own computer. The main benefit is that you only pay for what you actually use, so you can change your mind and get a bigger or smaller server, or just drop everything at anytime.
IaaS was a required, but low-level step in Cloud revolution. The flexibility it gives you is huge as compared to the bare-metal hardware, even with existing rent options. You can get dozens of servers available in few clicks, with ridiculous cost that only relates to the duration for which you actually use them.
The main drawback is that you only get the hardware. Operating system setup, low-level configuration, middleware installation, security, monitoring, and maintenance are your responsibilities. This makes sense if you have some very specialized software that you want to run, but for common technical stacks that are the concerned standards, this doesn't make much sense. If you need your own patched version of Linux kernel, IaaS is for you. If you want to run a Java application under the latest version of Tomcat, you will end up spending hours of engineering time just to set up and maintain the basic runtime that your developers are expecting.
Another well-known actor in the Cloud ecosystem is Google Mail . Such software doesn't require installation; you access it with a standard browser using a secure HTTP transporter on the Internet. You can create a new Gmail address using a fully automated subscription process. Such services are called Software as a Service (SaaS) since they provide a fully running product, with some options to customize them, but are focused on a specific use case. You can customize Gmail's style for your company and set some default filters for all the users, but you can't convert Gmail into a CMS—it's a mailbox service, period.
If the project you're working on matches with SaaS offer, don't look any further; just use it. The time that you'll gain can be invested in lots of useful things to make your business successful. If your business is successful, and you really hit a technical limit, you will be able to switch to a custom solution, but don't try to implement your own general-purpose service if you don't have highly specialized requirements. Gmail users would never consider writing their own mailing system.
The drawback of SaaS is that you have limited options to customize the software. They all expose API so that you can programmatically interact with the service to integrate with the third-party tools and extend it to your own need, but you can't change the general service spirit.
Platform as a Service (PaaS) is a crossover between IaaS and SaaS. This is a fuzzy definition, but it defines well the existing actors in this industry well and possible confusions. A general presentation of PaaS uses a pyramid. Depending on what the graphics try to demonstrate, the pyramid can be drawn upside down, as shown in the following diagram:
The pyramid on the left-hand side shows XaaS platforms based on the target users' profiles. It demonstrates that IaaS is the basis for all Cloud services. It provides the required flexibility for PaaS to support applications that are exposed as SaaS to the end users. Some SaaS actually don't use a PaaS and directly rely on IaaS, but that doesn't really matter here.
The pyramid on the right-hand side represents the providers and the three levels suggests the number of providers in each category. IaaS only makes sense for highly concentrated, large-scale providers. PaaS can have more actors, probably focused on some ecosystem, but the need is to have a neutral and standard platform that is actually attractive for developers. SaaS is about all the possible applications running in Cloud. The top-level shape should then be far larger than what the graphic shows.
With the previous definition of platform, you just have a faint idea; your understanding about PaaS is more than IaaS and less than SaaS. The missing definition is to know what the platform is about.
A platform is a standardization of the runtime for which a developer is waiting to do his/her job. This depends on the software ecosystem you're considering. For a Java EE developer, a platform means having at least a servlet container, managing DataSource to access the database, and having few comparable resources wrapped as standard Java EE APIs. A Play! framework developer will consider this as overweight and only ask for a JVM with web socket's support. A PHP developer will expect a Linux/Apache/MySQL/PHP (LAMP) stack, similar to the one he/she has been using for years, with a traditional server hosting service.
So, depending on the development ecosystem you're considering, platforms don't have the exact same meaning, but they all share a common principle. A platform is the common denominator for a software language ecosystem, where the application is all that a specific developer will write or choose on their own. Java EE developers will ask for a container, and Ruby developers will ask for an RVM environment. What they run on top is their own choice.
With this definition, you understand that a platform is about the standardization of runtime for a software ecosystem. Maybe some of you have patched OpenJDK to enable some magic features in the JVM (really?), but most of us just use the standard Oracle Java distribution. Such a standardization makes it possible to share resources and engineering skills on a large scale, to reduce cost, and provide a reliable runtime.
Another consideration for a platform is clustering . Cloud is based on slicing resources into small virtual elements and letting the users select as many as they need. In most cases, this requires the application to support a clustering mode, as using more resources will require you to scale out on multiple hosts.
Clustering has never been a trivial thing, and many developers aren't familiar with the related constraints. The platform can help them by providing specialized services to distribute the load around the cluster's nodes. Some PaaS such as CloudBees or Google App Engine provide such features, while some don't. This is the major difference between PaaS offers. Some are IaaS-like preinstalled middleware services, while some offer a highly integrated platform.
A typical issue faced is that of state management. Java EE developers rely on
HttpSession to store user's data and retrieve them on subsequent interaction. Modern frameworks tend to be stateless, but the state needs to be managed anyway. PaaS has to provide options to developers, so that they can choose the best strategy to match their own business requirements. This is a typical clustering issue that is well addressed by PaaS because the technical solutions (sticky session, session replication, distributed storage engines, and so on) have been implemented once with all the required skills to do it right, and can be used by all platform users.
Thanks to a PaaS, you don't need to be a clustering guru. This doesn't mean that it will magically let your legacy application scale out, but it gives you adequate tools to design the application for scalability.
If you go back to the comparison in the Preface with an electricity production, this may make sense if you're well established. Amazon or Google should have private power plants to supply giant data centers can make sense—anyway it doesn't seems that they do but as backends. For most of companies, this would be a surprising company choice.
The main reason is that the principle of the Cloud relies on the last letter of XaaS (S) that stands for Service. You can install an OpenStack or VMware farm on your data center, but then you won't have an IaaS. You will have some virtualization and flexibility that probably is far better than traditional dedicated hardware, but you miss the major change. You still will have to hire operators to administer the servers and software stack. You will even have a more complex software stack (search for an OpenStack administrator and you'll understand). Using Cloud makes sense because there are thousands of users all around the world sharing the same lower-level resources, and a centralized, highly specialized team to manage them all.
Building your own, private PaaS is yet another challenge. This is not a simple middleware stack. This is not about providing virtual machine images with a preinstalled Tomcat server. What about maintenance, application scalability, deployment APIs, clustering, backup, data replication, high availability,monitoring, and support?
Support is a major added value of cloud services—I'm not just saying this because I'm a support engineer—but because when something fails, you need someone to help. You can't just wait with the promise for a patch provided by the community. The guy who's running your application needs to have significant knowledge of the platform. That's one reason that CloudBees is focusing on Java first, as this is the ecosystem and environment we know best (even we have some Erlang and Ruby engineers whose preferred game is to troll on this displeasing language).
With a private Cloud, you probably can have level-one support with an internal support team, but you can't handle all the issues. As for resource concentration, working in support with thousands of customers allows a public platform to build an impressive knowledge base.
All those topics are ignored in most cases as people only focus on the
app:deploy automation, as opposed to the old-style deployments to dedicated hardware. If this is what you're looking for, you should know that Maven was able to do this for years on all the Java EE containers using cargo. You can check the same at http://cargo.codehaus.org. Cloud isn't just about abstracting the runtime behind an API; it's about changing the way in which developers manage and access runtime so that it becomes a service they can consume without any need to worry about what's happening behind the scene.
The reason that companies claim to prefer a private cloud solution is security.
Amazon datacenters are far more secure than any private datacenter, due to both strong security policy and anonymous user data. Security is not about exploiting encryption algorithms, like in Hollywood movies, but about social attacks that are far more fragile. Few companies take care of administrative, financial, familial, or personal safety.
Thanks to the combination of VPN, HTTPS, fixed IPs, and firewall filters, you can safely deploy an application on Amazon Cloud as an extension to your own network, to access data from your legacy Oracle or SAP mainframe hosted in your datacenter. As a mobile application demonstrates, your data is already going out from your private network. There's no concrete reason why your backend application can't be hosted outside your walls.
CloudBees PaaS has something special in its DNA that you won't find in other PaaS; focusing on the Java ecosystem first, even with polyglot support, CloudBees understands well the Java ecosystem's complexity and its underlying practices.
Heroku was one of the first successful PaaS, focusing on Ruby runtime. Deployment of a Ruby application is just about sending source code to the platform using the following command:
git push heroku master
Ruby is a pleasant ecosystem because there are no such long debates on building and provisioning tools that we know of, unlike in JavaWorld, GemFile, and Rake, period.
In the Java ecosystem, there is a need to generate, compile the source code, and then sometime post the process classes, hence a large set of build tools are required. There's also a need to provision runtime with dozens of dependencies, so a set of dependency management tools, inter-project relations, and so on are required. With Agile development practices, automated testing has introduced a huge set of test frameworks that developers want to integrate into the deployment process.
The Java platform is not just about hosting a JVM or a servlet container, it's about managing Ant, Maven, SBT, or Gradle builds, as well as Grails-, Play-, Clojure-, and Scala-specific tooling. It's about hosting dependency repositories. It's about handling complex build processes to include multiple levels of testing and code analysis.
Jenkins is not the subject of this book, but it is the de facto standard for but not limited to continuous integration in the Java ecosystem. With a large set of plugins, it can be extended to support a large set of tools, processes, and views about your project.
The CloudBees team includes major Jenkins committers (including
myself #selfpromotion), and so it has a deep knowledge on Jenkins ecosystem and is best placed to offer it as a Cloud service. We also can help you to diagnose your project workflow by applying the best continuous integration and deployment practices. This also helps you to get more efficient and focused results on your actual business development.
With some CloudBees-specific plugins to help, [email protected] Jenkins creates a smooth code-build-deploy pipeline, comparable to Heroku's Git push, but with full control over the intermediary process to convert your source code to a runnable application. This is such a significant component to build a full stack for Java developers that CloudBees is the official provider for the continuous integration service for Google App Engine (http://googleappengine.blogspot.fr/2012/10/jenkins-meet-google-app-engine.html), Cloud Foundry (http://blog.cloudfoundry.com/2013/02/28/continuous-integration-to-cloud-foundry-com-using-jenkins-in-the-cloud/), and Amazon Beanstalk (to be announced as I'm writing this chapter).
This chapter introduced Cloud principles and benefits and compared CloudBees to its competitors.
We will cover the CloudBees platform in detail in the next chapters. Hope that you will like it as we do and give it a try. If you prefer another PaaS, never mind; experiment with Cloud and let competitors give you the best service they can.