What is Keras?

Janu Verma

June 02nd, 2017

Keras is a high-level library for deep learning, which is built on top of Theano and Tensorflow. It is written in Python, and provides a scikit-learn type API for building neural networks. It enables developers to quickly build neural networks without worrying about the mathematical details of tensor algebra, optimization methods, and numerical techniques. The key idea behind Keras is to facilitate fast prototyping and experimentation. In the words of Francois Chollet, the creator of Keras: Being able to go from idea to result with the least possible delay is key to doing good research.

This is a huge advantage, especially for scientists and beginner developers because one can jump right into deep learning without getting their hands dirty. Because deep learning is currently on the rise, the demand for people trained in deep learning is ever increasing. Organizations are trying to incorporate deep learning into their workflows, and Keras offers an easy API to test and build deep learning applications without considerable effort. Deep learning research is such a hot topic right now, scientists need a tool to quickly try out their ideas, and they would rather spend time on coming up with ideas than putting together a neural network model. I use Keras in my own research, and I know a lot of other researchers relying on Keras for its easy and flexible API.

Salient features of Keras

  • Keras is a high-level interface to Theano or Tensorflow and any of these can be used in the backend. It is extremely easy to switch from one backend to another.
  • Training deep neural networks is a memory and time intensive task. Modern deep learning frameworks like Tensorflow, Caffe, Torch, etc. can also run on GPU, though there might be some overhead in setting up and running the GPU. Keras runs seamlessly on both CPU and GPU.
  • Keras supports most of the neural layer types e.g. fully connected, convolution, pooling, recurrent, embedding, dropout, etc., which can be combined in any which way to build complex models.
  • Keras is modular in nature in the sense that each component of a neural network model is a separate, standalone, fully-configurable module, and these modules can be combined to create new models. Essentially, layers, activation, optimizers, dropout, loss, etc. are all different modules that can be assembled to build models. A key advantage of modularity is that new features are easy to add as separate modules. This makes Keras fully expressive, extremely flexible, and well-suited for innovative research.
  • Coding in Keras is extremely easy. The API is very user-friendly with the least amount of cognitive load. Keras is a full Python framework, and all coding is done in Python, which makes it easy to debug and explore. The coding style is very minimalistic, and operations are added in very intuitive python statements.

Architecture of Keras

The core component of Keras architecture is a model. Essentially, a model is a neural network model with layers, activations, optimization, and loss. The simplest Keras model is Sequential, which is just a linear stack of layers; other layer arrangements can be formed using the Functional model. We first initialize a model, add layers to it one by one, each layer followed by its activation function (and regularization, if desired), and then the cost function is added to the model. The model is then compiled. A compiled model can be trained, using the simple API (model.fit()), and once trained the model can be used to make predictions (model.predict()). The similarity to scikit-learn API can be noted here. Two models can be combined sequentially or parallel. A model trained on some data can be saved as an HDF5 file, which can be loaded at a later time. This eliminates the need to train a model again and again; train once and make predictions whenever desired.

Keras provides an API for most common types of layers. You can also merge or concatenate layers for a parallel model. It is also possible to write your own layers. Other ingredients of a neural network model like loss function, metric, optimization method, activation function, and regularization are all available with most common choices. Another very useful component of Keras is the preprocessing module with support for manipulating and processing image, text, and sequence data.

A number of deep learning models and their weights obtained by training on a big dataset are made available. For example, we have VGG16, VGG19, InceptionV3, Xception, ResNet50 image recognition models with their weights after training on ImageNet data. These models can be used for direct prediction, feature building, and/or transfer learning.

One of the greatest advantage of Keras is a huge list of example code available on the Keras GitHub repository (with discussions on accompanying blog) and on the wider Internet. You can learn how to use Keras for text classification using a LSTM model, generate inceptionistic art using deep dream, using pre-trained word embeddings, building variational autoencoder, or train a Siamese network, etc.

There is also a visualization module, which provides functionality to draw a Keras model. This uses the graphviz library to plot and save the model graph to a file. All in all, Keras is a library worth exploring, if you haven't already. 

About the author

Janu Verma is a Researcher in the IBM T.J. Watson Research Center, New York. His research interests are in mathematics, machine learning, information visualization, computational biology and healthcare analytics. He has held research positions at Cornell University, Kansas State University, Tata Institute of Fundamental Research, Indian Institute of Science, and Indian Statistical Institute.  He has written papers for IEEE Vis, KDD, International Conference on HealthCare Informatics, Computer Graphics and Applications, Nature Genetics, IEEE Sensors Journals etc.  His current focus is on the development of visual analytics systems for prediction and understanding. He advises startups and companies on data science and machine learning in the Delhi-NCR area, email to schedule a meeting. Check out his personal website at http://jverma.github.io/.