Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Save more on your purchases! discount-offer-chevron-icon
Savings automatically calculated. No voucher code required.
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Newsletter Hub
Free Learning
Arrow right icon
timer SALE ENDS IN
0 Days
:
00 Hours
:
00 Minutes
:
00 Seconds

How-To Tutorials

7014 Articles
article-image-working-nav-and-azure-app-service
Packt
30 Jan 2017
12 min read
Save for later

Working with NAV and Azure App Service

Packt
30 Jan 2017
12 min read
In this article by Stefano Demiliani, the author of the book Building ERP Solutions with Microsoft Dynamics NAV, we will experience the quick solution to solve complex technical architectural scenarios and create external applications within a blink of an eye using Microsoft Dynamics NAV. (For more resources related to this topic, see here.) The business scenario Imagine a requirement where many Microsoft Dynamics NAV instances (physically located at different places around the world) have to interact with an external application. A typical scenario could be a headquarter of a big enterprise company that has a business application (called HQAPP) that must collect data about item shipments from the ERP of the subsidiary companies around the world (Microsoft Dynamics NAV): The cloud could help us to efficiently handle this scenario. Why not place the interface layer in the Azure Cloud and use the scalability features that Azure could offer? Azure App Service could be the solution to this. We can implement an architecture like the following schema: Here, the interface layer is placed on Azure App Service. Every NAV instance has the business logic (in our scenario, a query to retrieve the desired data) exposed as an NAV Web Service. The NAV instance can have an Azure VPN in place for security. HQAPP performs a request to the interface layer in Azure App Service with the correct parameters. The cloud service then redirects the request to the correct NAV instance and retrieves the data, which in turn is forwarded to HQAPP. Azure App Service can be scaled (manually or automatically) based on the resources requested to perform the data retrieval process. Azure App Service overview Azure App Service is a PaaS service for building scalable web and mobile apps and enabling interaction with on-premises or on-cloud data. With Azure App Service, you can deploy your application to the cloud and you can quickly scale your application to handle high traffic loads and manage traffic and application availability without interacting with the underlying infrastructure. This is the main difference with Azure VM, where you can run a web application on the cloud but in a IaaS environment (you control the infrastructure like OS, configuration, installed services, and so on). Some key features of Azure App Service are as follows: Support for many languages and frameworks Global scale with high availability (scaling up and out manually or automatically) Security Visual Studio integration for creating, deploying and debugging applications Application templates and connectors available Azure App Service offers different types of resources for running a workload, which are as follows: Web Apps: This hosts websites and web applications Mobile Apps: This hosts mobile app backends API Apps: This hosts RESTful APIs Logic Apps: This automates business processes across the cloud Azure App Service has the following different service plans where you can scale from depending on your requirements in terms of resources: Free: This is ideal for testing and development, no custom domains or SSL are required, you can deploy up to 10 applications. Shared: This has a fixed per-hour charge. This is ideal for testing and development, supports for custom domains and SSL, you can deploy up to 100 applications. Basic: This has a per-hour charge based on the number of instances. It runs on a dedicated instance. This is ideal for low traffic requirements, you can deploy an unlimited number of apps. It supports only a single SSL certificate per plan (not ideal if you need to connect to an Azure VPN or use deployment slots). Standard: This has a per-hour charge based on the number of instances. This provides full SSL support. This provides up to 10 instances with auto-scaling, automated backups, up to five deployment slots, ideal for production environments. Premium: This has per-hour charge based on the number of instances. This provides up to 50 instances with auto-scaling, up to 20 deployment slots, different daily backups, dedicated App Service Environment. Ideal for enterprise scale and integration. Regarding the application deployment, Azure App Service supports the concept of Deployment Slot (only on the Standard and Premium tiers). Deployment Slot is a feature that permits you to have a separate instance of an application that runs on the same VM but is isolated from the other deployment slots and production slots active in the App Service. Always remember that all Deployment Slots share the same VM instance and the same server resources. Developing the solution Our solution is essentially composed of two parts: The NAV business logic The interface layer (cloud service) The following steps will help you retrieve the required data from an external application: In the NAV instances of the subsidiary companies, we need to retrieve the sales shipment's data for every item. To do so, we need to create a Query object that reads Sales Shipment Header and Sales Shipment Line and exposes them as web services (OData).The Query object will be designed as follows: For every Sales Shipment Header web service, we retrieve the corresponding Sales Shipment Lines web service that have Type as DataItem: I've changed the name of the field No. in Sales Shipment Line in DataItem as ItemNo because the default name was in used in Sales Shipment Header in DataItem. Compile and save the Query object (here, I've used Object ID as 50009 and Service Name as Item Shipments). Now, we will publish the Query object as web service in NAV, so open the Web Services page and create the following entries:    Object Type: Query    Object ID: 50009    Service Name: Item Shipments    Published: TRUE When published, NAV returns the OData service URL. This Query object must be published as web service on every NAV instances in the subsidiary companies. To develop our interface layer, we need first to download and install (if not present) the Azure SDK for Visual Studio from https://azure.microsoft.com/en-us/downloads/. After that, we can create a new Azure Cloud Service project by opening Visual Studio and navigate to File | New | Project, select the Cloud templates, and choose Azure Cloud Service. Select the project's name (here, it is NAVAzureCloudService) and click on OK. After clicking on OK, Visual Studio asks you to select a service type. Select WCF Service Web Role, as shown in the following screenshot: Visual Studio now creates a template for our solution. Now right-click on the NAVAzureCloudService project and select New Web Role Project, and in the Add New .NET Framework Role Project window, select WCF Service Web Role and give it a proper name (here, we have named it WCFServiceWebRoleNAV): Then, rename Service1.svc with a better name (here, it is NAVService.svc). Our WCF Service Web Role must have the reference to all the NAV web service URLs for the various NAV instances in our scenario and (if we want to use impersonation) the credentials to access the relative NAV instance. You can right-click the WCFServiceWebRoleNAV project, select Properties and then the Settings tab. Here you can add the URL for the various NAV instances and the relative web service credentials. Let's start writing our service code. We create a class called SalesShipment that defines our data model as follows: public class SalesShipment { public string No { get; set; } public string CustomerNo { get; set; } public string ItemNo { get; set; } public string Description { get; set; } public string Description2 { get; set; } public string UoM { get; set; } public decimal? Quantity { get; set; } public DateTime? ShipmentDate { get; set; } } In next step, we have to define our service contract (interface). Our service will have a single method to retrieve shipments for a NAV instance and with a shipment date filter. The service contract will be defined as follows: public interface INAVService { [OperationContract] [WebInvoke(Method = "GET", ResponseFormat = WebMessageFormat.Xml, BodyStyle = WebMessageBodyStyle.Wrapped, UriTemplate = "getShipments?instance={NAVInstanceName}&date={shipmentDateFilter}"] List<SalesShipment> GetShipments(string NAVInstanceName, string shipmentDateFilter); //Date format parameter: YYYY-MM-DD } The WCF service definition will implement the previously defined interface as follows: public class NAVService : INAVService { } The GetShipments method is implemented as follows: public List<SalesShipment> GetShipments(string NAVInstanceName, string shipmentDateFilter) { try { DataAccessLayer.DataAccessLayer DAL = new DataAccessLayer.DataAccessLayer(); List<SalesShipment> list = DAL.GetNAVShipments(NAVInstanceName, shipmentDateFilter); return list; } catch(Exception ex) { // You can handle exceptions here… throw ex; } } This method creates an instance of a DataAccessLayer class (which we will discuss in detail later) and calls a method called GetNAVShipments by passing the NAV instance name and ShipmentDateFilter. To call the NAV business logic, we need to have a reference to the NAV OData web service (only to generate a proxy class, the real service URL will be dynamically called by code) so right-click on your project (WCFServiceWebRoleNAV) and navigate to Add | Service Reference. In the Add Service Reference window, paste the OData URL that comes from NAV and when the service is discovered, give it a reference name (here, it is NAVODATAWS). Visual Studio automatically adds a service reference to your project. The DataAccessLayer class will be responsible for handling calls to the NAV OData web service. This class defines a method called GetNAVShipments with the following two parameters: NAVInstanceName: This is the name of the NAV instance to call shipmentDateFilter: This filters date for the NAV shipment lines (greater than or equal to) According to NAVInstanceName, the method retrieves from the web.config file (appSettings) the correct NAV OData URL and credentials, calls the NAV query (by passing filters), and retrieves the data as a list of SalesShipment records (our data model). The DataAccessLayer class is defined as follows: public List<SalesShipment> GetNAVShipments(string NAVInstanceName, string shipmentDateFilter) { try { string URL = Properties.Settings.Default[NAVInstanceName].ToString(); string WS_User = Properties.Settings.Default[NAVInstanceName + "_User"].ToString(); string WS_Pwd = Properties.Settings.Default[NAVInstanceName + "_Pwd"].ToString(); string WS_Domain = Properties.Settings.Default[NAVInstanceName + "_Domain"].ToString(); DataServiceContext context = new DataServiceContext(new Uri(URL)); NAVODATAWS.NAV NAV = new NAVODATAWS.NAV(new Uri(URL)); NAV.Credentials = new System.Net.NetworkCredential(WS_User, WS_Pwd, WS_Domain); DataServiceQuery<NAVODATAWS.ItemShipments> q = NAV.CreateQuery<NAVODATAWS.ItemShipments>("ItemShipments"); if (shipmentDateFilter != null) { string FilterValue = string.Format("Shipment_Date ge datetime'{0}'", shipmentDateFilter); q = q.AddQueryOption("$filter", FilterValue); } List<NAVODATAWS.ItemShipments> list = q.Execute().ToList(); List<SalesShipment> sslist = new List<SalesShipment>(); foreach (NAVODATAWS.ItemShipments shpt in list) { SalesShipment ss = new SalesShipment(); ss.No = shpt.No; ss.CustomerNo = shpt.Sell_to_Customer_No; ss.ItemNo = shpt.ItemNo; ss.Description = shpt.Description; ss.Description2 = shpt.Description_2; ss.UoM = shpt.Unit_of_Measure; ss.Quantity = shpt.Quantity; ss.ShipmentDate = shpt.Shipment_Date; sslist.Add(ss); } return sslist; } catch (Exception ex) { throw ex; } } The method returns a list of the SalesShipment objects. It creates an instance of the NAV OData web service, applies the OData filter to the NAV query, reads the results, and loads the list of the SalesShipment objects. Deployment to Azure App Service Now that your service is ready, you have to deploy it to the Azure App Service by performing the following steps: Right-click on the NAVAzureCloudService project and select Package… as shown in the following screenshot: In the Package Azure Application window, select Service configuration as Cloud and Build configuration as Release, and then click on Package as shown in the following screenshot: This operation creates two files in the <YourProjectName>binReleaseapp.publish folder as shown in the following screenshot: These are the packages that must be deployed to Azure. To do so, you have to log in to the Azure Portal and navigate to Cloud Services | Add from the hub menu at the left. In the next window, set the following cloud service parameters: DNS name: This depicts name of your cloud service (yourname.cloudapp.net) Subscription: This is the Azure Subscription where the cloud service will be added Resource group: This creates a new resource group for your cloud service or use existing one Location: This is the Azure location where the cloud service is to be added Finally, you can click on the Create button to create your cloud service. Now, deploy the previously created cloud packages to your cloud service that was just created. In the cloud services list, click on NAVAZureCloudService, and in the next window, select the desired slot (for example, Production slot) and click on Upload as shown in the following screenshot: In the Upload a package window, provide the following parameters: Storage account: This is a previously created storage account for your subscription Deployment label: This is the name of your deployment Package: Select the .cspkg file previously created for your cloud service Configuration: Select the .cspkg file previously created for your cloud service configuration You can take a look at the preceding parameters in the following screenshot: Select the Start deployment checkbox and click on the OK button at the bottom to start the deployment process to Azure. Now you can start your cloud service and manage it (swap, scaling, and so on) directly from the Azure Portal: When running, you can use your deployed service by reaching this URL: http://navazurecloudservice.cloudapp.net/NAVService.svc This is the URL that the HQAPP in our business scenario has to call for retrieving data from the various NAV instances of the subsidiary companies around the world. In this way, you have deployed a service to the cloud, you can manage the resources in a central way (via the Azure Portal), and you can easily have different environments by using slots. Summary In this article, you learned to enable NAV instances placed at different locations to interact with an external application through Azure App Service and also the features that it provides. Resources for Article: Further resources on this subject: Introduction to NAV 2017 [article] Code Analysis and Debugging Tools in Microsoft Dynamics NAV 2009 [article] Exploring Microsoft Dynamics NAV – An Introduction [article]
Read more
  • 0
  • 0
  • 3234

article-image-creating-dynamic-maps
Packt
27 Jan 2017
15 min read
Save for later

Creating Dynamic Maps

Packt
27 Jan 2017
15 min read
In this article by Joel Lawhead, author of the book, QGIS Python Programming Cookbook - Second Edition, we will cover the following recipes: Setting a transparent layer fill Using a filled marker symbol Rendering a single band raster using a color ramp algorithm Setting a feature's color using a column in a CSV file Creating a complex vector layer symbol Using an outline for font markers Using arrow symbols (For more resources related to this topic, see here.) Setting a transparent layer fill Sometimes, you may just want to display the outline of a polygon in a layer and have the insides of the polygon render transparently, so you can see the other features and background layers inside that space. For example, this technique is common with political boundaries. In this recipe, we will load a polygon layer onto the map, and then interactively change it to just an outline of the polygon. Getting ready Download the zipped shapefile and extract it to your qgis_data directory into a folder named ms from https://github.com/GeospatialPython/Learn/raw/master/Mississippi.zip. How to do it… In the following steps, we'll load a vector polygon layer, set up a properties dictionary to define the color and style, apply the properties to the layer's symbol, and repaint the layer. In Python Console, execute the following: Create the polygon layer: lyr = QgsVectorLayer("/qgis_data/ms/mississippi.shp", "Mississippi", "ogr") Load the layer onto the map: QgsMapLayerRegistry.instance().addMapLayer(lyr) Now, we’ll create the properties dictionary: properties = {} Next, set each property for the fill color, border color, border width, and a style of no meaning no-brush. Note that we’ll still set a fill color; we are just making it transparent: properties["color"] = '#289e26' properties["color_border"] = '#289e26' properties["width_border"] = '2' properties["style"] = 'no' Now, we create a new symbol and set its new property: sym = QgsFillSymbolV2.createSimple(properties) Next, we access the layer's renderer: renderer = lyr.rendererV2() Then, we set the renderer's symbol to the new symbol we created: renderer.setSymbol(sym) Finally, we repaint the layer to show the style updates: lyr.triggerRepaint() How it works… In this recipe, we used a simple dictionary to define our properties combined with the createSimple method of the QgsFillSymbolV2 class. Note that we could have changed the symbology of the layer before adding it to the canvas, but adding it first allows you to see the change take place interactively. Using a filled marker symbol A newer feature of QGIS is filled marker symbols. Filled marker symbols are powerful features that allow you to use other symbols, such as point markers, lines, and shapebursts as a fill pattern for a polygon. Filled marker symbols allow for an endless set of options for rendering a polygon. In this recipe, we'll do a very simple filled marker symbol that paints a polygon with stars. Getting ready Download the zipped shapefile and extract it to your qgis_data directory into a folder named ms from https://github.com/GeospatialPython/Learn/raw/master/Mississippi.zip. How to do it… A filled marker symbol requires us to first create the representative star point marker symbol. Then, we'll add that symbol to the filled marker symbol and change it with the layer's default symbol. Finally, we'll repaint the layer to update the symbology: First, create the layer with our polygon shapefile: lyr = QgsVectorLayer("/qgis_data/ms/mississippi.shp", "Mississippi", "ogr") Next, load the layer onto the map: QgsMapLayerRegistry.instance().addMapLayer(lyr) Now, set up the dictionary with the properties of the star marker symbol: marker_props = {} marker_props["color"] = 'red' marker_props["color_border"] = 'black' marker_props["name"] = 'star' marker_props["size"] = '3' Now, create the star marker symbol: marker = QgsMarkerSymbolV2.createSimple(marker_props) Then, we create our filled marker symbol: filled_marker = QgsPointPatternFillSymbolLayer() We need to set the horizontal and vertical spacing of the filled markers in millimeters: filled_marker.setDistanceX(4.0) filled_marker.setDistanceY(4.0) Now, we can add the simple star marker to the filled marker symbol: filled_marker.setSubSymbol(marker) Next, access the layer's renderer: renderer = lyr.rendererV2() Now, we swap the first symbol layer of the first symbol with our filled marker using zero indexes to reference them: renderer.symbols()[0].changeSymbolLayer(0, filled_marker) Finally, we repaint the layer to see the changes: lyr.triggerRepaint() Verify that the result looks similar to the following screenshot: Rendering a single band raster using a color ramp algorithm A color ramp allows you to render a raster using just a few colors to represent different ranges of cell values that have a similar meaning in order to group them. The approach that will be used in this recipe is the most common way to render elevation data. Getting ready You can download a sample DEM from https://github.com/GeospatialPython/Learn/raw/master/dem.zip, which you can unzip in a directory named rasters in your qgis_data directory. How to do it... In the following steps, we will set up objects to color a raster, create a list establishing the color ramp ranges, apply the ramp to the layer renderer, and finally, add the layer to the map. To do this, we need to perform the following: First, we import the QtGui library for color objects in Python Console: from PyQt4 import QtGui Next, we load the raster layer, as follows: lyr = QgsRasterLayer("/qgis_data/rasters/dem.asc", "DEM") Now, we create a generic raster shader object: s = QgsRasterShader() Then, we instantiate the specialized ramp shader object: c = QgsColorRampShader() We must name a type for the ramp shader. In this case, we use an INTERPOLATED shader: c.setColorRampType(QgsColorRampShader.INTERPOLATED) Now, we'll create a list of our color ramp definitions: i = [] Then, we populate the list with the color ramp values that correspond to the elevation value ranges: i.append(QgsColorRampShader.ColorRampItem(400, QtGui.QColor('#d7191c'), '400')) i.append(QgsColorRampShader.ColorRampItem(900, QtGui.QColor('#fdae61'), '900')) i.append(QgsColorRampShader.ColorRampItem(1500, QtGui.QColor('#ffffbf'), '1500')) i.append(QgsColorRampShader.ColorRampItem(2000, QtGui.QColor('#abdda4'), '2000')) i.append(QgsColorRampShader.ColorRampItem(2500, QtGui.QColor('#2b83ba'), '2500')) Now, we assign the color ramp to our shader: c.setColorRampItemList(i) Now, we tell the generic raster shader to use the color ramp: s.setRasterShaderFunction(c) Next, we create a raster renderer object with the shader: ps = QgsSingleBandPseudoColorRenderer(lyr.dataProvider(), 1, s) We assign the renderer to the raster layer: lyr.setRenderer(ps) Finally, we add the layer to the canvas in order to view it: QgsMapLayerRegistry.instance().addMapLayer(lyr) How it works… While it takes a stack of four objects to create a color ramp, this recipe demonstrates how flexible the PyQGIS API is. Typically, the more number of objects it takes to accomplish an operation in QGIS, the richer the API is, giving you the flexibility to make complex maps. Notice that in each ColorRampItem object, you specify a starting elevation value, the color, and a label as the string. The range for the color ramp ends at any value less than the following item. So, in this case, the first color will be assigned to the cells with a value between 400 and 899. The following screenshot shows the applied color ramp: Setting a feature's color using a column in a CSV file Comma Separated Value (CSV) files are an easy way to store basic geospatial information. But you can also store styling properties alongside the geospatial data for QGIS to use in order to dynamically style the feature data. In this recipe, we'll load some points into QGIS from a CSV file and use one of the columns to determine the color of each point. Getting ready Download the sample zipped CSV file from the following URL: https://github.com/GeospatialPython/Learn/raw/master/point_colors.csv.zip Extract it and place it in your qgis_data directory in a directory named shapes. How to do it… We'll load the CSV file into QGIS as a vector layer and create a default point symbol. Then we'll specify the property and the CSV column we want to control. Finally we'll assign the symbol to the layer and add the layer to the map: First, create the URI string needed to load the CSV: uri = "file:///qgis_data/shapes/point_colors.csv?" uri += "type=csv&" uri += "xField=X&yField=Y&" uri += "spatialIndex=no&" uri += "subsetIndex=no&" uri += "watchFile=no&" uri += "crs=epsg:4326" Next, create the layer using the URI string: lyr = QgsVectorLayer(uri,"Points","delimitedtext") Now, create a default symbol for the layer's geometry type: sym = QgsSymbolV2.defaultSymbol(lyr.geometryType()) Then, we access the layer's symbol layer: symLyr = sym.symbolLayer(0) Now, we perform the key step, which is to assign a symbol layer property to a CSV column: symLyr.setDataDefinedProperty("color", '"COLOR"') Then, we change the existing symbol layer with our data-driven symbol layer: lyr.rendererV2().symbols()[0].changeSymbolLayer(0, symLyr) Finally, we add the layer to the map and verify that each point has the correct color, as defined in the CSV: QgsMapLayerRegistry.instance().addMapLayers([lyr]) How it works… In this example, we pulled feature colors from the CSV, but you could control any symbol layer property in this manner. CSV files can be a simple alternative to databases for lightweight applications or for testing key parts of a large application before investing the overhead to set up a database. Creating a complex vector layer symbol The true power of QGIS symbology lies in its ability to stack multiple symbols in order to create a single complex symbol. This ability makes it possible to create virtually any type of map symbol you can imagine. In this recipe, we'll merge two symbols to create a single symbol and begin unlocking the potential of complex symbols. Getting ready For this recipe, we will need a line shapefile, which you can download and extract from https://github.com/GeospatialPython/Learn/raw/master/paths.zip. Add this shapefile to a directory named shapes in your qgis_data directory. How to do it… Using Python Console, we will create a classic railroad line symbol by placing a series of short, rotated line markers along a regular line symbol. To do this, we need to perform the following steps: First, we load our line shapefile: lyr = QgsVectorLayer("/qgis_data/shapes/paths.shp", "Route", "ogr") Next, we get the symbol list and reference the default symbol: symbolList = lyr.rendererV2().symbols() symbol = symbolList[0] Then,we create a shorter variable name for the symbol layer registry: symLyrReg = QgsSymbolLayerV2Registry Now, we set up the line style for a simple line using a Python dictionary: lineStyle = {'width':'0.26', 'color':'0,0,0'} Then, we create an abstract symbol layer for a simple line: symLyr1Meta = symLyrReg.instance().symbolLayerMetadata("SimpleLine") We instantiate a symbol layer from the abstract layer using the line style properties: symLyr1 = symLyr1Meta.createSymbolLayer(lineStyle) Now, we add the symbol layer to the layer's symbol: symbol.appendSymbolLayer(symLyr1) Now,in order to create the rails on the railroad, we begin building a marker line style with another Python dictionary, as follows: markerStyle = {} markerStyle['width'] = '0.26' markerStyle['color'] = '0,0,0' markerStyle['interval'] = '3' markerStyle['interval_unit'] = 'MM' markerStyle['placement'] = 'interval' markerStyle['rotate'] = '1' Then, we create the marker line abstract symbol layer for the second symbol: symLyr2Meta = symLyrReg.instance().symbolLayerMetadata("MarkerLine") We instatiate the symbol layer, as shown here: symLyr2 = symLyr2Meta.createSymbolLayer(markerStyle) Now, we must work with a subsymbol that defines the markers along the marker line: sybSym = symLyr2.subSymbol() We must delete the default subsymbol: sybSym.deleteSymbolLayer(0) Now, we set up the style for our rail marker using a dictionary: railStyle = {'size':'2', 'color':'0,0,0', 'name':'line', 'angle':'0'} Now, we repeat the process of building a symbol layer and add it to the subsymbol: railMeta = symLyrReg.instance().symbolLayerMetadata("SimpleMarker") rail = railMeta.createSymbolLayer(railStyle) sybSym.appendSymbolLayer(rail) Then, we add the subsymbol to the second symbol layer: symbol.appendSymbolLayer(symLyr2) Finally, we add the layer to the map: QgsMapLayerRegistry.instance().addMapLayer(lyr) How it works… First, we must create a simple line symbol. The marker line, by itself, will render correctly, but the underlying simple line will be a randomly chosen color. We must also change the subsymbol of the marker line because the default subsymbol is a simple circle. Using an outline for font markers Font markers open up broad possibilities for icons, but a single-color shape can be hard to see across a varied map background. Recently, QGIS added the ability to place outlines around font marker symbols. In this recipe, we'll use font marker symbol methods to place an outline around the symbol to give it contrast and, therefore, visibility on any type of background. Getting ready Download the following zipped shapefile. Extract it and place it in a directory named ms in your qgis_data directory: https://github.com/GeospatialPython/Learn/raw/master/tourism_points.zip How to do it… This recipe will load a layer from a shapefile, set up a font marker symbol, put an outline on it, and then add it to the layer. We'll use a simple text character, an @ sign, as our font marker to keep things simple: First, we need to import the QtGUI library, so we can work with color objects: from PyQt4.QtGui import * Now, we create a path string to our shapefile: src = "/qgis_data/ms/tourism_points.shp" Next, we can create the layer: lyr = QgsVectorLayer(src, "Points of Interest", "ogr") Then, we can create the font marker symbol specifying the font size and color in the constructor: symLyr = QgsFontMarkerSymbolLayerV2(pointSize=16, color=QColor("cyan")) Now, we can set the font family, character, outline width, and outline color: symLyr.setFontFamily("'Arial'") symLyr.setCharacter("@") symLyr.setOutlineWidth(.5) symLyr.setOutlineColor(QColor("black")) We are now ready to assign the symbol to the layer: lyr.rendererV2().symbols()[0].changeSymbolLayer(0, symLyr) Finally, we add the layer to the map: QgsMapLayerRegistry.instance().addMapLayer(lyr) Verify that your map looks similar to the following image: How it works… We used class methods to set this symbol up, but we also could have used a property dictionary just as easily. Note that the font size and color were set in the object constructor for the font maker symbol instead of using setter methods. QgsFontMarkerSymbolLayerV2 doesn't have methods for these two properties. Using arrow symbols Line features convey location, but sometimes you also need to convey a direction along a line. QGIS recently added a symbol that does just that by turning lines into arrows. In this recipe, we'll symbolize some line features showing historical human migration routes around the world. This data requires directional arrows for us to understand it: Getting ready We will use two shapefiles in this example. One is a world boundaries shapefile and the other is a route shapefile. You can download the countries shapefile here: https://github.com/GeospatialPython/Learn/raw/master/countries.zip You can download the routes shapefile here: https://github.com/GeospatialPython/Learn/raw/master/human_migration_routes.zip Download these ZIP files and unzip the shapefiles into your qgis_data directory. How to do it… We will load the countries shapefile as a background reference layer and then, the route shapefile. Before we display the layers on the map, we'll create the arrow symbol layer, configure it, and then add it to the routes layer. Finally, we'll add the layers to the map. First, we'll create the URI strings for the paths to the two shapefiles: countries_shp = "/qgis_data/countries.shp" routes_shp = "/qgis_data/human_migration_routes.shp" Next, we'll create our countries and routes layers: countries = QgsVectorLayer(countries_shp, "Countries", "ogr") routes = QgsVectorLayer(routes_shp, "Human Migration Routes", "ogr") Now, we’ll create the arrow symbol layer: symLyr = QgsArrowSymbolLayer() Then, we’ll configure the layer. We'll use the default configuration except for two paramters--to curve the arrow and to not repeat the arrow symbol for each line segment: symLyr.setIsCurved(True) symLyr.setIsRepeated(False) Next, we add the symbol layer to the map layer: routes.rendererV2().symbols()[0].changeSymbolLayer(0, symLyr) Finally, we add the layers to the map: QgsMapLayerRegistry.instance().addMapLayers([routes,countries]) Verify that your map looks similar to the following image: How it works… The symbol calculates the arrow's direction based on the order of the feature's points. You may find that you need to edit the underlying feature data to produce the desired visual effect, especially when using curved arrows. You have limited control over the arc of the curve using the end points plus an optional third vertex. This symbol is one of the several new powerful visual effects added to QGIS, which would have normally been done in a vector illustration program after you produced a map. Summary In this article, weprogrammatically created dynamic maps using Python to control every aspect of the QGIS map canvas. We learnt to dynamically apply symbology from data in a CSV file. We also learnt how to use some newer QGIS custom symbology including font markers, arrow symbols, null symbols, and the powerful new 2.5D renderer for buildings. Wesaw that every aspect of QGIS is up for grabs with Python, to write your own application. Sometimes, the PyQGIS API may not directly support our application goal, but there is nearly always a way to accomplish what you set out to do with QGIS. Resources for Article: Further resources on this subject: Normal maps [article] Putting the Fun in Functional Python [article] Revisiting Linux Network Basics [article]
Read more
  • 0
  • 0
  • 4330

article-image-understanding-container-scenarios-and-overview-docker
Packt
24 Jan 2017
17 min read
Save for later

Understanding Container Scenarios and Overview of Docker

Packt
24 Jan 2017
17 min read
Docker is one of the recent most successful open source project which provides packaging, shipping, and running any application as light weight containers. We can actually compare Docker containers as shipping containers that provides standard consistent way of shipping any application. Docker is fairly a new project and with help of this article it will be easy to troubleshoot some of the common problems which Docker users face while installing and using Dockers containers. In this article by Rajdeep Dua, Vaibhav Kohli, and John Wooten authors of the book Troubleshooting Docker, the emphasis will be on the following topics; Decoding containers Diving into Docker Advantages of Docker containers Docker lifecycle Docker design patterns Unikernels (For more resources related to this topic, see here.) Decoding containers Containerization are an alternative to virtual machine which involves encapsulation of applications and providing it with its own operating environment. The basic foundation for containers is Linux containers (LXC) which is user space interface for Linux Kernel containment features. With help of powerful API and simple tools it lets Linux users create and manage application containers. LXC containers are in-between of chroot and full-fledged virtual machine. Another key difference with containerization from traditional hypervisor's is that containers share the Linux Kernel used by operating system running the host machine, thus multiple containers running in the same machine uses the same Linux Kernel. It gives the advantage of being fast with almost zero performance overhead compared to VMs. Major use cases of containers are listed in the further sections. OS container OS containers can be easily imagined as a Virtual Machine (VM) but unlike a VM they share the Kernel of the host operating system but provide user space isolation. Similar to a VM dedicated resources can be assigned to containers and we can install, configure and run different application, libraries, and so on. Just as you would run on any VM. OS containers are helpful in case of scalability testing where fleet of containers can be deployed easily with different flavors of distros, which is very less expensive compared to deployment of VM's. Container are created from templates or images that determine the structure and contents of the container. It allows to create a container with identical environment, same package version, and configuration across all containers mostly used in case of dev environment setups. There are various container technologies like LXC, OpenVZ , Docker, and BSD jails which are suitable for OS containers. Figure 1: OS based container Application containers Application containers are designed to run a single service in the package, while OS containers which are explained previously can support multiple processes. Application containers are getting lot of attraction after launch of Docker and Rocket. Whenever container is launched it runs a single process. This process runs an application process but in case of OS containers it runs multiple services on the same OS. Containers usually have a layered approach as in case of Docker container which helps in reduced duplication and increased re-use. Container can be started with base image common for all components and then we can go on adding layers in the file system that are specific to the component. Layered file system helps to rollback changes as we can simple switch to old layers if required. The run command which is specified in Dockerfile adds a new layer for the container. The main purpose of application containers is to package different component of the application in separate container. The different component of the application which are packaged separately in container then interact with help of API's and services. The distributed multi-component system deployment is the basic implementation of micro-service architecture. In the preceding approach developer gets the freedom to package the application as per his requirement and IT team gets the privilege to deploy the container on multiple platforms in order to scale the system both horizontally as well as vertically. Hypervisor is virtual machine monitor (VMM), used to allow multiple operation system to run and share the hardware resources from the host. Each virtual machine is termed as guest machine. The following simple example explains the difference between application container and OS containers: Figure 2: Docker layers Let's consider the example of web three-tier architecture we have a database tier such as MySQL, Nginx for load balancer and application tier as Node.js: Figure 3: OS container In case of OS container we can pick up by default Ubuntu as the base container and install services MySQL, Nginx, Node.js using Dockerfile. This type of packaging is good for testing or for development setup where all the services are packaged together and can be shipped and shared across developer's. But deploying this architecture for production cannot be done with OS containers as there is no consideration of data scalability and isolation. Application containers helps to meet this use case as we can scale the required component by deploying more application specific containers and it also helps to meet load-balancing and recovery use-case. For the preceding three-tier architecture each of the services will be packaged into separate containers in order to fulfill the architecture deployment use-case. Figure 4: Application containers scaled up Main difference between OS and application containers are: OS container Application container Meant to run multiple services on same OS container Meant to run single service Natively, No layered filesystem Layered filesystem Example: LXC, OpenVZ, BSD Jails Example: Docker, Rocket Diving into Docker Docker is a container implementation that has gathered enormous interest in recent years. It neatly bundles various Linux Kernel features and services like namespaces, cgroups, SELinux, and AppArmor profiles and so on with Union files systems like AUFS, BTRFS to make modular images. These images provides highly configurable virtualized environment for applications and follows write-once-run-anywhere principle. Application can be as simple as running a process to a highly scalable and distributed processes working together. Docker is getting a lot of traction in industry, because of its performance savvy, and universal replicability architecture, meanwhile providing the following four cornerstones of modern application development: Autonomy Decentralization Parallelism Isolation Furthermore, wide-scale adaptation of Thoughtworks's micro services architecture or Lots of Small Applications (LOSA) is further bringing potential in Docker technology. As a result, big companies like Google, VMware, and Microsoft have already ported Docker to their infrastructure, and the momentum is continued by the launch of myriad of Docker startups namely Tutum, Flocker, Giantswarm and so on. Since Docker containers replicate their behavior anywhere, be it your development machine, a bare-metal server, virtual machine, or datacenter, application designers can focus their attention on development, while operational semantics are left with Devops. This makes team workflow modular, efficient and productive. Docker is not to be confused with VM, even though they are both virtualization technologies. Where Docker shares an OS, meanwhile providing sufficient level of isolation and security to applications running in containers, later completely abstracts out OS and gives strong isolation and security guarantees. But Docker resource footprint is minuscule in comparison to VM, and hence preferred for economy and performance. However, it still cannot completely replace VM, and hence is complementary to VM technology: Figure 5: VM and Docker architecture Advantages of Docker containers Following listed are some of the advantages of using Docker containers in Micro-service architecture: Rapid application deployment: With minimal runtime containers can be deployed quickly because of the reduced size as only application is packaged. Portability: An application with its operating environment (dependencies) can be bundled together into a single Docker container that is independent from the OS version or deployment model. The Docker containers can be easily transferred to another machine that runs Docker container and executed without any compatibility issues. As well Windows support is also going to be part of future Docker releases. Easily sharable: Pre-built container images can be easily shared with help of public repositories as well as hosted private repositories for internal use. Lightweight footprint: Even the Docker images are very small and have minimal footprint to deploy new application with help of containers. Reusability: Successive versions of Docker containers can be easily built as well as roll-backed to previous versions easily whenever required. It makes them noticeably lightweight as components from the pre-existing layers can be reused. Docker lifecycle These are some of the basic steps involved in the lifecycle of Docker container: Build the Docker image with help of Dockerfile which contains all the commands required to be packaged. It can run in the following way: Docker build Tag name can be added in following way: Docker build -t username/my-imagename If Dockerfile exists at different path then the Docker build command can be executed by providing –f flag: Docker build -t username/my-imagename -f /path/Dockerfile After the image creation, in order to deploy the container Docker run can be used. The running containers can be checked with help of Docker pscommand, which list the currently active containers. There are two more commands to be discussed; Docker pause: This command used cgroups freezer to suspend all the process running in container, internally it uses SIGSTOP signal. Using this command process can be easily suspended and resumed whenever required. Docker start: This command is used to either start the paused or stopped container. After the usage of container is done it can either be stopped or killed; Docker stop: command will gracefully stop the running container by sending SIGTERM and then SIGKILL command. In this case container can still be listed by using Docker ps –a command. Docker kill will kill the running container by sending SIGKILL to main process running inside the container. If there are some changes made to the container while it is running, which are likely to be preserved, container can be converted back to image by using the Docker commit after container has been stopped. Figure 6: Docker lifecycle Docker design patterns Following listed are some of the Docker design patterns with examples. Dockerfile is the base structure from which we define a Docker image it contains all the commands to assemble an image. Using Docker build command we can create automated build that executes all the previously mentioned command-line instructions to create an image: $ Docker build Sending build context to Docker daemon 6.51 MB ... Design patterns listed further can help in creating Docker images that persist in volumes and provides various other flexibility so that they can be re-created or replaced easily at any time. The base image sharing For creating a web-based application or blog we can create a base image which can be shared and help to deploy the application with ease. This patterns helps out as it tries to package all the required services on top of one base image, so that this web application blog image can be re-used anywhere: FROM debian:wheezy RUN apt-get update RUN apt-get -y install ruby ruby-dev build-essential git # For debugging RUN apt-get install -y gdb strace # Set up my user RUN useradd vkohli -u 1000 -s /bin/bash --no-create-home RUN gem install -n /usr/bin bundler RUN gem install -n /usr/bin rake WORKDIR /home/vkohli/ ENV HOME /home/vkohli VOLUME ["/home"] USER vkohli EXPOSE 8080 The preceding Dockerfile shows the standard way of creating an application-based image. Docker image is a zipped file which is a snapshot of all the configuration parameters as well as the changes made in the base image (Kernel of the OS). It installs some specific tools (Ruby tools rake and bundler) on top of Debian base image. It creates a new user adds it to the container image and specifies the working directory by mounting /home directory from the host which is explained in detail in next section. Shared volume Sharing the volume at host level allows other containers to pick up the shared content required by them. This helps in faster rebuilding of Docker image or add/modify/remove dependencies. Example if we are creating the homepage deployment of the previously mentioned blog only directory required to be shared is /home/vkohli/src/repos/homepage directory with this web app container through the Dockerfile in the following way: FROM vkohli/devbase WORKDIR /home/vkohli/src/repos/homepage ENTRYPOINT bin/homepage web For creating the dev version of the blog we can share the folder /home/vkohli/src/repos/blog where all the related developer files can reside. And for creating the dev-version image we can take the base image from pre-created devbase: FROM vkohli/devbase WORKDIR / USER root # For Graphivz integration RUN apt-get update RUN apt-get -y install graphviz xsltproc imagemagick USER vkohli WORKDIR /home/vkohli/src/repos/blog ENTRYPOINT bundle exec rackup -p 8080 Dev-tools container For development purpose we have separate dependencies in dev and production environment which easily gets co-mingled at some point. Containers can be helpful in differentiating the dependencies by packaging them separately. As shown in the following example we can derive the dev tools container image from the base image and install development dependencies on top of it even allowing ssh connection so that we to work upon the code: FROM vkohli/devbase RUN apt-get update RUN apt-get -y install openssh-server emacs23-nox htop screen # For debugging RUN apt-get -y install sudo wget curl telnet tcpdump # For 32-bit experiments RUN apt-get -y install gcc-multilib # Man pages and "most" viewer: RUN apt-get install -y man most RUN mkdir /var/run/sshd ENTRYPOINT /usr/sbin/sshd -D VOLUME ["/home"] EXPOSE 22 EXPOSE 8080 As can be seen previously basic tools such as wget, curl, tcpdump are installed which are required during development. Even SSHD service is installed which allows to ssh connection into the dev container. Test environment container Testing the code in different environment always eases the process and helps to find more bugs in isolation. We can create a ruby environment in separate container to spawn a new ruby shell and use it to test the code base: FROM vkohli/devbase RUN apt-get update RUN apt-get -y install ruby1.8 git ruby1.8-dev In the preceding Dockerfile we are using the base image as devbase and with help of just one command Docker run can easily create a new environment by using the image created from this Dockerfile to test the code. The build container We have built steps involved in the application that are sometimes expensive. In order to overcome this we can create a separate a build container which can use the dependencies needed during build process. Following Dockerfile can be used to run a separate build process: FROM sampleapp RUN apt-get update RUN apt-get install -y build-essential [assorted dev packages for libraries] VOLUME ["/build"] WORKDIR /build CMD ["bundler", "install","--path","vendor","--standalone"] /build directory is the shared directory that can be used to provide the compiled binaries also we can mount the /build/source directory in the container to provide updated dependencies. Thus by using build container we can decouple the build process and final packaging part in separate containers. It still encapsulates both the process and dependencies by breaking the previous process in separate containers. The installation container The purpose of this container is to package the installation steps in separate container. Basically, in order to provide deployment of container in production environment. Sample Dockerfile to package the installation script inside Docker image as follows: ADD installer /installer CMD /installer.sh The installer.sh can contain the specific installation command to deploy container in production environment and also to provide the proxy setup with DNS entry in order to have the cohesive environment deployed. Service-in-a-box container In order to deploy the complete application in a container we can bundle multiple services to provide the complete deployment container. In this case we bundle web app, API service and database together in one container. It helps to ease the pain of inter-linking various separate containers: services: web: git_url: git@github.com:vkohli/sampleapp.git git_branch: test command: rackup -p 3000 build_command: rake db:migrate deploy_command: rake db:migrate log_folder: /usr/src/app/log ports: ["3000:80:443", "4000"] volumes: ["/tmp:/tmp/mnt_folder"] health: default api: image: quay.io/john/node command: node test.js ports: ["1337:8080"] requires: ["web"] databases: - "mysql" - "redis" Infrastructure container As we have talked about the container usage in development environment, there is one big category missing the usage of container for infrastructure services such as proxy setup which provides a cohesive environment in order to provide the access to application. In the following mentioned Dockerfile example we can see that haproxy is installed and links to its configuration file is provided: FROM debian:wheezy ADD wheezy-backports.list /etc/apt/sources.list.d/ RUN apt-get update RUN apt-get -y install haproxy ADD haproxy.cfg /etc/haproxy/haproxy.cfg CMD ["haproxy", "-db", "-f", "/etc/haproxy/haproxy.cfg"] EXPOSE 80 EXPOSE 443 Haproxy.cfg is the configuration file responsible for authenticating a user: backend test acl authok http_auth(adminusers) http-request auth realm vkohli if !authok server s1 192.168.0.44:8084 Unikernels Unikernels compiles source code into a custom operating system that includes only the functionality required by the application logic producing a specialized single address space machine image, eliminating unnecessary code. Unikernels is built using library operating system, which has the following benefits compared to traditional OS: Fast Boot time: Unikernels make provisioning highly dynamic and can boot in less than second. Small Footprint: Unikernel code base is smaller than the traditional OS equivalents and pretty much easy to manage. Improved security: As unnecessary code is not deployed, the attack surface is drastically reduced. Fine-grained optimization: Unikernels are constructed using compile tool chain and are optimized for device drivers and application logic to be used. Unikernels matches very well with the micro-services architecture as both source code and generated binaries can be easily version-controlled and are compact enough to be rebuild. Whereas on other side modifying VM's is not permitted and changes can be only made to source code which is time-consuming and hectic. For example, if the application doesn't require disk access and display facility. Unikernels can help to remove this unnecessary device drivers and display functionality from the Kernel. Thus production system becomes minimalistic only packaging the application code, runtime environment and OS facilities which is the basic concept of immutable application deployment where new image is constructed if any application change is required in production servers: Figure 7: Transition from traditional container to Unikernel based containers Container and Unikernels are best fit for each other. Recently, Unikernel system has become part of Docker and the collaboration of both this technology will be seen sooner in the next Docker release. As it is explained in the preceding diagram the first one shows the traditional way of packaging one VM supporting multiple Docker containers. The next step shows 1:1 map (one container per VM) which allows each application to be self-contained and gives better resource usage but creating a separate VM for each container adds an overhead. In the last step we can see the collaboration of Unikernels with the current existing Docker tools and eco-system, where container will get the Kernel low-library environment specific to its need. Adoption of Unikernels in Docker toolchain will accelerate the progress of Unikernels and it will be widely used and will be understood as packaging model and runtime framework making Unikernels as another type of container. After the Unikernels abstraction for Docker developers, we will be able to choose either to use traditional Docker container or use the Unikernel container in order to create the production environment. Summary In this article we studied about the basic containerization concept with help of application and OS-based containers. And the differences between them explained in this article will clearly help the developers to choose the containerization approach which fits perfectly for their system. We have thrown some light around the Docker technology, its advantages and lifecycle of Docker container. The eight Docker design patterns explained in this article clearly shows the way to implement Docker containers in production environment. Resources for Article: Further resources on this subject: Orchestration with Docker Swarm [article] Benefits and Components of Docker [article] Docker Hosts [article]
Read more
  • 0
  • 1
  • 12467

article-image-common-php-scenarios
Packt
24 Jan 2017
11 min read
Save for later

Common PHP Scenarios

Packt
24 Jan 2017
11 min read
Introduction In this article by Tim Butler, author of the book Nginx 1.9 Cookbook, we'll go through examples of the more common PHP scenarios and how to implement them with Nginx. PHP is a thoroughly tested product to use with Nginx because it is the most popular web-based programming language. It powers sites, such as Facebook, Wikipedia, and every WordPress-based site, and its popularity hasn't faded as other languages have grown. (For more resources related to this topic, see here.) As WordPress is the most popular of the PHP systems, I've put some additional information to help with troubleshooting. Even if you're not using WordPress, some of this information may be helpful if you run into issues with other PHP frameworks. Most of the recipes expect that you have a working understanding of the PHP systems, so not all of the setup steps for the systems will be covered. In order to keep the configurations as simple as possible, I haven't included details such as cache headers or SSL configurations in these recipes. Configuring Nginx for WordPress Covering nearly 30 percent of all websites, WordPress is certainly the Content Management System (CMS) of choice by many. Although it came from a blogging background, WordPress is a very powerful CMS for all content types and powers some of the world's busiest websites. By combing it with Nginx, you can deploy a highly scalable web platform. You can view the official WordPress documentation on Nginx at https://codex.wordpress.org/Nginx. We'll also cover some of the more complex WordPress scenarios, including multisite configurations with subdomains and directories. Let's get started. Getting ready To compile PHP code and run it via Nginx, the preferred method is via PHP-FPM, a high speed FastCGI Process Manager. We'll also need to install PHP itself and for the sake of simplicity, we'll stick with the OS supplied version. Those seeking the highest possible performance should ensure they're running PHP 7 (released December 3, 2015), which can offer a 2-3x speed improvement for WordPress. To install PHP-FPM, you should run the following on a Debian/Ubuntu system: sudo apt-get install php5-fpm For those running CentOS/RHEL, you should run the following: sudo yum install php-fpm As PHP itself is a prerequisite for the php-fpm packages, it will also be installed. Note: Other packages such as MySQL will be required if you're intending on running this on a single VPS instance. Consult the WordPress documentation for a full list of requirements. How to do it… At this instance, we're simply using a standalone WordPress site, which would be deployed in many personal and business scenarios. This is the typical deployment for WordPress. For ease of management, I've created a dedicated config file just for the WordPress site (/etc/nginx.conf.d/generic-wordpress.conf): server { listen 80; server_name wordpressdemo.nginxcookbook.com; access_log /var/log/nginx/access.log combined; location / { root /var/www/html; try_files $uri $uri/ /index.php?$args; } location ~ .php$ { fastcgi_pass unix:/var/run/php5-fpm.sock; fastcgi_index index.php; fastcgi_param SCRIPT_FILENAME $document_root$fastcgi_script_name; include fastcgi_params; } } Restart Nginx to pickup the new configuration file and then check your log files if there are any errors. If you're installing WordPress from scratch, you should see the following: You can complete the WordPress installation if you haven't already. How it works… For the root URL call, we have a new try_files directive, which will attempt to load the files in the order specified, but will fallback to the last parameter if they all fail. For this WordPress example, it means that any static files will be served if they exist on the system, then fallback to /index.php?args if this fails. This can also be very handy for automatic maintenance pages too. The args rewrite allows the permalinks of the site to be in a much more human form. For example, if you have a working WordPress installation, you can see links such as the one shown in the following image: Lastly, we process all PHP files via the FastCGI interface to PHP-FPM. In the preceding example, we're referencing the Ubuntu/Debian standard; if you're running CentOS/RHEL, then the path will be /var/run/php-fpm.sock. Nginx is simply proxying the connection to the PHP-FPM instance, rather than being part of Nginx itself. This separation allows for greater resource control, especially since the number of incoming requests to the webserver don't necessarily match the number of PHP requests for a typical website. There’s more… Take care when copying and pasting any configuration files. It's very easy to miss something and have one thing slightly different in your environment, which will cause issues with the website working as expected. Here's a quick lookup table of various other issues which you may come across: Error What to check 502 Bad Gateway File ownership permissions for the PHP-FPM socket file 404 File Not Found Check for the missing index index.php directive 403 Forbidden Check for the correct path in the root directive Your error log (defaults to /var/log/nginx/error.log) will generally contain a lot more detail in regard to the issue you're seeing compared with what's displayed in the browser. Make sure you check the error log if you receive any errors. Hint: Nginx does not support .htaccess files. If you see examples on the web referencing a .htaccess files, these are Apache specific. Make sure any configurations you're looking at are for Nginx. WordPress multisite with Nginx WordPress multisites (also referred to as network sites) allow you to run multiple websites from the one codebase. This can reduce the management burden of having separate WordPress installs when you have similar sites. For example, if you have a sporting site with separate news and staff for different regions, you can use a Multisite install to accomplish this. How to do it... To convert a WordPress site into a multisite, you need to add the configuration variable into your config file: define( 'WP_ALLOW_MULTISITE', true ); Under the Tools menu, you'll now see an extra menu called Network Setup. This will present you with two main options, Sub-domains and Sub-directories. This is the two different ways the multisite installation will work. The Sub-domains option have the sites separated by domain names, for example, site1.nginxcookbook.com and site2.nginxcookbook.com. The Sub-directories option mean that the sites are separated by directories, for example, www.nginxcookbook.com/site1 and www.nginxcookbook.com/site2. There's no functional difference between the two, it's simply an aesthetic choice. However, once you've made your choice, you cannot return to the previous state. Once you've made the choice, it will then provide the additional code to add to your wp-config.php file. Here's the code for my example instance, which is subdirectory based: define('MULTISITE', true); define('SUBDOMAIN_INSTALL', false); define('DOMAIN_CURRENT_SITE', 'wordpress.nginxcookbook.com'); define('PATH_CURRENT_SITE', '/'); define('SITE_ID_CURRENT_SITE', 1); define('BLOG_ID_CURRENT_SITE', 1); Because Nginx doesn't support .htaccess files, the second part of the WordPress instructions will not work. Instead, we need to modify the Nginx configuration to provide the rewrite rules ourselves. In the existing /etc/nginx/conf.d/wordpress.conf file, you'll need to add the following just after the location / directive: if (!-e $request_filename) { rewrite /wp-admin$ $scheme://$host$uri/ permanent; rewrite ^(/[^/]+)?(/wp-.*) $2 last; rewrite ^(/[^/]+)?(/.*.php) $2 last; } Although the if statements are normally avoided if possible, at this instance, it will ensure the subdirectory multisite configuration works as expected. If you're expecting a few thousand concurrent users on your site, then it may be worthwhile investigating the static mapping of each site. There are plugins to assist with the map generations for this, but they are still more complex compared to the if statement. Subdomains If you've selected subdomains, your code to put in wp-config.php will look like this: define('MULTISITE', true); define('SUBDOMAIN_INSTALL', true); define('DOMAIN_CURRENT_SITE', 'wordpressdemo.nginxcookbook.com'); define('PATH_CURRENT_SITE', '/'); define('SITE_ID_CURRENT_SITE', 1); define('BLOG_ID_CURRENT_SITE', 1); You'll also need to modify the Nginx config as well to add the wildcard in for the server name: server_name *.wordpressdemo.nginxcookbook.com wordpressdemo.nginxcookbook.com; You can now add in the additional sites such as site1.wordpressdemo.nginxcookbook.com and there won't be any changes required for Nginx. See also Nginx recipe page: https://www.nginx.com/resources/wiki/start/topics/recipes/wordpress/ WordPress Codex page: https://codex.wordpress.org/Nginx Running Drupal using Nginx With version 8 recently released and a community of over 1 million supporters, Drupal remains a popular choice when it comes to a highly flexible and functional CMS platform. Version 8 has over 200 new features compared to version 7, aimed at improving both the usability and manageability of the system. This cookbook will be using version 8.0.5. Getting ready This example assumes you already have a working instance of Drupal or are familiar with the installation process. You can also follow the installation guide available at https://www.drupal.org/documentation/install. How to do it... This recipe is for a basic Drupal configuration, with the Drupal files located in /var/www/vhosts/drupal. Here's the configuration to use: server { listen 80; server_name drupal.nginxcookbook.com; access_log /var/log/nginx/drupal.access.log combined; index index.php; root /var/www/vhosts/drupal/; location / { try_files $uri $uri/ /index.php?$args; } location ~ (^|/). { return 403; } location ~ /vendor/.*.php$ { deny all; return 404; } location ~ .php$|^/update.php { fastcgi_pass unix:/var/run/php5-fpm.sock; fastcgi_split_path_info ^(.+?.php)(|/.*)$; fastcgi_index index.php; fastcgi_param SCRIPT_FILENAME $document_root$fastcgi_script_name; include fastcgi_params; } } How it works… Based on a simple PHP-FPM structure, we make a few key changes specific for the Drupal environment. The first change is as follows: location ~ (^|/). { return 403; } We put a block in for any files beginning with a dot, which are normally hidden and/or system files. This is to prevent accidental information leakage: location ~ /vendor/.*.php$ { deny all; return 404; } Any PHP file within the vendor directory is also blocked, as they shouldn't be called directly. Blocking the PHP files limits any potential exploit opportunity which could be discovered in third-party code. Lastly, Drupal 8 changed the way the PHP functions are called for updates, which causes any old configuration to break. The location directive for the PHP files looks like this: location ~ .php$|^/update.php { This is to allow the distinct pattern that Drupal uses, where the PHP filename could be midway through the URI. We also modify how the FastCGI process splits the string, so that we ensure we always get the correct answer: fastcgi_split_path_info ^(.+?.php)(|/.*)$; See also Nginx Recipe: https://www.nginx.com/resources/wiki/start/topics/recipes/drupal/ Using Nginx with MediaWiki MediaWiki, most recognized by its use with Wikipedia, is the most popular open source wiki platform available. With features heavily focused on the ease of editing and sharing content, MediaWiki makes a great system to store information you want to continually edit: Getting ready This example assumes you already have a working instance of MediaWiki or are familiar with the installation process. For those unfamiliar with the process, it's available online at https://www.mediawiki.org/wiki/Manual:Installation_guide. How to do it... The basic Nginx configuration for MediaWiki is very similar to many other PHP platforms. It has a flat directory structure which easily runs with basic system resources. Here's the configuration: server { listen 80; server_name mediawiki.nginxcookbook.com; access_log /var/log/nginx/mediawiki.access.log combined; index index.php; root /var/www/vhosts/mediawiki/; location / { try_files $uri $uri/ /index.php?$args; } location ~ .php$ { fastcgi_pass unix:/var/run/php5-fpm.sock; fastcgi_index index.php; fastcgi_param SCRIPT_FILENAME $document_root$fastcgi_script_name; include fastcgi_params; } } The default installation doesn't use any rewrite rules, which means you'll get URLs such as index.php?title=Main_Page instead of the neater (and more readable) /wiki/Main_Page. To enable this, we need to edit the LocalSettings.php file and add the following lines: $wgArticlePath = "/wiki/$1"; $wgUsePathInfo = TRUE; This allows the URLs to be rewritten in a much neater format. See also NGINX Recipe: https://www.nginx.com/resources/wiki/start/topics/recipes/mediawiki/ Summary In this article we learned common PHP scenarios and how to configure them with Nginx. The first recipe talks about how to configure Nginx for WordPress. Then we learned how to set up a WordPress multisite. In third recipe we discussed how to configure and run Drupal using Nginx. In the last recipe we learned how to configure Nginx for MediaWiki. Resources for Article: Further resources on this subject: A Configuration Guide [article] Nginx service [article] Getting Started with Nginx [article]
Read more
  • 0
  • 0
  • 28486

article-image-storage-apache-cassandra
Packt
23 Jan 2017
42 min read
Save for later

The Storage - Apache Cassandra

Packt
23 Jan 2017
42 min read
In this article by Raúl Estrada, the author of the book Fast Data Processing Systems with SMACK Stack we will learn about Apache Cassandra. We have reached the part where we talk about storage. The C in the SMACK stack refers to Cassandra. The reader may wonder; why not use a conventional database? The answer is that Cassandra is the database that propels some giants like Walmart, CERN, Cisco, Facebook, Netflix, and Twitter. Spark uses a lot of Cassandra’s power. The application efficiency is greatly increased using the Spark Cassandra Connector. This article has the following sections: A bit of history NoSQL Apache Cassandra installation Authentication and authorization (roles) Backup and recovery Spark +a connector (For more resources related to this topic, see here.) A bit of history In Greek mythology, there was a priestess who was chastised for her treason againstthe God, Apollo. She asked forthe power of prophecy in exchange for a carnal meeting; however, she failed to fulfill her part of the deal. So, she received a punishment; she would have the power of prophecy, but no one would ever believe her forecasts. This priestess’s name was Cassandra. Movingto more recenttimes, let’s say 50 years ago, in the world of computing there have been big changes. In 1960, the HDD (Hard Disk Drive) took precedence over the magnetic strips which facilitate data handling. In 1966, IBM created the Information Management System (IMS) for the Apollo space program from whose hierarchical models later developed IBM DB2. In 1970s, a model that is fundamentally changing the existing data storage methods appeared, called the relational data model. Devised by Codd as an alternative to IBM’s IMS and its organization mode and data storage in 1985, his work presented 12 rules that a database should meet in order to be considered a relational database. The Web (especially social networks) appeared and demanded the storage oflarge amounts of data. The Relational Database Management System (RDBMS) scales the actual costs of databases, the number of users, amount of data, response time, or the time it takes to make a specific query on a database. In the beginning, it waspossible to solve through vertical scaling: the server machine is upgraded with more RAM, higher processors, and larger and faster HDDs. Now we can mitigate the problem, but it will not disappear. When the same problem occurs again, and the server cannot be upgraded, the only solution is to add a new server, which itself may hide unplanned costs: OS license, Database Management System (DBMS), and so on, without mentioning the data replication, transactions, and data consistency under normal use. One solution of such problems is the use of NoSQL databases. NoSQL was born from the need to process large amounts of data based on large hardware platforms built through clustering servers. The term NoSQL is perhaps not precise. A more appropriate term should be Not Only SQL. It is used on several non-relational databases such as Apache Cassandra, MongoDB, Riak, Neo4J, and so on, which have becomemore widespread in recent years. NoSQL We will read NoSQL as Not only SQL (SQL, Structured Query Language). NoSQL is a distributed database with an emphasis on scalability, high availability, and ease of administration; the opposite of established relational databases. Don’t think it as a direct replacement for RDBMS, rather, an alternative or a complement. The focus is in avoiding unnecessary complexity, the solution for data storage according to today’s needs, and without a fixed scheme. Due its distributed, the cloud computing is a great NoSQL sponsor. A NoSQL database model can be: Key-value/Tuple based For example, Redis, Oracle NoSQL (ACID compliant), Riak, Tokyo Cabinet / Tyrant, Voldemort, Amazon Dynamo, and Memcached and is used by Linked-In, Amazon, BestBuy, Github, and AOL. Wide Row/Column-oriented-based For example, Google BigTable, Apache Cassandra, Hbase/Hypertable, and Amazon SimpleDB and used by Amazon, Google, Facebook, and RealNetworks Document-based For example, CouchDB (ACID compliant), MongoDB, TerraStore, and Lotus Notes (possibly the oldest) and used in various financial and other relevant institutions: the US army, SAP, MTV, and SourceForge Object-based For example, db4o, Versant, Objectivity, and NEO and used by Siemens, China Telecom, and the European Space Agency. Graph-based For example, Neo4J, InfiniteGraph, VertexDb, and FlocDb and used by Twitter, Nortler, Ericson, Qualcomm, and Siemens. XML, multivalue, and others In Table 4-1, we have a comparison ofthe mentioned data models: Model Performance Scalability Flexibility Complexity Functionality key-value high high high low depends column high high high low depends document high high high low depends graph depends depends high high graph theory RDBMS depends depends low moderate relational algebra Table 4-1: Categorization and comparison NoSQL data model of Scofield and Popescu NoSQL or SQL? This is thewrong question. It would be better to ask the question: What do we need? Basically, it all depends on the application’s needs. Nothing is black and white. If consistency is essential, use RDBMS. If we need high-availability, fault tolerance, and scalability then use NoSQL. The recommendation is that in a new project, evaluate the best of each world. It doesn’t make sense to force NoSQL where it doesn’t fit, because its benefits (scalability, read/write speed in entire order of magnitude, soft data model) are only conditioned advantages achieved in a set of problems that can be solved, per se. It is necessary to carefully weigh, beyond marketing, what exactly is needed, what kind of strategy is needed, and how they will be applied to solve our problem. Consider using a NoSQL database only when you decide that this is a better solution than SQL. The challenges for NoSQL databases are: elastic scaling, cost-effective, simple and flexible. In table 4-2, we compare the two models: NoSQL RDBMS Schema-less Relational schema Scalable read/write Scalable read Auto high availability Custom high availability Limited queries Flexible queries Eventual consistency Consistency BASE ACID Table 4-2: Comparison of NoSQL and RDBMS CAP Brewer’s theorem In 2000, in Portland Oregon, the United States held the nineteenth international symposium on principles of distributed computing where keynote speaker Eric Brewer, a professor at UC Berkeley talked. In his presentation, among other things, he said that there are three basic system requirements which have a special relationship when making the design and implementation of applications in a distributed environment, and that a distributed system can have a maximum of two of the three properties (which is the basis of his theorem). The three properties are: Consistency: This property says that the data on one node must be the same data when read from a second node, the second node must show exactly the same data (could be a delay, if someone else in between is performing an update, but not different). Availability: This property says that a failure on one node doesn’t mean the loss of its data; the system must be able to display the requested data. Partition tolerance: This property says that in the event of a breakdown in communication between two nodes, the system should still work, meaning the data will still be available. In Figure 4-1, we show the CAP Brewer’s theorem with some examples.   Figure 4-1 CAP Brewer’s theorem Apache Cassandra installation In the Facebook laboratories, although not visible to the public, new software is developed, for example, the junction between two concepts involving the development departments of Google and Amazon. In short, Cassandra is defined as a distributed database. Since the beginning, the authors took the task of creating a scalable database massively decentralized, optimized for read operations when possible, painlessly modifying data structures, and with all this, not difficult to manage. The solution was found by combining two existing technologies: Google’s BigTable and Amazon’s Dynamo.One of the two authors, A. Lakshman, had earlier worked on BigTable and he borrowed the data model layout, while Dynamo contributed with the overall distributed architecture. Cassandra is written in Java and for good performance it requires the latest possible JDK version. In Cassandra 1.0, they used another open source project Thriftfor client access, which also came from Facebook and is currently an Apache Software project. In Cassandra 2.0, Thrift was removed in favor of CQL. Initially, thrift was not made just for Cassandra, but it is a software library tool and code generator for accessing backend services. Cassandra administration is done with the command-line tools or via the JMX console, the default installation allows us to use additional client tools. Since this is a server cluster, it hasdifferent administration rules and it is always good to review thedocumentation to take advantage of other people’s experiences. Cassandra managed the very demanding taskssuccessfully. Often used on site, serving a huge number of users (such as Twitter, Digg, Facebook, and Cisco) that, relatively, often change their complex data models to meet the challenges that will come later, and usually do not have to dealwith expensive hardware or licenses. At the time of writing, the Cassandra homepage (http://cassandra.apache.org) says that Apple Inc. for example, has a 75000 node cluster storing 10 Petabytes. Data model The storage model of Cassandra could be seen as a sorted HashMap of sorted HashMaps. Cassandra is a database that stores the rows in the form of key-value. In this model, the number of columns is not predefined in advance as in standard relational databases, but a single row can contain several columns. The column (Figure 4-2, Column) is the smallest atomic unit model. Each element in the column consists of a triplet: a name, a value (stored as a series of bytes without regard to the source type), and a timestamp (the time used to determine the most recent record). Figure4-2: Column All data triplets are obtained from the client, and even a timestamp. Thus, the row consists of a key and a set of data triplets (Figure 4-3).Here is how the super column will look: Figure 4-3: Super column In addition, the columns can be grouped into so-called column families (Figure 4-4, Column family), which would be somehow equivalent to the table and can be indexed: Figure 4-4: Column family A higher logical unit is the super column (as shown in the followingFigure 4-5, Super column family), in which columns contain other columns: Figure 4-5: Super column family Above all is the key space (As shown in Figure 4-6, Cluster with Key Spaces), which would be equivalent to a relational schema andis typically used by one application. The data model is simple, but at the same time very flexible and it takes some time to become accustomed to the new way of thinking while rejecting all the SQL’s syntax luxury. The replication factor is unique per keyspace. Moreover, keyspace could span multiple clusters and have different replication factors for each of them. This is used in geo-distributed deployments. Figure 4-6: Cluster with key spaces Data storage Apache Cassandra is designed to process large amounts of data in a short time; this way of storing data is taken from her big brother, Google’s Bigtable. Cassandra has a commit log file in which all the new data is recorded in order to ensure their sustainability. When data is successfully written on the commit log file, the recording of the freshest data is stored in a memory structure called memtable (Cassandra considers a writing failure if the same information is in the commit log and in memtable). Data within memtables issorted by Row key. When memtable is full, its contents are copied to the hard drive in a structure called Sorted String Table (SSTable). The process of copying content from memtable into SSTable is called flush. Data flush is performed periodically, although it could be carried out manually (for example, before restarting a node) through node tool flush commands. The SSTable provides a fixed, sorted map of row and value keys. Data entered in one SSTable cannot be changed, but is possible to enter new data. The internal structure of SSTable consists of a series of blocks of 64Kb (the block size can be changed), internally a SSTable is a block index used to locate blocks. One data row is usually stored within several SSTables so reading a single data row is performed in the background combining SSTables and the memtable (which have not yet made flush). In order to optimize the process of connecting, Cassandra uses a memory structure called Bloomfilter. Every SSTable has a bloom filter that checks if the requested row key is in the SSTable before look up in the disk. In order to reduce row fragmentation through several SSTables, in the background Cassandra performs another process: the compaction, a merge of several SSTables into a single SSTable. Fragmented data iscombined based on the values ​​of a row key. After creating a new SSTable, the old SSTable islabeled as outdated and marked in the garbage collector process for deletion. Compaction has different strategies: size-tiered compaction and leveled compaction and both have their own benefits for different scenarios. Installation To install Cassandra, go to http://www.planetcassandra.org/cassandra/. Installation is simple. After downloading the compressed files, extract them and change a couple of settings in the configuration files (set the new directory path). Run the startup scripts to activate a single node, and the database server. Of course, it is possible to use Cassandra in only one node, but we lose its main power, the distribution. The process of adding new servers to the cluster is called bootstrap and is generally not a difficult operation. Once all the servers are active, they form a ring of nodes, none of which is central meaning without a main server. Within the ring, the information propagation on all servers is performed through a gossip protocol. In short, one node transmits information about the new instances to only some of their known colleagues, and if one of them already knows from other sources about the new node, the first node propagation is stopped. Thus, the information about the node is propagated in an efficient and rapid way through the network. It is necessary for a new node activation to seed its information to at least one existing server in the cluster so the gossip protocol works. The server receives its numeric identifier, and each of the ring nodes stores its data. Which nodes store the information depends on the hash MD5 key-value (a combination of key-value) as shown in Figure 4-7, Nodes within a cluster. Figure 4-7: Nodes within a cluster The nodes are in a circular stack, that is, a ring, and each record is stored on multiple nodes. In case of failure of one of them, the data isstill available. Nodes are occupied according to their identifier integer range, that is, if the calculated value falls into a node range, then the data is saved there. Saving is not performed on only one node, more is better, an operation is considered a success if the data is correctly stored at the most possible nodes. All this is parameterized. In this way, Cassandra achieves sufficient data consistency and provides greater robustness of the entire system, if one node in the ring fails, is always possible to retrieve valid information from the other nodes. In the event that a node comes back online again, it is necessary to synchronize the data on it, which is achieved through the reading operation. The data is read from all the ring servers, a node saves just the data accepted as valid, that is, the most recent data, the data comparison is made according to the timestamp records. The nodes that don’t have the latest information, refresh theirdata in a low priority back-end process. Although this brief description of the architecture makes it sound like it is full of holes, in reality everything works flawlessly. Indeed, more servers in the game implies a better general situation. DataStax OpsCenter In this section, we make the Cassandra installation on a computer with a Windows operating system (to prove that nobody is excluded). Installing software under the Apache open license can be complicated on a Windows computer, especially if it is new software, such as Cassandra. To make things simpler we will use a distribution package for easy installation, start-up and work with Cassandra on a Windows computer. The distribution used in this example is called DataStax Community Edition. DataStax contains Apache Cassandra, along with the Cassandra Query Language (CQL) tool and the free version of DataStax OpsCenter for management and monitoring the Cassandra cluster. We can say that OpsCenter is a kind of DBMS for NoSQL databases. After downloading the installer from the DataStax’s official site, the installation process is quite simple, just keep in mind that DataStax supports Windows 7 and Windows Server 2008 and that DataStax used on a Windows computer must have the Chrome or Firefox web browser (Internet explorer is not supported). When starting DataStax on a Windows computer, DataStax will open asin Figure 4-8, DataStax OpsCenter. Figure 4-8: DataStax OpsCenter DataStax consists of a control panel (dashboard), in which we review the events, performance, and capacity of the cluster and also see how many nodes belong to our cluster (in this case a single node). In cluster control, we can see the different types of views (ring, physical, list). Adding a new key space (the equivalent to creating a database in the classic DBMS) is done through the CQLShell using CQL or using the DataStax data modeling. Also, using the data explorer we can view the column family and the database. Creating a key space The main tool for managing Cassandra CQL runs in a console interface and this tool is used to add new key spaces from which we will create a column family. The key space is created as follows: cqlsh> create keyspace hr with strategy_class=‘SimpleStrategy’ and strategy_options_replication_factor=1; After opening CQL Shell, the command create keyspace will make a new key space, the strategy_class = ‘SimpleStrategy’parameter invokes class replication strategy used when creating new key spaces. Optionally,strategy_options:replication_factor = 1command creates a copy of each row in each cluster node, and the value replication_factor set to 1 produces only one copy of each row on each node (if we set to 2, we will have two copies of each row on each node). cqlsh> use hr; cqlsh:hr> create columnfamily employee (sid int primary key, ... name varchar, ... last_name varchar); There are two types of keyspaces: SimpleStrategy and NetworkTopologyStrategy, whose syntax is as follows: { ‘class’ : ‘SimpleStrategy’, ‘replication_factor’ : <integer> }; { ‘class’ : ‘NetworkTopologyStrategy’[, ‘<data center>‘ : <integer>, ‘<data center>‘ : <integer>] . . . }; When NetworkTopologyStrategyis configured as the replication strategy, we set up one or more virtual data centers. To create a new column family, we use the create command; select the desired Key Space, and with the command create columnfamily example, we create a new table in which we define the id an integer as a primary key and other attributes like name and lastname. To make a data entry in column family, we use the insert command: insert into <table name> (<attribute_1>, < attribute_2> ... < attribute_n>); When filling data tables we use the common SQL syntax: cqlsh:hr>insert into employee (sid, name, lastname) values (1, ‘Raul’, ‘Estrada’); So we enter data values. With the selectcommand we can review our insert: cqlsh:hr> select * from employee; sid | name | last_name ----+------+------------ 1 | Raul | Estrada Authentication and authorization (roles) In Cassandra, the authentication and authorization must be configured on the cassandra.yamlfile and two additional files. The first file is to assign rights to users over the key space and column family, while the second is to assign passwords to users. These files are called access.properties and passwd.properties, and are located in the Cassandra installation directory. These files can be opened using our favorite text editor in order to be successfully configured. Setting up a simple authentication and authorization The following steps are: In the access.properitesfile we add the access rights to users and the permissions to read and write certain key spaces and columnfamily.Syntax: keyspace.columnfamily.permits = users Example 1: hr <rw> = restrada Example 2: hr.cars <ro> = restrada, raparicio In example 1, we give full rights in the Key Space hr to restrada while in example 2 we give read-only rights to users to the column family cars. In the passwd.propertiesfile, user names are matched to passwords, onthe left side of the equal sign we write username and onthe right side the password: Example: restrada = Swordfish01 After we change the files, before restarting Cassandra it is necessary to type the following command in the terminal in order to reflect the changes in the database: $ cd <installation_directory> $ sh bin/cassandra -f -Dpasswd.properties = conf/passwd.properties -Daccess.properties = conf/access.properties Note: The third step of setting up authentication and authorization doesn’t work onWindows computers and is just needed on Linux distributions. Also, note that user authentication and authorization should not be solved through Cassandra, for safety reasons, in the latest Cassandra versions this function is not included. Backup The purpose of making Cassandra a NoSQL database is because when we create a single node, we make a copy of it. Copying the database to other nodes and the exact number of copies depend on the replication factor established when we create a new key space. But as any other standard SQL database, Cassandra offers to create a backup on the local computer. Cassandra creates a copy of the base using snapshot. It is possible to make a snapshot of all the key spaces, or just one column family. It is also possible to make a snapshot of the entire cluster using the parallel SSH tool (pssh). If the user decides to snapshot the entire cluster, it can be reinitiated and use an incremental backup on each node. Incremental backups provide a way to get each node configured separately, through setting the incremental_backupsflagto truein cassandra.yaml. When incremental backups are enabled, Cassandra hard-links each flushed SSTable to a backups directory under the keyspace data directory. This allows storing backups offsite without transferring entire snapshots. To snapshot a key space we use the nodetool command: Syntax: nodetool snapshot -cf <ColumnFamily><keypace> -t <snapshot_name> Example: nodetool snapshot -cf cars hr snapshot1 The snapshot is stored in the Cassandra installation directory: C:Program FilesDataStax Communitydatadataenexamplesnapshots Compression The compression increases the cluster nodes capacity reducing the data size on the disk. With this function, compression also enhances the server’s disk performance. Compression in Cassandra works better when compressing a column family with a lot of columns, when each row has the same columns, or when we have a lot of common columns with the same data. A good example of this is a column family that contains user information such as user name and password because it is possible that they have the same data repeated. As the greater number of the same data to be extended through the rows, the compression ratio higher is. Column family compression is made with the Cassandra-CLI tool. It is possible to update existing columns families or create a new column family with specific compression conditions, for example, the compression shown here: CREATE COLUMN FAMILY users WITH comparator = ‘UTF8Type’ AND key_validation_class = ‘UTF8Type’ AND column_metadata = [ (column_name: name, validation_class: UTF8Type) (column_name: email, validation_class: UTF8Type) (column_name: country, validation_class: UTF8Type) (column_name: birth_date, validation_class: LongType) ] AND compression_options=(sstable_compression:SnappyCompressor, chunk_length_kb:64); We will see this output: Waiting for schema agreement.... ... schemas agree across the cluster After opening the Cassandra-CLI, we need to choose thekey space where the new column family would be. When creating a column family, it is necessary to state that the comparator (UTF8 type) and key_validation_class are of the same type. With this we will ensure that when executing the command we won’t have an exception (generated by a bug). After printing the column names, we set compression_options which has two possible classes: SnappyCompresor that provides faster data compression or DeflateCompresor which provides a higher compression ratio. The chunk_length adjusts compression size in kilobytes. Recovery Recovering a key space snapshot requests all the snapshots made for a certain column family. If you use an incremental backup, it is also necessary to provide the incremental backups created after the snapshot. There are multiple ways to perform a recovery from the snapshot. We can use the SSTable loader tool (used exclusively on the Linux distribution) or can recreate the installation method. Restart node If the recovery is running on one node, we must first shutdown the node. If the recovery is for the entire cluster, it is necessary to restart each node in the cluster. Here is the procedure: Shut down the node Delete all the log files in:C:Program FilesDataStax Communitylogs Delete all .db files within a specified key space and column family:C:Program FilesDataStax Communitydatadataencars Locate all Snapshots related to the column family:C:Program FilesDataStax Communitydatadataencarssnapshots1,351,279,613,842, Copy them to: C:Program FilesDataStax Communitydatadataencars Re-start the node. Printing schema Through DataStax OpsCenter or Apache Cassandra CLI we can obtain the schemes (Key Spaces) with the associated column families, but there is no way to make a data export or print it. Apache Cassandra is not RDBMS and it is not possible to obtain a relational model scheme from the key space database. Logs Apache Cassandra and DataStax OpsCenter both use the Apache log4j logging service API. In the directory where DataStax is installed, under Apache-Cassandra and opsCenter is the conf directory where the file log4j-server.properties is located, log4j-tools.properties for apache-cassandra andlog4j.properties for OpsCenter. The parameters of the log4j file can be modified using a text editor, log files are stored in plain text in the...DataStax Communitylogsdirectory, here it is possible to change the directory location to store the log files. Configuring log4j log4j configuration files are divided into several parts where all the parameters are set to specify how collected data is processed and written in the log files. For RootLoger: # RootLoger level log4j.rootLogger = INFO, stdout, R This section defines the data level, respectively, to all the events recorded in the log file. As we can see in Table 4-3, log level can be: Level Record ALL The lowest level, all the events are recorded in the log file DEBUG Detailed information about events ERROR Information about runtime errors or unexpected events FATAL Critical error information INFO Information about the state of the system OFF The highest level, the log file record is off TRACE Detailed debug information WARN Information about potential adverse events (unwanted/unexpected runtime errors) Table 4-3 Log4J Log level For Standard out stdout: # stdout log4j.appender.stdout = org.apache.log4j.ConsoleAppender log4j.appender.stdout.layout = org.apache.log4j.PatternLayout log4j.appender.stdout.layout.ConversionPattern= %5p %d{HH:mm:ss,SSS} %m%n Through the StandardOutputWriterclass,we define the appearance of the data in the log file. ConsoleAppenderclass is used for entry data in the log file, and theConversionPattern class defines the data appearance written into a log file. In the diagram, we can see how the data looks like stored in a log file, which isdefined by the previous configuration. Log file rotation In this example, we rotate the log when it reaches 20 Mb and we retain just 50 log files. # rolling log file log4j.appender.R=org.apache.log4j.RollingFileAppender log4j.appender.R.maxFileSize=20MB log4j.appender.R.maxBackupIndex=50 log4j.appender.R.layout=org.apache.log4j.PatternLayout log4j.appender.R.layout.ConversionPattern=%5p [%t] %d{ISO8601} %F (line %L) %m%n This part sets the log files. TheRollingFileAppenderclass inherits from FileAppender, and its role is to make a log file backup when it reaches a given size (in this case 20 MB). TheRollingFileAppender class has several methods, these two are the most used: public void setMaxFileSize( String value ) Method to define the file size and can take a value from 0 to 263 using the abbreviations KB, MB, GB.The integer value is automatically converted (in the example, the file size is limited to 20 MB): public void setMaxBackupIndex( int maxBackups ) Method that defines how the backup file is stored before the oldest log file is deleted (in this case retain 50 log files). To set the parameters of the location where the log files will be stored, use: # Edit the next line to point to your logs directory log4j.appender.R.File=C:/Program Files (x86)/DataStax Community/logs/cassandra.log User activity log log4j API has the ability to store user activity logs.In production, it is not recommended to use DEBUG or TRACE log level. Transaction log As mentioned earlier, any new data is stored in the commit log file. Within thecassandra.yaml configuration file, we can set the location where the commit log files will be stored: # commit log commitlog_directory: “C:/Program Files (x86)/DataStax Community/data/commitlog” SQL dump It is not possible to make a database SQL dump, onlysnapshot the DB. CQL CQL is a language like SQL, CQL means Cassandra Query Language.With this language we make the queries on a Key Space. There are several ways to interact with a Key Space, in the previous section we show how to do it using a shell called CQL shell. Since CQL is the first way to interact with Cassandra, in Table 4-4, Shell Command Summary, we see the main commands that can be used on the CQL Shell: Command Description Cqlsh Captures command output and appends it to a file. CAPTURE Shows the current consistency level, or given a level, sets it. CONSISTENCY Imports and exports CSV (comma-separated values) data to and from Cassandra. COPY Provides information about the connected Cassandra cluster, or about the data objects stored in the cluster. DESCRIBE Formats the output of a query vertically. EXPAND Terminates cqlsh. EXIT Enables or disables query paging. PAGING Shows the Cassandra version, host, or tracing information for the current cqlsh client session. SHOW Executes a file containing CQL statements. SOURCE Enables or disables request tracing. TRACING Captures command output and appends it to a file. Table 4-4. Shell command summary For more detailed information of shell commands, visit: http://docs.datastax.com/en/cql/3.1/cql/cql_reference/cqlshCommandsTOC.html CQL commands CQL is very similar to SQLas we have already seen in this article. Table 4-5, CQL Command Summary lists the language commands. CQL, like SQL, is based on sentences/statements.These sentences are for data manipulation and work with their logical container, the key space. The same as SQL statements, they must end with a semicolon (;) Command Description ALTER KEYSPACE Change property values of a keyspace. ALTER TABLE Modify the column metadata of a table. ALTER TYPE Modify a user-defined type. Cassandra 2.1 and later. ALTER USER Alter existing user options. BATCH Write multiple DML statements. CREATE INDEX Define a new index on a single column of a table. CREATE KEYSPACE Define a new keyspace and its replica placement strategy. CREATE TABLE Define a new table. CREATE TRIGGER Registers a trigger on a table. CREATE TYPE Create a user-defined type. Cassandra 2.1 and later. CREATE USER Create a new user. DELETE Removes entire rows or one or more columns from one or more rows. DESCRIBE Provides information about the connected Cassandra cluster, or about the data objects stored in the cluster. DROP INDEX Drop the named index. DROP KEYSPACE Remove the keyspace. DROP TABLE Remove the named table. DROP TRIGGER Removes registration of a trigger. DROP TYPE Drop a user-defined type. Cassandra 2.1 and later. DROP USER Remove a user. GRANT Provide access to database objects. INSERT Add or update columns. LIST PERMISSIONS List permissions granted to a user. LIST USERS List existing users and their superuser status. REVOKE Revoke user permissions. SELECT Retrieve data from a Cassandra table. TRUNCATE Remove all data from a table. UPDATE Update columns in a row. USE Connect the client session to a keyspace. Table 4-5. CQL command summary For more detailed information of CQL commands visit: http://docs.datastax.com/en/cql/3.1/cql/cql_reference/cqlCommandsTOC.html DBMS Cluster The idea of ​​Cassandra is a database working in a cluster, that is databases on multiple nodes. Although primarily intended for Cassandra Linux distributions is building clusters on Linux servers, Cassandra offers the possibility to build clusters on Windows computers. The first task that must be done prior to setting up the cluster on Windows computers is opening the firewall for Cassandra DBMS DataStax OpsCenter. Ports that must be open for Cassandra are 7000 and 9160. For OpsCenter, the ports are 7199, 8888, 61620 and 61621. These ports are the default when we install Cassandra and OpsCenter, however, unless it is necessary, we can specify new ports. Immediately after installing Cassandra and OpsCenter on a Windows computer, it is necessary to stop the DataStax OpsCenter service, the DataStax OpsCenter agent like in Figure 4-9,Microsoft Windows display services. Figure 4-9: Microsoft Windows display services One of Cassandra’s advantages is that it automatically distributes data in the computers of the cluster using the algorithm for the incoming data. To successfully perform this, it is necessary to assign tokens to each computer in the cluster. The token is a numeric identifier that indicates the computer’s position in the cluster and the data scope in the cluster responsible for that computer. For a successful token generation can be used Python that comes within the Cassandra installation located in the DataStax’s installation directory. In the code for generating tokens, the variable num = 2 refers to the number of computers in the cluster: $ python -c “num=2; print ““n”“.join([(““token %d: %d”“ %(i,(i*(2**127)/num))) for i in range(0,num)])” We will see an output like this: token 0: 0 token 1: 88743298547982745894789547895490438209 It is necessary to preserve the value of the token because they will be required in the following steps. We now need to configure the cassandra.yaml file which we have already met in the authentication and authorization section. The cassandra.yaml file must be configured separately on each computer in the cluster. After opening the file, you need to make the following changes: Initial_token On each computer in the cluster, copy the tokens generated. It should start from the token 0 and assign each computer a unique token. Listen_adress In this section, we will enter the IP of the computer used. Seeds You need to enter the IP address of the primary (main) node in the cluster. Once the file is modified and saved, you must restart DataStax Community Server as we already saw. This should be done only on the primary node. After that it is possible to check if the cluster nodes have communication using the node tool. In node tool, enter the following command: nodetool -h localhost ring If the cluster works, we will see the following result: AddressDCRackStatusStateLeadOwnsToken -datacenter1rack1UpNormal13.41 Kb50.0%88743298547982745894789547895490438209 -datacenter1rack1UpNormal6.68 Kb50.0%88743298547982745894789547895490438209 If the cluster is operating normally,select which computer will be the primary OpsCenter (may not be the primary node). Then on that computer open opscenter.conf which can be found in the DataStax’s installation directory. In that directory, you need to find the webserver interface section and set the parameter to the value 0.0.0.0. After that, in the agent section, change the incoming_interfaceparameter to your computer IP address. In DataStax’s installation directory (on each computer in the cluster) we must configure the address.yamlfile. Within these files, set the stomp_interface local_interfaceparameters and to the IP address of the computer where the file is configured. Now the primary computer should run the DataStax OpsCenter Community and DataStax OpsCenter agent services. After that, runcomputers the DataStax OpsCentar agent service on all the nodes. At this point it is possible to open DataStax OpsCenter with anInternet browser and OpsCenter should look like Figure 4-10, Display cluster in OpsCenter. Figure 4-10: Display cluster in OpsCenter Deleting the database In Apache Cassandra, there are several ways to delete the database (key space) or parts of the database (column family, individual rows within the family row, and so on). Although the easiest way to make a deletion is using the DataStax OpsCenter data modeling tool, there are commands that can be executed through the Cassandra-CLI or the CQL shell. CLI delete commands InTable 4-6, we have the CLI delete commands: CLI Command Function part Used to delete a great column, a column from the column family or rows within certain columns drop columnfamily Delete column family and all data contained on them drop keyspace Delete the key space, all the column families and the data contained on them. truncate Delete all the data from the selected column family Table 4-6 CLI delete commands CQL shell delete commands  In Table 4-7, we have the shell delete commands: CQL shell command Function alter_drop Delete specified column from the column family delete Delete one or more columns from one or more rows of the selected column family delete_columns Delete columns from the column family delete_where Delete individual rows drop_table Delete the selected column family and all the data contained on it drop_columnfamily Delete column family and all the data contained on it drop_keyspace Delete the key space, all the column families and all the data contained on them. truncate Delete all data from the selected column family. Table 4-7 CQL Shell delete commands DB and DBMS optimization Cassandra optimization is specified in the cassandra.yamlfile and these properties are used to adjust the performance and specify the use of system resources such as disk I/O, memory, and CPU usage. column_index_size_in_kb: Initial value: 64 Kb Range of values: - Column indices added to each row after the data reached the default size of 64 Kilobytes. commitlog_segment_size_in_mb Initial value: 32 Mb Range of values: 8-1024 Mb Determines the size of the commit log segment. The commit log segment is archived to be obliterated or recycled after they are transferred to the SRM table. commitlog_sync Initial value: - Range of values: - In Cassandra, this method is used for entry reception. This method is closely correlated with commitlog_sync_period_in_ms that controls how often log is synchronized with the disc. commitlog_sync_period_in_ms Initial value: 1000 ms Range of values: - Decides how often to send the commit log to disk when commit_sync is in periodic mode. commitlog_total_space_in_mb Initial value: 4096 MB Range of values: - When the size of the commit log reaches an initial value, Cassandra removes the oldest parts of the commit log. This reduces the data amount and facilitates the launch of fixtures. compaction_preheat_key_cache Initial value: true Range of values: true / false When this value is set to true, the stored key rows are monitored during compression, and after resaves it to a new location in the compressed SSTable. compaction_throughput_mb_per_sec Initial value: 16 Range of values: 0-32 Compression damping the overall bandwidth throughout the system. Faster data insertion means faster compression. concurrent_compactors Initial value: 1 per CPU core Range of values: depends on the number of CPU cores Adjusts the number of simultaneous compression processes on the node. concurrent_reads Initial value: 32 Range of values: - When there is more data than the memory can fit, a bottleneck occurs in reading data from disk. concurrent_writes Initial value: 32 Range of values: - Making inserts in Cassandra does not depend on I/O limitations. Concurrent inserts depend on the number of CPU cores. The recommended number of cores is 8. flush_largest_memtables_at Initial value: 0.75 Range of values: - This parameter clears the biggest memtable to free disk space. This parameter can be used as an emergency measure to prevent memory loss (out of memory errors) in_memory_compaction_limit_in_mb Initial value: 64 Range of values: Limit order size on the memory. Larger orders use a slower compression method. index_interval Initial value: 128 Value range: 128-512 Controlled sampling records from the first row of the index in the ratio of space and time, that is, the larger the time interval to be sampled the less effective. In technical terms, the interval corresponds to the number of index samples skipped between taking each sample. memtable_flush_queue_size Initial value: 4 Range of values: a minimum set of the maximum number of secondary indexes that make more than one Column family Indicates the total number of full-memtable to allow a flush, that is, waiting to the write thread. memtable_flush_writers Initial value: 1 (according to the data map) Range of values: - Number of memtable flush writer threads. These threads are blocked by the disk I/O, and each thread holds a memtable in memory until it is blocked. memtable_total_space_in_mb Initial value: 1/3 Java Heap Range of values: - Total amount of memory used for all the Column family memtables on the node. multithreaded_compaction Initial value: false Range of values: true/false Useful only on nodes using solid state disks reduce_cache_capacity_to Initial value: 0.6 Range of values: - Used in combination with reduce_cache_capacity_at. When Java Heap reaches the value of reduce_cache_size_at, this value is the total cache size to reduce the percentage to the declared value (in this case the size of the cache is reduced to 60%). Used to avoid unexpected out-of-memory errors. reduce_cache_size_at Initial value: 0.85 Range of values: 1.0 (disabled) When Java Heap marked to full sweep by the garbage Collector reaches a percentage stated on this variable (85%), Cassandra reduces the size of the cache to the value of the variable reduce_cache_capacity_to. stream_throughput_outbound_megabits_per_sec Initial value: off, that is, 400 Mbps (50 Mb/s) Range of values: - Regulate the stream of output file transfer in a node to a given throughput in Mbps. This is necessary because Cassandra mainly do sequential I/O when it streams data during system startup or repair, which can lead to network saturation and affect Remote Procedure Call performance. Bloom filter Every SSTable has a Bloom filter. In data requests, the Bloom filter checks whether the requested order exists in the SSTable before any disk I/O. If the value of the Bloom filter is too low, it may cause seizures of large amounts of memory, respectively, a higher Bloom filter value, means less memory use. The Bloom filter range of values ​​is from 0.000744 to 1.0. It is recommended keep the minimum value of the Bloom filter less than 0.1. The value of the Bloom filter column family is adjusted through the CQL shell as follows: ALTER TABLE <column_family> WITH bloom_filter_fp_chance = 0.01; Data cache Apache Cassandra has two caches by which it achieves highly efficient data caching. These are: cache key (default: enabled): cache index primary key columns families row cache (default: disabled): holding a row in memory so that reading can be done without using the disc If the key and row cache set, the query of data is accomplished in the way shown in Figure 4-11, Apache Cassandra Cache. Figure 4-11: Apache Cassandra cache When information is requested, first it checks in the row cache, if the information is available, then row cache returns the result without reading from the disk. If it has come from a request and the row cache can return a result, it checks if the data can be retrieved through the key cache, which is more efficient than reading from the disk, the retrieved data is finally written to the row cache. As the key cache memory stores the key location of an individual column family, any increase in key cache has a positive impact on reading data for the column family. If the situation permits, a combination of key cache and row cache increases the efficiency. It is recommended that the size of the key cache is set in relation to the size of the Java heap. Row cache is used in situations where data access patterns follow a normal (Gaussian) distribution of rows that contain often-read data and queries often returning data from the most or all the columns. Within cassandra.yaml files, we have the following options to configure the data cache: key_cache_size_in_mb Initial value: empty, meaning“Auto” (min (5% Heap (in MB), 100MB)) Range of values: blank or 0 (disabled key cache) Variable that defines the key cache size per node row_cache_size_in_mb Initial value: 0 (disabled) Range of values: - Variable that defines the row cache size per node key_cache_save_period Initial value: 14400 (i.e. 4 hours) Range of values: - Variable that defines the save frequency of key cache to disk row_cache_save_period Initial value: 0 (disabled) Range of values: - Variable that defines the save frequency of row cache to disk row_cache_provider Initial value: SerializingCacheProvider Range of values: ConcurrentLinkedHashCacheProvider or SerializingCacheProvider Variable that defines the implementation of row cache Java heap tune up Apache Cassandra interacts with the operating system using the Java virtual machine, so the Java heap size plays an important role. When starting Cassandra, the size of the Java Heap is set automatically based on the total amount of RAM (Table 4-8, Determination of the Java heap relative to the amount of RAM). The Java heap size can be manually adjusted by changing the values ​​of the following variables contained on the file cassandra-env.sh located in the directory...apache-cassandraconf. # MAX_HEAP_SIZE = “4G” # HEAP_NEWSIZE = “800M” Total system memory Java heap size < 2 Gb Half of the system memory 2 Gb - 4 Gb 1 Gb > 4 Gb One quarter of the system memory, no more than 8 Gb Table 4-8: Determination of the Java heap relative to the amount of RAM Java garbage collection tune up Apache Cassandra has a GC Inspector which is responsible for collecting information on each garbage collection process longer than 200ms. The Garbage Collection Processes that occur frequently and take a lot of time (as concurrent mark-sweep which takes several seconds) indicate that there is a great pressure on garbage collection and in the JVM. The recommendations to address these issues include: Add new nodes Reduce the cache size Adjust items related to the JVM garbage collection Views, triggers, and stored procedures By definition (In RDBMS) view represents a virtual table that acts as a real (created) table, which in reality does not contain any data. The obtained data isthe result of a SELECT query. View consists of a rows and columns combination of one or more different tables. Respectively in NoSQL, in Cassandra all data for key value rows are placed in one Column family. As in NoSQL, there is noJOIN commands and there is no possibility of flexible queries, the SELECT command lists the actual data, but there is no display options for a virtual table, that is, a view. Since Cassandra does not belong to the RDBMS group, there is no possibility of creating triggers and stored procedures. RI Restrictions can be set only in the application code Also, as Cassandra does not belong to the RDBMS group, we cannot apply Codd’s rules. Client-server architecture At this point, we have probably already noticed that Apache Cassandra runs on a client-server architecture. By definition, the client-server architecture allows distributed applications, since the tasks are divided into two main parts: On one hand, service providers: the servers. On the other hand, the service petitioners:  the clients. In this architecture, several clients are allowed to access the server; the server is responsible for meeting requests and handle each one according its own rules. So far, we have only used one client, managed from the same machine, that is, from the same data network. CQLs allows us to connect to Cassandra, access a key space, and send CQL statements to the Cassandra server. This is the most immediate method, but in daily practice, it is common to access the key spaces from different execution contexts (other systems and other programming languages). Thus, we require other clients different from CQLs, to do it in the Apache Cassandra context, we require connection drivers. Drivers A driver is just a software component that allows access to a key space to run CQL statements. Fortunately, there arealready a lot of drivers to create clients for Cassandra in almost any modern programming language, you can see an extensive list at this URL:http://wiki.apache.org/cassandra/ClientOptions. Typically, in a client-server architecture there are different clients accessing the server from different clients, which are distributed in different networks. Our implementation needs will dictate the required clients. Summary NoSQL is not just hype,or ayoung technology; it is an alternative, with known limitations and capabilities. It is not an RDBMS killer. It’s more like a younger brother who is slowly growing up and takes some of the burden. Acceptance is increasing and it will be even better as NoSQL solutions mature. Skepticism may be justified, but only for concrete reasons. Since Cassandra is an easy and free working environment, suitable for application development, it is recommended, especially with the additional utilities that ease and accelerate database administration. Cassandra has some faults (for example, user authentication and authorization are still insufficiently supportedin Windows environments) and preferably used when there is a need to store large amounts of data. For start-up companies that need to manipulate large amounts of data with the aim of costs reduction, implementing Cassandra in a Linux environment is a must-have. Resources for Article: Further resources on this subject: Getting Started with Apache Cassandra [article] Apache Cassandra: Working in Multiple Datacenter Environments [article] Apache Cassandra: Libraries and Applications [article]
Read more
  • 0
  • 0
  • 6431

article-image-get-familiar-angular
Packt
23 Jan 2017
26 min read
Save for later

Get Familiar with Angular

Packt
23 Jan 2017
26 min read
This article by Minko Gechev, the author of the book Getting Started with Angular - Second Edition, will help you understand what is required for the development of a new version of Angular from scratch and why its new features make intuitive sense for the modern Web in building high-performance, scalable, single-page applications. Some of the topics that we'll discuss are as follows: Semantic versioning and what chances to expect from Angular 2, 3, 5 and so on. How the evolution of the Web influenced the development of Angular. What lessons we learned from using Angular 1 in the wild for the last a couple of years. What TypeScript is and why it is a better choice for building scalable single-page applications than JavaScript. (For more resources related to this topic, see here.) Angular adopted semantic versioning so before going any further let's make an overview of what this actually means. Angular and semver Angular 1 was rewritten from scratch and replaced with its successor, Angular 2. A lot of us were bothered by this big step, which didn't allow us to have a smooth transition between these two versions of the framework. Right after Angular 2 got stable, Google announced that they want to follow the so called semantic versioning (also known as semver). Semver defines the version of given software project as the triple X.Y.Z, where Z is called patch version, Y is called minor version, and X is called major version. A change in the patch version means that there are no intended breaking changes between two versions of the same project but only bug fixes. The minor version of a project will be incremented when new functionality is introduced, and there are no breaking changes. Finally, the major version will be increased when in the API are introduced incompatible changes. This means that between versions 2.3.1 and 2.10.4, there are no introduced breaking changes but only a few added features and bug fixes. However, if we have version 2.10.4 and we want to change any of the already existing public APIs in a backward-incompatible manner (for instance, change the order of the parameters that a method accepts), we need to increment the major version, and reset the patch and minor versions, so we will get version 3.0.0. The Angular team also follows a strict schedule. According to it, a new patch version needs to be introduced every week; there should be three monthly minor release after each major release, and finally, one major release every six months. This means that by the end of 2018, we will have at least Angular 6. However, this doesn't mean that every six months we'll have to go through the same migration path like we did between Angular 1 and Angular 2. Not every major release will introduce breaking changes that are going to impact our projects . For instance, support for newer version of TypeScript or change of the last optional argument of a method will be considered as a breaking change. We can think of these breaking changes in a way similar to that happened between Angular 1.2 and Angular 1.3. We'll refer to Angular 2 as either Angular 2 or only Angular. If we explicitly mention Angular 2, this doesn't mean that the given paragraph will not be valid for Angular 4 or Angular 5; it most likely will. In case you're interested to know what are the changes between different versions of the framework, you can take a look at the changelog at https://github.com/angular/angular/blob/master/CHANGELOG.md. If we're discussing Angular 1, we will be more explicit by mentioning a version number, or the context will make it clear that we're talking about a particular version. Now that we introduced the Angular's semantic versioning and conventions for referring to the different versions of the framework, we can officially start our journey! The evolution of the Web - time for a new framework In the past couple of years, the Web has evolved in big steps. During the implementation of ECMAScript 5, the ECMAScript 6 standard started its development (now known as ECMAScript 2015 or ES2015). ES2015 introduced many changes in JavaScript, such as adding built-in language support for modules, block scope variable definition, and a lot of syntactical sugar, such as classes and destructuring. Meanwhile, Web Components were invented. Web Components allow us to define custom HTML elements and attach behavior to them. Since it is hard to extend the existing set of HTML elements with new ones (such as dialogs, charts, grids, and more), mostly because of the time required for consolidation and standardization of their APIs, a better solution is to allow developers to extend the existing elements the way they want. Web Components provide us with a number of benefits, including better encapsulation, the better semantics of the markup we produce, better modularity, and easier communication between developers and designers. We know that JavaScript is a single-threaded language. Initially, it was developed for simple client-side scripting, but over time, its role has shifted quite a bit. Now, with HTML5, we have different APIs that allow audio and video processing, communication with external services through a two-directional communication channel, transferring and processing big chunks of raw data, and more. All these heavy computations in the main thread may create a poor user experience. They may introduce freezing of the user interface when time-consuming computations are being performed. This led to the development of WebWorkers, which allow the execution of the scripts in the background that communicate with the main thread through message passing. This way, multithreaded programming was brought to the browser. Some of these APIs were introduced after the development of Angular 1 had begun; that's why the framework wasn't built with most of them in mind. Taking advantage of the APIs gives developers many benefits, such as the following: Significant performance improvements Development of software with better quality characteristics Now, let's briefly discuss how each of these technologies has been made part of the new Angular core and why. The evolution of ECMAScript Nowadays, browser vendors are releasing new features in short iterations, and users receive updates quite often. This helps developers take advantage of bleeding-edge Web technologies. ES2015 that is already standardized. The implementation of the latest version of the language has already started in the major browsers. Learning the new syntax and taking advantage of it will not only increase our productivity as developers but also will prepare us for the near future when all the browsers will have full support for it. This makes it essential to start using the latest syntax now. Some projects' requirements may enforce us to support older browsers, which do not support any ES2015 features. In this case, we can directly write ECMAScript 5, which has different syntax but equivalent semantics to ES2015. On the other hand, a better approach will be to take advantage of the process of transpilation. Using a transpiler in our build process allows us to take advantage of the new syntax by writing ES2015 and translating it to a target language that is supported by the browsers. Angular has been around since 2009. Back then, the frontend of most websites was powered by ECMAScript 3, the last main release of ECMAScript before ECMAScript 5. This automatically meant that the language used for the framework's implementation was ECMAScript 3. Taking advantage of the new version of the language requires porting of the entirety of Angular 1 to ES2015. From the beginning, Angular 2 took into account the current state of the Web by bringing the latest syntax in the framework. Although new Angular is written with a superset of ES2016 (TypeScript), it allows developers to use a language of their own preference. We can use ES2015, or if we prefer not to have any intermediate preprocessing of our code and simplify the build process, we can even use ECMAScript 5. Note that if we use JavaScript for our Angular applications we cannot use Ahead-of-Time (AoT) compilation. Web Components The first public draft of Web Components was published on May 22, 2012, about three years after the release of Angular 1. As mentioned, the Web Components standard allows us to create custom elements and attach behavior to them. It sounds familiar; we've already used a similar concept in the development of the user interface in Angular 1 applications. Web Components sound like an alternative to Angular directives; however, they have a more intuitive API and built-in browser support. They introduced a few other benefits, such as better encapsulation, which is very important, for example, in handling CSS-style collisions. A possible strategy for adding Web Components support in Angular 1 is to change the directives implementation and introduce primitives of the new standard in the DOM compiler. As Angular developers, we know how powerful but also complex the directives API is. It includes a lot of properties, such as postLink, preLink, compile, restrict, scope, controller, and much more, and of course, our favorite transclude. Approved as standard, Web Components will be implemented on a much lower level in the browsers, which introduces plenty of benefits, such as better performance and native API. During the implementation of Web Components, a lot of web specialists met with the same problems the Angular team did when developing the directives API and came up with similar ideas. Good design decisions behind Web Components include the content element, which deals with the infamous transclusion problem in Angular 1. Since both the directives API and Web Components solve similar problems in different ways, keeping the directives API on top of Web Components would have been redundant and added unnecessary complexity. That's why, the Angular core team decided to start from the beginning by building a framework compatible with Web Components and taking full advantage of the new standard. Web Components involve new features; some of them were not yet implemented by all browsers. In case our application is run in a browser, which does not support any of these features natively, Angular emulates them. An example for this is the content element polyfilled with the ng-content directive. WebWorkers JavaScript is known for its event loop. Usually, JavaScript programs are executed in a single thread and different events are scheduled by being pushed in a queue and processed sequentially, in the order of their arrival. However, this computational strategy is not effective when one of the scheduled events requires a lot of computational time. In such cases, the event's handling will block the main thread, and all other events will not be handled until the time-consuming computation is complete and passes the execution to the next one in the queue. A simple example of this is a mouse click that triggers an event, in which callback we do some audio processing using the HTML5 audio API. If the processed audio track is big and the algorithm running over it is heavy, this will affect the user's experience by freezing the UI until the execution is complete. The WebWorker API was introduced in order to prevent such pitfalls. It allows execution of heavy computations inside the context of a different thread, which leaves the main thread of execution free, capable of handling user input and rendering the user interface. How can we take advantage of this in Angular? In order to answer this question, let's think about how things work in Angular 1. What if we have an enterprise application, which processes a huge amount of data that needs to be rendered on the screen using data binding? For each binding, the framework will create a new watcher. Once the digest loop is run, it will loop over all the watchers, execute the expressions associated with them, and compare the returned results with the results gained from the previous iteration. We have a few slowdowns here: The iteration over a large number of watchers The evaluation of the expression in a given context The copy of the returned result The comparison between the current result of the expression's evaluation and the previous one All these steps could be quite slow, depending on the size of the input. If the digest loop involves heavy computations, why not move it to a WebWorker? Why not run the digest loop inside WebWorker, get the changed bindings, and then apply them to the DOM? There were experiments by the community, which aimed for this result. However, their integration into the framework wasn't trivial. One of the main reasons behind the lack of satisfying results was the coupling of the framework with the DOM. Often, inside the watchers' callbacks, Angular 1 directly manipulates the DOM, which makes it impossible to move the watchers inside WebWorkers since the WebWorkers are invoked in an isolated context, without access to the DOM. In Angular 1, we may have implicit or explicit dependencies between the different watchers, which require multiple iterations of the digest loop in order to get stable results. Combining the last two points, it is quite hard to achieve practical results in calculating the changes in threads other than the main thread of execution. Fixing this in Angular 1 introduces a great deal of complexity in the internal implementation. The framework simply was not built with this in mind. Since WebWorkers were introduced before the Angular 2 design process started, the core team took them into mind from the beginning. Lessons learned from Angular 1 in the wild It's important to remember that we're not starting completely from scratch. We're taking what we've learned from Angular 1 with us. In the period since 2009, the Web is not the only thing that evolved. We also started building more and more complex applications. Today, single-page applications are not something exotic, but more like a strict requirement for all the web applications solving business problems, which are aiming for high performance and a good user experience. Angular 1 helped us to efficiently build large-scale, single-page applications. However, by applying it in various use cases, we've also discovered some of its pitfalls. Learning from the community's experience, Angular's core team worked on new ideas aiming to answer the new requirements. Controllers Angular 1 follows the Model View Controller (MVC) micro-architectural pattern. Some may argue that it looks more like Model View ViewModel (MVVM) because of the view model attached as properties to the scope or the current context in case of "controller as syntax". It could be approached differently again, if we use the Model View Presenter pattern (MVP). Because of all the different variations of how we can structure the logic in our applications, the core team called Angular 1 a Model View Whatever (MVW) framework. The view in any Angular 1 application is supposed to be a composition of directives. The directives collaborate together in order to deliver fully functional user interfaces. Services are responsible for encapsulating the business logic of the applications. That's the place where we should put the communication with RESTful services through HTTP, real-time communication with WebSockets and even WebRTC. Services are the building block where we should implement the domain models and business rules of our applications. There's one more component, which is mostly responsible for handling user input and delegating the execution to the services--the controller. Although the services and directives have well-defined roles, we can often see the anti-pattern of the Massive View Controller, which is common in iOS applications. Occasionally, developers are tempted to access or even manipulate the DOM directly from their controllers. Initially, this happens while you want to achieve something simple, such as changing the size of an element, or quick and dirty changing elements' styles. Another noticeable anti-pattern is the duplication of the business logic across controllers. Often developers tend to copy and paste logic, which should be encapsulated inside services. The best practices for building Angular 1 applications state is that the controllers should not manipulate the DOM at all, instead, all DOM access and manipulations should be isolated in directives. If we have some repetitive logic between controllers, most likely we want to encapsulate it into a service and inject this service with the dependency injection mechanism of Angular in all the controllers that need that functionality. This is where we're coming from in Angular 1. All this said, it seems that the functionality of controllers could be moved into the directive's controllers. Since directives support the dependency injection API, after receiving the user's input, we can directly delegate the execution to a specific service, already injected. This is the main reason why now Angular uses a different approach by removing the ability to put controllers everywhere by using the ng-controller directive. Scope Data-binding in Angular 1 is achieved using the scope object. We can attach properties to it and explicitly declare in the template that we want to bind to these properties (one- or two-way). Although the idea of the scope seems clear, it has two more responsibilities, including event dispatching and the change detection-related behavior. Angular beginners have a hard time understanding what scope really is and how it should be used. Angular 1.2 introduced something called controller as syntax. It allows us to add properties to the current context inside the given controller (this), instead of explicitly injecting the scope object and later adding properties to it. This simplified syntax can be demonstrated through the following snippet: <div ng-controller="MainCtrl as main"> <button ng-click="main.clicked()">Click</button> </div> function MainCtrl() { this.name = 'Foobar'; } MainCtrl.prototype.clicked = function () { alert('You clicked me!'); }; The latest Angular took this even further by removing the scope object. All the expressions are evaluated in the context of the given UI component. Removing the entire scope API introduces higher simplicity; we don't need to explicitly inject it anymore, instead we add properties to the UI components to which we can later bind. This API feels much simpler and more natural. Dependency injection Maybe the first framework on the market that included inversion of control (IoC) through dependency injection (DI) in the JavaScript world was Angular 1. DI provides a number of benefits, such as easier testability, better code organization and modularization, and simplicity. Although the DI in the first version of the framework does an amazing job, Angular 2 took this even further. Since latest Angular is on top of the latest Web standards, it uses the ECMAScript 2016 decorators' syntax for annotating the code for using DI. Decorators are quite similar to the decorators in Python or annotations in Java. They allow us to decorate the behavior of a given object, or add metadata to it, using reflection. Since decorators are not yet standardized and supported by major browsers, their usage requires an intermediate transpilation step; however, if you don't want to take it, you can directly write a little bit more verbose code with ECMAScript 5 syntax and achieve the same semantics. The new DI is much more flexible and feature-rich. It also fixes some of the pitfalls of Angular 1, such as the different APIs; in the first version of the framework, some objects are injected by position (such as the scope, element, attributes, and controller in the directives' link function) and others, by name (using parameters names in controllers, directives, services, and filters). Server-side rendering The bigger the requirements of the Web are, the more complex the web applications become. Building a real-life, single-page application requires writing a huge amount of JavaScript, and including all the required external libraries may increase the size of the scripts on our page to a few megabytes. The initialization of the application may take up to several seconds or even tens of seconds on mobile until all the resources get fetched from the server, the JavaScript is parsed and executed, the page gets rendered, and all the styles are applied. On low-end mobile devices that use a mobile Internet connection, this process may make the users give up on visiting our application. Although there are a few practices that speed up this process, in complex applications, there's no silver bullet. In the process of trying to improve the user experience, developers discovered something called server-side rendering. It allows us to render the requested view of a single-page application on the server and directly provide the HTML for the page to the user. Later, once all the resources are processed, the event listeners and bindings can be added by the script files. This sounds like a good way to boost the performance of our application. One of the pioneers in this was React, which allowed prerendering of the user interface on the server side using Node.js DOM implementations. Unfortunately, the architecture of Angular 1 does not allow this. The showstopper is the strong coupling between the framework and the browser APIs, the same issue we had in running the change detection in WebWorkers. Another typical use case for the server-side rendering is for building Search Engine Optimization (SEO)-friendly applications. There were a couple of hacks used in the past for making the Angular 1 applications indexable by the search engines. One such practice, for instance, is the traversal of the application with a headless browser, which executes the scripts on each page and caches the rendered output into HTML files, making it accessible by the search engines. Although this workaround for building SEO-friendly applications works, server-side rendering solves both of the above-mentioned issues, improving the user experience and allowing us to build SEO-friendly applications much more easily and far more elegantly. The decoupling of Angular with the DOM allows us to run our Angular applications outside the context of the browser. Applications that scale MVW has been the default choice for building single-page applications since Backbone.js appeared. It allows separation of concerns by isolating the business logic from the view, allowing us to build well-designed applications. Taking advantage of the observer pattern, MVW allows listening for model changes in the view and updating it when changes are detected. However, there are some explicit and implicit dependencies between these event handlers, which make the data flow in our applications not obvious and hard to reason about. In Angular 1, we are allowed to have dependencies between the different watchers, which requires the digest loop to iterate over all of them a couple of times until the expressions' results get stable. The new Angular makes the data flow one-directional; this has a number of benefits: More explicit data flow. No dependencies between bindings, so no time to live (TTL) of the digest. Better performance of the framework: The digest loop is run only once. We can create apps, which are friendly to immutable or observable models, that allows us to make further optimizations. The change in the data flow introduces one more fundamental change in Angular 1 architecture. We may take another perspective on this problem when we need to maintain a large codebase written in JavaScript. Although JavaScript's duck typing makes the language quite flexible, it also makes its analysis and support by IDEs and text editors harder. Refactoring of large projects gets very hard and error-prone because in most cases, the static analysis and type inference are impossible. The lack of compiler makes typos all too easy, which are hard to notice until we run our test suite or run the application. The Angular core team decided to use TypeScript because of the better tooling possible with it and the compile-time type checking, which help us to be more productive and less error-prone. As the following diagram shows, TypeScript is a superset of ECMAScript; it introduces explicit type annotations and a compiler:  Figure 1 The TypeScript language is compiled to plain JavaScript, supported by today's browsers. Since version 1.6, TypeScript implements the ECMAScript 2016 decorators, which makes it the perfect choice for Angular. The usage of TypeScript allows much better IDE and text editors' support with static code analysis and type checking. All this increases our productivity dramatically by reducing the mistakes we make and simplifying the refactoring process. Another important benefit of TypeScript is the performance improvement we implicitly get by the static typing, which allows runtime optimizations by the JavaScript virtual machine. Templates Templates are one of the key features in Angular 1. They are simple HTML and do not require any intermediate translation, unlike most template engines, such as mustache. Templates in Angular combine simplicity with power by allowing us to extend HTML by creating an internal domain-specific language (DSL) inside it, with custom elements and attributes. This is one of the main purposes of Web Components as well. We already mentioned how and why Angular takes advantage of this new technology. Although Angular 1 templates are great, they can still get better! The new Angular templates took the best parts of the ones in the previous release of the framework and enhanced them by fixing some of their confusing parts. For example, let's say we have a directive and we want to allow the user to pass a property to it using an attribute. In Angular 1, we can approach this in the following three different ways: <user name="literal"></user> <user name="expression"></user> <user name="{{interpolate}}"></user> In the user directive, we pass the name property using three different approaches. We can either pass a literal (in this case, the string "literal"), a string, which will be evaluated as an expression (in our case "expression"), or an expression inside, {{ }}. Which syntax should be used completely depends on the directive's implementation, which makes its API tangled and hard to remember. It is a frustrating task to deal with a large amount of components with different design decisions on a daily basis. By introducing a common convention, we can handle such problems. However, in order to have good results and consistent APIs, the entire community needs to agree with it. The new Angular deals with this problem by providing special syntax for attributes, whose values need to be evaluated in the context of the current component, and a different syntax for passing literals. Another thing we're used to, based on our Angular 1 experience, is the microsyntax in template directives, such as ng-if and ng-for. For instance, if we want to iterate over a list of users and display their names in Angular 1, we can use: <div ng-for="user in users">{{user.name}}</div> Although this syntax looks intuitive to us, it allows limited tooling support. However, Angular 2 approached this differently by bringing a little bit more explicit syntax with richer semantics: <template ngFor let-user [ngForOf]="users"> {{user.name}} </template> The preceding snippet explicitly defines the property, which has to be created in the context of the current iteration (user), the one we iterate over (users). Since this syntax is too verbose for typing, developers can use the following syntax, which later gets translated to the more verbose one: <li *ngFor="let user of users"> {{user.name}} </li> The improvements in the new templates will also allow better tooling for advanced support by text editors and IDEs. Change detection We already mentioned the opportunity to run the digest loop in the context of a different thread, instantiated as WebWorker. However, the implementation of the digest loop in Angular 1 is not quite memory-efficient and prevents the JavaScript virtual machine from doing further code optimizations, which allows significant performance improvements. One such optimization is the inline caching ( http://mrale.ph/blog/2012/06/03/explaining-js-vms-in-js-inline-caches.html ). The Angular team did a lot of research in order to discover different ways the performance and the efficiency of the change detection could be improved. This led to the development of a brand new change detection mechanism. As a result, Angular performs change detection in code that the framework directly generates from the components' templates. The code is generated by the Angular compiler. There are two built-in code generation (also known as compilation) strategies: Just-in-Time (JiT) compilation: At runtime, Angular generates code that performs change detection on the entire application. The generated code is optimized for the JavaScript virtual machine, which provides a great performance boost. Ahead-of-Time (AoT) compilation: Similar to JiT with the difference that the code is being generated as part of the application's build process. It can be used for speeding the rendering up by not performing the compilation in the browser and also in environments that disallow eval(), such as CSP (Content-Security-Policy) and Chrome extensions. Summary In this article, we considered the main reasons behind the decisions taken by the Angular core team and the lack of backward compatibility between the last two major versions of the framework. We saw that these decisions were fueled by two things--the evolution of the Web and the evolution of the frontend development, with the lessons learned from the development of Angular 1 applications. We learned why we need to use the latest version of the JavaScript language, why to take advantage of Web Components and WebWorkers, and why it's not worth it to integrate all these powerful tools in version 1. We observed the current direction of frontend development and the lessons learned in the last few years. We described why the controller and scope were removed from Angular 2, and why Angular 1's architecture was changed in order to allow server-side rendering for SEO-friendly, high-performance, single-page applications. Another fundamental topic we took a look at was building large-scale applications, and how that motivated single-way data flow in the framework and the choice of the statically typed language, TypeScript. The new Angular reuses some of the naming of the concepts introduced by Angular 1, but generally changes the building blocks of our single-page applications completely. We will take a peek at the new concepts and compare them with the ones in the previous version of the framework. We'll make a quick introduction to modules, directives, components, routers, pipes, and services, and describe how they could be combined for building classy, single-page applications. Resources for Article: Further resources on this subject: Angular.js in a Nutshell [article] Angular's component architecture [article] AngularJS Performance [article]
Read more
  • 0
  • 0
  • 35657
Unlock access to the largest independent learning library in Tech for FREE!
Get unlimited access to 7500+ expert-authored eBooks and video courses covering every tech area you can think of.
Renews at $19.99/month. Cancel anytime
article-image-welcome-new-world
Packt
23 Jan 2017
8 min read
Save for later

Welcome to the New World

Packt
23 Jan 2017
8 min read
We live in very exciting times. Technology is changing at a pace so rapid, that it is becoming near impossible to keep up with these new frontiers as they arrive. And they seem to arrive on a daily basis now. Moore's Law continues to stand, meaning that technology is getting smaller and more powerful at a constant rate. As I said, very exciting. In this article by Jason Odom, the author of the book HoloLens Beginner's Guide, we will be discussing about one of these new emerging technologies that finally is reaching a place more material than science fiction stories, is Augmented or Mixed Reality. Imagine the world where our communication and entertainment devices are worn, and the digital tools we use, as well as the games we play, are holographic projections in the world around us. These holograms know how to interact with our world and change to fit our needs. Microsoft has to lead the charge by releasing such a device... the HoloLens. (For more resources related to this topic, see here.) The Microsoft HoloLens changes the paradigm of what we know as personal computing. We can now have our Word window up on the wall (this is how I am typing right now), we can have research material floating around it, we can have our communication tools like Gmail and Skype in the area as well. We are finally no longer trapped to a virtual desktop, on a screen, sitting on a physical desktop. We aren't even trapped by the confines of a room anymore. What exactly is the HoloLens? The HoloLens is a first of its kind, head-worn standalone computer with a sensor array which includes microphones and multiple types of cameras, spatial sound speaker array, a light projector, and an optical waveguide. The HoloLens is not only a wearable computer; it is also a complete replacement for the standard two-dimensional display. HoloLens has the capability of using holographic projection to create multiple screens throughout and environment as well as fully 3D- rendered objects as well. With the HoloLens sensor array these holograms can fully interact with the environment you are in. The sensor array allows the HoloLens to see the world around it, to see input from the user's hands, as well as for it to hear voice commands. While Microsoft has been very quiet about what the entire sensor array includes we have a good general idea about the components used in the sensor array, let's have a look at them: One IMU: The Inertia Measurement Unit (IMU) is a sensor array that includes, an Accelerometer, a Gyroscope, and a Magnetometer. This unit handles head orientation tracking and compensates for drift that comes from the Gyroscopes eventual lack of precision. Four environment understanding sensors: These together form the spatial mapping that the HoloLens uses to create a mesh of the world around the user. One depth camera:Also known as a structured--light 3D scanner. This device is used for measuring the three-dimensional shape of an object using projected light patterns and a camera system. Microsoft first used this type of camera inside the Kinect for the Xbox 360 and Xbox One.  One ambient light sensor:Ambient light sensors or photosensors are used for ambient light sensing as well as proximity detection. 2 MP photo/HD video camera:For taking pictures and video. Four-microphone array: These do a great job of listening to the user and not the sounds around them. Voice is one of the primary input types with HoloLens. Putting all of these elements together forms a Holographic computer that allows the user to see, hear and interact with the world around in new and unique ways. What you need to develop for the HoloLens The HoloLens development environment breaks down to two primary tools, Unity and Visual Studio. Unity is the 3D environment that we will do most of our work in. This includes adding holograms, creating user interface elements, adding sound, particle systems and other things that bring a 3D program to life. If Unity is the meat on the bone, Visual Studio is a skeleton. Here we write scripts or machine code to make our 3D creations come to life and add a level of control and immersion that Unity can not produce on its own. Unity Unity is a software framework designed to speed up the creation of games and 3D based software. Generally speaking, Unity is known as a game engine, but as the holographic world becomes more apparently, the more we will use such a development environment for many different kinds of applications. Unity is an application that allows us to take 3D models, 2D graphics, particle systems, and sound to make them interact with each other and our user. Many elements are drag and drop, plug and play, what you see is what you get. This can simplify the iteration and testing process. As developers, we most likely do not want to build and compile forever little change we make in the development process. This allows us to see the changes in context to make sure they work, then once we hit a group of changes we can test on the HoloLens ourselves. This does not work for every aspect of HoloLens--Unity development but it does work for a good 80% - 90%. Visual Studio community Microsoft Visual Studio Community is a great free Integrated Development Environment (IDE). Here we use programming languages such as C# or JavaScript to code change in the behavior of objects, and generally, make things happen inside of our programs. HoloToolkit - Unity The HoloToolkit--Unity is a repository of samples, scripts, and components to help speed up the process of development. This covers a large selection of areas in HoloLens Development such as: Input:Gaze, gesture, and voice are the primary ways in which we interact with the HoloLens Sharing:The sharing repository helps allow users to share holographic spaces and connect to each other via the network. Spatial Mapping:This is how the HoloLens sees our world. A large 3D mesh of our space is generated and give our holograms something to interact with or bounce off of. Spatial Sound:The speaker array inside the HoloLens does an amazing work of giving the illusion of space. Objects behind us sound like they are behind us. HoloLens emulator The HoloLens emulator is an extension to Visual Studio that will simulate how a program will run on the HoloLens. This is great for those who want to get started with HoloLens development but do not have an actual HoloLens yet. This software does require the use of Microsoft Hyper-V , a feature only available inside of the Windows 10 Pro operating system. Hyper-V is a virtualization environment, which allows the creation of a virtual machine. This virtual machine emulates the specific hardware so one can test without the actual hardware. Visual Studio tools for Unity This collection of tools adds IntelliSense and debugging features to Visual Studio. If you use Visual Studio and Unity this is a must have: IntelliSense:An intelligent code completion tool for Microsoft Visual Studio. This is designed to speed up many processes when writing code. The version that comes with Visual Studios tools for Unity has unity specific updates. Debugging:Up to the point that this extension exists debugging Unity apps proved to be a little tedious. With this tool, we can now debug Unity applications inside Visual Studio speeding of the bug squashing process considerably. Other useful tools Following mentioned are some the useful tools that are required: Image editor: Photoshop or Gimp are both good examples of programs that allow us to create 2D UI elements and textures for objects in our apps. 3D Modeling Software: 3D Studio Max, Maya, and Blender are all programs that allow us to make 3D objects that can be imported in Unity. Sound Editing Software: There are a few resources for free sounds out of the web with that in mind, Sound Forge is a great tool for editing those sounds, layering sounds together to create new sounds. Summary In this article, we have gotten to know a little bit about the HoloLens, so we can begin our journey into this new world. Here the only limitations are our imaginations. Resources for Article: Further resources on this subject: Creating a Supercomputer [article] Building Voice Technology on IoT Projects [article] C++, SFML, Visual Studio, and Starting the first game [article]
Read more
  • 0
  • 0
  • 26530

article-image-how-add-custom-slot-types-intents
Antonio Cucciniello
20 Jan 2017
6 min read
Save for later

How to Add Custom Slot Types to Intents

Antonio Cucciniello
20 Jan 2017
6 min read
Have you created an intent for use with Alexa where you wanted to add your own slot types to it? If this is you, hopefully after following this guide you should be on your way to creating as many custom slot types as you like. Before we go any farther, let us define what a slot is in Amazon Echo Skill Development. A slot essentially is a way to access what the user says when requesting something and then using it in your code to execute the skill's functionality properly. For example, let’s say I created a skill that repeated a name back to me, and the request was "Alexa, ask Repeater to respond with John." Alexa would then respond with "John." In this case the slot value would be John. In order to make slots, you need to do a couple of things in your Developer Portal and in your code. Amazon Developer Portal First, we will visit the steps to implement a slot in your Developer Portal. There are three individual parts to this. intent Schema Once you have logged in to your Developer Portal for the skill you would like to add a slot to, click on Interaction Model tab on the left hand side. Go to the Intent Schema section. This is a JSON object that holds all of your intents. Here we are going to create a new intent to our skill. For example, if our skill's name was Repeater and we wanted Alexa to respond with a name the user said back to them, our intent schema would look like this: { "intents": [ { "intent": "RepeatNameIntent", "slots": [ { "name": "repeatName", "type": "REPEAT_NAME" } } ] } Here we specify the intent name as RepeatNameIntent, then specify that the intent will have one slot named repeatName that is of the custom slot type REPEAT_NAME. Now we have created an intent for Alexa to handle while using this skill. It is time to define what the custom slot type of REPEAT_NAME is. Custom Slot Types Now that the intent has been added, in order to define the custom slot type of REPEAT_NAME, scroll down to the Custom Slot Type section. Now click on Add Slot Type. Enter the type as the same type you made in your Intent Schema (for this example, we will create it of type REPEAT_NAME). Then it will ask you to enter values. Here Amazon is looking for example things a user would say for this slot, in order to know when the slot is being used. In the case of the REPEAT_NAME, I have placed a bunch of different names as values for Alexa to handle. Here is an image of the custom slot type with its values: Sample Utterances Once you have defined what your custom slot type will look like by giving it Sample Values, it is time to create a couple of Sample Utterances so the skill knows where the slot type will be when invoking the intent. In order to specify where the slot will be in the Sample Utterance, you must use the format: {nameOfSlot}. For example, here are a couple of Sample Utterances I implemented for the RepeatNameIntent: RepeatNameIntent to repeat the name {repeatName} RepeatNameIntent to say the name {repeatName} RepeatNameIntent to respond with the name {repeatName} Now, by giving these Sample Utterances, Alexa knows that what the user says after the word "name" in the skill is what you would like repeated back to you. In order to implement this, we need to access the slot in our code. Code Now that you have the logistics of setting up the intent and custom slot type in your Developer Portal, you can move on to implementing the intent's functionality. // repeat-name-intent.js module.exports = RepeatNameIntent function RepeatNameIntent (intent, session, response) { var name = intent.slots.repeatName.value response.tell(name) return } The way to access the value of what the user is saying is through intent.slots.slotName.value. For your skill, you would replace slotName with the actual name of the slot that you used in your Intent Schema in the Developer Portal. For this example, we accessed the value and stored it in a variable called name and then had Alexa respond with the name to the user. Now, to make sure that you have the intent being handled in your code, head over to your main js file for this skill (the file where AWS Lambda is pointing to). Add the following line to pull in your intent function into your main file: // main.js var RepeatNameIntent = require('./repeat-name-intent.js') Once you have added that line, you can add this next line in your intentHandlers in order for your skill to know which intent in your Developer Portal's intent schema relates to which function in your code: RepeaterService.protoype.intentHandlers = { 'RepeatNameIntent' : RepeatNameIntent } This takes the form of 'IntentNameInDevPortal' : IntentNameInCode. Conclusion If you have made it this far, you have successfully added a custom slot type to your Intent! I will briefly explain what happens when invoking the skill with your custom slot type: The user says "Alexa, ask Repeater to say the name Joe." Alexa listens to what you are saying. She recognizes that you are invoking the RepeatNameIntent and saying the slot should be "Joe." She now executes the function RepeatNameIntent because your intent handler tells her that is how you would like to handle that intent. She responds with "Joe." Possible Resources Use my skill here: Edit Docs Check out the Code for my skill on GitHub Alexa Skills Kit Custom Interaction Model Reference Migrating to the Improved Built-in and Custom Slot Types About the author Antonio is a software engineer with a background in C, C++ and JavaScript (Node.js) from New Jersey. His most recent project called Edit Docs is an Amazon Echo skill that allows users to edit Google Drive files using their voice. He loves building cool things with software, reading books on self-help and improvement, finance, and entrepreneurship. To contact Antonio, e-mail him at Antonio.cucciniello16@gmail.com, follow him on twitter @antocucciniello, and follow him on GitHub.
Read more
  • 0
  • 0
  • 11306

article-image-installing-quicksight-application
Packt
20 Jan 2017
4 min read
Save for later

Installing QuickSight Application

Packt
20 Jan 2017
4 min read
In this article by Rajesh Nadipalli, the author of the book Effective Business Intelligence with QuickSight, we will see how you can install the Amazon QuickSight app from the Apple iTunes store for no cost. You can search for the app from the iTunes store and then proceed to download and install or alternatively you can follow this link to download the app. (For more resources related to this topic, see here.) Amazon QuickSight app is certified to work with iOS devices running iOS v9.0 and above. Once you have the app installed, you can then proceed to login to your QuickSight account as shown in the following screenshot: Figure 1.1: QuickSight sign in The Amazon QuickSight app is designed to access dashboards and analyses on your mobile device. All interactions on the app are read-only and changes you make on your device are not applied to the original visuals so that you can explore without any worry. Dashboards on the go After you login to the QuickSight app, you will first see the list of dashboards associated to your QuickSight account for easy access. If you don't see dashboards, then click on Dashboards icon from the menu at the bottom of your mobile device as shown in the following screenshot: Figure 1.2: Accessing dashboards You will now see the list of dashboards associated to your user ID. Dashboard detailed view From the dashboard listing, select the USA Census Dashboard, which will then redirect you to the detailed dashboard view. In the detailed dashboard view you will see all visuals that are part of that dashboard. You can click on the arrow to the extreme top right of each visual to open the specific chart in full screen mode as shown in the following screenshot. In the scatter plot analysis shown in the following screenshot, you can further click on any of the dots to get specific values about that bubble. In the following screenshot the selected circle is for zip code 94027 which has PopulationCount of 7,089 and MedianIncome of $216,905 and MeanIncome of $336,888: Figure 1.3: Dashboard visual Dashboard search QuickSight mobile app also provides a search feature, which is handy if you know only partial name of the dashboard. Follow the following steps to search for a dashboard: First ensure you are in the dashboards tab by clicking on the Dashboards icon from the bottom menu. Next click on the search icon seen on the top right corner. Next type the partial name. In the following example, i have typed Usa. QuickSight now searches for all dashboards that have the word Usa in it and lists them out. You can next click on the dashboard to get details about that specific dashboard as shown in the following screenshot: Figure 1.4: Dashboard search Favorite a dashboard QuickSight provides a convenient way to bookmark your dashboards by setting them as favorites. To use this feature, first identify which dashboards you often use and click on the star icon to it's right side as shown in the following screenshot. Next to access all of your favorites, click on the Favorites tab and the list is then refined to only those dashboards you had previously identified as favorite: Figure 1.5: Dashboard favorites Limitations of mobile app While dashboards are fairly easy to interact with on the mobile app, there are key limitations when compared to the standard browser version, which I am listing as follows: You cannot create share dashboards to others using the mobile app. You cannot zoom in/out from the visual, which would be really good in scenarios where the charts are dense. Chart legends are not shown. Summary We have seen how to install Amazon QuickSight app and using this app you can browse, search, and view dashboards. We have covered how to access dashboards, search, favorite, and its detailed view. We have also seen some limitations of mobile app. Resources for Article: Further resources on this subject: Introduction to Practical Business Intelligence [article] MicroStrategy 10 [article] Making Your Data Everything It Can Be [article]
Read more
  • 0
  • 0
  • 3037

article-image-clustering-model-spark
Packt
19 Jan 2017
7 min read
Save for later

Clustering Model with Spark

Packt
19 Jan 2017
7 min read
In this article by Manpreet Singh Ghotra and Rajdeep Dua, coauthors of the book Machine Learning with Spark, Second Edition, we will analyze the case where we do not have labeled data available. Supervised learning methods are those where the training data is labeled with the true outcome that we would like to predict (for example, a rating for recommendations and class assignment for classification or a real target variable in the case of regression). (For more resources related to this topic, see here.) In unsupervised learning, the model is not supervised with the true target label. The unsupervised case is very common in practice, since obtaining labeled training data can be very difficult or expensive in many real-world scenarios (for example, having humans label training data with class labels for classification). However, we would still like to learn some underlying structure in the data and use these to make predictions. This is where unsupervised learning approaches can be useful. Unsupervised learning models are also often combined with supervised models, for example, applying unsupervised techniques to create new input features for supervised models. Clustering models are, in many ways, the unsupervised equivalent of classification models. With classification, we would try to learn a model that would predict which class a given training example belonged to. The model is essentially a mapping from a set of features to the class. In clustering, we would like to segment the data in such a way that each training example is assigned to a segment called a cluster. The clusters act much like classes, except that the true class assignments are unknown. Clustering models have many use cases that are the same as classification; these include the following: Segmenting users or customers into different groups based on behavior characteristics and metadata Grouping content on a website or products in a retail business Finding clusters of similar genes Segmenting communities in ecology Creating image segments for use in image analysis applications such as object detection Types of clustering models There are many different forms of clustering models available, ranging from simple to extremely complex ones. The Spark MLlibrary currently provides K-means clustering, which is among the simplest approaches available. However, it is often very effective, and its simplicity makes it is relatively easy to understand and is scalable. K-means clustering K-means attempts to partition a set of data points into K distinct clusters (where K is an input parameter for the model). More formally, K-means tries to find clusters so as to minimize the sum of squared errors (or distances) within each cluster. This objective function is known as the within cluster sum of squared errors (WCSS). It is the sum, over each cluster, of the squared errors between each point and the cluster center. Starting with a set of K initial cluster centers (which are computed as the mean vector for all data points in the cluster), the standard method for K-means iterates between two steps: Assign each data point to the cluster that minimizes the WCSS. The sum of squares is equivalent to the squared Euclidean distance; therefore, this equates to assigning each point to the closest cluster center as measured by the Euclidean distance metric. Compute the new cluster centers based on the cluster assignments from the first step. The algorithm proceeds until either a maximum number of iterations has been reached or convergence has been achieved. Convergence means that the cluster assignments no longer change during the first step; therefore, the value of the WCSS objective function does not change either. For more details, refer to Spark's documentation on clustering at http://spark.apache.org/docs/latest/mllib-clustering.html or refer to http://en.wikipedia.org/wiki/K-means_clustering. To illustrate the basics of K-means, we will use a simple dataset. We have five classes, which are shown in the following figure: Multiclass dataset However, assume that we don't actually know the true classes. If we use K-means with five clusters, then after the first step, the model's cluster assignments might look like this: Cluster assignments after the first K-means iteration We can see that K-means has already picked out the centers of each cluster fairly well. After the next iteration, the assignments might look like those shown in the following figure: Cluster assignments after the second K-means iteration Things are starting to stabilize, but the overall cluster assignments are broadly the same as they were after the first iteration. Once the model has converged, the final assignments could look like this: Final cluster assignments for K-means As we can see, the model has done a decent job of separating the five clusters. The leftmost three are fairly accurate (with a few incorrect points). However, the two clusters in the bottom-right corner are less accurate. This illustrates the following: The iterative nature of K-means The model's dependency on the method of initially selecting clusters' centers (here, we will use a random approach) How the final cluster assignments can be very good for well-separated data but can be poor for data that is more difficult Initialization methods The standard initialization method for K-means, usually simply referred to as the random method, starts by randomly assigning each data point to a cluster before proceeding with the first update step. Spark ML provides a parallel variant for this initialization method, called K-means++, which is the default initialization method used. Refer to http://en.wikipedia.org/wiki/K-means_clustering#Initialization_methods and http://en.wikipedia.org/wiki/K-means%2B%2B for more information. The results of using K-means++ are shown here. Note that this time, the difficult bottom-right points have been mostly correctly clustered. Final cluster assignments for K-means++ Variants There are many other variants of K-means; they focus on initialization methods or the core model. One of the more common variants is fuzzy K-means. This model does not assign each point to one cluster as K-means does (a so-called hard assignment). Instead, it is a soft version of K-means, where each point can belong to many clusters and is represented by the relative membership to each cluster. So, for K clusters, each point is represented as a K-dimensional membership vector, with each entry in this vector indicating the membership proportion in each cluster. Mixture models A mixture model is essentially an extension of the idea behind fuzzy K-means; however, it makes an assumption that there is an underlying probability distribution that generates the data. For example, we might assume that the data points are drawn from a set of K-independent Gaussian (normal) probability distributions. The cluster assignments are also soft, so each point is represented by K membership weights in each of the K underlying probability distributions. Refer to http://en.wikipedia.org/wiki/Mixture_model for further details and for a mathematical treatment of mixture models. Hierarchical clustering Hierarchical clustering is a structured clustering approach that results in a multilevel hierarchy of clusters where each cluster might contain many subclusters (or child clusters). Each child cluster is, thus, linked to the parent cluster. This form of clustering is often also called tree clustering. Agglomerative clustering is a bottom-up approach where we have the following: Each data point begins in its own cluster The similarity (or distance) between each pair of clusters is evaluated The pair of clusters that are most similar are found; this pair is then merged to form a new cluster The process is repeated until only one top-level cluster remains Divisive clustering is a top-down approach that works in reverse, starting with one cluster, and at each stage, splitting a cluster into two, until all data points are allocated to their own bottom-level cluster. You can find more information at http://en.wikipedia.org/wiki/Hierarchical_clustering. Summary In this article, we explored a new class of model that learns structure from unlabeled data—unsupervised learning. You learned about various clustering models like the K-means model, mixture models, and the hierarchical clustering model. We also considered a simple dataset to illustrate the basics of K-means. Resources for Article: Further resources on this subject: Spark for Beginners [article] Setting up Spark [article] Holistic View on Spark [article]
Read more
  • 0
  • 0
  • 2123
article-image-background-jobs-django-celery
Jean Jung
19 Jan 2017
7 min read
Save for later

Background jobs on Django with Celery

Jean Jung
19 Jan 2017
7 min read
While doing web applications, you usually need to run some operations in the background to improve the application performance, or because a job really needs to run outside of the application environment. In both cases, if you are on Django, you are in good hands because you have Celery, the Distributed Task Queue written in Python. Celery is a tiny but complete project. You can find more information on the project page. In this post, we will see how it’s easy to integrate Celery with an existing project, and although we are focusing on Django here, creating a standalone Celery worker is a very similar process. Installing Celery The first step we will see is how to install Celery. If you already have it, please move to the next section and follow the next step! As every good Python package, Celery is distributed on pip. You can install it just by entering: pip install celery Choosing a message broker The second step is about choosing a message broker to act as the job queue. Celery can talk with a great variety of brokers; the main ones are: RabbitMQ Redis 1 Amazon SQS  ² Check for support on other brokers here. If you’re already using any of these brokers for other purposes, choose it as your primary option. In this section there is nothing more you have to do. Celery is very transparent and does not require any source modification to move from a broker to another, so feel free to try more than one after we end here. Ok let’s move on, but first do not forget to look the little notes below. ¹: For Redis (a great choice in my opinion), you have to install the celery[redis] package. ²: Celery has great features like web monitoring that do not work with this broker. Celery worker entrypoint When running Celery on a directory it will search for a file called celery.py, which is the application entrypoint, where the configs are loaded and the application object resides. Working with Django, this file is commonly stored on the project directory, along with the settings.py file; your file structure should look like this: your_project_name your_project_name __init__.py settings.py urls.py wsgi.py celery.py your_app_name __init__.py models.py views.py …. The settings read by that file will be on the same settings.py file that Django uses. At this point we can take a look at the official documentation celery.py file example. This code is basically the same for every project; just replace proj by your project name and save that file. Each part is described in the file comments. from __future__ import absolute_import, unicode_literals import os from celery import Celery # set the default Django settings module for the 'celery' program. os.environ.setdefault('DJANGO_SETTINGS_MODULE', 'proj.settings') app = Celery('proj') # Using a string here means the worker don't have to serialize # the configuration object to child processes. # - namespace='CELERY' means all celery-related configuration keys # should have a `CELERY_` prefix. app.config_from_object('django.conf:settings', namespace='CELERY') # Load task modules from all registered Django app configs. # This is not required, but as you can have more than one app # with tasks it’s better to do the autoload than declaring all tasks # in this same file. app.autodiscover_tasks() Settings By default, Celery depends only on the broker_url setting to work. As we’ve seen in the previous session, your settings will be stored alongside the Django ones but with the 0‘CELERY_’ prefix. The broker_url format is as follows: CELERY_BROKER_URL = ‘broker://[[user]:[password]@]host[:port[/resource]]’ Where broker is an identifier that specifies the chosen broker, like amqp or redis; user and password are the authentication to the service. If needed, host and port are the addresses of the service and resource is a broker-specific path to the component resource. For example, if you’ve chosen a local Redis as your broker, your broker URL will be: CELERY_BROKER_URL = ‘redis://localhost:6379/0’ ¹ 1: Considering a default Redis installation with the database 0 being used. Doing this we have a functioning celery worker. How lucky! It’s so simple! But wait, what about the tasks? How do we write and execute them? Let’s see. Creating and running tasks Because of the superpowers Celery has, it can autoload tasks from Django app directories as we’ve seen before; you just have to declare your app tasks in a file called tasks.py in the app dir: your_project_name your_project_name __init__.py settings.py urls.py wsgi.py celery.py your_app_name __init__.py models.py views.py tasks.py …. In that file you just need to put functions decorated with the celery.shared_task decorator. So suppose we want to do a background mailer; the source will be like this: from __future__ import absolute_import, unicode_literals from celery import shared_task from django.core.mail import send_mail @shared_task def mailer(subject, message, recipient_list, from=’default@admin.com’): send_mail(subject, message, recipient_list, from) Then on the Django application, on any place you have to send an e-mail on background, just do the following: from __future__ import absolute_import from app.tasks import mailer …. def send_email_to_user(request): if request.user: mailer.delay(‘Alert Foo’, ‘The foo message’, [request.user.email]) delay is probably the most used way to submit a job to a Celery worker, but is not the only one. Check this reference to see what is possible to do. There are many features like task chaining, with future schedules and more! As you can have noticed, in a great majority of the files, we have used the from __future__ import absolute_import statement. This is very important, mainly with Python 2, because of the way Celery serializes messages to post tasks on brokers. You need to follow the same convention when creating and using tasks, as otherwise the namespace of the task will differ and the task will not get executed. The absolute import module forces you to use absolute imports, so you will avoid these problems. Check this link for more information. Running the worker If you get the source code above, put anything in the right place and run the Django development server to test your background jobs, they will not work! Wait. This is because you don’t have a Celery worker started yet. To start it, do a cd to the project main directory (the same as you run python manage.py runserver for example) and run: celery -A your_project_name worker -l info Replace your_project_name with your project and info with the desired log level. Keep this process running, start the Django server, and yes. Now you can see that anything works! Where to go now? Explore the Celery documentation and see all the available features, caveats, and help you can get from it. There is also an example project on the Celery GitHub page that you can use as a template for new projects or a guide to add celery to your existing project. Summary We’ve seen how to install and configure Celery to run alongside a new or existing Django project. We explored some of the broker options we have, and how simple it is to change between them. There are some hints about brokers that don’t offer all of the features Celery has. We have seen an example of a mailer task, and how it was created and called from the Django application. Finally I provided instructions to start the worker to get the things done. References [1] - Django project documentation [2] - Celery project documentation [3] - Redis project page [4] - RabbitMQ project page [5] - Amazon SQS page About the author Jean Jung is a Brazilian developer passionate about technology. He is currently a system analyst at EBANX, an international payment processing company for Latin America. He's very interested in Python and artificial intelligence, specifically machine learning, compilers and operational systems. As a hobby, he's always looking for IoT projects with Arduino.
Read more
  • 0
  • 0
  • 4815

article-image-normal-maps
Packt
19 Jan 2017
12 min read
Save for later

Normal maps

Packt
19 Jan 2017
12 min read
In this article by Raimondas Pupius, the author of the book Mastering SFML Game Development we will learn about normal maps and specular maps. (For more resources related to this topic, see here.) Lighting can be used to create visually complex and breath-taking scenes. One of the massive benefits of having a lighting system is the ability it provides to add extra details to your scene, which wouldn't have been possible otherwise. One way of doing so is using normal maps. Mathematically speaking, the word "normal" in the context of a surface is simply a directional vector that is perpendicular to the said surface. Consider the following illustration: In this case, what's normal is facing up because that's the direction perpendicular to the plane. How is this helpful? Well, imagine you have a really complex model with many vertices; it'd be extremely taxing to render the said model because of all the geometry that would need to be processed with each frame. A clever trick to work around this, known as normal mapping, is to take the information of all of those vertices and save them on a texture that looks similar to this one: It probably looks extremely funky, especially if being looked of physical release in grayscale, but try not to think of this in terms of colors, but directions. The red channel of a normal map encodes the –x and +x values. The green channel does the same for –y and +y values, and the blue channel is used for –z to +z. Looking back at the previous image now, it's easier to confirm which direction each individual pixel is facing. Using this information on geometry that's completely flat would still allow us to light it in such a way that it would make it look like it has all of the detail in there; yet, it would still remain flat and light on performance: These normal maps can be hand-drawn or simply generated using software such as Crazybump. Let's see how all of this can be done in our game engine. Implementing normal map rendering In the case of maps, implementing normal map rendering is extremely simple. We already have all the material maps integrated and ready to go, so at this time, it's simply a matter of sampling the texture of the tile-sheet normals: void Map::Redraw(sf::Vector3i l_from, sf::Vector3i l_to) { ... if (renderer->UseShader("MaterialPass")) { // Material pass. auto shader = renderer->GetCurrentShader(); auto textureName = m_tileMap.GetTileSet().GetTextureName(); auto normalMaterial = m_textureManager-> GetResource(textureName + "_normal"); for (auto x = l_from.x; x <= l_to.x; ++x) { for (auto y = l_from.y; y <= l_to.y; ++y) { for (auto layer = l_from.z; layer <= l_to.z; ++layer) { auto tile = m_tileMap.GetTile(x, y, layer); if (!tile) { continue; } auto& sprite = tile->m_properties->m_sprite; sprite.setPosition( static_cast<float>(x * Sheet::Tile_Size), static_cast<float>(y * Sheet::Tile_Size)); // Normal pass. if (normalMaterial) { shader->setUniform("material", *normalMaterial); renderer->Draw(sprite, &m_normals[layer]); } } } } } ... } The process is exactly the same as drawing a normal tile to a diffuse map, except that here we have to provide the material shader with the texture of the tile-sheet normal map. Also note that we're now drawing to a normal buffer texture. The same is true for drawing entities as well: void S_Renderer::Draw(MaterialMapContainer& l_materials, Window& l_window, int l_layer) { ... if (renderer->UseShader("MaterialPass")) { // Material pass. auto shader = renderer->GetCurrentShader(); auto textures = m_systemManager-> GetEntityManager()->GetTextureManager(); for (auto &entity : m_entities) { auto position = entities->GetComponent<C_Position>( entity, Component::Position); if (position->GetElevation() < l_layer) { continue; } if (position->GetElevation() > l_layer) { break; } C_Drawable* drawable = GetDrawableFromType(entity); if (!drawable) { continue; } if (drawable->GetType() != Component::SpriteSheet) { continue; } auto sheet = static_cast<C_SpriteSheet*>(drawable); auto name = sheet->GetSpriteSheet()->GetTextureName(); auto normals = textures->GetResource(name + "_normal"); // Normal pass. if (normals) { shader->setUniform("material", *normals); drawable->Draw(&l_window, l_materials[MaterialMapType::Normal].get()); } } } ... } You can try obtaining a normal texture through the texture manager. If you find one, you can draw it to the normal map material buffer. Dealing with particles isn't much different from what we've seen already, except for one little piece of detail: void ParticleSystem::Draw(MaterialMapContainer& l_materials, Window& l_window, int l_layer) { ... if (renderer->UseShader("MaterialValuePass")) { // Material pass. auto shader = renderer->GetCurrentShader(); for (size_t i = 0; i < container->m_countAlive; ++i) { if (l_layer >= 0) { if (positions[i].z < l_layer * Sheet::Tile_Size) { continue; } if (positions[i].z >= (l_layer + 1) * Sheet::Tile_Size) { continue; } } else if (positions[i].z < Sheet::Num_Layers * Sheet::Tile_Size) { continue; } // Normal pass. shader->setUniform("material", sf::Glsl::Vec3(0.5f, 0.5f, 1.f)); renderer->Draw(drawables[i], l_materials[MaterialMapType::Normal].get()); } } ... } As you can see, we're actually using the material value shader in order to give particles' static normals, which are always sort of pointing to the camera. A normal map buffer should look something like this after you render all the normal maps to it: Changing the lighting shader Now that we have all of this information, let's actually use it when calculating the illumination of the pixels inside the light pass shader: uniform sampler2D LastPass; uniform sampler2D DiffuseMap; uniform sampler2D NormalMap; uniform vec3 AmbientLight; uniform int LightCount; uniform int PassNumber; struct LightInfo { vec3 position; vec3 color; float radius; float falloff; }; const int MaxLights = 4; uniform LightInfo Lights[MaxLights]; void main() { vec4 pixel = texture2D(LastPass, gl_TexCoord[0].xy); vec4 diffusepixel = texture2D(DiffuseMap, gl_TexCoord[0].xy); vec4 normalpixel = texture2D(NormalMap, gl_TexCoord[0].xy); vec3 PixelCoordinates = vec3(gl_FragCoord.x, gl_FragCoord.y, gl_FragCoord.z); vec4 finalPixel = gl_Color * pixel; vec3 viewDirection = vec3(0, 0, 1); if(PassNumber == 1) { finalPixel *= vec4(AmbientLight, 1.0); } // IF FIRST PASS ONLY! vec3 N = normalize(normalpixel.rgb * 2.0 - 1.0); for(int i = 0; i < LightCount; ++i) { vec3 L = Lights[i].position - PixelCoordinates; float distance = length(L); float d = max(distance - Lights[i].radius, 0); L /= distance; float attenuation = 1 / pow(d/Lights[i].radius + 1, 2); attenuation = (attenuation - Lights[i].falloff) / (1 - Lights[i].falloff); attenuation = max(attenuation, 0); float normalDot = max(dot(N, L), 0.0); finalPixel += (diffusepixel * ((vec4(Lights[i].color, 1.0) * attenuation))) * normalDot; } gl_FragColor = finalPixel; } First, the normal map texture needs to be passed to it as well as sampled, which is where the first two highlighted lines of code come in. Once this is done, for each light we're drawing on the screen, the normal directional vector is calculated. This is done by first making sure that it can go into the negative range and then normalizing it. A normalized vector only represents a direction. Since the color values range from 0 to 255, negative values cannot be directly represented. This is why we first bring them into the right range by multiplying them by 2.0 and subtracting by 1.0. A dot product is then calculated between the normal vector and the normalized L vector, which now represents the direction from the light to the pixel. How much a pixel is lit up from a specific light is directly contingent upon the dot product, which is a value from 1.0 to 0.0 and represents magnitude. A dot product is an algebraic operation that takes in two vectors, as well as the cosine of the angle between them, and produces a scalar value between 0.0 and 1.0 that essentially represents how “orthogonal” they are. We use this property to light pixels less and less, given greater and greater angles between their normals and the light. Finally, the dot product is used again when calculating the final pixel value. The entire influence of the light is multiplied by it, which allows every pixel to be drawn differently as if it had some underlying geometry that was pointing in a different direction. The last thing left to do now is to pass the normal map buffer to the shader in our C++ code: void LightManager::RenderScene() { ... if (renderer->UseShader("LightPass")) { // Light pass. ... shader->setUniform("NormalMap", m_materialMaps[MaterialMapType::Normal]->getTexture()); ... } ... } This effectively enables normal mapping and gives us beautiful results such as this: The leaves, the character, and pretty much everything in this image now looks like they have a definition, ridges, and crevices; it is lit as if it had geometry, although it's paper-thin. Note the lines around each tile in this particular instance. This is one of the main reasons why normal maps for pixel art, such as tile sheets, shouldn't be automatically generated; it can sample the tiles adjacent to it and incorrectly add bevelled edges. Specular maps While normal maps provide us with the possibility to fake how bumpy a surface is, specular maps allow us to do the same with the shininess of a surface. This is what the same segment of the tile sheet we used as an example for a normal map looks like in a specular map: It's not as complex as a normal map since it only needs to store one value: the shininess factor. We can leave it up to each light to decide how much shine it will cast upon the scenery by letting it have its own values: struct LightBase { ... float m_specularExponent = 10.f; float m_specularStrength = 1.f; }; Adding support for specularity Similar to normal maps, we need to use the material pass shader to render to a specularity buffer texture: void Map::Redraw(sf::Vector3i l_from, sf::Vector3i l_to) { ... if (renderer->UseShader("MaterialPass")) { // Material pass. ... auto specMaterial = m_textureManager->GetResource( textureName + "_specular"); for (auto x = l_from.x; x <= l_to.x; ++x) { for (auto y = l_from.y; y <= l_to.y; ++y) { for (auto layer = l_from.z; layer <= l_to.z; ++layer) { ... // Normal pass. // Specular pass. if (specMaterial) { shader->setUniform("material", *specMaterial); renderer->Draw(sprite, &m_speculars[layer]); } } } } } ... } The texture for specularity is once again attempted to be obtained; it is passed down to the material pass shader if found. The same is true when you render entities: void S_Renderer::Draw(MaterialMapContainer& l_materials, Window& l_window, int l_layer) { ... if (renderer->UseShader("MaterialPass")) { // Material pass. ... for (auto &entity : m_entities) { ... // Normal pass. // Specular pass. if (specular) { shader->setUniform("material", *specular); drawable->Draw(&l_window, l_materials[MaterialMapType::Specular].get()); } } } ... } Particles, on the other hand, also use the material value pass shader: void ParticleSystem::Draw(MaterialMapContainer& l_materials, Window& l_window, int l_layer) { ... if (renderer->UseShader("MaterialValuePass")) { // Material pass. auto shader = renderer->GetCurrentShader(); for (size_t i = 0; i < container->m_countAlive; ++i) { ... // Normal pass. // Specular pass. shader->setUniform("material", sf::Glsl::Vec3(0.f, 0.f, 0.f)); renderer->Draw(drawables[i], l_materials[MaterialMapType::Specular].get()); } } } For now, we don't want any of them to be specular at all. This can obviously be tweaked later on, but the important thing is that we have that functionality available and yielding results, such as the following: This specularity texture needs to be sampled inside a light-pass shader, just like a normal texture. Let's see what this involves. Changing the lighting shader Just as before, a uniform sampler2D needs to be added to sample the specularity of a particular fragment: uniform sampler2D LastPass; uniform sampler2D DiffuseMap; uniform sampler2D NormalMap; uniform sampler2D SpecularMap; uniform vec3 AmbientLight; uniform int LightCount; uniform int PassNumber; struct LightInfo { vec3 position; vec3 color; float radius; float falloff; float specularExponent; float specularStrength; }; const int MaxLights = 4; uniform LightInfo Lights[MaxLights]; const float SpecularConstant = 0.4; void main() { ... vec4 specularpixel = texture2D(SpecularMap, gl_TexCoord[0].xy); vec3 viewDirection = vec3(0, 0, 1); // Looking at positive Z. ... for(int i = 0; i < LightCount; ++i){ ... float specularLevel = 0.0; specularLevel = pow(max(0.0, dot(reflect(-L, N), viewDirection)), Lights[i].specularExponent * specularpixel.a) * SpecularConstant; vec3 specularReflection = Lights[i].color * specularLevel * specularpixel.rgb * Lights[i].specularStrength; finalPixel += (diffusepixel * ((vec4(Lights[i].color, 1.0) * attenuation)) + vec4(specularReflection, 1.0)) * normalDot; } gl_FragColor = finalPixel; } We also need to add in the specular exponent and strength to each light's struct, as it's now part of it. Once the specular pixel is sampled, we need to set up the direction of the camera as well. Since that's static, we can leave it as is in the shader. The specularity of the pixel is then calculated by taking into account the dot product between the pixel’s normal and the light, the color of the specular pixel itself, and the specular strength of the light. Note the use of a specular constant in the calculation. This is a value that can and should be tweaked in order to obtain best results, as 100% specularity rarely ever looks good. Then, all that's left is to make sure the specularity texture is also sent to the light-pass shader in addition to the light's specular exponent and strength values: void LightManager::RenderScene() { ... if (renderer->UseShader("LightPass")) { // Light pass. ... shader->setUniform("SpecularMap", m_materialMaps[MaterialMapType::Specular]->getTexture()); ... for (auto& light : m_lights) { ... shader->setUniform(id + ".specularExponent", light.m_specularExponent); shader->setUniform(id + ".specularStrength", light.m_specularStrength); ... } } } The result may not be visible right away, but upon closer inspection of moving a light stream, we can see that correctly mapped surfaces will have a glint that will move around with the light: While this is nearly perfect, there's still some room for improvement. Summary Lighting is a very powerful tool when used right. Different aspects of a material may be emphasized depending on the setup of the game level, additional levels of detail can be added in without too much overhead, and the overall aesthetics of the project will be leveraged to new heights. The full version of “Mastering SFML Game Development” offers all of this and more by not only utilizing normal and specular maps, but also using 3D shadow-mapping techniques to create Omni-directional point light shadows that breathe new life into the game world. Resources for Article: Further resources on this subject: Common Game Programming Patterns [article] Sprites in Action [article] Warfare Unleashed Implementing Gameplay [article]
Read more
  • 0
  • 0
  • 26521

article-image-installing-and-running-lxc
Packt
19 Jan 2017
22 min read
Save for later

Installing and running LXC

Packt
19 Jan 2017
22 min read
In this article by Konstantin Ivanov, the author of the book Containerization with LXC, we will see how to install and run LXC. LXC takes advantage of the kernel namespaces and cgroups to create process isolation we often refer to as containers. As such LXC is not a separate software component in the Linux kernel, but rather a set of userspace tools, the liblxc library and various language bindings. In this article, we are going to cover the following topics: Installing LXC on Ubuntu Building and starting containers using the provided templates and configuration files Showcase the various LXC operations (For more resources related to this topic, see here.) Installing LXC At the time of writing there are two long-term support versions of LXC – 1.0 and 2.0. The userspace tools that they provide have some minor differences in command line flags and deprecations that I'll point out as we use them. Installing LXC on Ubuntu with apt Let's start by installing LXC 1.0 on Ubuntu 14.04 Trusty: Install the main LXC package, tooling and dependencies: root@ubuntu:~# lsb_release -dc Description: Ubuntu 14.04.5 LTS Codename: trusty root@ubuntu:~# apt-get –y install -y lxc bridge-utils debootstrap libcap-dev cgroup-bin libpam-systemd bridge-utils root@ubuntu:~# The package versions that Trusty provides at this time is 1.0.8: root@ubuntu:~# dpkg --list | grep lxc | awk '{print $2,$3}' liblxc1 1.0.8-0ubuntu0.3 lxc 1.0.8-0ubuntu0.3 lxc-templates 1.0.8-0ubuntu0.3 python3-lxc 1.0.8-0ubuntu0.3 root@ubuntu:~# To install LXC 2.0 we'll need the backports repository: Add the following two lines in the apt sources file: root@ubuntu:~# vim /etc/apt/sources.list deb http://archive.ubuntu.com/ubuntu trusty-backports main restricted universe multiverse deb-src http://archive.ubuntu.com/ubuntu trusty-backports main restricted universe multiverse Resynchronize the package index files from their sources: root@ubuntu:~# apt-get update Install the main LXC package, tooling and dependencies: root@ubuntu:~# apt-get –y install -y lxc=2.0.3-0ubuntu1~ubuntu14.04.1 lxc1=2.0.3-0ubuntu1~ubuntu14.04.1 liblxc1=2.0.3-0ubuntu1~ubuntu14.04.1 python3-lxc=2.0.3-0ubuntu1~ubuntu14.04.1 cgroup-lite=1.11~ubuntu14.04.2 lxc-templates=2.0.3-0ubuntu1~ubuntu14.04.1 bridge-utils root@ubuntu:~# Ensure the package versions are on the 2.x branch, in this case 2.0.3: root@ubuntu:~# dpkg --list | grep lxc | awk '{print $2,$3}' liblxc1 2.0.3-0ubuntu1~ubuntu14.04.1 lxc 2.0.3-0ubuntu1~ubuntu14.04.1 lxc-common 2.0.3-0ubuntu1~ubuntu14.04.1 lxc-templates 2.0.3-0ubuntu1~ubuntu14.04.1 lxc1 2.0.3-0ubuntu1~ubuntu14.04.1 lxcfs 2.0.2-0ubuntu1~ubuntu14.04.1 python3-lxc 2.0.3-0ubuntu1~ubuntu14.04.1 root@ubuntu:~# LXC directory installation layout The following table shows the directory layout of LXC that is created after package and source installation. The directories vary depending on distribution and installation method. Ubuntu package CentOS package Source installation Description /usr/share/lxc /usr/share/lxc /usr/local/share/lxc LXC base directory /usr/share/lxc/config /usr/share/lxc/config /usr/local/share/lxc/config Collection of distribution based LXC configuration files /usr/share/lxc/templates /usr/share/lxc/templates /usr/local/share/lxc/templates Collection of container template scripts /usr/bin /usr/bin /usr/local/bin Location for most LXC binaries /usr/lib/x86_64-linux-gnu /usr/lib64 /usr/local/lib Location of liblxc libraries /etc/lxc /etc/lxc /usr/local/etc/lxc Location of default LXC config files /var/lib/lxc/ /var/lib/lxc/ /usr/local/var/lib/lxc/ Location of the root filesystem and config for created container /var/log/lxc /var/log/lxc /usr/local/var/log/lxc LXC log files  We will explore most of the directories during building, starting and terminating of LXC containers. Building and manipulating LXC containers Managing the container life cycle with the provided userspace tools is quite convenient compared to manually creating namespaces and applying resource limits with cgroups. In essence this is exactly what the LXC tools do, creation and manipulation of the namespaces through calls to the liblxc API and cgroups. LXC comes packaged with various templates for building root file systems for different Linux distributions. We can use them to create a variety of container flavors. Building our first container We can create our first container by using a template. The lxc-download file, like the rest of the templates in the templates directory, is a script written in bash: root@ubuntu:~# ls -la /usr/share/lxc/templates/ drwxr-xr-x 2 root root 4096 Aug 29 20:03 . drwxr-xr-x 6 root root 4096 Aug 29 19:58 .. -rwxr-xr-x 1 root root 10557 Nov 18 2015 lxc-alpine -rwxr-xr-x 1 root root 13534 Nov 18 2015 lxc-altlinux -rwxr-xr-x 1 root root 10556 Nov 18 2015 lxc-archlinux -rwxr-xr-x 1 root root 9878 Nov 18 2015 lxc-busybox -rwxr-xr-x 1 root root 29149 Nov 18 2015 lxc-centos -rwxr-xr-x 1 root root 10486 Nov 18 2015 lxc-cirros -rwxr-xr-x 1 root root 17354 Nov 18 2015 lxc-debian -rwxr-xr-x 1 root root 17757 Nov 18 2015 lxc-download -rwxr-xr-x 1 root root 49319 Nov 18 2015 lxc-fedora -rwxr-xr-x 1 root root 28253 Nov 18 2015 lxc-gentoo -rwxr-xr-x 1 root root 13962 Nov 18 2015 lxc-openmandriva -rwxr-xr-x 1 root root 14046 Nov 18 2015 lxc-opensuse -rwxr-xr-x 1 root root 35540 Nov 18 2015 lxc-oracle -rwxr-xr-x 1 root root 11868 Nov 18 2015 lxc-plamo -rwxr-xr-x 1 root root 6851 Nov 18 2015 lxc-sshd -rwxr-xr-x 1 root root 23494 Nov 18 2015 lxc-ubuntu -rwxr-xr-x 1 root root 11349 Nov 18 2015 lxc-ubuntu-cloud root@ubuntu:~# If you examine the scripts closely you'll notice that most of them create chroot environments, where packages and various configuration files are then installed to create the root filesystem for the selected distribution. Let's start by building a container using the lxc-download template, which will ask for the distribution, release and architecture, then use the appropriate template to create the file system and configuration for us: root@ubuntu:~# lxc-create -t download -n c1 Setting up the GPG keyring Downloading the image index --- DIST RELEASE ARCH VARIANT BUILD --- centos 6 amd64 default 20160831_02:16 centos 6 i386 default 20160831_02:16 centos 7 amd64 default 20160831_02:16 debian jessie amd64 default 20160830_22:42 debian jessie arm64 default 20160824_22:42 debian jessie armel default 20160830_22:42 ... ubuntu trusty amd64 default 20160831_03:49 ubuntu trusty arm64 default 20160831_07:50 ubuntu yakkety s390x default 20160831_03:49 --- Distribution: ubuntu Release: trusty Architecture: amd64 Unpacking the rootfs --- You just created an Ubuntu container (release=trusty, arch=amd64, variant=default) To enable sshd, run: apt-get install openssh-server For security reason, container images ship without user accounts and without a root password. Use lxc-attach or chroot directly into the rootfs to set a root password or create user accounts. root@ubuntu:~# Let's list all containers: root@ubuntu:~# lxc-ls -f NAME STATE IPV4 IPV6 AUTOSTART ---------------------------------------------------- c1 STOPPED - - NO root@nova-perf:~# Depending on the version of LXC some of the command options might be different, read the man page for each of the tools if you encounter errors Our container is currently not running, let's start it in the background and increase the log level to DEBUG: root@ubuntu:~# lxc-start -n c1 -d -l DEBUG On some distributions LXC does not create the host bridge when building the first container, which results in an error. If this happens you can create it by running: brctl addbr virbr0 root@ubuntu:~# lxc-ls -f NAME STATE IPV4 IPV6 AUTOSTART ---------------------------------------------------------- c1 RUNNING 10.0.3.190 - NO root@ubuntu:~# To obtain more information about the container run: root@ubuntu:~# lxc-info -n c1 Name: c1 State: RUNNING PID: 29364 IP: 10.0.3.190 CPU use: 1.46 seconds BlkIO use: 112.00 KiB Memory use: 6.34 MiB KMem use: 0 bytes Link: vethVRD8T2 TX bytes: 4.28 KiB RX bytes: 4.43 KiB Total bytes: 8.70 KiB root@ubuntu:~# The new container is now connected to the host bridge lxcbr0: root@ubuntu:~# brctl show bridge name bridge id STP enabled interfaces lxcbr0 8000.fea50feb48ac no vethVRD8T2 root@ubuntu:~# ip a s lxcbr0 4: lxcbr0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default link/ether fe:a5:0f:eb:48:ac brd ff:ff:ff:ff:ff:ff inet 10.0.3.1/24 brd 10.0.3.255 scope global lxcbr0 valid_lft forever preferred_lft forever inet6 fe80::465:64ff:fe49:5fb5/64 scope link valid_lft forever preferred_lft forever root@ubuntu:~# ip a s vethVRD8T2 8: vethVRD8T2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast master lxcbr0 state UP group default qlen 1000 link/ether fe:a5:0f:eb:48:ac brd ff:ff:ff:ff:ff:ff inet6 fe80::fca5:fff:feeb:48ac/64 scope link valid_lft forever preferred_lft forever root@ubuntu:~# By using the download template and not specifying any network settings, the container obtains its IP address from a dnsmasq server that runs on a private network, 10.0.3.0/24 in this case. The host allows the container to connect to the rest of the network and Internet by using NAT rules in iptables: root@ubuntu:~# iptables -L -n -t nat Chain PREROUTING (policy ACCEPT) target prot opt source destination Chain INPUT (policy ACCEPT) target prot opt source destination Chain OUTPUT (policy ACCEPT) target prot opt source destination Chain POSTROUTING (policy ACCEPT) target prot opt source destination MASQUERADE all -- 10.0.3.0/24 !10.0.3.0/24 root@ubuntu:~# Other containers connected to the bridge will have access to each other and to the host, as long as they are all connected to the same bridge and are not tagged with different VLAN IDs. Let's see how the process tree looks like after starting the container: root@ubuntu:~# ps axfww … 1552 ? S 0:00 dnsmasq -u lxc-dnsmasq --strict-order --bind-interfaces --pid-file=/run/lxc/dnsmasq.pid --conf-file= --listen-address 10.0.3.1 --dhcp-range 10.0.3.2,10.0.3.254 --dhcp-lease-max=253 --dhcp-no-override --except-interface=lo --interface=lxcbr0 --dhcp-leasefile=/var/lib/misc/dnsmasq.lxcbr0.leases --dhcp-authoritative 29356 ? Ss 0:00 lxc-start -n c1 -d -l DEBUG 29364 ? Ss 0:00 _ /sbin/init 29588 ? S 0:00 _ upstart-udev-bridge --daemon 29597 ? Ss 0:00 _ /lib/systemd/systemd-udevd --daemon 29667 ? Ssl 0:00 _ rsyslogd 29688 ? S 0:00 _ upstart-file-bridge --daemon 29690 ? S 0:00 _ upstart-socket-bridge --daemon 29705 ? Ss 0:00 _ dhclient -1 -v -pf /run/dhclient.eth0.pid -lf /var/lib/dhcp/dhclient.eth0.leases eth0 29775 pts/6 Ss+ 0:00 _ /sbin/getty -8 38400 tty4 29777 pts/1 Ss+ 0:00 _ /sbin/getty -8 38400 tty2 29778 pts/5 Ss+ 0:00 _ /sbin/getty -8 38400 tty3 29787 ? Ss 0:00 _ cron 29827 pts/7 Ss+ 0:00 _ /sbin/getty -8 38400 console 29829 pts/0 Ss+ 0:00 _ /sbin/getty -8 38400 tty1 root@ubuntu:~# Notice the new init child process that was cloned from the lxc-start command. This is PID 1 in the actual container. Next, let's attach to the container, list all processes, network interfaces and check connectivity: root@ubuntu:~# lxc-attach -n c1 root@c1:~# ps axfw PID TTY STAT TIME COMMAND 1 ? Ss 0:00 /sbin/init 176 ? S 0:00 upstart-udev-bridge --daemon 185 ? Ss 0:00 /lib/systemd/systemd-udevd --daemon 255 ? Ssl 0:00 rsyslogd 276 ? S 0:00 upstart-file-bridge --daemon 278 ? S 0:00 upstart-socket-bridge --daemon 293 ? Ss 0:00 dhclient -1 -v -pf /run/dhclient.eth0.pid -lf /var/lib/dhcp/dhclient.eth0.leases eth0 363 lxc/tty4 Ss+ 0:00 /sbin/getty -8 38400 tty4 365 lxc/tty2 Ss+ 0:00 /sbin/getty -8 38400 tty2 366 lxc/tty3 Ss+ 0:00 /sbin/getty -8 38400 tty3 375 ? Ss 0:00 cron 415 lxc/console Ss+ 0:00 /sbin/getty -8 38400 console 417 lxc/tty1 Ss+ 0:00 /sbin/getty -8 38400 tty1 458 ? S 0:00 /bin/bash 468 ? R+ 0:00 ps ax root@c1:~# ip a s 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo valid_lft forever preferred_lft forever inet6 ::1/128 scope host valid_lft forever preferred_lft forever 7: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000 link/ether 00:16:3e:b2:34:8a brd ff:ff:ff:ff:ff:ff inet 10.0.3.190/24 brd 10.0.3.255 scope global eth0 valid_lft forever preferred_lft forever inet6 fe80::216:3eff:feb2:348a/64 scope link valid_lft forever preferred_lft forever root@c1:~# ping -c 3 google.com PING google.com (216.58.192.238) 56(84) bytes of data. 64 bytes from ord30s26-in-f14.1e100.net (216.58.192.238): icmp_seq=1 ttl=52 time=1.77 ms 64 bytes from ord30s26-in-f14.1e100.net (216.58.192.238): icmp_seq=2 ttl=52 time=1.58 ms 64 bytes from ord30s26-in-f14.1e100.net (216.58.192.238): icmp_seq=3 ttl=52 time=1.75 ms --- google.com ping statistics --- 3 packets transmitted, 3 received, 0% packet loss, time 2003ms rtt min/avg/max/mdev = 1.584/1.705/1.779/0.092 ms root@c1:~# exit exit root@ubuntu:~# On some distributions like CentOS, or if installed from source, the dnsmasq server is not configured and started by default. You can either install it and configure it manually, or configure the container with an IP address and a default gateway instead, as I demonstrate later in this article. Notice how the hostname changed on the terminal once we attached to the container. This is an example of how LXC uses the UTS namespaces. Let's examine the directory that was created after building the c1 container: root@ubuntu:~# ls -la /var/lib/lxc/c1/ total 16 drwxrwx--- 3 root root 4096 Aug 31 20:40 . drwx------ 3 root root 4096 Aug 31 21:01 .. -rw-r--r-- 1 root root 516 Aug 31 20:40 config drwxr-xr-x 21 root root 4096 Aug 31 21:00 rootfs root@ubuntu:~# The rootfs directory looks like a regular Linux filesystem. You can manipulate the container directly by making changes to the files there, or by using chroot. To demonstrate this, let's change the root password of the c1 container not by attaching to it, but by using chroot rootfs: root@ubuntu:~# cd /var/lib/lxc/c1/ root@ubuntu:/var/lib/lxc/c1# chroot rootfs root@ubuntu:/# ls -al total 84 drwxr-xr-x 21 root root 4096 Aug 31 21:00 . drwxr-xr-x 21 root root 4096 Aug 31 21:00 .. drwxr-xr-x 2 root root 4096 Aug 29 07:33 bin drwxr-xr-x 2 root root 4096 Apr 10 2014 boot drwxr-xr-x 4 root root 4096 Aug 31 21:00 dev drwxr-xr-x 68 root root 4096 Aug 31 22:12 etc drwxr-xr-x 3 root root 4096 Aug 29 07:33 home drwxr-xr-x 12 root root 4096 Aug 29 07:33 lib drwxr-xr-x 2 root root 4096 Aug 29 07:32 lib64 drwxr-xr-x 2 root root 4096 Aug 29 07:31 media drwxr-xr-x 2 root root 4096 Apr 10 2014 mnt drwxr-xr-x 2 root root 4096 Aug 29 07:31 opt drwxr-xr-x 2 root root 4096 Apr 10 2014 proc drwx------ 2 root root 4096 Aug 31 22:12 root drwxr-xr-x 8 root root 4096 Aug 31 20:54 run drwxr-xr-x 2 root root 4096 Aug 29 07:33 sbin drwxr-xr-x 2 root root 4096 Aug 29 07:31 srv drwxr-xr-x 2 root root 4096 Mar 13 2014 sys drwxrwxrwt 2 root root 4096 Aug 31 22:12 tmp drwxr-xr-x 10 root root 4096 Aug 29 07:31 usr drwxr-xr-x 11 root root 4096 Aug 29 07:31 var root@ubuntu:/# passwd Enter new UNIX password: Retype new UNIX password: passwd: password updated successfully root@ubuntu:/# exit exit root@ubuntu:/var/lib/lxc/c1# Notice how the path changed on the console when we used chroot and after exiting the jailed environment. To test the root password, let's install SSH server in the container by first attaching to it and then using ssh to connect: root@ubuntu:~# lxc-attach -n c1 root@c1:~# apt-get update && apt-get install –y openssh-server root@c1:~# sed -i 's/without-password/yes/g' /etc/ssh/sshd_config root@c1:~# service ssh restart root@c1:/# exit exit root@ubuntu:/var/lib/lxc/c1# ssh 10.0.3.190 root@10.0.3.190's password: Welcome to Ubuntu 14.04.5 LTS (GNU/Linux 3.13.0-91-generic x86_64) * Documentation: https://help.ubuntu.com/ Last login: Wed Aug 31 22:25:39 2016 from 10.0.3.1 root@c1:~# exit logout Connection to 10.0.3.190 closed. root@ubuntu:/var/lib/lxc/c1# We were able to ssh to the container and use the root password that was manually set earlier. Autostarting LXC containers By default LXC containers do not start after a server reboot. To change that, we can use the lxc-autostart tool and the containers configuration file. To demonstrate this, let's create a new container first: root@ubuntu:~# lxc-create --name autostart_container --template ubuntu root@ubuntu:~# lxc-ls -f NAME STATE AUTOSTART GROUPS IPV4 IPV6 autostart_container STOPPED 0 - - - root@ubuntu:~# Next, add the lxc.start.auto stanza to its config file: root@ubuntu:~# echo "lxc.start.auto = 1" >> /var/lib/lxc/autostart_container/config root@ubuntu:~# List all containers that are configured to start automatically: root@ubuntu:~# lxc-autostart --list autostart_container root@ubuntu:~# Now we can use the lxc-autostart command again to start all containers configured to autostart, in this case just one: root@ubuntu:~# lxc-autostart --all root@ubuntu:~# lxc-ls -f NAME STATE AUTOSTART GROUPS IPV4 IPV6 autostart_container RUNNING 1 - 10.0.3.98 – root@ubuntu:~# Two other useful autostart configuration parameters are adding a delay to the start and defining a group in which multiple containers can start as a single unit. Stop the container and add the following to configuration options: root@ubuntu:~# lxc-stop --name autostart_container root@ubuntu:~# echo "lxc.start.delay = 5" >> /var/lib/lxc/autostart_container/config root@ubuntu:~# echo "lxc.group = high_priority" >> /var/lib/lxc/autostart_container/config root@ubuntu:~# Next, lets list the containers configured to autostart again: root@ubuntu:~# lxc-autostart --list root@ubuntu:~# Notice that no containers showed from the preceding output. This is because our container now belongs to an autostart group. Let's specify it: root@ubuntu:~# lxc-autostart --list --group high_priority autostart_container 5 root@ubuntu:~# Similarly to start all containers belong to a given autostart group: root@ubuntu:~# lxc-autostart --group high_priority root@ubuntu:~# lxc-ls -f NAME STATE AUTOSTART GROUPS IPV4 IPV6 autostart_container RUNNING 1 high_priority 10.0.3.98 - root@ubuntu:~# In order for lxc-autostart to automatically start containers after a server reboot, it first needs to be started. This can be achieved by either adding the preceding command in crontab, or creating an init script. Finally in order to clean up run: root@ubuntu:~# lxc-destroy --name autostart_container Destroyed container autostart_container root@ubuntu:~# lxc-ls -f root@ubuntu:~# LXC container hooks LXC provides a convenient way to execute programs during the life cycle of containers. The following table summarizes the various configuration options available to allow for this feature: Option Description lxc.hook.pre-start A hook to be run in the host namespace before the container ttys, consoles, or mounts are loaded. lxc.hook.pre-mount A hook to be run in the container's filesystem namespace, but before the rootfs has been set up. lxc.hook.mount A hook to be run in the container after mounting has been done, but before the pivot_root. lxc.hook.autodev A hook to be run in the container after mounting has been done and after any mount hooks have run, but before the pivot_root. lxc.hook.start A hook to be run in the container right before executing the container's init. lxc.hook.stop A hook to be run in the host's namespace after the container has been shut down. lxc.hook.post-stop A hook to be run in the host's namespace after the container has been shut down. lxc.hook.clone A hook to be run when the container is cloned. lxc.hook.destroy A hook to be run when the container is destroyed.  To demonstrate this, let's create a new container and write a simple script that will output the values of four LXC variables to a file, during container start. First, create the container and add the lxc.hook.pre-start option to its configuration file: root@ubuntu:~# lxc-create --name hooks_container --template ubuntu root@ubuntu:~# echo "lxc.hook.pre-start = /var/lib/lxc/hooks_container/pre_start.sh" >> /var/lib/lxc/hooks_container/config root@ubuntu:~# Next, create a simple bash script and make it executable: root@ubuntu:~# cat /var/lib/lxc/hooks_container/pre_start.sh #!/bin/bash LOG_FILE=/tmp/container.log echo "Container name: $LXC_NAME" | tee -a $LOG_FILE echo "Container mounted rootfs: $LXC_ROOTFS_MOUNT" | tee -a $LOG_FILE echo "Container config file $LXC_CONFIG_FILE" | tee -a $LOG_FILE echo "Container rootfs: $LXC_ROOTFS_PATH" | tee -a $LOG_FILE root@ubuntu:~# root@ubuntu:~# chmod u+x /var/lib/lxc/hooks_container/pre_start.sh root@ubuntu:~# Start the container and check the contents of the file that the bash script should have written to, ensuring the script got triggered: root@ubuntu:~# lxc-start --name hooks_container root@ubuntu:~# lxc-ls -f NAME STATE AUTOSTART GROUPS IPV4 IPV6 hooks_container RUNNING 0 - 10.0.3.237 - root@ubuntu:~# cat /tmp/container.log Container name: hooks_container Container mounted rootfs: /usr/lib/x86_64-linux-gnu/lxc Container config file /var/lib/lxc/hooks_container/config Container rootfs: /var/lib/lxc/hooks_container/rootfs root@ubuntu:~# From the preceding output we can see that the script got triggered when we started the container and the value of the LXC variables got written to the temp file. Attaching directories from the host OS and exploring the running filesystem of a container The root filesystem of LXC containers is visible from the host OS as a regular directory tree. We can directly manipulate files in a running container by just making changes in that directory. LXC also allows for attaching directories from the host OS inside of the container using bind mount. A bind mount is a different view of the directory tree. It achieves this by replicating the existing directory tree under a different mount point. To demonstrate this, let's create a new container, directory and a file on the host: root@ubuntu:~# mkdir /tmp/export_to_container root@ubuntu:~# hostname -f > /tmp/export_to_container/file root@ubuntu:~# lxc-create --name mount_container --template ubuntu root@ubuntu:~# Next, we are going to use the lxc.mount.entry option in the configuration file of the container, telling LXC what directory to bind mount from the host and the mount point inside the container to bind to: root@ubuntu:~# echo "lxc.mount.entry = /tmp/export_to_container/ /var/lib/lxc/mount_container/rootfs/mnt none ro,bind 0 0" >> /var/lib/lxc/mount_container/config root@ubuntu:~# Once the container is started we can see that the /mnt inside of it now contains the file that we created in /tmp/export_to_container directory on the host OS earlier: root@ubuntu:~# lxc-start --name mount_container root@ubuntu:~# lxc-attach --name mount_container root@mount_container:~# cat /mnt/file ubuntu root@mount_containerr:~# exit exit root@ubuntu:~# When an LXC container is in a running state some files are only visible from /proc on the host OS. To examine the running directory of a container, first grab its PID: root@ubuntu:~# lxc-info --name mount_container Name: mount_container State: RUNNING PID: 8594 IP: 10.0.3.237 CPU use: 1.96 seconds BlkIO use: 212.00 KiB Memory use: 8.50 MiB KMem use: 0 bytes Link: vethBXR2HO TX bytes: 4.74 KiB RX bytes: 4.73 KiB Total bytes: 9.46 KiB root@ubuntu:~# With the PID in hand we can examine the running directory of the container: root@ubuntu:~# ls -la /proc/8594/root/run/ total 44 drwxr-xr-x 10 root root 420 Sep 14 23:28 . drwxr-xr-x 21 root root 4096 Sep 14 23:28 .. -rw-r--r-- 1 root root 4 Sep 14 23:28 container_type -rw-r--r-- 1 root root 5 Sep 14 23:28 crond.pid ---------- 1 root root 0 Sep 14 23:28 crond.reboot -rw-r--r-- 1 root root 5 Sep 14 23:28 dhclient.eth0.pid drwxrwxrwt 2 root root 40 Sep 14 23:28 lock -rw-r--r-- 1 root root 112 Sep 14 23:28 motd.dynamic drwxr-xr-x 3 root root 180 Sep 14 23:28 network drwxr-xr-x 3 root root 100 Sep 14 23:28 resolvconf -rw-r--r-- 1 root root 5 Sep 14 23:28 rsyslogd.pid drwxr-xr-x 2 root root 40 Sep 14 23:28 sendsigs.omit.d drwxrwxrwt 2 root root 40 Sep 14 23:28 shm drwxr-xr-x 2 root root 40 Sep 14 23:28 sshd -rw-r--r-- 1 root root 5 Sep 14 23:28 sshd.pid drwxr-xr-x 2 root root 80 Sep 14 23:28 udev -rw-r--r-- 1 root root 5 Sep 14 23:28 upstart-file-bridge.pid -rw-r--r-- 1 root root 4 Sep 14 23:28 upstart-socket-bridge.pid -rw-r--r-- 1 root root 5 Sep 14 23:28 upstart-udev-bridge.pid drwxr-xr-x 2 root root 40 Sep 14 23:28 user -rw-rw-r-- 1 root utmp 2688 Sep 14 23:28 utmp root@ubuntu:~# Make sure you replace the PID with the output of lxc-info from your host, as it will differ from the above example. In order to make persistent changes in the root filesystem of a container, modify the files in /var/lib/lxc/mount_container/rootfs/ instead. Freezing a running container LXC takes advantage of the freezer cgroup to freeze all the processes running inside of a container. The processes will be in a blocked state until thawed. Freezing a container can be useful in cases where the system load is high and you want to free some resources without actually stopping the container and preserve its running state. Ensure you have a running container and check its state from the freezer cgroup: root@ubuntu:~# lxc-ls -f NAME STATE AUTOSTART GROUPS IPV4 IPV6 hooks_container RUNNING 0 - 10.0.3 root@ubuntu:~# cat /sys/fs/cgroup/freezer/lxc/hooks_container/freezer.state THAWED root@ubuntu:~# Notice how a currently running container shows as thawed. Let's freeze it: root@ubuntu:~# lxc-freeze -n hooks_container root@ubuntu:~# lxc-ls -f NAME STATE AUTOSTART GROUPS IPV4 IPV6 hooks_container FROZEN 0 - 10.0.3.237 – root@ubuntu:~# The container state shows as frozen, let's check the cgroup file: root@ubuntu:~# cat /sys/fs/cgroup/freezer/lxc/hooks_container/freezer.state FROZEN root@ubuntu:~# To unfreeze it run: root@ubuntu:~# lxc-unfreeze --name hooks_container root@ubuntu:~# lxc-ls -f NAME STATE AUTOSTART GROUPS IPV4 IPV6 hooks_container RUNNING 0 - 10.0.3.237 - root@ubuntu:~# cat /sys/fs/cgroup/freezer/lxc/hooks_container/freezer.state THAWED root@ubuntu:~# We can monitor the state change by running the lxc-monitor command on a separate console while freezing and unfreezing a container. The change of the container's state will show as the following: root@ubuntu:~# lxc-monitor --name hooks_container 'hooks_container' changed state to [FREEZING] 'hooks_container' changed state to [FROZEN] 'hooks_container' changed state to [THAWED] Limiting container resource usage LXC comes with tools that are just as straightforward and easy to use. Lets start by setting up the available memory to a container to 512MB: root@ubuntu:~# lxc-cgroup -n hooks_container memory.limit_in_bytes 536870912 root@ubuntu:~# We can verify that the new setting has been applying by directly inspecting the memory cgroup for the container: root@ubuntu:~# cat /sys/fs/cgroup/memory/lxc/hooks_container/memory.limit_in_bytes 536870912 root@ubuntu:~# Changing the value only requires running the same command again. Let's change the available memory to 256 MB and inspect the container by attaching to it and running the free utility: root@ubuntu:~# lxc-cgroup -n hooks_container memory.limit_in_bytes 268435456 root@ubuntu:~# cat /sys/fs/cgroup/memory/lxc/hooks_container/memory.limit_in_bytes 268435456 root@ubuntu:~# lxc-attach --name hooks_container root@hooks_container:~# free -m total used free shared buffers cached Mem: 256 63 192 0 0 54 -/+ buffers/cache: 9 246 Swap: 0 0 0 root@hooks_container:~# exit root@ubuntu:~# As the preceding output shows the container only sees 256 MB of total available memory. Similarly we can pin a CPU core to container. In the next example our test server has two cores. Let's allow the container to only run on core 0: root@ubuntu:~# cat /proc/cpuinfo | grep processor processor : 0 processor : 1 root@ubuntu:~# root@ubuntu:~# lxc-cgroup -n hooks_container cpuset.cpus 0 root@ubuntu:~# cat /sys/fs/cgroup/cpuset/lxc/hooks_container/cpuset.cpus 0 root@ubuntu:~# lxc-attach --name hooks_container root@hooks_container:~# cat /proc/cpuinfo | grep processor processor : 0 root@hooks_container:~# exit exit root@ubuntu:~# By attaching to the container and checking the available CPUs we see that only one is presented, as expected. To make changes persist server reboots we need to add them to the configuration file of the container: root@ubuntu:~# echo "lxc.cgroup.memory.limit_in_bytes = 536870912" >> /var/lib/lxc/hooks_container/config root@ubuntu:~# Setting various other cgroup parameters is done in a similar way. For example let's see the CPU shares and the block IO on a container: root@ubuntu:~# lxc-cgroup -n hooks_container cpu.shares 512 root@ubuntu:~# lxc-cgroup -n hooks_container blkio.weight 500 root@ubuntu:~# lxc-cgroup -n hooks_container blkio.weight 500 root@ubuntu:~# Summary In this article we demonstrated how to install LXC, build containers using the provided templates and showed some basic operations to manage the instances. Resources for Article: Further resources on this subject: Remote Authentication [article] Wireless Attacks in Kali Linux [article] Revisiting Linux Network Basics [article]
Read more
  • 0
  • 0
  • 6760
article-image-using-firebase-real-time-database
Oliver Blumanski
18 Jan 2017
5 min read
Save for later

Using the Firebase Real-Time Database

Oliver Blumanski
18 Jan 2017
5 min read
In this post, we are going to look at how to use the Firebase real-time database, along with an example. Here we are writing and reading data from the database using multiple platforms. To do this, we first need a server script that is adding data, and secondly we need a component that pulls the data from the Firebase database. Step 1 - Server Script to collect data Digest an XML feed and transfer the data into the Firebase real-time database. The script runs as cronjob frequently to refresh the data. Step 2 - App Component Subscribe to the data from a JavaScript component, in this case, React-Native. About Firebase Now that those two steps are complete, let's take a step back and talk about Google Firebase. Firebase offers a range of services such as a real-time database, authentication, cloud notifications, storage, and much more. You can find the full feature list here. Firebase covers three platforms: iOS, Android, and Web. The server script uses the Firebases JavaScript Web API. Having data in this real-time database allows us to query the data from all three platforms (iOS, Android, Web), and in addition, the real-time database allows us to subscribe (listen) to a database path (query), or to query a path once. Step 1 - Digest XML feed and transfer into Firebase Firebase Set UpThe first thing you need to do is to set up a Google Firebase project here In the app, click on "Add another App" and choose Web, a pop-up will show you the configuration. You can copy paste your config into the example script. Now you need to set the rules for your Firebase database. You should make yourself familiar with the database access rules. In my example, the path latestMarkets/ is open for write and read. In a real-world production app, you would have to secure this, having authentication for the write permissions. Here are the database rules to get started: { "rules": { "users": { "$uid": { ".read": "$uid === auth.uid", ".write": "$uid === auth.uid" } }, "latestMarkets": { ".read": true, ".write": true } } } The Server Script Code The XML feed contains stock market data and is frequently changing, except on the weekend. To build the server script, some NPM packages are needed: Firebase Request xml2json babel-preset-es2015 Require modules and configure Firebase web api: const Firebase = require('firebase'); const request = require('request'); const parser = require('xml2json'); // firebase access config const config = { apiKey: "apikey", authDomain: "authdomain", databaseURL: "dburl", storageBucket: "optional", messagingSenderId: "optional" } // init firebase Firebase.initializeApp(config) [/Code] I write JavaScript code in ES6. It is much more fun. It is a simple script, so let's have a look at the code that is relevant to Firebase. The code below is inserting or overwriting data in the database. For this script, I am happy to overwrite data: Firebase.database().ref('latestMarkets/'+value.Symbol).set({ Symbol: value.Symbol, Bid: value.Bid, Ask: value.Ask, High: value.High, Low: value.Low, Direction: value.Direction, Last: value.Last }) .then((response) => { // callback callback(true) }) .catch((error) => { // callback callback(error) }) Firebase Db first references the path: Firebase.database().ref('latestMarkets/'+value.Symbol) And then the action you want to do: // insert/overwrite (promise) Firebase.database().ref('latestMarkets/'+value.Symbol).set({}).then((result)) // get data once (promise) Firebase.database().ref('latestMarkets/'+value.Symbol).once('value').then((snapshot)) // listen to db path, get data on change (callback) Firebase.database().ref('latestMarkets/'+value.Symbol).on('value', ((snapshot) => {}) // ...... Here is the Github repository: Displaying the data in a React-Native app This code below will listen to a database path, on data change, all connected devices will synchronise the data: Firebase.database().ref('latestMarkets/').on('value', snapshot => { // do something with snapshot.val() }) To close the listener, or unsubscribe the path, one can use "off": Firebase.database().ref('latestMarkets/').off() I’ve created an example react-native app to display the data: The Github repository Conclusion In mobile app development, one big question is: "What database and cache solution can I use to provide online and offline capabilities?" One way to look at this question is like you are starting a project from scratch. If so, you can fit your data into Firebase, and then this would be a great solution for you. Additionally, you can use it for both web and mobile apps. The great thing is that you don't need to write a particular API, and you can access data straight from JavaScript. On the other hand, if you have a project that uses MySQL for example, the Firebase real-time database won't help you much. You would need to have a remote API to connect to your database in this case. But even if using the Firebase database isn't a good fit for your project, there are still other features, such as Firebase Storage or Cloud Messaging, which are very easy to use, and even though they are beyond the scope of this post, they are worth checking out. About the author Oliver Blumanski is a developer based out of Townsville, Australia. He has been a software developer since 2000, and can be found on GitHub at @blumanski.
Read more
  • 0
  • 0
  • 16942

article-image-building-scalable-microservices
Packt
18 Jan 2017
33 min read
Save for later

Building Scalable Microservices

Packt
18 Jan 2017
33 min read
In this article by Vikram Murugesan, the author of the book Microservices Deployment Cookbook, we will see a brief introduction to concept of the microservices. (For more resources related to this topic, see here.) Writing microservices with Spring Boot Now that our project is ready, let's look at how to write our microservice. There are several Java-based frameworks that let you create microservices. One of the most popular frameworks from the Spring ecosystem is the Spring Boot framework. In this article, we will look at how to create a simple microservice application using Spring Boot. Getting ready Any application requires an entry point to start the application. For Java-based applications, you can write a class that has the main method and run that class as a Java application. Similarly, Spring Boot requires a simple Java class with the main method to run it as a Spring Boot application (microservice). Before you start writing your Spring Boot microservice, you will also require some Maven dependencies in your pom.xml file. How to do it… Create a Java class called com.packt.microservices.geolocation.GeoLocationApplication.java and give it an empty main method: package com.packt.microservices.geolocation; public class GeoLocationApplication { public static void main(String[] args) { // left empty intentionally } } Now that we have our basic template project, let's make our project a child project of Spring Boot's spring-boot-starter-parent pom module. This module has a lot of prerequisite configurations in its pom.xml file, thereby reducing the amount of boilerplate code in our pom.xml file. At the time of writing this, 1.3.6.RELEASE was the most recent version: <parent> <groupId>org.springframework.boot</groupId> <artifactId>spring-boot-starter-parent</artifactId> <version>1.3.6.RELEASE</version> </parent> After this step, you might want to run a Maven update on your project as you have added a new parent module. If you see any warnings about the version of the maven-compiler plugin, you can either ignore it or just remove the <version>3.5.1</version> element. If you remove the version element, please perform a Maven update afterward. Spring Boot has the ability to enable or disable Spring modules such as Spring MVC, Spring Data, and Spring Caching. In our use case, we will be creating some REST APIs to consume the geolocation information of the users. So we will need Spring MVC. Add the following dependencies to your pom.xml file: <dependencies> <dependency> <groupId>org.springframework.boot</groupId> <artifactId>spring-boot-starter-web</artifactId> </dependency> </dependencies> We also need to expose the APIs using web servers such as Tomcat, Jetty, or Undertow. Spring Boot has an in-memory Tomcat server that starts up as soon as you start your Spring Boot application. So we already have an in-memory Tomcat server that we could utilize. Now let's modify the GeoLocationApplication.java class to make it a Spring Boot application: package com.packt.microservices.geolocation; import org.springframework.boot.SpringApplication; import org.springframework.boot.autoconfigure.SpringBootApplication; @SpringBootApplication public class GeoLocationApplication { public static void main(String[] args) { SpringApplication.run(GeoLocationApplication.class, args); } } As you can see, we have added an annotation, @SpringBootApplication, to our class. The @SpringBootApplication annotation reduces the number of lines of code written by adding the following three annotations implicitly: @Configuration @ComponentScan @EnableAutoConfiguration If you are familiar with Spring, you will already know what the first two annotations do. @EnableAutoConfiguration is the only annotation that is part of Spring Boot. The AutoConfiguration package has an intelligent mechanism that guesses the configuration of your application and automatically configures the beans that you will likely need in your code. You can also see that we have added one more line to the main method, which actually tells Spring Boot the class that will be used to start this application. In our case, it is GeoLocationApplication.class. If you would like to add more initialization logic to your application, such as setting up the database or setting up your cache, feel free to add it here. Now that our Spring Boot application is all set to run, let's see how to run our microservice. Right-click on GeoLocationApplication.java from Package Explorer, select Run As, and then select Spring Boot App. You can also choose Java Application instead of Spring Boot App. Both the options ultimately do the same thing. You should see something like this on your STS console: If you look closely at the console logs, you will notice that Tomcat is being started on port number 8080. In order to make sure our Tomcat server is listening, let's run a simple curl command. cURL is a command-line utility available on most Unix and Mac systems. For Windows, use tools such as Cygwin or even Postman. Postman is a Google Chrome extension that gives you the ability to send and receive HTTP requests. For simplicity, we will use cURL. Execute the following command on your terminal: curl http://localhost:8080 This should give us an output like this: {"timestamp":1467420963000,"status":404,"error":"Not Found","message":"No message available","path":"/"} This error message is being produced by Spring. This verifies that our Spring Boot microservice is ready to start building on with more features. There are more configurations that are needed for Spring Boot, which we will perform later in this article along with Spring MVC. Writing microservices with WildFly Swarm WildFly Swarm is a J2EE application packaging framework from RedHat that utilizes the in-memory Undertow server to deploy microservices. In this article, we will create the same GeoLocation API using WildFly Swarm and JAX-RS. To avoid confusion and dependency conflicts in our project, we will create the WildFly Swarm microservice as its own Maven project. This article is just here to help you get started on WildFly Swarm. When you are building your production-level application, it is your choice to either use Spring Boot, WildFly Swarm, Dropwizard, or SparkJava based on your needs. Getting ready Similar to how we created the Spring Boot Maven project, create a Maven WAR module with the groupId com.packt.microservices and name/artifactId geolocation-wildfly. Feel free to use either your IDE or the command line. Be aware that some IDEs complain about a missing web.xml file. We will see how to fix that in the next section. How to do it… Before we set up the WildFly Swarm project, we have to fix the missing web.xml error. The error message says that Maven expects to see a web.xml file in your project as it is a WAR module, but this file is missing in your project. In order to fix this, we have to add and configure maven-war-plugin. Add the following code snippet to your pom.xml file's project section: <build> <plugins> <plugin> <groupId>org.apache.maven.plugins</groupId> <artifactId>maven-war-plugin</artifactId> <version>2.6</version> <configuration> <failOnMissingWebXml>false</failOnMissingWebXml> </configuration> </plugin> </plugins> </build> After adding the snippet, save your pom.xml file and perform a Maven update. Also, if you see that your project is using a Java version other than 1.8. Again, perform a Maven update for the changes to take effect. Now, let's add the dependencies required for this project. As we know that we will be exposing our APIs, we have to add the JAX-RS library. JAX-RS is the standard JSR-compliant API for creating RESTful web services. JBoss has its own version of JAX-RS. So let's  add that dependency to the pom.xml file: <dependencies> <dependency> <groupId>org.jboss.spec.javax.ws.rs</groupId> <artifactId>jboss-jaxrs-api_2.0_spec</artifactId> <version>1.0.0.Final</version> <scope>provided</scope> </dependency> </dependencies> The one thing that you have to note here is the provided scope. The provide scope in general means that this JAR need not be bundled with the final artifact when it is built. Usually, the dependencies with provided scope will be available to your application either via your web server or application server. In this case, when Wildfly Swarm bundles your app and runs it on the in-memory Undertow server, your server will already have this dependency. The next step toward creating the GeoLocation API using Wildfly Swarm is creating the domain object. Use the com.packt.microservices.geolocation.GeoLocation.java file. Now that we have the domain object, there are two classes that you need to create in order to write your first JAX-RS web service. The first of those is the Application class. The Application class in JAX-RS is used to define the various components that you will be using in your application. It can also hold some metadata about your application, such as your basePath (or ApplicationPath) to all resources listed in this Application class. In this case, we are going to use /geolocation as our basePath. Let's see how that looks: package com.packt.microservices.geolocation; import javax.ws.rs.ApplicationPath; import javax.ws.rs.core.Application; @ApplicationPath("/geolocation") public class GeoLocationApplication extends Application { public GeoLocationApplication() {} } There are two things to note in this class; one is the Application class and the other is the @ApplicationPath annotation—both of which we've already talked about. Now let's move on to the resource class, which is responsible for exposing the APIs. If you are familiar with Spring MVC, you can compare Resource classes to Controllers. They are responsible for defining the API for any specific resource. The annotations are slightly different from that of Spring MVC. Let's create a new resource class called com.packt.microservices.geolocation.GeoLocationResource.java that exposes a simple GET API: package com.packt.microservices.geolocation; import java.util.ArrayList; import java.util.List; import javax.ws.rs.GET; import javax.ws.rs.Path; import javax.ws.rs.Produces; @Path("/") public class GeoLocationResource { @GET @Produces("application/json") public List<GeoLocation> findAll() { return new ArrayList<>(); } } All the three annotations, @GET, @Path, and @Produces, are pretty self explanatory. Before we start writing the APIs and the service class, let's test the application from the command line to make sure it works as expected. With the current implementation, any GET request sent to the /geolocation URL should return an empty JSON array. So far, we have created the RESTful APIs using JAX-RS. It's just another JAX-RS project: In order to make it a microservice using Wildfly Swarm, all you have to do is add the wildfly-swarm-plugin to the Maven pom.xml file. This plugin will be tied to the package phase of the build so that whenever the package goal is triggered, the plugin will create an uber JAR with all required dependencies. An uber JAR is just a fat JAR that has all dependencies bundled inside itself. It also deploys our application in an in-memory Undertow server. Add the following snippet to the plugins section of the pom.xml file: <plugin> <groupId>org.wildfly.swarm</groupId> <artifactId>wildfly-swarm-plugin</artifactId> <version>1.0.0.Final</version> <executions> <execution> <id>package</id> <goals> <goal>package</goal> </goals> </execution> </executions> </plugin> Now execute the mvn clean package command from the project's root directory, and wait for the Maven build to be successful. If you look at the logs, you can see that wildfly-swarm-plugin will create the uber JAR, which has all its dependencies. You should see something like this in your console logs: After the build is successful, you will find two artifacts in the target directory of your project. The geolocation-wildfly-0.0.1-SNAPSHOT.war file is the final WAR created by the maven-war-plugin. The geolocation-wildfly-0.0.1-SNAPSHOT-swarm.jar file is the uber JAR created by the wildfly-swarm-plugin. Execute the following command in the same terminal to start your microservice: java –jar target/geolocation-wildfly-0.0.1-SNAPSHOT-swarm.jar After executing this command, you will see that Undertow has started on port number 8080, exposing the geolocation resource we created. You will see something like this: Execute the following cURL command in a separate terminal window to make sure our API is exposed. The response of the command should be [], indicating there are no geolocations: curl http://localhost:8080/geolocation Now let's build the service class and finish the APIs that we started. For simplicity purposes, we are going to store the geolocations in a collection in the service class itself. In a real-time scenario, you will be writing repository classes or DAOs that talk to the database that holds your geolocations. Get the com.packt.microservices.geolocation.GeoLocationService.java interface. We'll use the same interface here. Create a new class called com.packt.microservices.geolocation.GeoLocationServiceImpl.java that extends the GeoLocationService interface: package com.packt.microservices.geolocation; import java.util.ArrayList; import java.util.Collections; import java.util.List; public class GeoLocationServiceImpl implements GeoLocationService { private static List<GeoLocation> geolocations = new ArrayList<>(); @Override public GeoLocation create(GeoLocation geolocation) { geolocations.add(geolocation); return geolocation; } @Override public List<GeoLocation> findAll() { return Collections.unmodifiableList(geolocations); } } Now that our service classes are implemented, let's finish building the APIs. We already have a very basic stubbed-out GET API. Let's just introduce the service class to the resource class and call the findAll method. Similarly, let's use the service's create method for POST API calls. Add the following snippet to GeoLocationResource.java: private GeoLocationService service = new GeoLocationServiceImpl(); @GET @Produces("application/json") public List<GeoLocation> findAll() { return service.findAll(); } @POST @Produces("application/json") @Consumes("application/json") public GeoLocation create(GeoLocation geolocation) { return service.create(geolocation); } We are now ready to test our application. Go ahead and build your application. After the build is successful, run your microservice: let's try to create two geolocations using the POST API and later try to retrieve them using the GET method. Execute the following cURL commands in your terminal one by one: curl -H "Content-Type: application/json" -X POST -d '{"timestamp": 1468203975, "userId": "f1196aac-470e-11e6-beb8-9e71128cae77", "latitude": 41.803488, "longitude": -88.144040}' http://localhost:8080/geolocation This should give you something like the following output (pretty-printed for readability): { "latitude": 41.803488, "longitude": -88.14404, "userId": "f1196aac-470e-11e6-beb8-9e71128cae77", "timestamp": 1468203975 } curl -H "Content-Type: application/json" -X POST -d '{"timestamp": 1468203975, "userId": "f1196aac-470e-11e6-beb8-9e71128cae77", "latitude": 9.568012, "longitude": 77.962444}' http://localhost:8080/geolocation This command should give you an output similar to the following (pretty-printed for readability): { "latitude": 9.568012, "longitude": 77.962444, "userId": "f1196aac-470e-11e6-beb8-9e71128cae77", "timestamp": 1468203975 } To verify whether your entities were stored correctly, execute the following cURL command: curl http://localhost:8080/geolocation This should give you an output like this (pretty-printed for readability): [ { "latitude": 41.803488, "longitude": -88.14404, "userId": "f1196aac-470e-11e6-beb8-9e71128cae77", "timestamp": 1468203975 }, { "latitude": 9.568012, "longitude": 77.962444, "userId": "f1196aac-470e-11e6-beb8-9e71128cae77", "timestamp": 1468203975 } ] Whatever we have seen so far will give you a head start in building microservices with WildFly Swarm. Of course, there are tons of features that WildFly Swarm offers. Feel free to try them out based on your application needs. I strongly recommend going through the WildFly Swarm documentation for any advanced usages. Writing microservices with Dropwizard Dropwizard is a collection of libraries that help you build powerful applications quickly and easily. The libraries vary from Jackson, Jersey, Jetty, and so on. You can take a look at the full list of libraries on their website. This ecosystem of libraries that help you build powerful applications could be utilized to create microservices as well. As we saw earlier, it utilizes Jetty to expose its services. In this article, we will create the same GeoLocation API using Dropwizard and Jersey. To avoid confusion and dependency conflicts in our project, we will create the Dropwizard microservice as its own Maven project. This article is just here to help you get started with Dropwizard. When you are building your production-level application, it is your choice to either use Spring Boot, WildFly Swarm, Dropwizard, or SparkJava based on your needs. Getting ready Similar to how we created other Maven projects,  create a Maven JAR module with the groupId com.packt.microservices and name/artifactId geolocation-dropwizard. Feel free to use either your IDE or the command line. After the project is created, if you see that your project is using a Java version other than 1.8. Perform a Maven update for the change to take effect. How to do it… The first thing that you will need is the dropwizard-core Maven dependency. Add the following snippet to your project's pom.xml file: <dependencies> <dependency> <groupId>io.dropwizard</groupId> <artifactId>dropwizard-core</artifactId> <version>0.9.3</version> </dependency> </dependencies> Guess what? This is the only dependency you will need to spin up a simple Jersey-based Dropwizard microservice. Before we start configuring Dropwizard, we have to create the domain object, service class, and resource class: com.packt.microservices.geolocation.GeoLocation.java com.packt.microservices.geolocation.GeoLocationService.java com.packt.microservices.geolocation.GeoLocationImpl.java com.packt.microservices.geolocation.GeoLocationResource.java Let's see what each of these classes does. The GeoLocation.java class is our domain object that holds the geolocation information. The GeoLocationService.java class defines our interface, which is then implemented by the GeoLocationServiceImpl.java class. If you take a look at the GeoLocationServiceImpl.java class, we are using a simple collection to store the GeoLocation domain objects. In a real-time scenario, you will be persisting these objects in a database. But to keep it simple, we will not go that far. To be consistent with the previous, let's change the path of GeoLocationResource to /geolocation. To do so, replace @Path("/") with @Path("/geolocation") on line number 11 of the GeoLocationResource.java class. We have now created the service classes, domain object, and resource class. Let's configure Dropwizard. In order to make your project a microservice, you have to do two things: Create a Dropwizard configuration class. This is used to store any meta-information or resource information that your application will need during runtime, such as DB connection, Jetty server, logging, and metrics configurations. These configurations are ideally stored in a YAML file, which will them be mapped to your Configuration class using Jackson. In this application, we are not going to use the YAML configuration as it is out of scope for this article. If you would like to know more about configuring Dropwizard, refer to their Getting Started documentation page at http://www.dropwizard.io/0.7.1/docs/getting-started.html. Let's  create an empty Configuration class called GeoLocationConfiguration.java: package com.packt.microservices.geolocation; import io.dropwizard.Configuration; public class GeoLocationConfiguration extends Configuration { } The YAML configuration file has a lot to offer. Take a look at a sample YAML file from Dropwizard's Getting Started documentation page to learn more. The name of the YAML file is usually derived from the name of your microservice. The microservice name is usually identified by the return value of the overridden method public String getName() in your Application class. Now let's create the GeoLocationApplication.java application class: package com.packt.microservices.geolocation; import io.dropwizard.Application; import io.dropwizard.setup.Environment; public class GeoLocationApplication extends Application<GeoLocationConfiguration> { public static void main(String[] args) throws Exception { new GeoLocationApplication().run(args); } @Override public void run(GeoLocationConfiguration config, Environment env) throws Exception { env.jersey().register(new GeoLocationResource()); } } There are a lot of things going on here. Let's look at them one by one. Firstly, this class extends Application with the GeoLocationConfiguration generic. This clearly makes an instance of your GeoLocationConfiguraiton.java class available so that you have access to all the properties you have defined in your YAML file at the same time mapped in the Configuration class. The next one is the run method. The run method takes two arguments: your configuration and environment. The Environment instance is a wrapper to other library-specific objects such as MetricsRegistry, HealthCheckRegistry, and JerseyEnvironment. For example, we could register our Jersey resources using the JerseyEnvironment instance. The env.jersey().register(new GeoLocationResource())line does exactly that. The main method is pretty straight-forward. All it does is call the run method. Before we can start the microservice, we have to configure this project to create a runnable uber JAR. Uber JARs are just fat JARs that bundle their dependencies in themselves. For this purpose, we will be using the maven-shade-plugin. Add the following snippet to the build section of the pom.xml file. If this is your first plugin, you might want to wrap it in a <plugins> element under <build>: <plugin> <groupId>org.apache.maven.plugins</groupId> <artifactId>maven-shade-plugin</artifactId> <version>2.3</version> <configuration> <createDependencyReducedPom>true</createDependencyReducedPom> <filters> <filter> <artifact>*:*</artifact> <excludes> <exclude>META-INF/*.SF</exclude> <exclude>META-INF/*.DSA</exclude> <exclude>META-INF/*.RSA</exclude> </excludes> </filter> </filters> </configuration> <executions> <execution> <phase>package</phase> <goals> <goal>shade</goal> </goals> <configuration> <transformers> <transformer implementation="org.apache.maven.plugins.shade.resource.ServicesResourceTransformer" /> <transformer implementation="org.apache.maven.plugins.shade.resource.ManifestResourceTransformer"> <mainClass>com.packt.microservices.geolocation.GeoLocationApplication</mainClass> </transformer> </transformers> </configuration> </execution> </executions> </plugin> The previous snippet does the following: It creates a runnable uber JAR that has a reduced pom.xml file that does not include the dependencies that are added to the uber JAR. To learn more about this property, take a look at the documentation of maven-shade-plugin. It utilizes com.packt.microservices.geolocation.GeoLocationApplication as the class whose main method will be invoked when this JAR is executed. This is done by updating the MANIFEST file. It excludes all signatures from signed JARs. This is required to avoid security errors. Now that our project is properly configured, let's try to build and run it from the command line. To build the project, execute mvn clean package from the project's root directory in your terminal. This will create your final JAR in the target directory. Execute the following command to start your microservice: java -jar target/geolocation-dropwizard-0.0.1-SNAPSHOT.jar server The server argument instructs Dropwizard to start the Jetty server. After you issue the command, you should be able to see that Dropwizard has started the in-memory Jetty server on port 8080. If you see any warnings about health checks, ignore them. Your console logs should look something like this: We are now ready to test our application. Let's try to create two geolocations using the POST API and later try to retrieve them using the GET method. Execute the following cURL commands in your terminal one by one: curl -H "Content-Type: application/json" -X POST -d '{"timestamp": 1468203975, "userId": "f1196aac-470e-11e6-beb8-9e71128cae77", "latitude": 41.803488, "longitude": -88.144040}' http://localhost:8080/geolocation This should give you an output similar to the following (pretty-printed for readability): { "latitude": 41.803488, "longitude": -88.14404, "userId": "f1196aac-470e-11e6-beb8-9e71128cae77", "timestamp": 1468203975 } curl -H "Content-Type: application/json" -X POST -d '{"timestamp": 1468203975, "userId": "f1196aac-470e-11e6-beb8-9e71128cae77", "latitude": 9.568012, "longitude": 77.962444}' http://localhost:8080/geolocation This should give you an output like this (pretty-printed for readability): { "latitude": 9.568012, "longitude": 77.962444, "userId": "f1196aac-470e-11e6-beb8-9e71128cae77", "timestamp": 1468203975 } To verify whether your entities were stored correctly, execute the following cURL command: curl http://localhost:8080/geolocation It should give you an output similar to the following (pretty-printed for readability): [ { "latitude": 41.803488, "longitude": -88.14404, "userId": "f1196aac-470e-11e6-beb8-9e71128cae77", "timestamp": 1468203975 }, { "latitude": 9.568012, "longitude": 77.962444, "userId": "f1196aac-470e-11e6-beb8-9e71128cae77", "timestamp": 1468203975 } ] Excellent! You have created your first microservice with Dropwizard. Dropwizard offers more than what we have seen so far. Some of it is out of scope for this article. I believe the metrics API that Dropwizard uses could be used in any type of application. Writing your Dockerfile So far in this article, we have seen how to package our application and how to install Docker. Now that we have our JAR artifact and Docker set up, let's see how to Dockerize our microservice application using Docker. Getting ready In order to Dockerize our application, we will have to tell Docker how our image is going to look. This is exactly the purpose of a Dockerfile. A Dockerfile has its own syntax (or Dockerfile instructions) and will be used by Docker to create images. Throughout this article, we will try to understand some of the most commonly used Dockerfile instructions as we write our Dockerfile for the geolocation tracker microservice. How to do it… First, open your STS IDE and create a new file called Dockerfile in the geolocation project. The first line of the Dockerfile is always the FROM instruction followed by the base image that you would like to create your image from. There are thousands of images on Docker Hub to choose from. In our case, we would need something that already has Java installed on it. There are some images that are official, meaning they are well documented and maintained. Docker Official Repositories are very well documented, and they follow best practices and standards. Docker has its own team to maintain these repositories. This is essential in order to keep the repository clear, thus helping the user make the right choice of repository. To read more about Docker Official Repositories, take a look at https://docs.docker.com/docker-hub/official_repos/ We will be using the Java official repository. To find the official repository, go to hub.docker.com and search for java. You have to choose the one that says official. At the time of writing this, the Java image documentation says it will soon be deprecated in favor of the openjdk image. So the first line of our Dockerfile will look like this: FROM openjdk:8 As you can see, we have used version (or tag) 8 for our image. If you are wondering what type of operating system this image uses, take a look at the Dockerfile of this image, which you can get from the Docker Hub page. Docker images are usually tagged with the version of the software they are written for. That way, it is easy for users to pick from. The next step is creating a directory for our project where we will store our JAR artifact. Add this as your next line: RUN mkdir -p /opt/packt/geolocation This is a simple Unix command that creates the /opt/packt/geolocation directory. The –p flag instructs it to create the intermediate directories if they don't exist. Now let's create an instruction that will add the JAR file that was created in your local machine into the container at /opt/packt/geolocation. ADD target/geolocation-0.0.1-SNAPSHOT.jar /opt/packt/geolocation/ As you can see, we are picking up the uber JAR from target directory and dropping it into the /opt/packt/geolocation directory of the container. Take a look at the / at the end of the target path. That says that the JAR has to be copied into the directory. Before we can start the application, there is one thing we have to do, that is, expose the ports that we would like to be mapped to the Docker host ports. In our case, the in-memory Tomcat instance is running on port 8080. In order to be able to map port 8080 of our container to any port to our Docker host, we have to expose it first. For that, we will use the EXPOSE instruction. Add the following line to your Dockerfile: EXPOSE 8080 Now that we are ready to start the app, let's go ahead and tell Docker how to start a container for this image. For that, we will use the CMD instruction: CMD ["java", "-jar", "/opt/packt/geolocation/geolocation-0.0.1-SNAPSHOT.jar"] There are two things we have to note here. Once is the way we are starting the application and the other is how the command is broken down into comma-separated Strings. First, let's talk about how we start the application. You might be wondering why we haven't used the mvn spring-boot:run command to start the application. Keep in mind that this command will be executed inside the container, and our container does not have Maven installed, only OpenJDK 8. If you would like to use the maven command, take that as an exercise, and try to install Maven on your container and use the mvn command to start the application. Now that we know we have Java installed, we are issuing a very simple java –jar command to run the JAR. In fact, the Spring Boot Maven plugin internally issues the same command. The next thing is how the command has been broken down into comma-separated Strings. This is a standard that the CMD instruction follows. To keep it simple, keep in mind that for whatever command you would like to run upon running the container, just break it down into comma-separated Strings (in whitespaces). Your final Dockerfile should look something like this: FROM openjdk:8 RUN mkdir -p /opt/packt/geolocation ADD target/geolocation-0.0.1-SNAPSHOT.jar /opt/packt/geolocation/ EXPOSE 8080 CMD ["java", "-jar", "/opt/packt/geolocation/geolocation-0.0.1-SNAPSHOT.jar"] This Dockerfile is one of the simplest implementations. Dockerfiles can sometimes get bigger due to the fact that you need a lot of customizations to your image. In such cases, it is a good idea to break it down into multiple images that can be reused and maintained separately. There are some best practices to follow whenever you create your own Dockerfile and image. Though we haven't covered that here as it is out of the scope of this article, you still should take a look at and follow them. To learn more about the various Dockerfile instructions, go to https://docs.docker.com/engine/reference/builder/. Building your Docker image We created the Dockerfile, which will be used in this article to create an image for our microservice. If you are wondering why we would need an image, it is the only way we can ship our software to any system. Once you have your image created and uploaded to a common repository, it will be easier to pull your image from any location. Getting ready Before you jump right into it, it might be a good idea to get yourself familiar with some of the most commonly used Docker commands. In this article, we will use the build command. Take a look at this URL to understand the other commands: https://docs.docker.com/engine/reference/commandline/#/image-commands. After familiarizing yourself with the commands, open up a new terminal, and change your directory to the root of the geolocation project. Make sure your docker-machine instance is running. If it is not running, use the docker-machine start command to run your docker-machine instance: docker-machine start default If you have to configure your shell for the default Docker machine, go ahead and execute the following command: eval $(docker-machine env default) How to do it… From the terminal, issue the following docker build command: docker build –t packt/geolocation. We'll try to understand the command later. For now, let's see what happens after you issue the preceding command. You should see Docker downloading the openjdk image from Docker Hub. Once the image has been downloaded, you will see that Docker tries to validate each and every instruction provided in the Dockerfile. When the last instruction has been processed, you will see a message saying Successfully built. This says that your image has been successfully built. Now let's try to understand the command. There are three things to note here: The first thing is the docker build command itself. The docker build command is used to build a Docker image from a Dockerfile. It needs at least one input, which is usually the location of the Dockerfile. Dockerfiles can be renamed to something other than Dockerfile and can be referred to using the –f option of the docker build command. An instance of this being used is when teams have different Dockerfiles for different build environments, for example, using DockerfileDev for the dev environment, DockerfileStaging for the staging environment, and DockerfileProd for the production environment. It is still encouraged as best practice to use other Docker options in order to keep the same Dockerfile for all environments. The second thing is the –t option. The –t option takes the name of the repo and a tag. In our case, we have not mentioned the tag, so by default, it will pick up latest as the tag. If you look at the repo name, it is different from the official openjdk image name. It has two parts: packt and geolocation. It is always a good practice to put the Docker Hub account name followed by the actual image name as the name of your repo. For now, we will use packt as our account name, we will see how to create our own Docker Hub account and use that account name here. The third thing is the dot at the end. The dot operator says that the Dockerfile is located in the current directory, or the present working directory to be more precise. Let's go ahead and verify whether our image was created. In order to do that, issue the following command on your terminal: docker images The docker images command is used to list down all images available in your Docker host. After issuing the command, you should see something like this: As you can see, the newly built image is listed as packt/geolocation in your Docker host. The tag for this image is latest as we did not specify any. The image ID uniquely identifies your image. Note the size of the image. It is a few megabytes bigger than the openjdk:8 image. That is most probably because of the size of our executable uber JAR inside the container. Now that we know how to build an image using an existing Dockerfile, we are at the end of this article. This is just a very quick intro to the docker build command. There are more options that you can provide to the command, such as CPUs and memory. To learn more about the docker build command, take a look at this page: https://docs.docker.com/engine/reference/commandline/build/ Running your microservice as a Docker container We successfully created our Docker image in the Docker host. Keep in mind that if you are using Windows or Mac, your Docker host is the VirtualBox VM and not your local computer. In this article, we will look at how to spin off a container for the newly created image. Getting ready To spin off a new container for our packt/geolocation image, we will use the docker run command. This command is used to run any command inside your container, given the image. Open your terminal and go to the root of the geolocation project. If you have to start your Docker machine instance, do so using the docker-machine start command, and set the environment using the docker-machine env command. How to do it… Go ahead and issue the following command on your terminal: docker run packt/geolocation Right after you run the command, you should see something like this: Yay! We can see that our microservice is running as a Docker container. But wait—there is more to it. Let's see how we can access our microservice's in-memory Tomcat instance. Try to run a curl command to see if our app is up and running: Open a new terminal instance and execute the following cURL command in that shell: curl -H "Content-Type: application/json" -X POST -d '{"timestamp": 1468203975, "userId": "f1196aac-470e-11e6-beb8-9e71128cae77", "latitude": 41.803488, "longitude": -88.144040}' http://localhost:8080/geolocation Did you get an error message like this? curl: (7) Failed to connect to localhost port 8080: Connection refused Let's try to understand what happened here. Why would we get a connection refused error when our microservice logs clearly say that it is running on port 8080? Yes, you guessed it right: the microservice is not running on your local computer; it is actually running inside the container, which in turn is running inside your Docker host. Here, your Docker host is the VirtualBox VM called default. So we have to replace localhost with the IP of the container. But getting the IP of the container is not straightforward. That is the reason we are going to map port 8080 of the container to the same port on the VM. This mapping will make sure that any request made to port 8080 on the VM will be forwarded to port 8080 of the container. Now go to the shell that is currently running your container, and stop your container. Usually, Ctrl + C will do the job. After your container is stopped, issue the following command: docker run –p 8080:8080 packt/geolocation The –p option does the port mapping from Docker host to container. The port number to the left of the colon indicates the port number of the Docker host, and the port number to the right of the colon indicates that of the container. In our case, both of them are same. After you execute the previous command, you should see the same logs that you saw before. We are not done yet. We still have to find the IP that we have to use to hit our RESTful endpoint. The IP that we have to use is the IP of our Docker Machine VM. To find the IP of the docker-machine instance, execute the following command in a new terminal instance: docker-machine ip default. This should give you the IP of the VM. Let's say the IP that you received was 192.168.99.100. Now, replace localhost in your cURL command with this IP, and execute the cURL command again: curl -H "Content-Type: application/json" -X POST -d '{"timestamp": 1468203975, "userId": "f1196aac-470e-11e6-beb8-9e71128cae77", "latitude": 41.803488, "longitude": -88.144040}' http://192.168.99.100:8080/geolocation This should give you an output similar to the following (pretty-printed for readability): { "latitude": 41.803488, "longitude": -88.14404, "userId": "f1196aac-470e-11e6-beb8-9e71128cae77", "timestamp": 1468203975 } This confirms that you are able to access your microservice from the outside. Take a moment to understand how the port mapping is done. The following figure shows how your machine, VM, and container are orchestrated: This confirms that you are able to access your microservice from the outside. Summary We looked at an example of a geolocation tracker application to see how it can be broken down into smaller and manageable services. Next, we saw how to create the GeoLocationTracker service using the Spring Boot framework. Resources for Article: Further resources on this subject: Domain-Driven Design [article] Breaking into Microservices Architecture [article] A capability model for microservices [article]
Read more
  • 0
  • 0
  • 46772
Modal Close icon
Modal Close icon