How-To Tutorials

11 Jul 2017

19 min read

Massive Graphs on Big Data

11 Jul 2017

In this article by Rajat Mehta, author of the book Big Data Analytics with Java, we will learn about graphs. Graphs theory is one of the most important and interesting concepts of computer science. Graphs have been implemented in real life in a lot of use cases. If you use a GPS on your phone or a GPS device and it shows you a driving direction to a place, behind the scene there is an efficient graph that is working for you to give you the best possible direction. In a social network you are connected to your friends and your friends are connected to other friends and so on. This is a massive graph running in production in all the social networks that you use. You can send messages to your friends or follow them or get followed all in this graph. Social networks or a database storing driving directions all involve massive amounts of data and this is not data that can be stored on a single machine, instead this is distributed across a cluster of thousands of nodes or machines. This massive data is nothing but big data and in this article we will learn how data can be represented in the form of a graph so that we make analysis or deductions on top of these massive graphs. In this article, we will cover: A small refresher on the graphs and its basic concepts A small introduction on graph analytics, its advantages and how Apache Spark fits in Introduction on the GraphFrames library that is used on top of Apache Spark Before we dive deeply into each individual section, let's look at the basic graph concepts in brief. (For more resources related to this topic, see here.) Refresher on graphs In this section, we will cover some of the basic concepts of graphs, and this is supposed to be a refresher section on graphs. This is a basic section, hence if some users already know this information they can skip this section. Graphs are used in many important concepts in our day-to-day lives. Before we dive into the ways of representing a graph, let's look at some of the popular use cases of graphs (though this is not a complete list) Graphs are used heavily in social networks In finding driving directions via GPS In many recommendation engines In fraud detection in many financial companies In search engines and in network traffic flows In biological analysis As you must have noted earlier, graphs are used in many applications that we might be using on a daily basis. Graphs is a form of a data structure in computer science that helps in depicting entities and the connection between them. So, if there are two entities such as Airport A and Airport B and they are connected by a flight that takes, for example, say a few hours then Airport A and Airport B are the two entities and the flight connecting between them that takes those specific hours depict the weightage between them or their connection. In formal terms, these entities are called as vertexes and the relationship between them are called as edges. So, in mathematical terms, graph G = {V, E},that is, graph is a function of vertexes and edges. Let's look at the following diagram for the simple example of a graph: As you can see, the preceding graph is a set of six vertexes and eight edges,as shown next: Vertexes = {A, B, C, D, E, F} Edges = { AB, AF, BC, CD, CF, DF, DE, EF} These vertexes can represent any entities, for example, they can be places with the edges being 'distances' between the places or they could be people in a social network with the edges being the type of relationship, for example, friends or followers. Thus, graphs can represent real-world entities like this. The preceding graph is also called a bidirected graph because in this graph the edges go in either direction that is the edge from A to B can be traversed both ways from A to B as well as from B to A. Thus, the edges in the preceding diagrams that is AB can be BA or AF can be FA too. There are other types of graphs called as directed graphs and in these graphs the direction of the edges go in one way only and does not retrace back. A simple example of a directed graph is shown as follows:. As seen in the preceding graph, the edge A to B goes only in one direction as well as the edge B to C. Hence, this is a directed graph. A simple linked list data structure or a tree datastructure are also forms of graph only. In a tree, nodes can have children only and there are no loops, while there is no such rule in a general graph. Representing graphs Visualizing a graph makes it easily comprehensible but depicting it using a program requires two different approaches Adjacency matrix: Representing a graph as a matrix is easy and it has its own advantages and disadvantages. Let's look at the bidirected graph that we showed in the preceding diagram. If you would represent this graph as a matrix, it would like this: The preceding diagram is a simple representation of our graph in matrix form. The concept of matrix representation of graph is simple—if there is an edge to a node we mark the value as 1, else, if the edge is not present, we mark it as 0. As this is a bi-directed graph, it has edges flowing in one direction only. Thus, from the matrix, the rows and columns depict the vertices. There if you look at the vertex A, it has an edge to vertex B and the corresponding matrix value is 1. As you can see, it takes just one step or O[1]to figure out an edge between two nodes. We just need the index (rows and columns) in the matrix and we can extract that value from it. Also, if you would have looked at the matrix closely, you would have seen that most of the entries are zero, hence this is a sparse matrix. Thus, this approach eats a lot of space in computer memory in marking even those elements that do not have an edge to each other, and this is its main disadvantage. Adjacency list: Adjacency list solves the problem of space wastage of adjacency matrix. To solve this problem, it stores the node and its neighbors in a list (linked list) as shown in the following diagram: For maintaining brevity, we have not shown all the vertices but you can make out from the diagram that each vertex is storing its neighbors in a linked list. So when you want to figure out the neighbors of a particular vertex, you can directly iterate over the list. Of course this has the disadvantage of iterating when you have to figure out whether an edge exists between two nodes or not. This approach is also widely used in many algorithms in computer science. We have briefly seen how graphs can be represented, let's now see some important terms that are used heavily on graphs. Common terminology on graphs We will now introduce you to some common terms and concepts in graphs that you can use in your analytics on top of graphs: Vertices:As we mentioned earlier, vertices are the mathematical terms for the nodes in a graph. For analytic purposes, thevertices count shows the number of nodes in the system, for example, the number of people involved in a social graph. Edges: As we mentioned earlier, edges are the connection between vertices and edges can carry weights. The number of edges represent the number of relations in a system of graph. The weight on a graph represents the intensity of the relationship between the nodes involved; for example, in a social network, the relationship of friends is a stronger relationship than followed between nodes. Degrees: Represent the total number of connections flowing into as well as out of a node. For example, in the previous diagram the degree of node F is four. The degree count is useful, for example, in a social network graph it can represent how well a person is connected if his degree count is very high. Indegrees: This represents the number of connections flowing into a node. For example, in the previous diagram, for node F the indegree value is three. In a social network graph, this might represent how many people can send messages to this person or node. Oudegrees: This represents the number of connections flowing out of a node. For example, in the previous diagram, for node F the outdegree value is one. In a social network graph, this might represent how many people can send messages to this person or node. Common algorithms on graphs Let's look at the three common algorithms that are run on graphs frequently and some of their uses: Breadth first search: Breadth first search is an algorithm for graph traversal or searching. As the name suggests, the traversal occurs across the breadth of the graphs that is to say the neighbors of the node form where traversal starts are searched first before exploring further in the same manner. We will refer to the same graph we used earlier: If we start at vertex A, then according to breadth first search next we search or go at the neighbors of A that's B and F. After that, we will go at the neighbors of B and that will be C. Next we will go to the neighbours of F and those will be E and D. We only go through each node once and this mimics real-life travel as well, as to reach from a point to another point we seldom cover the same road or path again. Thus, our breadth first traversal starting from A will be {A , B , F , C , D , E }. Breadth first search is very useful in graph analytics and can tell us things such as the friends that are not your immediate friends but just at the next level after your immediate friends in a social network or in the case of a graph of a flights network it can show flights with just a single stop or two stops to the destination. Depth first search: This is another way of searching where we start from the source vertice and keep on searching until we reach the end node or the leaf node and then we backtrack. This algorithm is not as performant as the bread first search as it requires lots of traversals. So if you want to know if a node A is connected to node B, you might end up searching along a lot of wasteful nodes that do not have anything to do with the original nodes A and B before coming at the appropriate solution. Dijkstra's shortest path: This is a greedy algorithm to find the shortest path in a graph network. So in a weighted graph, if you need to find the shortest path between two nodes, you can start from the starting node and keep on picking the next node in path greedily to be the one with the least weight (in the case of weights being distances between nodes like in city graphs depicting interconnecting cities and roads).So in a road network, you can find the shortest path between two cities using this algorithm. PageRank algorithm: This is a very popular algorithm that came out from Google and it essentially is used to find the importance of a web page by figuring out how connected it is to other important websites. It gives a page rank score to each of the websites based on this approach and finally the search results are built based on this score. The best part about this algorithm is it can be applied to other areas in life too, for example, in figuring out the important airports in a flight graph, or figuring out the most important people in a social network group. So much for the basics and refresher on graphs, in the next section, we will now see how graphs can be used in real world in massive datasets such as social network data or in data used in the field of biology. We will also study how graph analytics can be used on top of these graphs to derive exclusive deductions. Plotting graphs There is a handy open source Java library called GraphStream, which can be used to plot graphs and this is very useful specially if you want to view the structure of your graphs. While viewing, you can also figure out if some of the vertices are very close to each other (clustered) or in general how they are placed. Using the GraphStream library is easy. Just download the jar from http://graphstream-project.org and put it in the classpath of your project. Next, we will show a simple example demonstrating how easy it is to plot a graph using this library. Just create an instance of a graph. For our example, we will create a simple DefaultGraph and name it SimpleGraph. Next, we will add the nodes or vertices of the graph. We will also add the attribute of the label that is displayed on the vertice. Graph graph = newDefaultGraph("SimpleGraph"); graph.addNode("A" ).setAttribute("ui.label", "A"); graph.addNode("B" ).setAttribute("ui.label", "B"); graph.addNode("C" ).setAttribute("ui.label", "C"); After building the nodes, it's now time to connect these nodes using the edges. The API is simple to use and on the graph instance we can define the edges, provided an ID is given to them and the starting and ending nodes are also given. graph.addEdge("AB", "A", "B"); graph.addEdge("BC", "B", "C"); graph.addEdge("CA", "C", "A"); All the information of nodes and edges is present on the graph instance. It's now time to plot this graph on the UI and we can just invoke the display method on the graph instance as shown next and display it on the UI. graph.display(); This would plot the graph on the UI as follows: This library is extensive and it will be good learning experience to explore this library further and we would urge the readers to further explore this library on their own. Massive graphs on big data Big data comprises huge amount of data distributed across a cluster of thousands (if not more) of machines. Building graphs based on this massive data has different challenges shown as follows: Due to the vast amount of data involved, the data for the graph is distributed across a cluster of machines. Hence, in actuality, it's not a single node graph and we have to build a graph that spans across a cluster of machines. A graph that spans across a cluster of machines would have vertices and edges spread across different machines and this data in a graph won't fit into the memory of one single machine. Consider your friend's list on Facebook; some of your friend's data in your Facebook friend list graph might lie on different machines and this data might be just tremendous in size. Look at an example diagram of a graph of 10 Facebook friends and their network shown as follows: As you can see in the preceding diagram, when for just 10 friends the data can be huge, and here since the graph is drawn by hand we have not even shown a lot of connections to make the image comprehensible, but in real life each person can have say more than thousands of connections. So imagine what will happen to a graph with say thousands if not more people on the list. As shown in the reasons we just saw, building massive graphs on big data is a different ball game altogether and there are few main approaches for building this massive graphs. From the perspective of big data building the massive graphs involve running and storing data parallely on many nodes. The two main approaches are bulk synchronous parallely and the pregel approach. Apache Spark follows the pregel approach. Covering these approaches in detail is out of scope of this book and if the users are interested more on these topics they should refer to other books and the Wikipedia for the same. Graph analytics The biggest advantage to using graphs is you can analyze these graphs and use them for analyzing complex datasets. You might ask what is so special about graph analytics that we can't do by relational databases. Let's try to understand this using an example, suppose we want to analyze your friends network on Facebook and pull information about your friends such as their name, their birth date, their recent likes, and so on. If Facebook had a relational database, then this would mean firing a query on some table using the foreign key of the user requesting this info. From the perspective of relational database, this first level query is easy. But what if we now ask you to go to the friends at level four in your network and fetch their data (as shown in the following diagram). The query to get this becomes more and more complicated from a relational database perspective but this is a trivial task on a graph or graphical database (such as Neo4j). Graphs are extremely good on operations where you want to pull information from one end of the node to another, where the other node lies after a lot of joins and hops. As such, graph analytics is good for certain use cases (but not for all use cases, relation database are still good on many other use cases). As you can see, the preceding diagram depicts a huge social network (though the preceding diagram might just be depicting a network of a few friends only). The dots represent actual people in a social network. So if somebody asks to pick one user on the left-most side of the diagram and see and follow host connections to the right-most side and pull the friends at the say 10th level or more, this is something very difficult to do in a normal relational database and doing it and maintaining it could easily go out of hand. There are four particular use cases where graph analytics is extremely useful and used frequently (though there are plenty more use cases too) Path analytics: As the name suggests, this analytics approach is used to figure out the paths as you traverse along the nodes of a graph. There are many fields where this can be used—simplest being road networks and figuring out details such as shortest path between cities, or in flight analytics to figure out the shortest time taking flight or direct flights. Connectivity analytics: As the name suggests, this approach outlines how the nodes within a graph are connected to each other. So using this you can figure out how many edges are flowing into a node and how many are flowing out of the node. This kind of information is very useful in analysis. For example, in a social network if there is a person who receives just one message but gives out say ten messages within his network then this person can be used to market his favorite products as he is very good in responding to messages. Community Analytics: Some graphs on big data are huge. But within these huge graphs there might be nodes that are very close to each other and are almost stacked in a cluster of their own. This is useful information as based on this you can extract out communities from your data. For example, in a social network if there are people who are part of some community, say marathon runners, then they can be clubbed into a single community and further tracked. Centrality Analytics: This kind of analytical approach is useful in finding central nodes in a network or graph. This is useful in figuring out sources that are single handedly connected to many other sources. It is helpful in figuring out influential people in a social network, or a central computer in a computer network. From the perspective of this article, we will be covering some of these use cases in our sample case studies and for this we will be using a library on Apache Spark called GraphFrames. GraphFrames GraphX library is advanced and performs well on massive graphs, but, unfortunately, it's currently only implemented in Scala and does not have any direct Java API. GraphFrames is a relatively new library that is built on top of Apache Spark and provides support for dataframe (now dataset) based graphs.It contains a lot of methods that are direct wrappers over the underlying sparkx methods. As such it provides similar functionality as GraphX except that GraphX acts on the Spark SRDD and GraphFrame works on the dataframe so GraphFrame is more user friendly (as dataframes are simpler to use). All the advantages of firing Spark SQL queries, joining datasets, filtering queries are all supported on this. To understand GraphFrames and representing massive big data graphs, we will take small baby steps first by building some simple programs using GraphFrames before building full-fledged case studies. Summary In this article, we learned about graph analytics. We saw how graphs can be built even on top of massive big datasets. We learned how Apache Spark can be used to build these massive graphs and in the process we learned about the new library GraphFrames that helps us in building these graphs. Resources for Article: Further resources on this subject: Saying Hello to Java EE [article] Object-Oriented JavaScript [article] Introduction to JavaScript [article]

0
0
25378

Packt

10 Jul 2017

11 min read

Android UIs with Custom Views

Packt

10 Jul 2017

11 min read

In this article by Raimon Rafols Montane, the author of the book Building Android UIs with Custom Views, we will see the very basics steps we'll need to get ourselves started building Android custom views, where we should use them and where we should simply rely on the Android standard widgets. (For more resources related to this topic, see here.) Why do we need custom views: There are lovely Android applications on Google Play and in other markets such as Amazon, built only using the standard Android UI widgets and layouts. There are also many other applications which have that small additional feature that makes our interaction with them easier or simply more pleasing. There is no magic formula, but maybe by just adding something different, something that the user feels like "hey it's not just another app for" might increase our user retention. It might not be the deal breaker but can definitely make the difference sometimes. Few years ago, author was working on the pre-sales department of a mobile app development agency and when the app Path was launched to the market, we got many and many requests from our customers to build menus like the Path app menu. It wasn't a critical feature for the applications they were building, but at that point it was a cool thing and many other applications and products wanted to have something similar. Even if it was a simple detail, it made the Path application special at that time. It wasn't the case at that time, but nowadays there are many implementations of that menu published in GitHub as open source, although they're not really maintained anymore. https://github.com/daCapricorn/ArcMenu https://github.com/oguzbilgener/CircularFloatingActionMenu. One of the main reasons to create our own custom views to our mobile application is, precisely, to have something special. It might be a menu, a component, a screen, something that might be really needed or even the main functionality for our application or just as an additional feature. In addition, by creating our custom view we can actually optimize the performance of our application. We can create a specific way of layouting widgets that otherwise will need many hierarchy layers by just using standard Android layouts or a custom view that simplifies rendering or user interaction. On the other hand, we can easily fall in the error of trying to build everything custom. Android provides an awesome list of widget and layout components that manages a lot of things for ourselves. If we ignore the basic Android framework and try to build everything by ourselves it would be a lot of work, potentially struggling with a lot of issues and errors that the Android OS developers already faced or, at least, very similar ones and, to put it up in one sentence, we would be reinventing the wheel. Examples in the market We all probably use great apps that are built only using the standard Android UI widgets and layouts, but there are many others that have some custom views that we don't know or we haven't really noticed. The custom views or layouts can sometimes be very subtle and hard to spot. We'd not be the first ones to have a custom view or layout in our application. In fact, many popular apps have some custom elements on them. For example, the Etsy application had a custom layout called StaggeredGridView. It was even published as open source in GitHub. It has been deprecated since 2015 in favor of Google's own StaggeredGridLayoutManager used together with RecyclerView. More information refer to: https://github.com/etsy/AndroidStaggeredGrid https://developer.android.com/reference/android/support/v7/widget/StaggeredGridLayoutManager.html. Parameterizing our custom view We have our custom view that adapts to multiple sizes now, that's good, but, what happens if we need another custom view that paints the background blue instead of red? And yellow? we shouldn't have to copy the custom view class for each customization. Luckily, we can set parameters on the XML layout and read them from our custom view. First, we need to define the type of parameters we will use on our custom view. We've to create a file called attrs.xml: <?xml version="1.0" encoding="utf-8"?> <resources> <declare-styleable name="OwnCustomView"> <attr name="fillColor" format="color"/> </declare-styleable> </resources> Then, we have to add a new different namespace on our layout file where we want to use following new parameter: <?xml version="1.0" encoding="utf-8"?> <ScrollView android_orientation="vertical" android_layout_width="match_parent" android_layout_height="match_parent"> <LinearLayout android_layout_width="match_parent" android_layout_height="wrap_content" android_orientation="vertical" android_padding="@dimen/activity_vertical_margin"> <com.packt.rrafols.customview.OwnCustomView android_layout_width="match_parent" android_layout_height="wrap_content" app_fillColor="@android:color/holo_blue_dark"/> </LinearLayout> </ScrollView> Now that we have this defined, let's see how we can read it from our custom view class. int fillColor; TypedArray ta = context.getTheme().obtainStyledAttributes(attributeSet, R.styleable.OwnCustomView, 0, 0); try { fillColor = ta.getColor(R.styleable.OwnCustomView_fillColor, DEFAULT_FILL_COLOR); } finally { ta.recycle(); } By getting a TypedArray using the styled attribute ID Android tools created for us after saving the attrs.xml file, we'll be able to query for the value of those parameters set on the XML layout file. More information about how to obtain a TypedArray can be found on the Android documentation: https://developer.android.com/reference/android/content/res/Resources.Theme.html#obtainStyledAttributes(android.util.AttributeSet, int[], int, int). For more information about TypedArray refer to: https://developer.android.com/reference/android/content/res/TypedArray.html. In this example, we created an attribute named fillColor which will be a formatted as a color. This format, or basically, the type of the attribute is very important to limit the kind of values we can set, and how these values can be retrieved afterwards from our custom view. Also, for each parameter we define, we'll get a R.styleable.<name>_<parameter_name> index in the TypedArray. In the code above, we're querying for the fillColor using the R.styleable.OwnCustomView_fillColor index. We shouldn't forget to recycle the TypedArray after using it so it can be reused, but once recycled, we can't use it again. Many custom views will only need to draw something in a special way, that's the reason we created them as custom views, but many others will need to react to user events. For example, how our custom view will behave when the user clicks or drags on top of it? Basic event handling Let's start by adding some basic event handling to our custom views. Reacting to touches One of the first things we'd to implement is to react to touch events. Android provides us with the method onTouchEvent() that we can override in our custom view. Overriding this method, we'll get any touch event happening on top of it. To see how it works, let's add it to the custom view: @Override public boolean onTouchEvent(MotionEvent event) { Log.d(TAG, "touch: " + event); return super.onTouchEvent(event); } Let's also add a log call to see what events do we receive. If we run this code and we touch on top of our view, we'll get: D/com.packt.rrafols.customview.CircularActivityIndicator: touch: MotionEvent { action=ACTION_DOWN, actionButton=0, id[0]=0, x[0]=644.3645, y[0]=596.55804, toolType[0]=TOOL_TYPE_FINGER, buttonState=0, metaState=0, flags=0x0, edgeFlags=0x0, pointerCount=1, historySize=0, eventTime=30656461, downTime=30656461, deviceId=9, source=0x1002 } As we can see there is a lot of information on the event, the coordinates, the action type, the time, but even if we perform more actions on it, we'll only get ACTION_DOWN events. That's because the default implementation of view is not clickable. If we don't set the clickable flag on the view, onTouchEvent() will return false and ignore further events. Method onTouchEvent() has to return true if the event has been processed or false if it hasn't. If we receive an event in our custom view and we don't know what to do it or we're not interested in that kind of events, we should return false so it can be processed by our view's parent or by any other component or the system. To receive more type of events, we can do two things: Set the view as clickable by using setClickable(true) Implement our own logic and process the events ourselves in our custom class. As, we'll implement more complex events, we'll go for the second option. Let's do a quick test, change the method to return simply true instead of calling the parent method: @Override public boolean onTouchEvent(MotionEvent event) { Log.d(TAG, "touch: " + event); return true; } Now we should receive many other types of events as follows: ...CircularActivityIndicator: touch: MotionEvent { action=ACTION_DOWN, ...CircularActivityIndicator: touch: MotionEvent { action=ACTION_UP, ...CircularActivityIndicator: touch: MotionEvent { action=ACTION_DOWN, ...CircularActivityIndicator: touch: MotionEvent { action=ACTION_MOVE, ...CircularActivityIndicator: touch: MotionEvent { action=ACTION_MOVE, ...CircularActivityIndicator: touch: MotionEvent { action=ACTION_MOVE, ...CircularActivityIndicator: touch: MotionEvent { action=ACTION_UP, ...CircularActivityIndicator: touch: MotionEvent { action=ACTION_DOWN, Using the Paint class We've been drawing some primitives until now, but Canvas provides us with many more primitive rendering methods. We'll briefly cover some of them, but before let's first talk about the Paint class as we haven't introduced it properly. According to the official definition the Paint class hold the style and color information about how to draw primitives, text and bitmaps. If we check the examples we've been building, we created a Paint object on our class constructor, or on the onCreate method, and we used it to draw primitives later on our onDraw method. As, for instance, we set our background Paint instance Style to Paint.Style.FILL, it'll fill the primitive, but we can change it to Paint.Style.STROKE if we only want to draw the border or the strokes of the silhouette. We can draw both using Paint.Style.FILL_AND_STROKE. To see the Paint.Style.STROKE in action, we'll draw a black border on top of our selected colored bar in our custom view. Let's start by defining a new Paint object called indicatorBorderPaint and initialize it on our class constructor: indicatorBorderPaint = new Paint(); indicatorBorderPaint.setAntiAlias(false); indicatorBorderPaint.setColor(BLACK_COLOR); indicatorBorderPaint.setStyle(Paint.Style.STROKE); indicatorBorderPaint.setStrokeWidth(BORDER_SIZE); indicatorBorderPaint.setStrokeCap(Paint.Cap.BUTT); We also defined a constant with the size of the border line and set the stroke width to this size. If we set the width to 0, Android guaranties it'll use a single pixel to draw the line. As we want to draw a thick black border, this is not our case right now. In addition, we set the stroke cap to Paint.Cap.BUTT to avoid the stroke overflowing its path. There are two more Caps we can use, Paint.Cap.SQUARE and Paint.Cap.ROUND. These last two will end the stroke respectively with a circle, rounding the stroke, or a square. Let's see the differences between the three Caps and also introduce the drawLine primitive. First of all, we create an array with all three Cap, so we can easily iterate between them and create a more compact code: private Paint.Cap[] caps = new Paint.Cap[] { Paint.Cap.BUTT, Paint.Cap.ROUND, Paint.Cap.SQUARE }; Now, on our onDraw method, let's draw a line using each of the Cap using the drawLine(float startX, float startY, float stopX, float stopY, Paint paint) method. int xPos = (getWidth() - 100) / 2; int yPos = getHeight() / 2 - BORDER_SIZE * CAPS.length / 2; for(int i = 0; i < CAPS.length; i++) { indicatorBorderPaint.setStrokeCap(CAPS[i]); canvas.drawLine(xPos, yPos, xPos + 100, yPos, indicatorBorderPaint); yPos += BORDER_SIZE * 2; } indicatorBorderPaint.setStrokeCap(Paint.Cap.BUTT); Summary In this article we have seen the reasoning behind why we might want to build custom views and layouts. Android provides a great basic framework for creating UIs and not using it would be a mistake. Not every component, button or widget has to be developed completely custom but, by doing it so in the right spot, we can add an extra feature that might make our application remembered. We have also seen basics of event handling. Resources for Article: Further resources on this subject: Offloading work from the UI Thread on Android [article] Getting started with Android Development [article] Practical How-To Recipes for Android [article]

0
0
2811

How-To Tutorials

Packt

10 Jul 2017

8 min read

HPC Cluster Computing

Packt

10 Jul 2017

8 min read

In this article by, Raja Malleswara Rao Pattamsetti, the author of the book Distributed Computing in Java 9, the author is going to look into the processing capabilities for organizational applications are more than a chronological computer configuration that can be addressed to an extent by increasing the processor capacity and other resource allocation. While this can alleviate the performance for a while, the future computational requirements are restricted for adding more powerful computational processors and cost incurred in producing such powerful systems. Also, there is a need to produce efficient algorithms and practices to produce the best result out of it. A practical and economic substitute for these single high-power computers is to establish multiple low-power capacity processors that work collectively and organize their processing capabilities, which results in a powerful system, that is, parallel computers that permit the processing activities to be distributed among multiple low-capacity computers and together obtain the best expected result. In this article, we will cover the following topics: Era of Computing Commanding Parallel System Architectures MPP: Massively Parallel Processors SMP: Symmetric Multiprocessors CC-NUMA: Cache Coherent Nonuniform Memory Access Distributed Systems Clusters Java support for High-Performance Computing (For more resources related to this topic, see here.) Era of Computing The technological developments are rapid in the industry of computing with the help of advancements in system software and hardware. The hardware advancements are around the processor advancements and their making techniques, and high-performance processors are getting generated in amazingly low cost. The hardware advancements are further boosted by the high-bandwidth and low-latency network systems. Very Large Scale Integration (VLSI) as a concept brought several phenomenal advancements in producing commanding chronological and parallel processing computers. Simultaneously, the System Software advancements have improved the ability of operating system, advanced software programming development techniques. It was observed as two commanding computing eras, namely Sequential and Parallel Era of Computing. The following diagram shows the advancements in both the Era of Computing from last few decades to the forecast for next two decades: In each of these era, it is observed that the hardware architecture growth is trailed by the system software advancements, which mean that as there was more powerful hardware evolved, correspondingly advanced software programs and operating system capacities doubled its strength. As the applications and problem solving environments are added to this with the advent of parallel computing, mini to microprocessor development, clustered computers. Let’s now review some of the commanding parallel system architectures from the last few years. Commanding parallel system architectures From the last few years, multiple varieties of computing models provisioning great processing performance have developed. They are classified depending on the memory allocated to their processors and their alignment are designed. Some of the important parallel systems are as follows: MPP: Massively Parallel Processors SMP: Symmetric Multiprocessors CC-NUMA: Cache Coherent Nonuniform Memory Access Distributed Systems Clusters Let’s now understand a little more detail about each of these system architectures and review some of their important performance characteristics. Massively Parallel Processors (MPP) Massively Parallel Processors (MPP), as the name advises, are a huge parallel processing system developed in no sharing architecture. Such systems usually contain large number of processing nodes that are integrated with a high-speed interconnected network switch. Node is nothing but an element of computer with diverse hardware component combination, usually containing a memory unit and more than on processor. Some of the purpose nodes are designed to have backup disks or additional storage capacity.The following diagram represents the massively parallel processors architecture: Symmetric Multiprocessors (SMP) Symmetric Multiprocessors (SMP), as the name advises, contain a set of limited number of processors ranging from 2 to 64 processors and share most of the resources among those processors. One instance of the operating system will be operating together on all these connected processors while they commonly share the I/O, memory, and the network bus. Based on the nature of similar set of processors connected and acting together as on operating system is the essential behavior of symmetric multiprocessors.The following diagram depicts the symmetric multiprocessors representation: Cache Coherent Nonuniform Memory Access (CC-NUMA) Cache Coherent Nonuniform Memory Access (CC-NUMA) is a special type of multiple processor system having mountable additional processor capability. In CC-NUMA, the SMP system and the other remote nodes communicate through the remote interconnection link. Each of the remote nodes contains local memory and processors of its own.The following diagram represents the cache coherent nonuniform memory access architecture: The nature of memory access is nonuniform. Just Like the symmetric multiprocessors, CC-NUMA system is a comprehensive sight to entire system memory, and as the name advises, it takes nonuniform time to access either the close or distant memory locations. Distributed systems Distributed systems, as we have been discussing from previous articles, are traditional individual set of computers interconnected through an Internet/intranet running on their own operating system. Any diverse set of computer systems can contribute to be part of the distributed system with this expectation, including the combinations of Massively Parallel Processors, Symmetric Multiprocessors, distinct computers, and clusters. Clusters Cluster is an assembly of terminals or computers that are assembled through internetwork connection to address a large processing requirement through parallel processing. Hence, the clusters are usually configured with terminals or computers having higher processing capabilities connected with high-speed internetwork. Usually, Cluster is considered as one image that integrates a number of nodes with a group of resource utilization. The following diagram is a sample clustered tomcat server for a typical J2EE web application deployment: The following table shows some of the important performance characteristics of the different parallel architecture systems discussed so far: Network of workstations A network of workstations is a group of resources connected like system processors, interface for networks, storage, and data disks that open the space for newer combinations such as: Parallel Computing: As discussed in the previous sections, the group of system processor can be connected as MPP or DSM, which can obtain the parallel processing ability. RAM Network: As a number of systems are connected, each system memory can collectively work as DRAM cache, which intensely expand the virtual memory for the entire system to improve its processing ability. Software Redundant Array of Inexpensive Disks (RAID): As a group of systems are used in the network connected in array improve the system stability, availability, and memory capacity with the help of low-cost multiple systems connected in the local area network. This also gives the simultaneous I/O system support. Multipath Communication: Multipath communication is a technique of using more than one network connection between the network of workstations to allow simultaneous information exchange among system nodes. Java support for High-Performance Computing Java is providing some numerous advantages for HPC (High-Performance Computing) as a programming language, particularly with Grid computing. Some of the key benefits of using Java in HPC include the following: Portability: The capability to write the programming in one platform and port it to run on any other operating platform has been the biggest strength of Java language. This continue to be the advantage when porting Java applications to HPC systems. In Grid computing, this is an essential feature, as the execution environment gets decided during the execution of the program. This is possible since the Java byte code executes in Java Virtual Machine (JVM), which itself acts as an abstract operating environment. Network Centricity: As discussed in previous articles, Java provides a great support for distributed systems with its network centric feature of remote procedure calls (RPC) through RMI and CORBA services. Along with these, Java support for socket programming is another great support for grid computing. Software Engineering: Another great feature of Java is the way it can be used for engineering the solutions. We can produce more object-oriented, loosely coupled software that can be independently added as API through jar files in multiple other systems. Encapsulation and Interface programming makes it a great advantage in such application development. Security: Java is highly secure programming language with its feature like byte code verification to limit the resource utilization by an untrusted intruder software or program. This is a great advantage when running application on distributed environment with remote communication established over shared network. GUI development: Java support for platform independent GUI development is a great advantage to develop and deploy the enterprise web applications over a HPC environment to interact with. Availability: Java is having a great supporting from multiple operating system including Windows, Linux, SGI, and Compaq. It is easily available to consume and develop using a great set of open source frameworks developed on top of it. While the advantages listed in the previous points in favor of using Java for HPC environments, some concerns need to be reviewed while designing Java applications, including its numerics (complex numbers, fastfp, multidimensional), performance, and parallel computing designs. Summary Through this article, you have learned about era of computing, commanding parallel, system architectures, MPP: Massively Parallel Processors, SMP: Symmetric Multiprocessors, CC-NUMA: Cache Coherent Nonuniform Memory Access, Distributed Systems, Clusters, Java support for High-Performance Computing Resources for Article: Further resources on this subject: The Amazon S3 SDK for Java [article] Gradle with the Java Plugin [article] Getting Started with Sorting Algorithms in Java [article]

0
0
6992

Packt

10 Jul 2017

8 min read

Queues and topics

Packt

10 Jul 2017

8 min read

In this article by Luca Stancapiano, the author of the book Mastering Java EE Development with WildFly, we will see how to implement Java Message Service (JMS) in a queue channel using WildFly console. (For more resources related to this topic, see here.) JMS works inside channels of messages that manage the messages asynchronously. These channels contain messages that they will collect or remove according the configuration and the type of channel. These channels are of two types, queues and topics. These channels are highly configurable by the WildFly console. As for all components in WildFly they can be installed through the console command line or directly with maven plugins of the project. In the next two paragraphs we will show what do they mean and all the possible configurations. Queues Queues collect the sent messages that are waiting to be read. The messages are delivered in the order they are sent and when beds are removed from the queue. Create the queue from the web console See now the steps to create a new queue through the web console. Connect to http://localhost:9990/. Go in Configuration | Subsystems/Messaging - ActiveMQ/default. And click on Queues/Topics. Now select the Queues menu and click on the Add button. You will see this screen: The parameters to insert are as follows: Name: The name of the queue. JNDI Names: The jndi names the queue will be bound to. Durable?: Whether the queue is durable or not. Selector: The queue selector. As for all enterprise components, JMS components are callable through Java Naming Directory Interface (JNDI). Durable queues keep messages around persistently for any suitable consumer to consume them. Durable queues do not need to concern themselves with which consumer is going to consume the messages at some point in the future. There is just one copy of a message that any consumer in the future can consume. Message Selectors allows to filter the messages that a Message Consumer will receive. The filter is a relatively complex language similar to the syntax of an SQL WHERE clause. The selector can use all the message headers and properties for filtering operations, but cannot use the message content.Selectors are mostly useful for channels that broadcast a very large number of messages to its subscribers. On Queues, only messages that match the selector will be returned. Others stay in the queue (and thus can be read by a MessageConsumer with different selector). The following SQL elements are allowed in our filters and we can put them in the Selector field of the form: Element Description of the Element Example of Selectors AND, OR, NOT Logical operators (releaseYear < 1986) ANDNOT (title = 'Bad') String Literals String literals in single quotes, duplicate to escape title = 'Tom''s' Number Literals Numbers in Java syntax. They can be double or integer releaseYear = 1982 Properties Message properties that follow Java identifier naming releaseYear = 1983 Boolean Literals TRUE and FALSE isAvailable = FALSE ( ) Round brackets (releaseYear < 1981) OR (releaseYear > 1990) BETWEEN Checks whether number is in range (both numbers inclusive) releaseYear BETWEEN 1980 AND 1989 Header Fields Any headers except JMSDestination, JMSExpiration and JMSReplyTo JMSPriority = 10 =, <>, <, <=, >, >= Comparison operators (releaseYear < 1986) AND (title <> 'Bad') LIKE String comparison with wildcards '_' and '%' title LIKE 'Mirror%' IN Finds value in set of strings title IN ('Piece of mind', 'Somewhere in time', 'Powerslave') IS NULL, IS NOT NULL Checks whether value is null or not null. releaseYear IS NULL *, +, -, / Arithmetic operators releaseYear * 2 > 2000 - 18 Fill the form now: In this article we will implement a messaging service to send coordinates of the bus means . The queue is created and showed in the queues list: Create the queue using CLI and Maven WildFly plugin The same thing can be done with the Command Line Interface (CLI). So start a WildFly instance, go in the bin directory of WildFly and execute the following script: bash-3.2$ ./jboss-cli.sh You are disconnected at the moment. Type 'connect' to connect to the server or 'help' for the list of supported commands. [disconnected /] connect [standalone@localhost:9990 /] /subsystem=messagingactivemq/ server=default/jmsqueue= gps_coordinates:add(entries=["java:/jms/queue/GPS"]) {"outcome" => "success"} The same thing can be done through maven. Simply add this snippet in your pom.xml: <plugin> <groupId>org.wildfly.plugins</groupId> <artifactId>wildfly-maven-plugin</artifactId> <version>1.0.2.Final</version> <executions> <execution> <id>add-resources</id> <phase>install</phase> <goals> <goal>add-resource</goal> </goals> <configuration> <resources> <resource> <address>subsystem=messaging-activemq,server=default,jmsqueue= gps_coordinates</address> <properties> <durable>true</durable> <entries>!!["gps_coordinates", "java:/jms/queue/GPS"]</entries> </properties> </resource> </resources> </configuration> </execution> <execution> <id>del-resources</id> <phase>clean</phase> <goals> <goal>undeploy</goal> Queues and topics [ 7 ] </goals> <configuration> <afterDeployment> <commands> <command>/subsystem=messagingactivemq/ server=default/jms-queue=gps_coordinates:remove </command> </commands> </afterDeployment> </configuration> </execution> </executions> </plugin> The Maven WildFly plugin lets you to do admin operations in WildFly using the same custom protocol used by command line. Two executions are configured: add-resources: It hooks the install maven scope and it adds the queue passing the name, JNDI and durable parameters seen in the previous paragraph. del-resources: It hooks the clean maven scope and remove the chosen queue by name. Create the queue through an Arquillian test case Or we can add and remove the queue through an Arquillian test case: @RunWith(Arquillian.class) @ServerSetup(MessagingResourcesSetupTask.class) public class MessageTestCase { ... private static final String QUEUE_NAME = "gps_coordinates"; private static final String QUEUE_LOOKUP = "java:/jms/queue/GPS"; static class MessagingResourcesSetupTask implements ServerSetupTask { @Override public void setup(ManagementClient managementClient, String containerId) throws Exception { getInstance(managementClient.getControllerClient()).createJmsQueue(QUEUE_NA ME, QUEUE_LOOKUP); } @Override public void tearDown(ManagementClient managementClient, String containerId) throws Exception { getInstance(managementClient.getControllerClient()).removeJmsQueue(QUEUE_NA ME); } } Queues and topics [ 8 ] ... } The Arquillian org.jboss.as.arquillian.api.ServerSetup annotation let to use an external setup manager used to install or remove new components inside WildFly. In this case we are installing the queue declared with the two variables QUEUE_NAME and QUEUE_LOOKUP. When the test ends, automatically the tearDown method will be started and it will remove the installed queue. To use Arquillian it's important add the WildFly testsuite dependency in your pom.xml project: ... <dependencies> <dependency> <groupId>org.wildfly</groupId> <artifactId>wildfly-testsuite-shared</artifactId> <version>10.1.0.Final</version> <scope>test</scope> </dependency> </dependencies> ... Going in the standalone-full.xml we will find the created queue as: <subsystem > <server name="default"> ... <jms-queue name="gps_coordinates" entries="java:/jms/queue/GPS"/> ... </server> </subsystem> JMS is available in the standalone-full configuration. By default WildFly supports 4 standalone configurations. They can be found in the standalone/configuration directory: standalone.xml: It supports all components except the messaging and corba/iiop standalone-full.xml: It supports all components standalone-ha.xml: It supports all components except the messaging and corba/iiop with the enabled cluster standalone-full-ha.xml: It supports all components with the enabled cluster To start WildFly with the chosen configuration simply add a -c with the configuration in the standalone.sh script. Here a sample to start the standalone full configuration: ./standalone.sh -c standalone-full.xml Create the java client for the queue See now how create a client to send a message to the queue. JMS 2.0 simplify very much the creation of clients. Here a sample of a client inside a stateless Enterprise Java Beans (EJB): @Stateless public class MessageQueueSender { @Inject private JMSContext context; @Resource(mappedName = "java:/jms/queue/GPS") private Queue queue; public void sendMessage(String message) { context.createProducer().send(queue, message); } } The javax.jms.JMSContext is injectable from any EE component. We will see the JMS context in details in the next paragraph The JMS Context. The queue is represented in JMS by the javax.jms.Queue class. It can be injected as JNDI resource through the @Resource annotation. The JMS context through the createProducer method creates a producer represented by the javax.jms.JMSProducer class used to send the messages. We can now create a client injecting the stateless and sending a string message hello! ... @EJB private MessageQueueSender messageQueueSender; ... messageQueueSender.sendMessage("hello!"); Summary In this article we have seen how to implement Java Message Service in a queue channel using web console, Command Line Interface and Maven WildFly plugins, Arquillian test cases and how to create Java clients for queue. Resources for Article: Further resources on this subject: WildFly – the Basics [article] WebSockets in Wildfly [article] Creating Java EE Applications [article]

0
0
15006

Packt

10 Jul 2017

6 min read

Thread of Execution

Packt

10 Jul 2017

6 min read

In this article by Anton Polukhin Alekseevic, the author of the book Boost C++ Application Development Cookbook - Second Edition, we will see the multithreading concept. Multithreading means multiple threads of execution exist within a single process. Threads may share process resources and have their own resources. Those threads of execution may run independently on different CPUs, leading to faster and more responsible programs. Let's see how to create a thread of execution. (For more resources related to this topic, see here.) Creating a thread of execution On modern multicore compilers, to achieve maximal performance (or just to provide a good user experience), programs usually must use multiple threads of execution. Here is a motivating example in which we need to create and fill a big file in a thread that draws the user interface: #include <algorithm> #include <fstream> #include <iterator> bool is_first_run(); // Function that executes for a long time. void fill_file(char fill_char, std::size_t size, const char* filename); // ... // Somewhere in thread that draws a user interface: if (is_first_run()) { // This will be executing for a long time during which // users interface freezes. fill_file(0, 8 * 1024 * 1024, "save_file.txt"); } Getting ready This recipe requires knowledge of boost::bind or std::bind. How to do it... Starting a thread of execution never was so easy: #include <boost/thread.hpp> // ... // Somewhere in thread that draws a user interface: if (is_first_run()) { boost::thread(boost::bind( &fill_file, 0, 8 * 1024 * 1024, "save_file.txt" )).detach(); } How it works... The boost::thread variable accepts a functional object that can be called without parameters (we provided one using boost::bind) and creates a separate thread of execution. That functional object is copied into a constructed thread of execution and run there. We are using version 4 of the Boost.Thread in all recipes (defined BOOST_THREAD_VERSION to 4). Important differences between Boost.Thread versions are highlighted. After that, we call the detach() function, which does the following: The thread of execution is detached from the boost::thread variable but continues its execution The boost::thread variable start to hold a Not-A-Thread state Note that without a call to detach(), the destructor of boost::thread will notice that it still holds a OS thread and will call std::terminate. std::terminate terminates our program without calling destructors, freeing up resources, and without other cleanup. Default constructed threads also have a Not-A-Thread state, and they do not create a separate thread of execution. There's more... What if we want to make sure that file was created and written before doing some other job? In that case we need to join the thread in the following way: // ... // Somewhere in thread that draws a user interface: if (is_first_run()) { boost::thread t(boost::bind( &fill_file, 0, 8 * 1024 * 1024, "save_file.txt" )); // Do some work. // ... // Waiting for thread to finish. t.join(); } After the thread is joined, the boost::thread variable holds a Not-A-Thread state and its destructor does not call std::terminate. Remember that the thread must be joined or detached before its destructor is called. Otherwise, your program will terminate! With BOOST_THREAD_VERSION=2 defined, the destructor of boost::thread calls detach(), which does not lead to std::terminate. But doing so breaks compatibility with std::thread, and some day, when your project is moving to the C++ standard library threads or when BOOST_THREAD_VERSION=2 won't be supported; this will give you a lot of surprises. Version 4 of Boost.Thread is more explicit and strong, which is usually preferable in C++ language. Beware that std::terminate() is called when any exception that is not of type boost::thread_interrupted leaves boundary of the functional object that was passed to the boost::thread constructor. There is a very helpful wrapper that works as a RAII wrapper around the thread and allows you to emulate the BOOST_THREAD_VERSION=2 behavior; it is called boost::scoped_thread<T>, where T can be one of the following classes: boost::interrupt_and_join_if_joinable: To interrupt and join thread at destruction boost::join_if_joinable: To join a thread at destruction boost::detach: To detach a thread at destruction Here is a small example: #include <boost/thread/scoped_thread.hpp> void some_func(); void example_with_raii() { boost::scoped_thread<boost::join_if_joinable> t( boost::thread(&some_func) ); // 't' will be joined at scope exit. } The boost::thread class was accepted as a part of the C++11 standard and you can find it in the <thread> header in the std:: namespace. There is no big difference between the Boost's version 4 and C++11 standard library versions of the thread class. However, boost::thread is available on the C++03 compilers, so its usage is more versatile. There is a very good reason for calling std::terminate instead of joining by default! C and C++ languages are often used in life critical software. Such software is controlled by other software, called watchdog. Those watchdogs can easily detect that application has terminated but not always can detect deadlocks or detect them with bigger delays. For example for a defibrillator software it's safer to terminate, than hang on join() for a few seconds waiting for a watchdog reaction. Keep that in mind while designing such applications. See also All the recipes in this chapter are using Boost.Thread. You may continue reading to get more information about the library. The official documentation has a full list of the boost::thread methods and remarks about their availability in the C++11 standard library. The official documentation can be found at http://boost.org/libs/thread. The Interrupting a thread recipe will give you an idea of what the boost::interrupt_and_join_if_joinable class does. Summary We saw how to create a thread of execution using some easy techniques. Resources for Article: Further resources on this subject: Introducing the Boost C++ Libraries [article] Boost.Asio C++ Network Programming [article] Application Development in Visual C++ - The Tetris Application [article]

0
0
34258

article-image-when-do-we-use-r-over-python

Packt

10 Jul 2017

16 min read

When do we use R over Python?

Packt

10 Jul 2017

16 min read

0
0
5425

Packt

06 Jul 2017

11 min read

Continous Delivery

Packt

06 Jul 2017

11 min read

In this article, Jonathan Baier, the author of Getting Started with Kubernetes - Second Edition, will show the reader how to integrate their build pipeline and deployments with a Kubernetes cluster. It will cover the concept of using Gulp.js and Jenkins in conjunction with your Kubernetes cluster. This article will discuss the following topics: Integrating with continuous deployment pipeline Using Gulp.js with Kubernetes Integrating Jenkins with Kubernetes Integrating with continuous delivery pipeline Continuous integration and delivery are key components to modern development shops. Speed to market or mean-time-to-revenue are crucial for any company that is creating their own software. We'll see how Kubernetes can help you. CI/CD (short for Continuous Integration / Continuous Delivery) often requires ephemeral build and test servers to be available whenever changes are pushed to the code repository. Docker and Kubernetes are well suited for this task, as it's easy to create containers in a few seconds and just as easy to remove them after builds are run. In addition, if you already have a large portion of infrastructure available on your cluster, it can make sense to utilize the idle capacity for builds and testing. In this article, we will explore two popular tools used in building and deploying software: Gulp.js: This is a simple task runner used to automate the build process using JavaScript and Node.js Jenkins: This is a fully-fledged continuous integration server Gulp.js Gulp.js gives us the framework to do Build as code. Similar to Infrastructure as code, this allows us to programmatically define our build process. We will walk through a short example to demonstrate how you can create a complete workflow from a Docker image build to the final Kubernetes service. Prerequisites For this section of the article, you will need a NodeJS environment installed and ready including the node package manager (npm). If you do not already have these packages installed, you can find instructions for installing them at https://docs.npmjs.com/getting-started/installing-node. You can check whether NodeJS is installed correctly with a node -v command. You'll also need Docker CE and a DockerHub account to push a new image. You can find instructions to install Docker CE at https://docs.docker.com/installation/. You can easily create a DockerHub account at https://hub.docker.com/. After you have your credentials, you can log in with the CLI using $ docker login command. Gulp build example Let's start by creating a project directory named node-gulp: $ mkdir node-gulp $ cd node-gulp Next, we will install the gulp package and check whether it's ready by running the npm command with the version flag, as follows: $ npm install -g gulp You may need to open a new terminal window to make sure that gulp is on your path. Also, make sure to navigate back to your node-gulp directory: $ gulp -v Next, we will install gulp locally in our project folder as well as the gulp-git and gulp-shell plugins, as follows: $ npm install --save-dev gulp $ npm install gulp-git -save $ npm install --save-dev gulp-shell Finally, we need to create a Kubernetes controller and service definition file, as well as a gulpfile.js file, to run all our tasks. Again, these files are available in the book file bundle, if you wish to copy them instead. Refer to the following code: apiVersion: v1 kind: ReplicationController metadata: name: node-gulp labels: name: node-gulp spec: replicas: 1 selector: name: node-gulp template: metadata: labels: name: node-gulp spec: containers: - name: node-gulp image: <your username>/node-gulp:latest imagePullPolicy: Always ports: - containerPort: 80 Listing 7-1: node-gulp-controller.yaml As you can see, we have a basic controller. You will need to replace <your username>/node-gulp:latest with your Docker Hub username: apiVersion: v1 kind: Service metadata: name: node-gulp labels: name: node-gulp spec: type: LoadBalancer ports: - name: http protocol: TCP port: 80 selector: name: node-gulp Listing 7-2: node-gulp-service.yaml Next, we have a simple service that selects the pods from our controller and creates an external load balancer for access, as earlier: var gulp = require('gulp'); var git = require('gulp-git'); var shell = require('gulp-shell'); // Clone a remote repo gulp.task('clone', function(){ return git.clone('https://github.com/jonbaierCTP/getting-started-with-kubernetes-se.git', function (err) { if (err) throw err; }); }); // Update codebase gulp.task('pull', function(){ return git.pull('origin', 'master', {cwd: './getting-started-with-kubernetes-se'}, function (err) { if (err) throw err; }); }); //Build Docker Image gulp.task('docker-build', shell.task([ 'docker build -t <your username>/node-gulp ./getting-started-with-kubernetes-se/docker-image-source/container-info/', 'docker push <your username>/node-gulp' ])); //Run New Pod gulp.task('create-kube-pod', shell.task([ 'kubectl create -f node-gulp-controller.yaml', 'kubectl create -f node-gulp-service.yaml' ])); //Update Pod gulp.task('update-kube-pod', shell.task([ 'kubectl delete -f node-gulp-controller.yaml', 'kubectl create -f node-gulp-controller.yaml' ])); Listing 7-3: gulpfile.js Finally, we have the gulpfile.js file. This is where all our build tasks are defined. Again, fill in your Docker Hub username in both the <your username>/node-gulp sections. Looking through the file, first, the clone task downloads our image source code from GitHub. The pull tasks execute a git pull on the cloned repository. Next, the docker-build command builds an image from the container-info subfolder and pushes it to DockerHub. Finally, we have the create-kube-pod and update-kube-pod commands. As you can guess, the create-kube-pod command creates our controller and service for the first time, whereas the update-kube-pod command simply replaces the controller. Let's go ahead and run these commands and see our end-to-end workflow: $ gulp clone $ gulp docker-build The first time through, you can run the create-kube-pod command, as follows: $ gulp create-kube-pod This is all there is to it. If we run a quick kubectl describe command for the node-gulp service, we can get the external IP for our new service. Browse to that IP and you'll see the familiar container-info application running. Note that the host starts with node-gulp, just as we named it in the previously mentioned pod definition: Service launched by Gulp build On subsequent updates, run the pull and update-kube-pod commands, as shown here: $ gulp pull $ gulp docker-build $ gulp update-kube-pod This is a very simple example, but you can begin to see how easy it is to coordinate your build and deployment end to end with a few simple lines of code. Next, we will look at how to use Kubernetes to actually run builds using Jenkins. Kubernetes plugin for Jenkins One way we can use Kubernetes for our CI/CD pipeline is to run our Jenkins build slaves in a containerized environment. Luckily, there is already a plugin, written by Carlos Sanchez, which allows you to run Jenkins slaves in Kubernetes' pods. Prerequisites You'll need a Jenkins server handy for this next example. If you don't have one you can use, there is a Docker image available at https://hub.docker.com/_/jenkins/. Running it from the Docker CLI is as simple as this: docker run --name myjenkins -p 8080:8080 -v /var/jenkins_home jenkins Installing plugins Log in to your Jenkins server, and from your home dashboard, click on Manage Jenkins. Then, select Manage Plugins from the list. A note for those installing a new Jenkins server: When you first log in to the Jenkins server, it asks you to install plugins. Choose the default ones or no plugins will be installed: Jenkins main dashboard The credentials plugin is required, but should be installed by default. We can check the Installed tab if in doubt, as shown in the following screenshot: Jenkins installed plugins Next, we can click on the Available tab. The Kubernetes plugin should be located under Cluster Management and Distributed Build or Misc (cloud). There are many plugins, so you can alternatively search for Kubernetes on the page. Check the box for Kubernetes Plugin and click on Install without restart. This will install theKubernetes Plugin and the Durable Task Plugin: Plugin installation If you wish to install a nonstandard version or just like to tinker, you can optionally download the plugins. The latest Kubernetes and Durable Task plugins can be found here: Kubernetes plugin: https://wiki.jenkins-ci.org/display/JENKINS/Kubernetes+Plugin Durable Task plugin: https://wiki.jenkins-ci.org/display/JENKINS/Durable+Task+PluginNext, we can click on the Advanced tab and scroll down to Upload Plugin. Navigate to the durable-task.hpi file and click on Upload. You should see a screen that shows an installing progress bar. After a minute or two, it will update to Success.Finally, install the main Kubernetes plugin. On the left-hand side, click on Manage Plugins and then the Advanced tab once again. This time, upload the kubernetes.hpi file and click on Upload. After a few minutes, the installation should be complete. Configuring the Kubernetes plugin Click on Back to Dashboard or the Jenkins link in the top-left corner. From the main dashboard page, click on the Credentials link. Choose a domain from the list; in my case, I just used the default Global credentials domain. Click on Add Credentials: Add credentials screen Leave Kind as Username with password and Scope as Global. Add your Kubernetes admin credentials. Remember that you can find these by running the config command: $ kubectl config view You can leave ID blank, give it a sensible description, and click on the OK button. Now that we have our credentials saved, we can add our Kubernetes server. Click on the Jenkins link in the top-left corner and then Manage Jenkins. From there, select Configure System and scroll all the way down to the Cloud section. Select Kubernetes from the Add a new cloud dropdown and a Kubernetes section will appear, as follows: New Kubernetes cloud settings You'll need to specify the URL for your master in the form of https://<Master IP>/. Next, choose the credentials we added from the drop-down list. Since Kubernetes use a self-signed certificate by default, you'll also need to check the Disable https certificate check checkbox. Click on Test Connection and if all goes well, you should see Connection successful appearing next to the button. If you are using an older version of the plugin, you may not see the Disable https certificate check checkbox. If this is the case, you will need to install the self-signed certificate directly on the Jenkins Master. Finally, we will add a pod template by choosing Kubernetes Pod Template from the Add Pod Template dropdown next to Images. This will create another new section. Use jenkins-slave for the Name and Labels section. Click on Add next to Containers and again use jenkins-slave for the Name. Use csanchez/jenkins-slave for the Docker Image and leave /home/jenkins for the Working Directory. Labels can be used later on in the build settings to force the build to use the Kubernetes cluster: Kubernetes cluster addition Here is the Pod Template that expands below the cluster addition: Kubernetes pod template Click on Save and you are all set. Now, new builds created in Jenkins can use the slaves in the Kubernetes pod we just created. Here is another note about firewalls. The Jenkins Master will need to be reachable by all the machines in your Kubernetes cluster, as the pod could land anywhere. You can find out your port settings in Jenkins under Manage Jenkins and Configure Global Security. Bonus fun Fabric8 bills itself as an integration platform. It includes a variety of logging, monitoring, and continuous delivery tools. It also has a nice console, an API registry, and a 3D game that lets you shoot at your pods. It's a very cool project, and it actually runs on Kubernetes. Refer to http://fabric8.io/. It's an easy single command to set up on your Kubernetes cluster, so refer to http://fabric8.io/guide/getStarted/gke.html. Summary We looked at two continuous integration tools that can be used with Kubernetes. We did a brief walk-through of deploying the Gulp.js task on our cluster. We also looked at a new plugin used to integrate Jenkins build slaves into your Kubernetes cluster. You should now have a better sense of how Kubernetes can integrate with your own CI/CD pipeline.

0
0
1854

Packt

06 Jul 2017

10 min read

Exposure to RxJava

Packt

06 Jul 2017

10 min read

0
0
1778

Packt

06 Jul 2017

9 min read

Command-Line Tools

Packt

06 Jul 2017

9 min read

In this article by Aaron Torres, author of the book, Go Cookbook, we will cover the following recipes: Using command-line arguments Working with Unix pipes An ANSI coloring application (For more resources related to this topic, see here.) Using command-line arguments This article will expand on other uses for these arguments by constructing a command that supports nested subcommands. This will demonstrate Flagsets and also using positional arguments passed into your application. This recipe requires a main function to run. There are a number of third-party packages for dealing with complex nested arguments and flags, but we'll again investigate doing so using only the standard library. Getting ready You need to perform the following steps for the installation: Download and install Go on your operating system at https://golang.org/doc/install and configure your GOPATH. Open a terminal/console application. Navigate to your GOPATH/src and create a project directory, for example, $GOPATH/src/github.com/yourusername/customrepo. All code will be run and modified from this directory. Optionally, install the latest tested version of the code using the go get github.com/agtorre/go-cookbook/ command. How to do it... From your terminal/console application, create and navigate to the chapter2/cmdargs directory. Copy tests from https://github.com/agtorre/go-cookbook/tree/master/chapter2/cmdargs or use this as an exercise to write some of your own. Create a file called cmdargs.go with the following content: package main import ( "flag" "fmt" "os" ) const version = "1.0.0" const usage = `Usage: %s [command] Commands: Greet Version ` const greetUsage = `Usage: %s greet name [flag] Positional Arguments: name the name to greet Flags: ` // MenuConf holds all the levels // for a nested cmd line argument type MenuConf struct { Goodbye bool } // SetupMenu initializes the base flags func (m *MenuConf) SetupMenu() *flag.FlagSet { menu := flag.NewFlagSet("menu", flag.ExitOnError) menu.Usage = func() { fmt.Printf(usage, os.Args[0]) menu.PrintDefaults() } return menu } // GetSubMenu return a flag set for a submenu func (m *MenuConf) GetSubMenu() *flag.FlagSet { submenu := flag.NewFlagSet("submenu", flag.ExitOnError) submenu.BoolVar(&m.Goodbye, "goodbye", false, "Say goodbye instead of hello") submenu.Usage = func() { fmt.Printf(greetUsage, os.Args[0]) submenu.PrintDefaults() } return submenu } // Greet will be invoked by the greet command func (m *MenuConf) Greet(name string) { if m.Goodbye { fmt.Println("Goodbye " + name + "!") } else { fmt.Println("Hello " + name + "!") } } // Version prints the current version that is // stored as a const func (m *MenuConf) Version() { fmt.Println("Version: " + version) } Create a file called main.go with the following content: package main import ( "fmt" "os" "strings" ) func main() { c := MenuConf{} menu := c.SetupMenu() menu.Parse(os.Args[1:]) // we use arguments to switch between commands // flags are also an argument if len(os.Args) > 1 { // we don't care about case switch strings.ToLower(os.Args[1]) { case "version": c.Version() case "greet": f := c.GetSubMenu() if len(os.Args) < 3 { f.Usage() return } if len(os.Args) > 3 { if.Parse(os.Args[3:]) } c.Greet(os.Args[2]) default: fmt.Println("Invalid command") menu.Usage() return } } else { menu.Usage() return } } Run the go build command. Run the following commands and try a few other combinations of arguments: $./cmdargs -h Usage: ./cmdargs [command] Commands: Greet Version $./cmdargs version Version: 1.0.0 $./cmdargs greet Usage: ./cmdargs greet name [flag] Positional Arguments: name the name to greet Flags: -goodbye Say goodbye instead of hello $./cmdargs greet reader Hello reader! $./cmdargs greet reader -goodbye Goodbye reader! If you copied or wrote your own tests go up one directory and run go test, and ensure all tests pass. How it works... Flagsets can be used to set up independent lists of expected arguments, usage strings, and more. The developer is required to do validation on a number of arguments, parsing in the right subset of arguments to commands, and defining usage strings. This can be error prone and requires a lot of iteration to get it completely correct. The flag package makes parsing arguments much easier and includes convenience methods to get the number of flags, arguments, and more. This recipe demonstrates basic ways to construct a complex command-line application using arguments, including a package-level config, required positional arguments, multi-leveled command usage, and how to split these things into multiple files or packages if needed. Working with Unix pipes Unix pipes are useful when passing the output of one program to the input of another. Consider the following example: $ echo "test case" | wc -l 1 In a Go application, the left-hand side of the pipe can be read in using os.Stdin and acts like a file descriptor. To demonstrate this, this recipe will take an input on the left-hand side of a pipe and return a list of words and their number of occurrences. These words will be tokenized on white space. Getting ready Refer to the Getting Ready section of the Using command-line arguments recipe. How to do it... From your terminal/console application, create a new directory, chapter2/pipes. Navigate to that directory and copy tests from https://github.com/agtorre/go-cookbook/tree/master/chapter2/pipes or use this as an exercise to write some of your own. Create a file called pipes.go with the following content: package main import ( "bufio" "fmt" "os" ) // WordCount takes a file and returns a map // with each word as a key and it's number of // appearances as a value func WordCount(f *os.File) map[string]int { result := make(map[string]int) // make a scanner to work on the file // io.Reader interface scanner := bufio.NewScanner(f) scanner.Split(bufio.ScanWords) for scanner.Scan() { result[scanner.Text()]++ } if err := scanner.Err(); err != nil { fmt.Fprintln(os.Stderr, "reading input:", err) } return result } func main() { fmt.Printf("string: number_of_occurrencesnn") for key, value := range WordCount(os.Stdin) { fmt.Printf("%s: %dn", key, value) } } Run echo "some string" | go run pipes.go. You may also run: go build echo "some string" | ./pipes You should see the following output: $ echo "test case" | go run pipes.go string: number_of_occurrences test: 1 case: 1 $ echo "test case test" | go run pipes.go string: number_of_occurrences test: 2 case: 1 If you copied or wrote your own tests, go up one directory and run go test, and ensure that all tests pass. How it works... Working with pipes in go is pretty simple, especially if you're familiar with working with files. This recipe uses a scanner to tokenize the io.Reader interface of the os.Stdin file object. You can see how you must check for errors after completing all of the reads. An ANSI coloring application Coloring an ANSI terminal application is handled by a variety of code before and after a section of text that you want colored. This recipe will explore a basic coloring mechanism to color the text red or keep it plain. For a more complete application, take a look at https://github.com/agtorre/gocolorize, which supports many more colors and text types implements the fmt.Formatter interface for ease of printing. Getting ready Refer to the Getting Ready section of the Using command line arguments recipe. How to do it... From your terminal/console application, create and navigate to the chapter2/ansicolor directory. Copy tests from https://github.com/agtorre/go-cookbook/tree/master/chapter2/ansicolor or use this as an exercise to write some of your own. Create a file called color.go with the following content: package ansicolor import "fmt" //Color of text type Color int const ( // ColorNone is default ColorNone = iota // Red colored text Red // Green colored text Green // Yellow colored text Yellow // Blue colored text Blue // Magenta colored text Magenta // Cyan colored text Cyan // White colored text White // Black colored text Black Color = -1 ) // ColorText holds a string and its color type ColorText struct { TextColor Color Text string } func (r *ColorText) String() string { if r.TextColor == ColorNone { return r.Text } value := 30 if r.TextColor != Black { value += int(r.TextColor) } return fmt.Sprintf("33[0;%dm%s33[0m", value, r.Text) } Create a new directory named example. Navigate to example and then create a file named main.go with the following content. Ensure that you modify the ansicolor import to use the path you set up in step 1: package main import ( "fmt" "github.com/agtorre/go-cookbook/chapter2/ansicolor" ) func main() { r := ansicolor.ColorText{ansicolor.Red, "I'm red!"} fmt.Println(r.String()) r.TextColor = ansicolor.Green r.Text = "Now I'm green!" fmt.Println(r.String()) r.TextColor = ansicolor.ColorNone r.Text = "Back to normal..." fmt.Println(r.String()) } Run go run main.go. Alternatively, you may also run the following: go build ./example You should see the following with the text colored if your terminal supports the ANSI coloring format: $ go run main.go I'm red! Now I'm green! Back to normal... If you copied or wrote your own tests, go up one directory and run go test, and ensure that all the tests pass. How it works... This application makes use of a struct keyword to maintain state of the colored text. In this case, it stores the color of the text and the value of the text. The final string is rendered when you call the String() method, which will either return colored text or plain text depending on the values stored in the struct. By default, the text will be plain. Summary In this article, we demonstrated basic ways to construct a complex command-line application using arguments, including a package-level config, required positional arguments, multi-leveled command usage, and how to split these things into multiple files or packages if needed. We saw how to work with Unix pipes and explored a basic coloring mechanism to color text red or keep it plain. Resources for Article: Further resources on this subject: Building a Command-line Tool [article] A Command-line Companion Called Artisan [article] Scaffolding with the command-line tool [article]

0
0
18487

article-image-chart-model-and-draggable-and-droppable-directives

Packt

06 Jul 2017

9 min read

Chart Model and Draggable and Droppable Directives

Packt

06 Jul 2017

9 min read

In this article by Sudheer Jonna and Oleg Varaksin, the author of the book Learning Angular UI Development with PrimeNG, we will see how to work with the chart model and learn about draggable and droppable directives. (For more resources related to this topic, see here.) Working with the chart model The chart component provides a visual representation of data using chart on a web page. PrimeNG chart components are based on charts.js 2.x library (as a dependency), which is a HTML5 open source library. The chart model is based on UIChart class name and it can be represented with element name as p-chart. The chart components will work efficiently by attaching the chart model file (chart.js) to the project root folder entry point. For example, in this case it would be index.html file. It can be configured as either CDN resource, local resource, or CLI configuration. CDN resource configuration: <script src="https://cdnjs.cloudflare.com/ajax/libs/Chart.js/2.5.0/Chart.bu ndle.min.js"></script> Angular CLI configuration: "scripts": [ "../node_modules/chart.js/dist/Chart.js", //..others ] More about the chart configuration and options will be available in the official documentation of the chartJS library (http://www.chartjs.org/). Chart types The chart type is defined through the type property. It supports six types of charts with an options such as pie, bar, line, doughnut, polarArea, and radar. Each type has it's own format of data and it can be supplied through the data property. For example, in doughnut chart, the type should refer to doughnut and the data property should bind to the data options as shown here: <p-chart type="doughnut" [data]="doughnutdata"></p-chart> The component class has to define data the options with labels and datasets as follows: this.doughnutdata = { labels: ['PrimeNG', 'PrimeUI', 'PrimeReact'], datasets: [ { data: [3000, 1000, 2000], backgroundColor: [ "#6544a9", "#51cc00", "#5d4361" ], hoverBackgroundColor: [ "#6544a9", "#51cc00", "#5d4361" ] } ] }; Along with labels and data options, other properties related to skinning can be applied too. The legends are closable by default (that is, if you want to visualize only particular data variants then it is possible by collapsing legends which are not required). The collapsed legend is represented with a strike line. The respective data component will be disappeared after click operation on legend. Customization Each series is customized on a dataset basis but you can customize the general or common options via the options attribute. For example, the line chart which customize the default options would be as follows: <p-chart type="line" [data]="linedata" [options]="options"></p-chart> The component class needs to define chart options with customized title and legend properties as follows: this.options = { title: { display: true, text: 'PrimeNG vs PrimeUI', fontSize: 16 }, legend: { position: 'bottom' } }; As per the preceding example, the title option is customized with a dynamic title, font size, and conditional display of the title. Where as legend attribute is used to place the legend in top, left, bottom, and right positions. The default legend position is top. In this example, the legend position is bottom. The line chart with preceding customized options would results as a snapshot shown here: The Chart API also supports the couple of utility methods as shown here: refresh Redraws the graph with new data reinit Destroys the existing graph and then creates it again generateLegend Returns an HTML string of a legend for that chart Events The chart component provides a click event on data sets to process the select data using onDataSelect event callback. Let us take a line chart example with onDataSelect event callback by passing an event object as follows: <p-chart type="line" [data]="linedata" (onDataSelect)="selectData($event)"></p-chart> In the component class, an event callback is used to display selected data information in a message format as shown: selectData(event: any) { this.msgs = []; this.msgs.push({ severity: 'info', summary: 'Data Selected', 'detail': this.linedata.datasets[event.element._datasetIndex] .data[event.element._index] }); } In the preceding event callback (onDataSelect), we used an index of the dataset to display information. There are also many other options from an event object: event.element = Selected element event.dataset = Selected dataset event.element._datasetIndex = Index of the dataset in data event.element._index = Index of the data in dataset Learning Draggable and Droppable directives Drag and drop is an action, which means grabbing an object and dragging it to a different location. The components capable of being dragged and dropped enrich the web and make a solid base for modern UI patterns. The drag and drop utilities in PrimeNG allow us to create draggable and droppable user interfaces efficiently. They make it abstract for the developers to deal with the implementation details at the browser level. In this section, we will learn about pDraggable and pDroppable directives. We will introduce a DataGrid component containing some imaginary documents and make these documents draggable in order to drop them onto a recycle bin. The recycle bin is implemented as DataTable component which shows properties of dropped documents. For the purpose of better understanding the developed code, a picture comes first: This picture shows what happens after dragging and dropping three documents. The complete demo application with instructions is available on GitHub at https://github.com/ova2/angular-development-with-primeng/tree/master/chapter9/dragdrop. Draggable pDraggable is attached to an element to add a drag behavior. The value of the pDraggable attribute is required--it defines the scope to match draggables with droppables. By default, the whole element is draggable. We can restrict the draggable area by applying the dragHandle attribute. The value of dragHandle can be any CSS selector. In the DataGrid with available documents, we only made the panel's header draggable: <p-dataGrid [value]="availableDocs"> <p-header> Available Documents </p-header> <ng-template let-doc pTemplate="item"> <div class="ui-g-12 ui-md-4" pDraggable="docs" dragHandle=".uipanel- titlebar" dragEffect="move" (onDragStart)="dragStart($event, doc)" (onDragEnd)="dragEnd($event)"> <p-panel [header]="doc.title" [style]="{'text-align':'center'}"> <img src="/assets/data/images/docs/{{doc.extension}}.png"> </p-panel> </div> </ng-template> </p-dataGrid> The draggable element can fire three events when dragging process begins, proceeds, and ends. These are onDragStart, onDrag, and onDragEnd respectively. In the component class, we buffer the dragged document at the beginning and reset it at the end of the dragging process. This task is done in two callbacks: dragStart and dragEnd. class DragDropComponent { availableDocs: Document[]; deletedDocs: Document[]; draggedDoc: Document; constructor(private docService: DocumentService) { } ngOnInit() { this.deletedDocs = []; this.docService.getDocuments().subscribe((docs: Document[]) => this.availableDocs = docs); } dragStart(event: any, doc: Document) { this.draggedDoc = doc; } dragEnd(event: any) { this.draggedDoc = null; } ... } In the shown code, we used the Document interface with the following properties: interface Document { id: string; title: string; size: number; creator: string; creationDate: Date; extension: string; } In the demo application, we set the cursor to move when the mouse is moved over any panel's header. This trick provides a better visual feedback for draggable area: body .ui-panel .ui-panel-titlebar { cursor: move; } We can also set the dragEffect attribute to specifies the effect that is allowed for a drag operation. Possible values are none, copy, move, link, copyMove, copyLink, linkMove, and all. Refer the official documentation to read more details at https://developer.mozilla.org/en-US/docs/Web/API/DataTransfer/effectAllowed. Droppable pDroppable is attached to an element to add a drop behavior. The value of the pDroppable attribute should have the same scope as pDraggable. Droppable scope can also be an array to accept multiple droppables. The droppable element can fire four events. Event name Description onDragEnter Invoked when a draggable element enters the drop area onDragOver Invoked when a draggable element is being dragged over the drop area onDrop Invoked when a draggable is dropped onto the drop area onDragLeave Invoked when a draggable element leaves the drop area In the demo application, the whole code of the droppable area looks as follows: <div pDroppable="docs" (onDrop)="drop($event)" [ngClass]="{'dragged-doc': draggedDoc}"> <p-dataTable [value]="deletedDocs"> <p-header>Recycle Bin</p-header> <p-column field="title" header="Title"></p-column> <p-column field="size" header="Size (bytes)"></p-column> <p-column field="creator" header="Creator"></p-column> <p-column field="creationDate" header="Creation Date"> <ng-template let-col let-doc="rowData" pTemplate="body"> {{doc[col.field].toLocaleDateString()}} </ng-template> </p-column> </p-dataTable> </div> Whenever a document is being dragged and dropped into the recycle bin, the dropped document is removed from the list of all available documents and added to the list of deleted documents. This happens in the onDrop callback: drop(event: any) { if (this.draggedDoc) { // add draggable element to the deleted documents list this.deletedDocs = [...this.deletedDocs, this.draggedDoc]; // remove draggable element from the available documents list this.availableDocs = this.availableDocs.filter((e: Document) => e.id !== this.draggedDoc.id); this.draggedDoc = null; } } Both available and deleted documents are updated by creating new arrays instead of manipulating existing arrays. This is necessary in data iteration components to force Angular run change detection. Manipulating existing arrays would not run change detection and the UI would not be updated. The Recycle Bin area gets a red border while dragging any panel with document. We have achieved this highlighting by setting ngClass as follows: [ngClass]="{'dragged-doc': draggedDoc}". The style class dragged-doc is enabled when the draggedDoc object is set. The style class is defined as follows: .dragged-doc { border: solid 2px red; } Summary Initially we started with chart components. At first we started with chart Model API and then will learn how to create charts programmatically using various chart types such as pie, bar, line, doughnut, polar and radar charts. We also learned features of Draggable and Droppable. Resources for Article: Further resources on this subject: Building Components Using Angular [article] Get Familiar with Angular [article] Writing a Blog Application with Node.js and AngularJS [article]

0
0
17930

How-To Tutorials

Packt

06 Jul 2017

12 min read

IIS 10 Fundamentals

Packt

06 Jul 2017

12 min read

In this article by Ashraf Khan, the author of the book Microsoft IIS 10 Cookbook, helps us to understand the following topics: Understanding IIS 10 Basic requirements of IIS 10 Understanding application pools on IIS 10 Installation of lower framework version Configuration of application pool on IIS 10 (For more resources related to this topic, see here.) Understanding IIS 10 In this recipe, we will understand how to work with IIS 10's new features. We will have an overview of the following new features added to IIS 10: HTTP/2 HTTP/2 requests are now faster than ever. This feature is active by default with IIS 10 on Windows Server 2016 and Windows 10. IIS 10 on Nano Server IIS 10 is easy and quick to install on Nano Server. You can manage IIS 10 remotely with PowerShell or the IIS Manager console. Nano Server is much faster and consumes less memory and disk space that the full-fledged Windows Server. Rebooting is also faster so that you can manage time effectively. Wildcard host headers IIS 10 support the subdomain feature for your parent domain name. This will really help you manage more subdomains with the same primary domain name. PowerShell 5 cmdlets IIS 10 adds a new, simplified PowerShell module for quick and easy management. You can use PowerShell to access server-management features remotely. It also supports existing WebAdministration cmdlets. FTP FTP is a simple protocol for transferring files. This system can transfer files inside your company LAN and WAN using the default port, 21. IIS 10 includes an FTP server that is easy to configure and manage. FTPS FTPS is the same as FTP, with the only difference that it is secure. FTPS transfers data with SSL. We are going to use HTTPS port 443. For this, we need to create and install an SSL certificate that encrypts and decrypts data securely. SSL ensures that all data passed between web server and browser remains private and consistent during upload and download over private or public networks. Multi-web hosting IIS 10 allows you to create multiple websites and multiple applications on the same server. You can easily manage and create a new virtual directory located in the default location or a custom location. Virtual directories IIS 10 makes it easy to manage and create the virtual directories you require. Understanding application pools on IIS 10 In this recipe, we are going to understand application pools. We can simply say that application pool is the heart of IIS 10. Application pools are logical groupings of web applications that will execute in a common process, thereby allowing greater granularity of which programs are clustered together in a single process. For example, if you required every Web Application to execute in a separate process, you simply go and create an Application Pool for each application of different frameworks versions. Let's say that we have more than one version of website, one which supports framework 2.0 and another one supporting framework 4.0 or different application like PHP or WordPress. All these website process are managed through application pool. Getting ready To step through this recipe , you will need a running IIS 10. You should also be having Administrative privilege. No other prerequisites are required. How to do it... Open the Server Manager on Windows Server 2016. Click on Tools menu and open the IIS Manager. Expand the IIS server (WIN2016IIS) this is the localhost server name WIN2016IIS. We get the listed application pools and sites. In Application Pools, you will get IIS 10 DefaultsAppPool as shown in above figure, also you get Actions panel in right side of the screen where you may add application pools. Click on DefaultAppPool, then you will get the Actions panel of DefaultAppPool. Here you will get an option for Application Pool Tasks highlighted in right side, where you may Start, Stop, and Recycle the services of IIS 10. In Edit Application Pool section, you can change the settings of application pool as Basic Settings..., Advanced Settings..., Rename the application pool and you may also do the Recycling... How it works... Let's take a look at what we explored in IIS Manager and Application Pools. We understood the basics of application pools and the property where we can get the changes done as per our requirement. IIS 10 default application pool framework is v4.0 which is supported upto v4.6, but we will get some more option for installing different versions of application pool. We can easily customize the application pool, which helps us to fulfill our typical web application requirement. We have several options for application pool in action pane. We can add new application pool, we can start, stop and recycle the application pool task. We can do the editing and automated recycling. Now we are going to learn in the next recipe more about application pools for Installation of lower framework version. Installation of lower framework version In this recipe, we are going to install framework 3.5 on Windows Server 2016. Default IIS 10 has the framework 4.0. We will install the lower version of framework which supports the web application of Version 2.0 to Version 3.5 .NET framework. Let's start now if you have your own web application which you had created a few years back and it was developed in v2.0 .NET framework. You want to run this application on IIS 10. We are going to cover this topic in this recipe. Getting ready To step through this recipe you need to install framework version3.5, v3.5 framework is based on v2.0 framework. You will need a Windows Server 2016. You should be having a Window Server 2016 Operating System media for framework 3.5 or Internet connected on Window server 2016. You should have Administrative privilege. No other prerequisites are required. How to do it.... Open the Server Manager on Windows Server 2016, click on highlighted Add roles and features option. Click on Next until you get the Select features wizard. You can see the next figure. Click on Features panel and click on the check box .NET Framework 3.5 Features. It will also install the 2.0 supported framework. Move to next wizard as shown in figure. There is a warning coming before the installation: Do you need to specify an alternate source path? One or more installation selection are missing source files on the destination. We have to provide the Installation media sourcesSxS folder path. Click on Specify an alternate source path. See next figure for more details. Here we have Windows Server 2016 media in D:drive. This is the media path in our case which I have downloaded but in your case it can be different path to locate where you already had downloaded. There is a folder which is called sources and sub folder(SxS). Inside framework 3, installation file is available. You may see the next figure. Now you know where the source folder is. Come to confirm Screen and click on Install. The next figure shows Installation progress on WIN2016IIS. Click on close when installation is completed, you have framework 3.5 available on your server. Now you have to check whether framework 3.5 has been installed or not. It should be available in features wizard. Open the Server Manager, click on Add roles and features. Click next and next until you get the Select Features wizard. You will see .NET framework 3.5 check box checked with gray color which is disabled. You can not check and uncheck the checkbox. You can see the next figure. As shown in the preceding figure, it has been confirmed that .NET framework 3.5 has been installed. This can be installed through PowerShell. We can install directly from Windows Update, you need Internet connectivity and running Windows Update service on Window Server. How it works... In this recipe, IIS administrator is installed in the framework v3.5. The version 3.5 framework on window server 2016 helps us to run built in application .NET framework v2.0 or v3.5. The framework v3.5 processes the application which is built in framework v3.5 or v2.0. We also find out where is the sourcesSxS folder, after installation we verified that this .NET framework v3.5 is available. We are going to create application pool which will support the .NET framework v3.5. Configuration of application pool on IIS 10 In this recipe, we will have an overview of application pool property. We will check the default configuration of Basic Settings, Recycling and Advanced Settings. This is very helpful for developer or system administrator as we can do the configuration of different property of different application pool based upon application requirement. Getting ready For this recipe, we need IIS 1o and .NET framework of any version which is installed on IIS 10. You must have Administrative privilege. No other prerequisites are required. How to do it... Open the Server Manager on Windows Server 2016. Click on Tools menu and open the IIS Manager. Expand the IIS server (WIN2016IIS). We get the listed Application Pools. You may see in the next figure. Now we have already created application pool which is displayed in Application Pools. We created 2and3.5AppPool, Asp.net and DefaultAppPool (Default one). In the Actions panel, we can add many application pools and we can set any one of the created application pool as default application pool. The default application pool helps us when we are creating a website. The default application pool will be selected as an application pool. Select the 2and3.5AppPool from application pools. You will see the Actions pane having a list of available properties in which you can do some changes if needed. The version of 2and3.5AppPool is v2.0, you can see in the next figure. See the Actions panel, Application Pool Tasks and Edit Application Pool which we selected. From the Application Tool Tasks we can Start, Stop and Recycle... the application pool. Now let's come to the basic property of application pool. Click on Basic Settings... from Edit Application Pool, see the next figure which will appear after clicking. Basic Settings... is nothing but a quick settings to change limited number of things. We can change the .NET framework version to framework v4.0 or framework v3.5(version 2.0 is updated version 3.5). We can change the Managed pipeline mode to Integrated or Classic, also we can check or uncheck the start option. Next is the Advanced Settings... which has more options for customization of relevant Application Pool. Click on Advanced Settings..., the next figure will open. We have more settings option available in Advanced Settings... window. You may change the .NET framework version, you can select 32 bit application support true or false. Queue Length is 1000 by default. You may reduce or increase as you need. Start Mode should be OnDemand or Always Running. We can also customize utilization of CPU which helps you to manage the load of each application and their performance. Process Model will help you to define task for application pool availability and accessibility. We can see more about application pool in next figure. Rapid-Fail Protection is generally used for fail over. We can setup the fail over server and configuration. Recycling is to refresh the application pool overlapped. We can set a default recycling value. We can do more specific settings through Recycling settings by clicking on Recycling.... You may see your recycling conditions window in the next figure. Recycling is based on conditions like virtual memory usage, private memory usage, specific time, regular time intervals and fixed number of request, also it will generate you a log file which will help you understand which one was executed at what time. Here you will set the fixed intervals based on time and based on number of request, or specific time and based on Memory utilization, virtual and private memory. Click Next. In the Recycling Events to Log window, we generate log on the recycling events. How it works.... In this recipe we have learned three properties of IIS Application - Basic property, Advanced property and Recycling. We can use these properties for web application which we will host in IIS server to process through the application pool. When we are hosting a web application, there is always some requirement which we need to configure in application pool settings. For example, our management decides that we need to limit the queue of 2and3.5apppool application. We can just go to advance settings and change it. In the next section, we are going to host v4.0 .NET framework website and we will make use of application pool v4.0. Summary In this article, we understood application pools in IIS 10 and how to install and configure them. We also understood how to install a lower framework version. Resources for Article: Further resources on this subject: Exploring Microsoft Dynamics NAV – An Introduction [article] The Microsoft Azure Stack Architecture [article] Setting up Microsoft Bot Framework Dev Environment [article]

0
0
11165

Packt

06 Jul 2017

9 min read

Ruby Strings

Packt

06 Jul 2017

9 min read

In this article by Jordan Hudgens, the author of the book Comprehensive Ruby Programming, you'll learn about the Ruby String data type and walk through how to integrate string data into a Ruby program. Working with words, sentences, and paragraphs are common requirements in many applications. Additionally you learn how to: Employ string manipulation techniques using core Ruby methods Demonstrate how to work with the string data type in Ruby (For more resources related to this topic, see here.) Using strings in Ruby A string is a data type in Ruby and contains set of characters, typically normal English text (or whatever natural language you're building your program for), that you would write. A key point for the syntax of strings is that they have to be enclosed in single or double quotes if you want to use them in a program. The program will throw an error if they are not wrapped inside quotation marks. Let's walk through three scenarios. Missing quotation marks In this code I tried to simply declare a string without wrapping it in quotation marks. As you can see, this results in an error. This error is because Ruby thinks that the values are classes and methods. Printing strings In this code snippet we're printing out a string that we have properly wrapped in quotation marks. Please note that both single and double quotation marks work properly. It's also important that you do not mix the quotation mark types. For example, if you attempted to run the code: puts "Name an animal' You would get an error, because you need to ensure that every quotation mark is matched with a closing (and matching) quotation mark. If you start a string with double quotation marks, the Ruby parser requires that you end the string with the matching double quotation marks. Storing strings in variables Lastly in this code snippet we're storing a string inside of a variable and then printing the value out to the console. We'll talk more about strings and string interpolation in subsequent sections. String interpolation guide for Ruby In this section, we are going to talk about string interpolation in Ruby. What is string interpolation? So what exactly is string interpolation? Good question. String interpolation is the process of being able to seamlessly integrate dynamic values into a string. Let's assume we want to slip dynamic words into a string. We can get input from the console and store that input into variables. From there we can call the variables inside of a pre-existing string. For example, let's give a sentence the ability to change based on a user's input. puts "Name an animal" animal = gets.chomp puts "Name a noun" noun= gets.chomp p "The quick brown #{animal} jumped over the lazy #{noun} " Note the way I insert variables inside the string? They are enclosed in curly brackets and are preceded by a # sign. If I run this code, this is what my output will look: So, this is how you insert values dynamically in your sentences. If you see sites like Twitter, it sometimes displays personalized messages such as: Good morning Jordan or Good evening Tiffany. This type of behavior is made possible by inserting a dynamic value in a fixed part of a string and leverages string interpolation. Now, let's use single quotes instead of double quotes, to see what happens. As you'll see, the string was printed as it is without inserting the values for animal and noun. This is exactly what happens when you try using single quotes—it prints the entire string as it is without any interpolation. Therefore it's important to remember the difference. Another interesting aspect is that anything inside the curly brackets can be a Ruby script. So, technically you can type your entire algorithm inside these curly brackets, and Ruby will run it perfectly for you. However, it is not recommended for practical programming purposes. For example, I can insert a math equation, and as you'll see it prints the value out. String manipulation guide In this section we are going to learn about string manipulation along with a number of examples of how to integrate string manipulation methods in a Ruby program. What is string manipulation? So what exactly is string manipulation? It's the process of altering the format or value of a string, usually by leveraging string methods. String manipulation code examples Let's start with an example. Let's say I want my application to always display the word Astros in capital letters. To do that, I simply write: "Astros".upcase Now if I always a string to be in lower case letters I can use the downcase method, like so: "Astros".downcase Those are both methods I use quite often. However there are other string methods available that we also have at our disposal. For the rare times when you want to literally swap the case of the letters you can leverage the swapcase method: "Astros".swapcase And lastly if you want to reverse the order of the letters in the string we can call the reverse method: "Astros".reverse These methods are built into the String data class and we can call them on any string values in Ruby. Method chaining Another neat thing we can do is join different methods together to get custom output. For example, I can run: "Astros".reverse.upcase The preceding code displays the value SORTSA. This practice of combining different methods with a dot is called method chaining. Split, strip, and join guides for strings In this section, we are going to walk through how to use the split and strip methods in Ruby. These methods will help us clean up strings and convert a string to an array so we can access each word as its own value. Using the strip method Let's start off by analyzing the strip method. Imagine that the input you get from the user or from the database is poorly formatted and contains white space before and after the value. To clean the data up we can use the strip method. For example: str = " The quick brown fox jumped over the quick dog " p str.strip When you run this code, the output is just the sentence without the white space before and after the words. Using the split method Now let's walk through the split method. The split method is a powerful tool that allows you to split a sentence into an array of words or characters. For example, when you type the following code: str = "The quick brown fox jumped over the quick dog" p str.split You'll see that it converts the sentence into an array of words. This method can be particularly useful for long paragraphs, especially when you want to know the number of words in the paragraph. Since the split method converts the string into an array, you can use all the array methods like size to see how many words were in the string. We can leverage method chaining to find out how many words are in the string, like so: str = "The quick brown fox jumped over the quick dog" p str.split.size This should return a value of 9, which is the number of words in the sentence. To know the number of letters, we can pass an optional argument to the split method and use the format: str = "The quick brown fox jumped over the quick dog" p str.split(//).size And if you want to see all of the individual letters, we can remove the size method call, like this: p str.split(//) And your output should look like this: Notice, that it also included spaces as individual characters which may or may not be what you want a program to return. This method can be quite handy while developing real-world applications. A good practical example of this method is Twitter. Since this social media site restricts users to 140 characters, this method is sure to be a part of the validation code that counts the number of characters in a Tweet. Using the join method We've walked through the split method, which allows you to convert a string into a collection of characters. Thankfully, Ruby also has a method that does the opposite, which is to allow you to convert an array of characters into a single string, and that method is called join. Let's imagine a situation where we're asked to reverse the words in a string. This is a common Ruby coding interview question, so it's an important concept to understand since it tests your knowledge of how string work in Ruby. Let's imagine that we have a string, such as: str = "backwards am I" And we're asked to reverse the words in the string. The pseudocode for the algorithm would be: Split the string into words Reverse the order of the words Merge all of the split words back into a single string We can actually accomplish each of these requirements in a single line of Ruby code. The following code snippet will perform the task: str.split.reverse.join(' ') This code will convert the single string into an array of strings, for the example it will equal ["backwards", "am", "I"]. From there it will reverse the order of the array elements, so the array will equal: ["I", "am", "backwards"]. With the words reversed, now we simply need to merge the words into a single string, which is where the join method comes in. Running the join method will convert all of the words in the array into one string. Summary In this article, we were introduced to the string data type and how it can be utilized in Ruby. We analyzed how to pass strings into Ruby processes by leveraging string interpolation. We also learned the methods of basic string manipulation and how to find and replace string data. We analyzed how to break strings into smaller components, along with how to clean up string based data. We even introduced the Array class in this article. Resources for Article: Further resources on this subject: Ruby and Metasploit Modules [article] Find closest mashup plugin with Ruby on Rails [article] Building tiny Web-applications in Ruby using Sinatra [article]

0
0
16622

Packt

06 Jul 2017

11 min read

Spark Streaming

Packt

06 Jul 2017

11 min read

In this article by Romeo Kienzler, the author of the book Mastering Apache Spark 2.x - Second Edition, we will see Apache Streaming module is a stream processing-based module within Apache Spark. It uses the Spark cluster to offer the ability to scale to a high degree. Being based on Spark, it is also highly fault tolerant, having the ability to rerun failed tasks by check-pointing the data stream that is being processed. The following areas will be covered in this article after an initial section, which will provide a practical overview of how Apache Spark processes stream-based data: Error recovery and check-pointing TCP-based stream processing File streams Kafka stream source For each topic, we will provide a worked example in Scala, and will show how the stream-based architecture can be set up and tested. (For more resources related to this topic, see here.) Overview The following diagram shows potential data sources for Apache Streaming, such as Kafka, Flume, and HDFS: These feed into the Spark Streaming module, and are processed as Discrete Streams. The diagram also shows that other Spark module functionality, such as machine learning, can be used to process the stream-based data. The fully processed data can then be an output for HDFS, databases, or dashboards. This diagram is based on the one at the Spark streaming website, but we wanted to extend it for expressing the Spark module functionality: When discussing Spark Discrete Streams, the previous figure, again taken from the Spark website at http://spark.apache.org/, is the diagram we like to use. The green boxes in the previous figure show the continuous data stream sent to Spark, being broken down into a Discrete Streams (DStream). The size of each element in the stream is then based on a batch time, which might be two seconds. It is also possible to create a window, expressed as the previous red box, over the DStream. For instance, when carrying out trend analysis in real time, it might be necessary to determine the top ten Twitter-based hashtags over a ten minute window. So, given that Spark can be used for stream processing, how is a stream created? The following Scala-based code shows how a Twitter stream can be created. This example is simplified because Twitter authorization has not been included, but you get the idea. The Spark Stream Context (SSC) is created using the Spark Context sc. A batch time is specified when it is created; in this case, 5 seconds. A Twitter-based DStream, called stream, is then created from the Streamingcontext using a window of 60 seconds: val ssc = new StreamingContext(sc, Seconds(5) ) val stream = TwitterUtils.createStream(ssc,None).window( Seconds(60) ) The stream processing can be started with the stream context start method (shown next), and the awaitTermination method indicates that it should process until stopped. So, if this code is embedded in a library-based application, it will run until the session is terminated, perhaps with a Crtl + C: ssc.start() ssc.awaitTermination() This explains what Spark Streaming is, and what it does, but it does not explain error handling, or what to do if your stream-based application fails. The next section will examine Spark Streaming error management and recovery. Errors and recovery Generally, the question that needs to be asked for your application is; is it critical that you receive and process all the data? If not, then on failure you might just be able to restart the application and discard the missing or lost data. If this is not the case, then you will need to use check pointing, which will be described in the next section. It is also worth noting that your application's error management should be robust and self-sufficient. What we mean by this is that; if an exception is non-critical, then manage the exception, perhaps log it, and continue processing. For instance, when a task reaches the maximum number of failures (specified by spark.task.maxFailures), it will terminate processing. Checkpointing It is possible to set up an HDFS-based checkpoint directory to store Apache Spark-based streaming information. In this Scala example, data will be stored in HDFS, under /data/spark/checkpoint. The following HDFS file system ls command shows that before starting, the directory does not exist: [hadoop@hc2nn stream]$ hdfs dfs -ls /data/spark/checkpoint ls: `/data/spark/checkpoint': No such file or directory The Twitter-based Scala code sample given next, starts by defining a package name for the application, and by importing Spark Streaming Context, and Twitter-based functionality. It then defines an application object named stream1: package nz.co.semtechsolutions import org.apache.spark._ import org.apache.spark.SparkContext._ import org.apache.spark.streaming._ import org.apache.spark.streaming.twitter._ import org.apache.spark.streaming.StreamingContext._ object stream1 { Next, a method is defined called createContext, which will be used to create both the spark, and streaming contexts. It will also checkpoint the stream to the HDFS-based directory using the streaming context checkpoint method, which takes a directory path as a parameter. The directory path being the value (cpDir) that was passed into the createContext method: def createContext( cpDir : String ) : StreamingContext = { val appName = "Stream example 1" val conf = new SparkConf() conf.setAppName(appName) val sc = new SparkContext(conf) val ssc = new StreamingContext(sc, Seconds(5) ) ssc.checkpoint( cpDir ) ssc } Now, the main method is defined, as is the HDFS directory, as well as Twitter access authority and parameters. The Spark Streaming context ssc is either retrieved or created using the HDFS checkpoint directory via the StreamingContext method—getOrCreate. If the directory doesn't exist, then the previous method called createContext is called, which will create the context and checkpoint. Obviously, we have truncated our own Twitter auth.keys in this example for security reasons: def main(args: Array[String]) { val hdfsDir = "/data/spark/checkpoint" val consumerKey = "QQpxx" val consumerSecret = "0HFzxx" val accessToken = "323xx" val accessTokenSecret = "IlQxx" System.setProperty("twitter4j.oauth.consumerKey", consumerKey) System.setProperty("twitter4j.oauth.consumerSecret", consumerSecret) System.setProperty("twitter4j.oauth.accessToken", accessToken) System.setProperty("twitter4j.oauth.accessTokenSecret", accessTokenSecret) val ssc = StreamingContext.getOrCreate(hdfsDir, () => { createContext( hdfsDir ) }) val stream = TwitterUtils.createStream(ssc,None).window( Seconds(60) ) // do some processing ssc.start() ssc.awaitTermination() } // end main Having run this code, which has no actual processing, the HDFS checkpoint directory can be checked again. This time it is apparent that the checkpoint directory has been created, and the data has been stored: [hadoop@hc2nn stream]$ hdfs dfs -ls /data/spark/checkpoint Found 1 items drwxr-xr-x - hadoop supergroup 0 2015-07-02 13:41 /data/spark/checkpoint/0fc3d94e-6f53-40fb-910d-1eef044b12e9 This example, taken from the Apache Spark website, shows how checkpoint storage can be set up and used. But how often is checkpointing carried out? The metadata is stored during each stream batch. The actual data is stored with a period, which is the maximum of the batch interval, or ten seconds. This might not be ideal for you, so you can reset the value using the method: DStream.checkpoint( newRequiredInterval ) Where newRequiredInterval is the new checkpoint interval value that you require, generally you should aim for a value which is five to ten times your batch interval. Checkpointing saves both the stream batch and metadata (data about the data). If the application fails, then when it restarts, the checkpointed data is used when processing is started. The batch data that was being processed at the time of failure is reprocessed, along with the batched data since the failure. Remember to monitor the HDFS disk space being used for check pointing. In the next section, we will begin to examine the streaming sources, and will provide some examples of each type. Streaming sources We will not be able to cover all the stream types with practical examples in this section, but where this article is too small to include code, we will at least provide a description. In this article, we will cover the TCP and file streams, and the Flume, Kafka, and Twitter streams. We will start with a practical TCP-based example. This article examines stream processing architecture. For instance, what happens in cases where the stream data delivery rate exceeds the potential data processing rate? Systems like Kafka provide the possibility of solving this issue by providing the ability to use multiple data topics and consumers. TCP stream There is a possibility of using the Spark Streaming Context method called socketTextStream to stream data via TCP/IP, by specifying a hostname and a port number. The Scala-based code example in this section will receive data on port 10777 that was supplied using the Netcat Linux command. The code sample starts by defining the package name, and importing Spark, the context, and the streaming classes. The object class named stream2 is defined, as it is the main method with arguments: package nz.co.semtechsolutions import org.apache.spark._ import org.apache.spark.SparkContext._ import org.apache.spark.streaming._ import org.apache.spark.streaming.StreamingContext._ object stream2 { def main(args: Array[String]) { The number of arguments passed to the class is checked to ensure that it is the hostname and the port number. A Spark configuration object is created with an application name defined. The Spark and streaming contexts are then created. Then, a streaming batch time of 10 seconds is set: if ( args.length < 2 ) { System.err.println("Usage: stream2 <host> <port>") System.exit(1) } val hostname = args(0).trim val portnum = args(1).toInt val appName = "Stream example 2" val conf = new SparkConf() conf.setAppName(appName) val sc = new SparkContext(conf) val ssc = new StreamingContext(sc, Seconds(10) ) A DStream called rawDstream is created by calling the socketTextStream method of the streaming context using the host and port name parameters. val rawDstream = ssc.socketTextStream( hostname, portnum ) A top-ten word count is created from the raw stream data by splitting words by spacing. Then a (key,value) pair is created as (word,1), which is reduced by the key value, this being the word. So now, there is a list of words and their associated counts. Now, the key and value are swapped, so the list becomes (count and word). Then, a sort is done on the key, which is now the count. Finally, the top 10 items in the RDD, within the DStream, are taken and printed out: val wordCount = rawDstream .flatMap(line => line.split(" ")) .map(word => (word,1)) .reduceByKey(_+_) .map(item => item.swap) .transform(rdd => rdd.sortByKey(false)) .foreachRDD( rdd => { rdd.take(10).foreach(x=>println("List : " + x)) }) The code closes with the Spark Streaming start, and awaitTermination methods being called to start the stream processing and await process termination: ssc.start() ssc.awaitTermination() } // end main } // end stream2 The data for this application is provided, as we stated previously, by the Linux Netcat (nc) command. The Linux Cat command dumps the contents of a log file, which is piped to nc. The lk options force Netcat to listen for connections, and keep on listening if the connection is lost. This example shows that the port being used is 10777: [root@hc2nn log]# pwd /var/log [root@hc2nn log]# cat ./anaconda.storage.log | nc -lk 10777 The output from this TCP-based stream processing is shown here. The actual output is not as important as the method demonstrated. However, the data shows, as expected, a list of 10 log file words in descending count order. Note that the top word is empty because the stream was not filtered for empty words: List : (17104,) List : (2333,=) List : (1656,:) List : (1603,;) List : (1557,DEBUG) List : (564,True) List : (495,False) List : (411,None) List : (356,at) List : (335,object) This is interesting if you want to stream data using Apache Spark Streaming, based upon TCP/IP from a host and port. But what about more exotic methods? What if you wish to stream data from a messaging system, or via memory-based channels? What if you want to use some of the big data tools available today like Flume and Kafka? The next sections will examine these options, but first I will demonstrate how streams can be based upon files. Summary We could have provided streaming examples for systems like Kinesis, as well as queuing systems, but there was not room in this article. This article has provided practical examples of data recovery via checkpointing in Spark Streaming. It has also touched on the performance limitations of checkpointing and shown that that the checkpointing interval should be set at five to ten times the Spark stream batch interval. Resources for Article: Further resources on this subject: Understanding Spark RDD [article] Spark for Beginners [article] Setting up Spark [article]

0
0
19571

Packt

05 Jul 2017

9 min read

Planning and Preparation

Packt

05 Jul 2017

9 min read

In this article by Jason Beltrame, authors of the book Penetration Testing Bootcamp, Proper planning and preparation is key to a successful penetration test. It is definitely not as exciting as some of the tasks we will do within the penetration test later, but it will lay the foundation of the penetration test. There are a lot of moving parts to a penetration test, and you need to make sure that you stay on the correct path and know just how far you can and should go. The last thing you want to do in a penetration test is cause a customer outage because you took down their application server with an exploit test (unless, of course, they want us to get to that depth) or scanned the wrong network. Performing any of these actions would cause our penetration-testing career to be a rather short-lived career. In this article, following topics will be covered: Why does penetration testing take place? Building the systems for the penetration test Penetration system software setup (For more resources related to this topic, see here.) Why does penetration testing take place? There are many reasons why penetration tests happen. Sometimes, a company may want to have a stronger understanding of their security footprint. Sometimes, they may have a compliance requirement that they have to meet. Either way, understanding why penetration testing is happening will help you understand the goal of the company. Plus, it will also let you know whether you are performing an internal penetration test or an external penetration test. External penetration tests will follow the flow of an external user and see what they have access to and what they can do with that access. Internal penetration tests are designed to test internal systems, so typically, the penetration box will have full access to that environment, being able to test all software and systems for known vulnerabilities. Since tests have different objectives, we need to treat them differently; therefore, our tools and methodologies will be different. Understanding the engagement One of the first tasks you need to complete prior to starting a penetration test is to have a meeting with the stakeholders and discuss various data points concerning the upcoming penetration test. This meeting could be you as an external entity performing a penetration test for a client or you as an internal security employee doing the test for your own company. The important element here is that the meeting should happen either way, and the same type of information needs to be discussed. During the scoping meeting, the goal is to discuss various items of the penetration test so that you have not only everything you need, but also full management buy-in with clearly defined objectives and deliverables. Full management buy-in is a key component for a successful penetration test. Without it, you may have trouble getting required information from certain teams, scope creep, or general pushback. Building the systems for the penetration test With a clear understanding of expectations, deliverables, and scope, it is now time to start working on getting our penetration systems ready to go. For the hardware, I will be utilizing a decently powered laptop. The laptop specifications are a Macbook Pro with 16 GB of RAM, 256 GB SSD, and a quad-core 2.3 Ghz Intel i7 running VMware Fusion. I will also be using the Raspberry Pi 3. The Raspberry Pi 3 is a 1.2 Ghz ARMv8 64-bit Quad Core, with 1GB of RAM and a 32 GB microSD. Obviously, there is quite a power discrepancy between the laptop and the Raspberry Pi. That is okay though, because I will be using both these devices differently. Any task that requires any sort of processing power will be done on the laptop. I love using the Raspberry Pi because of its small form factor and flexibility. It can be placed in just about any location we need, and if needed, it can be easily concealed. For software, I will be using Kali Linux as my operating system of choice. Kali is a security-oriented Linux distribution that contains a bunch of security tools already installed. Its predecessor, Backtrack, was also a very popular security operating system. One of the benefits of Kali Linux is that it is also available for the Raspberry Pi, which is perfect in our circumstance. This way, we can have a consistent platform between devices we plan to use in our penetration-testing labs. Kali Linux can be downloaded from their site at https://www.kali.org. For the Raspberry Pi, the Kali images are managed by Offensive Security at https://www.offensive-security.com. Even though I am using Kali Linux as my software platform of choice, feel free to use whichever software platform you feel most comfortable with. We will be using a bunch of open source tools for testing. A lot of these tools are available for other distributions and operating systems. Penetration system software setup Setting up Kali Linux on both systems is a bit different since they are different platforms. We won't be diving into a lot of details on the install, but we will be hitting all the major points. This is the process you can use to get the software up and running. We will start with the installation on the Raspberry Pi: Download the images from Offensive Security at https://www.offensive-security.com/kali-linux-arm-images/. Open the Terminal app on OS X. Using the utility xz, you can decompress the Kali image that was downloaded: xz-dkali-2.1.2-rpi2.img.xz Next, you insert the USB microSD card reader with the microSD card into the laptop and verify the disks that are installed so that you know the correct disk to put the Kali image on: diskutillist Once you know the correct disk, you can unmount the disk to prepare to write to it: diskutilunmountDisk/dev/disk2 Now that you have the correct disk unmounted, you will want to write the image to it using the dd command. This process can take some time, so if you want to check on the progress, you can run the Ctrl + T command anytime: sudoddif=kali-2.1.2-rpi2.imgof=/dev/disk2bs=1m Since the image is now written to the microSD drive, you can eject it with the following command: diskutileject/dev/disk2 You then remove the USB microSD card reader, place the microSD card in the Raspberry Pi, and boot it up for the first time. The default login credentials are as follows: Username:root Password:toor You then change the default password on the Raspberry Pi to make sure no one can get into it with the following command: Passwd<INSERTPASSWORDHERE> Making sure the software is up to date is important for any system, especially a secure penetration-testing system. You can accomplish this with the following commands: apt-getupdate apt-getupgrade apt-getdist-upgrade After a reboot, you are ready to go on the Raspberry Pi. Next, it's onto setting up the Kali Linux install on the Mac. Since you will be installing Kali as a VM within Fusion, the process will vary compared to another hypervisor or installing on a bare metal system. For me, I like having the flexibility of having OS X running so that I can run commands on there as well: Similar to the Raspberry Pi setup, you need to download the image. You will do that directly via the Kali website. They offer virtual images for downloads as well. If you go to select these, you will be redirected to the Offensive Security site at https://www.offensive-security.com/kali-linux-vmware-virtualbox-image-download/. Now that you have the Kali Linux image downloaded, you need to extract the VMDK. We used 7z via CLI to accomplish this task: Since the VMDK is ready to import now, you will need to go into VMware Fusion and navigate to File | New. A screen similar to the following should be displayed: Click on Create a custom virtual machine. You can select the OS as Other | Other and click on Continue: Now, you will need to import the previously decompressed VMDK. Click on the Use an existing virtual disk radio button, and hit Choose virtual disk. Browse the VMDK. Click on Continue. Then, on the last screen, click on the Finish button. The disk should now start to copy. Give it a few minutes to complete: Once completed, the Kali VM will now boot. Log in with the credentials we used in the Raspberry Pi image: Username:root Password:toor You need to then change the default password that was set to make sure no one can get into it. Open up a terminal within the Kali Linux VM and use the following command: Passwd<INSERTPASSWORDHERE> Make sure the software is up to date, like you did for the Raspberry Pi. To accomplish this, you can use the following commands: apt-getupdate apt-getupgrade apt-getdist-upgrade Once this is complete, the laptop VM is ready to go. Summary Now that we have reached the end of this article, we should have everything that we need for the penetration test. Having had the scoping meeting with all the stakeholders, we were able to get answers to all the questions that we required. Once we completed the planning portion, we moved onto the preparation phase. In this case, the preparation phase involved setting up Kali Linux on both the Raspberry Pi as well as setting it up as a VM on the laptop. We went through the steps of installing and updating the software on each platform as well as some basic administrative tasks. Resources for Article: Further resources on this subject: Introducing Penetration Testing [article] Web app penetration testing in Kali [article] BackTrack 4: Security with Penetration Testing Methodology [article]

0
0
21847

How-To Tutorials

article-image-devops-concepts-and-assessment-framework

Packt

05 Jul 2017

21 min read

DevOps Concepts and Assessment Framework

Packt

05 Jul 2017

21 min read

0
1
18722

Massive Graphs on Big Data

Android UIs with Custom Views

HPC Cluster Computing

Queues and topics

Thread of Execution

When do we use R over Python?

Continous Delivery

Exposure to RxJava

Command-Line Tools

Chart Model and Draggable and Droppable Directives

Trending Topics

IIS 10 Fundamentals

Ruby Strings

Spark Streaming

Planning and Preparation

DevOps Concepts and Assessment Framework

Create a Free Account To Continue Reading

Sign in to activate your 7-day free access