Programming | Tech News, Tutorials & Expert Insights

23 Oct 2009

13 min read

Business Process Modeling

23 Oct 2009

Modeling Business Processes The transparency of the process flow is crucial, as this gives the process owners, process analysts, and all others involved an insight into what is going on. An understanding of the as-is process flow also ensures that we can judge the efficiency and the quality of the process. The main objective of process modeling is the definition of the as-is process flow. Process modeling needs to answer the following questions: What is the outcome of the business process? What activities are performed within the business process? What is the order of activities? Who performs the activities? Which business documents are exchanged within the process? How foolproof is the process, and how can it be extended in the future? After answering these and some other questions, we get a good insight into how the process works. We can also identify structural, organizational, and technological weak points and even bottlenecks, and identify potential improvements to the process. We will model business process to satisfy the following objectives: To specify the exact result of the business process, and to understand the business value of this result. To understand the activities of the business process. Knowing the exact tasks and activities that have to be performed is crucial to understanding the details of the process. To understand the order of activities. Activities can be performed in sequence or in parallel, which can help improve the overall time required to fulfill a business process. Activities can be short-running or long-running. To understand the responsibilities, to identify (and later supervise) who is responsible for which activities and tasks. To understand the utilization of resources consumed in the business process. Knowing who uses which resources can help improve the utilization of resources as resource requirements can be planned for and optimized. To understand the relationship between people involved in the processes, and their communication. Knowing exactly who communicates with whom is important and can help to organize and optimize communications. To understand the document flow. Business processes produce and consume documents (regardless of whether these are paper or electronic documents). Understanding where the documents are going, and where they are coming from is important. A good overview of the documents also gives us the opportunity to identify whether all of the documents are really necessary. To identify potential bottlenecks and points of improvements, which can be used later in the process optimization phase. To introduce quality standards such as ISO 9001 more successfully, and to better pass certification. To improve the understandability of quality regulations that can be supplemented with process diagrams. To use business process models as work guidelines for new employees who can introduce themselves to the business processes faster and more efficiently. To understand business processes, which will enable us to understand and describe the company as a whole. A good understanding of business processes is very important for developing IT support. Applications that provide end-to-end support for business processes, can be developed efficiently only if we understand the business processes in details. Modeling Method and Notation Efficient process modeling requires a modeling method that provides a structured and controlled approach to process modeling. Several modeling methods have been developed over the years. Examples include IDS Sheer's the ARIS methodology, CSC's Catalyst, Business Genetics, SCOR and the extensions PCOR and VCOR, POEM, and so on. The ARIS methodology has been the most popular methodology, and has been adopted by many software vendors. In the next section, we will describe the basics of the ARIS methodology, which has lately been adapted to be conformant with SOA. ARIS ARIS is both a BPM methodology, and an architectural framework for designing enterprise architectures. Enterprise architecture combines business models (process models, organizational models, and so on) with IT models (IT architecture, data model, and so on). ARIS stands for Architecture of Integrated Information Systems and comprises of two things, the methodology and framework, and the software that supports both. Here, we will give a brief introduction to ARIS methodology and framework, which dates back to 1992. The objective of ARIS is to narrow the gap between business requirements and IT. The ARIS framework is not only about process models (describing business processes), although process models are one of the most important things of ARIS. As enterprise architecture is complex, ARIS defines several views that focus on specific aspects such as business, technology, information, and so on, to reduce the complexity. The ARIS framework describes the following: Business processes Products and services related to the processes The structure of the organization Business objectives and strategies Information flows IT architecture and applications The data model Resources (people and hardware resources) Costs Skills and knowledge These views are gathered under the concept of ARIS House, which provides a structured view on all information on business processes. ARIS House offers five views: The process view (also called the control view) is the central view that shows the behavior of the processes, how the processes relate to the products and services, organization, functions, and data. The process view includes the process models in the selected notation, and other diagrams such as information flow, material flow, value chains, communication diagrams, and so on. The product and service view shows the products and services, their structures, relations, and product/service trees. The organizational view shows the organizational structure of the company, including departments, roles, and employees. It shows these in hierarchical organizational charts. The organization view also shows technical resources and communication networks. The function view defines process tasks and describes business objectives, function hierarchies, and application software. The data view shows business data and information. This view includes data models, information maps, database models, and knowledge structures. The ARIS House is illustrated in the following figure: In ARIS House, the process view is the central view of the dynamic behavior of the business processes and brings together the other four static views, the organizational view, data view, function view and product/service view. In this book, we will focus primarily on the process view. Each ARIS view is divided further into phases. The translation of business requirements into IT applications requires that we follow certain phases. Globally, three general phases are likely to be used: Requirements phase Design specification phase Implementation phase ARIS is particularly strong in the requirements phase, while other phases may differ depending on the implementation method and the architecture we use. We will talk about these later in this article. Let us now look at the other important aspect, the business process modeling notations. Modeling Notation Process modeling also requires a notation In the past, several notations were used to model processes. Flow diagrams and block diagrams were representatives of the first-generation notations. Then, more sophisticated notations were defined, such as EPC (Event Process Chain) and eEPC (Extended Event Process Chain). UML activity diagrams, XPDL, and IDEF 3 were also used, in addition to some other less-known notations. A few years ago a new notation, called Business Process Modeling Notation (BPMN) was developed. BPMN was developed particularly for modeling business processes in accordance with SOA. In this article, we will use BPMN for modeling processes. BPMN BPMN is the most comprehensive notation for process modeling so far. It has been developed under the hood of OMG (Object Management Group). Let us look into the brief introduction of the most important BPMN elements so that we can read the diagrams presented later in this article. The most important goals while designing BPMN have been: To develop a notation, which will be understandable at all levels: In business process modeling different people are involved, from business users, business analysts, and process owners, to the technical architects and developers. The management reviews business processes at periodic intervals. Therefore, the goal of BPMN has been to provide a graphical notation the is simple to understand, yet powerful enough to model business processes at the required level of detail. To enable automatic transformation into executable code, that is, BPEL, and vice-versa: The gap between the business process models and the information technology (application software) has been quite large in existing technologies. There is no clear definition on how one relates to the other. Therefore, BPMN has been designed specifically to provide such transformations. To model the diagrams, BPMN defines four categories of elements: Flow objects, which are activities, events, and gateways. Activities can be tasks or sub-processes. Events can be triggers or results. Three types of events are supported: start, intermediate, and end. Gateways control the divergence of sequential flows into concurrent flows, and their convergence back to sequential flow. Connecting objects are used to connect flow objects together. Connectors are sequence flows, message flows, and associations. Swim lanes are used to organize activities into visual categories in order to illustrate different responsibilities or functional capabilities. Pools and lanes can be used for swim lanes. Artifacts are used to add specific context to the business processes that are being modeled. Data objects are used to show how data is produced or required by the process. Groups are used to group together similar activities or other elements. Annotations are used to add text information to the diagram. We can also define custom artifacts. The following diagrams show the various notations used in BPMN: Activities are the basic elements of BPMN and are represented by rectangles with rounded corners. A plus sign denotes that the activity can be further decomposed: Decisions are shown as diamonds. A plus sign inside the diamond denotes a logical AND, while an x denotes a logical OR: Events are shown as double circles: Roles are shown as pools and swim-lanes within pools: A Document is shown as follows: The order of activities is indicated by an arrow: The flow of a document or information is shown with a dashed line: BPMN can be used to model parts of processes or whole processes. Processes can be modeled at different levels of fidelity. BPMN is equally suitable for internal (private) business processes, and for public (collaborative) business-to-business processes. Internal business processes focus on the point of view of a single company, and define activities that are internal to the company. Such processes might also define interactions with external partners. Public collaborative processes show the interaction between all involved businesses and organizations. Such processes models should be modeled from the general point of view, and should show interactions between the participants. Process Design The main activity in process design is the recording of the actual processes. The objective is to develop the as-is process model. To develop the as-is model, it is necessary to gather all knowledge about the process. This knowledge often exists only in the heads of the employees, who are involved in the process. Therefore, it is necessary to perform detailed interviews with all involved people. Often, process supervisors might think that they know exactly how the process is performed. However, after talking with those employees who really carry out the work, they see that the actual situation differs considerably. It is very important to gather all this information about the process, otherwise it will not be possible to develop a sound process model, that reflects the as-is state of the process. The first question related to the as-is model is the business result that the process generates. Understanding the business result is crucial, as sometimes it may not be clearly articulated. After the business result is identified, we should understand the process flow. The process flow consists of activities (or tasks) that are performed in a certain order. The process flow is modeled at various levels of abstraction. At the highest level of abstraction, the process flow shows only the most important activities (usually up to ten). Each of the top-level activities are then decomposed into detailed flows. The process complexity, and the required level of detail, are the criteria that instruct us how deep we should decompose. To understand the process behavior completely, it makes sense to decompose until atomic activities (that is, activities that cannot be further decomposed) are reached. When developing the as-is process model, one of the most important things to consider is the level of detail. In order to provide end-to-end support for business processes using SOA, detailed process modeling should be done. The difficulties often hide in the details! In the process design, we should understand the detailed structure of the business process. Therefore, we should identify at least the following: Process activities at various levels of detail Roles responsible for carrying out each process activity Events that trigger the process execution and events that interrupt the process flow Documents exchanged within the process. This includes input documents and output documents Business rules that are part of the process We should design the usual (also called optimal) process flow and identify possible exception scenarios. Exceptions interrupt the usual process flow. Therefore, we need to specify how the exceptions will be handled. The usual approach to the process design includes the following steps: Identifying the roles Identifying the activities Connecting activities to roles Defining the order of activities Adding events Adding documents We should also understand the efficiency of the business process. This includes resource utilization, the time taken by involved employees, possible bottlenecks, and inefficiencies. This is the reason why we should also identify metrics that are used to measure the efficiency of the process. While some of these metrics may be KPIs, other metrics relevant to the process should also be identified. We should identify if the process is compliant with standards or reference processes. In some industry domains, reference processes have been defined. An example is the telecommunications industry where the TMF (Telecom Management Forum) has defined NGOSS. Part of NGOSS is eTom (Enhanced Telecom Operations Map), which specifies compliant business processes for telecom companies. Other industries have also started to develop similar reference processes. We should also identify the business goals to which the process contributes to. Business goals are the same as the process results. A business process should not only have at least one result, but should also contribute to at least one (preferably more than one) business goal. Here, we can look into the company strategy to identify the business goals. We should also identify the events that can interrupt the process flow. Each process can be interrupted, and we should understand how this happens. If a process is interrupted, we might need to compensate those activities of the process that have already been successfully completed. Therefore, we should also specify the compensation logic related to different interruption events. Finally, we should also understand the current software support for the business process. This is important because existing software may hide the details of process behavior. This information can also be re-used for end-to-end process support. Once we have identified all of these artifacts, we will have gathered a good understanding of the process. Therefore, let us now look at the results of the process modeling.

0
0
3919

article-image-multiplying-performance-parallel-computing

Packt

06 Feb 2015

22 min read

Multiplying Performance with Parallel Computing

Packt

06 Feb 2015

22 min read

In this article, by Aloysius Lim and William Tjhi, authors of the book R High Performance Programming, we will learn how to write and execute a parallel R code, where different parts of the code run simultaneously. So far, we have learned various ways to optimize the performance of R programs running serially, that is in a single process. This does not take full advantage of the computing power of modern CPUs with multiple cores. Parallel computing allows us to tap into all the computational resources available and to speed up the execution of R programs by many times. We will examine the different types of parallelism and how to implement them in R, and we will take a closer look at a few performance considerations when designing the parallel architecture of R programs. (For more resources related to this topic, see here.) Data parallelism versus task parallelism Many modern software applications are designed to run computations in parallel in order to take advantage of the multiple CPU cores available on almost any computer today. Many R programs can similarly be written in order to run in parallel. However, the extent of possible parallelism depends on the computing task involved. On one side of the scale are embarrassingly parallel tasks, where there are no dependencies between the parallel subtasks; such tasks can be made to run in parallel very easily. An example of this is, building an ensemble of decision trees in a random forest algorithm—randomized decision trees can be built independently from one another and in parallel across tens or hundreds of CPUs, and can be combined to form the random forest. On the other end of the scale are tasks that cannot be parallelized, as each step of the task depends on the results of the previous step. One such example is a depth-first search of a tree, where the subtree to search at each step depends on the path taken in previous steps. Most algorithms fall somewhere in between with some steps that must run serially and some that can run in parallel. With this in mind, careful thought must be given when designing a parallel code that works correctly and efficiently. Often an R program has some parts that have to be run serially and other parts that can run in parallel. Before making the effort to parallelize any of the R code, it is useful to have an estimate of the potential performance gains that can be achieved. Amdahl's law provides a way to estimate the best attainable performance gain when you convert a code from serial to parallel execution. It divides a computing task into its serial and potentially-parallel parts and states that the time needed to execute the task in parallel will be no less than this formula: T(n) = T(1)(P + (1-P)/n), where: T(n) is the time taken to execute the task using n parallel processes P is the proportion of the whole task that is strictly serial The theoretical best possible speed up of the parallel algorithm is thus: S(n) = T(1) / T(n) = 1 / (P + (1-P)/n) For example, given a task that takes 10 seconds to execute on one processor, where half of the task can be run in parallel, then the best possible time to run it on four processors is T(4) = 10(0.5 + (1-0.5)/4) = 6.25 seconds. The theoretical best possible speed up of the parallel algorithm with four processors is 1 / (0.5 + (1-0.5)/4) = 1.6x . The following figure shows you how the theoretical best possible execution time decreases as more CPU cores are added. Notice that the execution time reaches a limit that is just above five seconds. This corresponds to the half of the task that must be run serially, where parallelism does not help. Best possible execution time versus number of CPU cores In general, Amdahl's law means that the fastest execution time for any parallelized algorithm is limited by the time needed for the serial portions of the algorithm. Bear in mind that Amdahl's law provides only a theoretical estimate. It does not account for the overheads of parallel computing (such as starting and coordinating tasks) and assumes that the parallel portions of the algorithm are infinitely scalable. In practice, these factors might significantly limit the performance gains of parallelism, so use Amdahl's law only to get a rough estimate of the maximum speedup possible. There are two main classes of parallelism: data parallelism and task parallelism. Understanding these concepts helps to determine what types of tasks can be modified to run in parallel. In data parallelism, a dataset is divided into multiple partitions. Different partitions are distributed to multiple processors, and the same task is executed on each partition of data. Take for example, the task of finding the maximum value in a vector dataset, say one that has one billion numeric data points. A serial algorithm to do this would look like the following code, which iterates over every element of the data in sequence to search for the largest value. (This code is intentionally verbose to illustrate how the algorithm works; in practice, the max() function in R, though also serial in nature, is much faster.) serialmax <- function(data) {max = -Inffor (i in data) {if (i > max)max = i}return max} One way to parallelize this algorithm is to split the data into partitions. If we have a computer with eight CPU cores, we can split the data into eight partitions of 125 million numbers each. Here is the pseudocode for how to perform the same task in parallel: # Run this in parallel across 8 CPU corespart.results <- run.in.parallel(serialmax(data.part))# Compute global maxglobal.max <- serialmax(part.results) This pseudocode runs eight instances of serialmax()in parallel—one for each data partition—to find the local maximum value in each partition. Once all the partitions have been processed, the algorithm finds the global maximum value by finding the largest value among the local maxima. This parallel algorithm works because the global maximum of a dataset must be the largest of the local maxima from all the partitions. The following figure depicts data parallelism pictorially. The key behind data parallel algorithms is that each partition of data can be processed independently of the other partitions, and the results from all the partitions can be combined to compute the final results. This is similar to the mechanism of the MapReduce framework from Hadoop. Data parallelism allows algorithms to scale up easily as data volume increases—as more data is added to the dataset, more computing nodes can be added to a cluster to process new partitions of data. Data parallelism Other examples of computations and algorithms that can be run in a data parallel way include: Element-wise matrix operations such as addition and subtraction: The matrices can be partitioned and the operations are applied to each pair of partitions. Means: The sums and number of elements in each partition can be added to find the global sum and number of elements from which the mean can be computed. K-means clustering: After data partitioning, the K centroids are distributed to all the partitions. Finding the closest centroid is performed in parallel and independently across the partitions. The centroids are updated by first, calculating the sums and the counts of their respective members in parallel, and then consolidating them in a single process to get the global means. Frequent itemset mining using the Partition algorithm: In the first pass, the frequent itemsets are mined from each partition of data to generate a global set of candidate itemsets; in the second pass, the supports of the candidate itemsets are summed from each partition to filter out the globally infrequent ones. The other main class of parallelism is task parallelism, where tasks are distributed to and executed on different processors in parallel. The tasks on each processor might be the same or different, and the data that they act on might also be the same or different. The key difference between task parallelism and data parallelism is that the data is not divided into partitions. An example of a task parallel algorithm performing the same task on the same data is the training of a random forest model. A random forest is a collection of decision trees built independently on the same data. During the training process for a particular tree, a random subset of the data is chosen as the training set, and the variables to consider at each branch of the tree are also selected randomly. Hence, even though the same data is used, the trees are different from one another. In order to train a random forest of say 100 decision trees, the workload could be distributed to a computing cluster with 100 processors, with each processor building one tree. All the processors perform the same task on the same data (or exact copies of the data), but the data is not partitioned. The parallel tasks can also be different. For example, computing a set of summary statistics on the same set of data can be done in a task parallel way. Each process can be assigned to compute a different statistic—the mean, standard deviation, percentiles, and so on. Pseudocode of a task parallel algorithm might look like this: # Run 4 tasks in parallel across 4 coresfor (task in tasks)run.in.parallel(task)# Collect the results of the 4 tasksresults <- collect.parallel.output()# Continue processing after all 4 tasks are complete Implementing data parallel algorithms Several R packages allow code to be executed in parallel. The parallel package that comes with R provides the foundation for most parallel computing capabilities in other packages. Let's see how it works with an example. This example involves finding documents that match a regular expression. Regular expression matching is a fairly computational expensive task, depending on the complexity of the regular expression. The corpus, or set of documents, for this example is a sample of the Reuters-21578 dataset for the topic corporate acquisitions (acq) from the tm package. Because this dataset contains only 50 documents, they are replicated 100,000 times to form a corpus of 5 million documents so that parallelizing the code will lead to meaningful savings in execution times. library(tm)data("acq")textdata <- rep(sapply(content(acq), content), 1e5) The task is to find documents that match the regular expression d+(,d+)? mln dlrs, which represents monetary amounts in millions of dollars. In this regular expression, d+ matches a string of one or more digits, and (,d+)? optionally matches a comma followed by one more digits. For example, the strings 12 mln dlrs, 1,234 mln dlrs and 123,456,789 mln dlrs will match the regular expression. First, we will measure the execution time to find these documents serially with grepl(): pattern <- "\d+(,\d+)? mln dlrs"system.time(res1 <- grepl(pattern, textdata))## user system elapsed ## 65.601 0.114 65.721 Next, we will modify the code to run in parallel and measure the execution time on a computer with four CPU cores: library(parallel)detectCores()## [1] 4cl <- makeCluster(detectCores())part <- clusterSplit(cl, seq_along(textdata))text.partitioned <- lapply(part, function(p) textdata[p])system.time(res2 <- unlist( parSapply(cl, text.partitioned, grepl, pattern = pattern))) ## user system elapsed ## 3.708 8.007 50.806 stopCluster(cl) In this code, the detectCores() function reveals how many CPU cores are available on the machine, where this code is executed. Before running any parallel code, makeCluster() is called to create a local cluster of processing nodes with all four CPU cores. The corpus is then split into four partitions using the clusterSplit() function to determine the ideal split of the corpus such that each partition has roughly the same number of documents. The actual parallel execution of grepl() on each partition of the corpus is carried out by the parSapply() function. Each processing node in the cluster is given a copy of the partition of data that it is supposed to process along with the code to be executed and other variables that are needed to run the code (in this case, the pattern argument). When all four processing nodes have completed their tasks, the results are combined in a similar fashion to sapply(). Finally, the cluster is destroyed by calling stopCluster(). It is good practice to ensure that stopCluster() is always called in production code, even if an error occurs during execution. This can be done as follows: doSomethingInParallel <- function(...) { cl <- makeCluster(...) on.exit(stopCluster(cl)) # do something} In this example, running the task in parallel on four processors resulted in a 23 percent reduction in the execution time. This is not in proportion to the amount of compute resources used to perform the task; with four times as many CPU cores working on it, a perfectly parallelizable task might experience as much as a 75 percent runtime reduction. However, remember Amdahl's law—the speed of parallel code is limited by the serial parts, which includes the overheads of parallelization. In this case, calling makeCluster() with the default arguments creates a socket-based cluster. When such a cluster is created, additional copies of R are run as workers. The workers communicate with the master R process using network sockets, hence the name. The worker R processes are initialized with the relevant packages loaded, and data partitions are serialized and sent to each worker process. These overheads can be significant, especially in data parallel algorithms where large volumes of data needs to be transferred to the worker processes. Besides parSapply(), parallel also provides the parApply() and parLapply() functions; these functions are analogous to the standard sapply(), apply(), and lapply() functions, respectively. In addition, the parLapplyLB() and parSapplyLB() functions provide load balancing, which is useful when the execution of each parallel task takes variable amounts of time. Finally, parRapply() and parCapply() are parallel row and column apply() functions for matrices. On non-Windows systems, parallel supports another type of cluster that often incurs less overheads — forked clusters. In these clusters, new worker processes are forked from the parent R process with a copy of the data. However, the data is not actually copied in the memory unless it is modified by a child process. This means that, compared to socket-based clusters, initializing child processes is quicker and the memory usage is often lower. Another advantage of using forked clusters is that parallel provides a convenient and concise way to run tasks on them via the mclapply(), mcmapply(), and mcMap() functions. (These functions start with mc because they were originally a part of the multicore package) There is no need to explicitly create and destroy the cluster, as these functions do this automatically. We can simply call mclapply() and state the number of worker processes to fork via the mc.cores argument: system.time(res3 <- unlist( mclapply(text.partitioned, grepl, pattern = pattern, mc.cores = detectCores())))## user system elapsed ## 127.012 0.350 33.264 This shows a 49 percent reduction in execution time compared to the serial version, and 35 percent reduction compared to parallelizing using a socket-based cluster. For this example, forked clusters provide the best performance. Due to differences in system configuration, you might see very different results when you try the examples in your own environment. When you develop parallel code, it is important to test the code in an environment that is similar to the one that it will eventually run in. Implementing task parallel algorithms Let's now see how to implement a task parallel algorithm using both socket-based and forked clusters. We will look at how to run the same task and different tasks on workers in a cluster. Running the same task on workers in a cluster To demonstrate how to run the same task on a cluster, the task for this example is to generate 500 million Poisson random numbers. We will do this by using L'Ecuyer's combined multiple-recursive generator, which is the only random number generator in base R that supports multiple streams to generate random numbers in parallel. The random number generator is selected by calling the RNGkind() function. We cannot just use any random number generator in parallel because the randomness of the data depends on the algorithm used to generate random data and the seed value given to each parallel task. Most other algorithms were not designed to produce random numbers in multiple parallel streams, and might produce multiple highly correlated streams of numbers, or worse, multiple identical streams! First, we will measure the execution time of the serial algorithm: RNGkind("L'Ecuyer-CMRG")nsamples <- 5e8lambda <- 10system.time(random1 <- rpois(nsamples, lambda))## user system elapsed## 51.905 0.636 52.544 To generate the random numbers on a cluster, we will first distribute the task evenly among the workers. In the following code, the integer vector samples.per.process contains the number of random numbers that each worker needs to generate on a four-core CPU. The seq() function produces ncores+1 numbers evenly distributed between 0 and nsamples, with the first number being 0 and the next ncores numbers indicating the approximate cumulative number of samples across the worker processes. The round() function rounds off these numbers into integers and diff() computes the difference between them to give the number of random numbers that each worker process should generate. cores <- detectCores()cl <- makeCluster(ncores)samples.per.process <- diff(round(seq(0, nsamples, length.out = ncores+1))) Before we can generate the random numbers on a cluster, each worker needs a different seed from which it can generate a stream of random numbers. The seeds need to be set on all the workers before running the task, to ensure that all the workers generate different random numbers. For a socket-based cluster, we can call clusterSetRNGStream() to set the seeds for the workers, then run the random number generation task on the cluster. When the task is completed, we call stopCluster() to shut down the cluster: clusterSetRNGStream(cl)system.time(random2 <- unlist( parLapply(cl, samples.per.process, rpois, lambda = lambda)))## user system elapsed ## 5.006 3.000 27.436stopCluster(cl) Using four parallel processes in a socket-based cluster reduces the execution time by 48 percent. The performance of this type of cluster for this example is better than that of the data parallel example because there is less data to copy to the worker processes—only an integer that indicates how many random numbers to generate. Next, we run the same task on a forked cluster (again, this is not supported on Windows). The mclapply() function can set the random number seeds for each worker for us, when the mc.set.seed argument is set to TRUE; we do not need to call clusterSetRNGStream(). Otherwise, the code is similar to that of the socket-based cluster: system.time(random3 <- unlist( mclapply(samples.per.process, rpois, lambda = lambda, mc.set.seed = TRUE, mc.cores = ncores))) ## user system elapsed ## 76.283 7.272 25.052 On our test machine, the execution time of the forked cluster is slightly faster, but close to that of the socket-based cluster, indicating that the overheads for this task are similar for both types of clusters. Running different tasks on workers in a cluster So far, we have executed the same tasks on each parallel process. The parallel package also allows different tasks to be executed on different workers. For this example, the task is to generate not only Poisson random numbers, but also uniform, normal, and exponential random numbers. As before, we start by measuring the time to perform this task serially: RNGkind("L'Ecuyer-CMRG")nsamples <- 5e7pois.lambda <- 10system.time(random1 <- list(pois = rpois(nsamples, pois.lambda), unif = runif(nsamples), norm = rnorm(nsamples), exp = rexp(nsamples)))## user system elapsed ## 14.180 0.384 14.570 In order to run different tasks on different workers on socket-based clusters, a list of function calls and their associated arguments must be passed to parLapply(). This is a bit cumbersome, but parallel unfortunately does not provide an easier interface to run different tasks on a socket-based cluster. In the following code, the function calls are represented as a list of lists, where the first element of each sublist is the name of the function that runs on a worker, and the second element contains the function arguments. The function do.call() is used to call the given function with the given arguments. cores <- detectCores()cl <- makeCluster(cores)calls <- list(pois = list("rpois", list(n = nsamples, lambda = pois.lambda)), unif = list("runif", list(n = nsamples)), norm = list("rnorm", list(n = nsamples)), exp = list("rexp", list(n = nsamples)))clusterSetRNGStream(cl)system.time( random2 <- parLapply(cl, calls, function(call) { do.call(call[[1]], call[[2]]) }))## user system elapsed ## 2.185 1.629 10.403stopCluster(cl) On forked clusters on non-Windows machines, the mcparallel() and mccollect() functions offer a more intuitive way to run different tasks on different workers. For each task, mcparallel() sends the given task to an available worker. Once all the workers have been assigned their tasks, mccollect() waits for the workers to complete their tasks and collects the results from all the workers. mc.reset.stream()system.time({ jobs <- list() jobs[[1]] <- mcparallel(rpois(nsamples, pois.lambda), "pois", mc.set.seed = TRUE) jobs[[2]] <- mcparallel(runif(nsamples), "unif", mc.set.seed = TRUE) jobs[[3]] <- mcparallel(rnorm(nsamples), "norm", mc.set.seed = TRUE) jobs[[4]] <- mcparallel(rexp(nsamples), "exp", mc.set.seed = TRUE) random3 <- mccollect(jobs)})## user system elapsed ## 14.535 3.569 7.97 Notice that we also had to call mc.reset.stream() to set the seeds for random number generation in each worker. This was not necessary when we used mclapply(), which calls mc.reset.stream() for us. However, mcparallel() does not, so we need to call it ourselves. Summary In this article, we learned about two classes of parallelism: data parallelism and task parallelism. Data parallelism is good for tasks that can be performed in parallel on partitions of a dataset. The dataset to be processed is split into partitions and each partition is processed on a different worker processes. Task parallelism, on the other hand, divides a set of similar or different tasks to amongst the worker processes. In either case, Amdahl's law states that the maximum improvement in speed that can be achieved by parallelizing code is limited by the proportion of that code that can be parallelized. Resources for Article: Further resources on this subject: Using R for Statistics, Research, and Graphics [Article] Learning Data Analytics with R and Hadoop [Article] Aspects of Data Manipulation in R [Article]

0
0
3888

Packt

30 Sep 2016

15 min read

Functions in Swift

Packt

30 Sep 2016

15 min read

0
0
3884

Packt

11 Nov 2013

7 min read

Adding Connectors in Bonita

Packt

11 Nov 2013

7 min read

(For more resources related to this topic, see here.) Bonita connectors Bonita connectors are used to set variables or some other parameters inside Bonita. They can also be used to start a process or execute a step. These connectors equip the user to connect with different parameters of the Bonita work flow. The other kind of connectors are used to integrate with some other third-party tools. Most of the Bonita connectors are related to the documents and comments at a particular step. Although these may be useful in some cases, in a majority of the cases we will not find much use for them. The most useful ones are getting the users a step, executing a step, starting a new process, and setting variables. Click on any step on which you want to define the connector and click on Add.... Here, we will check the start an instance connector of Bonita. Give a name to this connector and click on Next. Here we have to fill in the name of the process that we want to invoke. We also have an option to specify different versions of the process. If we leave this blank, it will pick up the latest version. Next, we can specify the process variables that need to be copied from one pool to the other. Start an instance connector in Bonita Studio In the previous example, the process variables that we specify will be copied over to the target pool. We have to make sure that the target pool has the process variables mentioned in this connector. Make sure that you mention the name of the variable in the first column without the curly braces. If you select the names from the drop-down menu, make sure you remove the $ and the {} for filling in the name. The value field can be filled by the actual process variable. We can also use the set variable connector to set a value to a variable, either a process variable or a step variable. Here, we have two parameters: one is the variable whose value we have to set and the other parameter is the actual value of the variable. Note that this value may be a Groovy expression, too. Hence, it is similar to writing a Groovy script to assign a value to a variable. Another type of connector is the one to start or finish a step. In this connector, all we have to do is mention the name of the step we want to start or stop. Similarly, there is another connector to execute a step. Executing will run all the start and end Connectors of a particular step and then finish it. These connectors might be useful in the cases where some step may be waiting for another step, and at the end of the current step we might execute that step or mark it finished. We also have connectors to get the users from the workflow. There are connectors to find out the initiator of a process and the step submitter. Another useful connector is to get a user based on the username. This returns the User class that Bonita uses to implement the functionality of a user in the work flow. Select the connector to get a user from a username. Enter the username and click on Next. Here, we get the output of the connector and we can decide to save the output in a particular pool or step variable. Saving the connector output in a variable in Bonita The user class has methods to retrieve data, such as the e-mail, first name, last name, metadata, and password from the user. The e-mail connector We have a connector in the messaging group to send an e-mail. Now, we might use this connector for a variety of purposes: to send information about the work flow to an external e-mail, to send a notification to the person performing the task that he/she has some pending items in his/her inbox, and so on. We have to configure the e-mail connector on various parameters. In our TicketingWorkflow, let us send an e-mail to the person in whose name the tickets are booked. He/she enters his/her e-mail address in the Payment step of the workflow. Hence, let us send an e-mail at the end of the Payment step to the person at his/her e-mail address with which the tickets have been booked. For this, let us configure the e-mail connector: Click on the Payment step of the work flow. Click on the Connectors tab to add a connector. Select the connector as a medium to send an e-mail. Then name the connector as SendEmail and make sure that this connector is at the finish event of the step. In the next step, we are required to enter the configuration details of the SMTP server we will use for sending the e-mail. By default, it is set to the Gmail configuration with the host as smtp.gmail.com and the port as 465. Let us stick to the default option and send an e-mail from a Gmail hosted server. Leave the Security option as it is, but enter your credentials in the Authentication section. Here, you should enter your full e-mail address, not just your username. You can also use your own domain e-mail address if it is hosted on a Gmail server. Next, we define the parameters of the e-mail notification that has to be sent. After entering the From address as the ticketing admin address or some similar address, enter the To address as the variable in which we have saved the e-mail address: email. In the title field, we have to specify the subject of the e-mail. We have already seen that we can use Java inside the Groovy editor. Here, we will have a look at a simple Java code that is executed inside the editor. Enter the following code in the Groovy editor: import java.text.SimpleDateFormat; return "Flight ticket from " + from + " to " + to + " on " + new SimpleDateFormat("MM-dd-yyyy").format(departOn); The overview of the flight details is mentioned in the subject of the e-mail. We know that the departOn variable is a Date object. For printing the date, we have to convert it into a String by using the SimpleDateFormat class. Next, we have to write the actual e-mail that we will send to the customer. Below the Title field, make sure that the e-mail body is in HTML and not plain text. We can insert Groovy scripts in between the text, which will be substituted with the actual variable value when the e-mail is sent. Write the following in the body of the e-mail: Hi ${passenger1}, Your ${from} to ${to} flight is confirmed. The flight details are given below: Date Departure Arrival Duration Price ${import java.text. SimpleDateFormat; return new SimpleDateFormat ("MM-dd-yyyy"). format(departOn); ${departure} ${arrival} ${duration} ${price} Travelers: ${passenger1} ${passenger2} ${passenger3} Payment Details: Card Holder - ${cardHolder} Card Number - ${cardNumber} Thank you for booking with TicketingWorkflow! Configuring the e-mail connector Clicking on Next will get you to the advanced options. Generally it's not really required to configure these options, and we can make do with the default settings. Summary This article looked at the various connector integration options available in Bonita Studio. It showed how connectors can be used to fetch data into the workflow and how to export data, too. We have a close look at the Bonita inbuilt connectors and e-mail connectors. Resources for Article: Further resources on this subject: Oracle BPM Suite 11gR1: Creating a BPM Application [Article] Managing Oracle Business Intelligence [Article] Setting Up Oracle Order Management [Article]

0
0
3875

article-image-database-active-record-and-model-tricks

Packt

11 Jul 2013

14 min read

Database, Active Record, and Model Tricks

Packt

11 Jul 2013

14 min read

0
0
3874

Packt

06 Mar 2013

17 min read

Asynchrony in Action

Packt

06 Mar 2013

17 min read

0
0
3855

article-image-high-availability-oracle-11g-r1-r2-real-application-clusters-rac

Packt

20 May 2011

12 min read

High Availability: Oracle 11g R1 R2 Real Application Clusters (RAC)

Packt

20 May 2011

12 min read

0
0
3828

article-image-drools-integration-modules-spring-framework-and-apache-camel

Packt

28 Dec 2011

14 min read

Drools Integration Modules: Spring Framework and Apache Camel

Packt

28 Dec 2011

14 min read

0
0
3795

article-image-getting-jump-start-ironpython

Packt

07 Oct 2009

9 min read

Getting a Jump-Start with IronPython

Packt

07 Oct 2009

9 min read

Where do you get it? Before getting started, you’ll need to download IronPython 2.0.1 (though the contents of this article could just as easily be applied to past or even future versions). The official IronPython site is www.codeplex.com/ironpython. Here you’ll find not only the IronPython bits, but also samples, source code, documentation and many other resources. After clicking on the "Downloads" tab at the top you will be presented with three download options: IronPython.msi, IronPython-2.0.1-Bin.zip (the binaries) or IronPython-2.0.1-Src.zip (the source code). If you already have CPython installed—the standard Python implementation—the binaries are probably your best bet. You simply unzip the files to your preferred installation directory and you’re done. If you don’t have CPython installed, I recommend the IronPython.msi file since it comes prepackaged with portions of the CPython Standard Library. Figure 1. IronPython installation directory. There are a few items I would like to highlight in the IronPython installation directory displayed in Figure 1. The first is the FAQ.html file. This covers all of your basic IronPython questions, from licensing questions to implementation details. Periodically reviewing this while you’re learning IronPython will probably save you a lot of frustration. The second item of importance is the two executables, ipy.exe and ipyw.exe. As you probably guessed, these are what you use to launch IronPython; ipy.exe is used for scripts and console applications while ipyw.exe is reserved for other types of applications (Windows Forms, WPF, etc). Lastly, I’d like to draw your attention to the Tutorial folder. Inside the Tutorial folder, you’ll find a Tutorial.html file in addition to a number of other files. The Tutorial.html file is a comprehensive review of what you need to know to get started with IronPython. If you want to be quickly productive, be sure to at least review the tutorial. It will answer many of your questions. Visual Studio or a Text Editor? One thing that neither the ReadMe nor the Tutorial really covers is the tooling story. While Visual Studio 2008 is a viable Python development tool, you may want to consider other options. Personally, I bounce between VS and SciTE, but I’m always watching for new tools that might improve my development experience. There are a number of IDEs and debuggers out there and you owe it to yourself to investigate them. Sometimes, however, Visual Studio IS the right tool for the job. If that’s the case then you’ll need to install the Visual Studio SDK from http://www.microsoft.com/downloads/details.aspx?FamilyID=30402623-93ca-479a-867c-04dc45164f5b&displaylang=en. Let’s Write Some Code! To get started, let’s create a simple python script and execute it with ipy. In a new file called "sample.py" (Python files are indicated by a ".py" extension), type "print ‘Hello, world’". Open a command window, navigate to the directory where you saved sample.py and then call ipy.exe passing "sample.py" as an argument. Figure 1 displays what you might expect to see in the console window. Figure 2. Executing a script using the comand line Not that executing from the Command Line isn’t effective, but I prefer a more efficient approach. Therefore I’m going to use SciTE, an editor I briefly mentioned earlier, to duplicate the example in Figure2. Why SciTE? I get syntax highlighting, I can run my code simply by hitting F5 and the stdout is redirected to SciTE’s output window. In short, I never have to leave my coding environment. If you performed the above "hello, world" example in SciTE, the example would look like Figure 2. Figure 3. Executing a script using SciTE Congratulations! You’ve written your first bit of Python code! The problem is it doesn’t really touch any .NET namespaces. Fortunately, this is not a difficult thing to do. Figure 3 shows all the code you need to start working with the System namespace. Figure 4. You only need a single of code to gain access to the System namespace With that simple import statement, we now enjoy access to the entirety of the System namespace. For example, to access the String class we simply would type System.String. That’s great for getting started but what happens when we want to use something like the Regex class? Do we have to type System.Text.RegularExpressions.Regex? Figure 5. Using .NET regular expressions from IronPython No! The first line of Figure 5 introduces a new form of the import statement that only imports the specific items you want. In our case, we only want the Regex class. The code in line 3 demonstrates creating a new instance of the Regex class. Note the lack of a "new" keyword. Python considers "new" redundant since you have to include parentheses anyways. Another interesting note is the syntax—or is it the lack of syntax—for creating a variable. There’s no declaration statement or type required. We simply create a name that we set equal to a new instance of the Regex class. If you’ve ever written any PHP or classic ASP, this should feel pretty familiar to you. Finally, the print statement on line 6 produces the output shown in Figure 6. Figure 6. Output from the Regex example The last example was easy because IronPython already holds a reference to System and mscorlib. Let’s push our limits and create a simple Windows form. This requires a bit more work. Figure 7. Using the clr module to add a reference A Quick Review of Python Classes Figure 7 introduces the clr module as a way of adding references to other libraries in the Global Assembly Cache (GAC). Once we have a reference, now we can import the Form, TextBox and Button classes so we can start constructing our GUI. Before we do that though, we have a couple of concepts we need to cover. Figure 8. Introducing classes and methods Up until this point, we really haven’t needed to create any classes or methods. But now that we need to create a form we’re going to need both. Figure 8 demonstrates a very simple class and an equally simple method. I think it’s clear that the "class" keyword defines a class and the "def" keyword defines a method. You probably correctly assumed that "(object)" after "MyClass" is Python’s way of expressing inheritance. The "pass" keyword, however, may not be immediately obvious. In Python, classes and methods cannot be empty. Therefore, if you have a class or method that you aren’t quite ready to code yet, you can use the "pass" statement until you are ready. A more subtle characteristic of Figure 8 is the whitespace. In Python we indent the contents of control structures with four spaces. Tabs will also work, but by convention four spaces are used. In our example above, since "my_method" has no preceding spaces, it’s clear that "my_method" is not part of "MyClass". So how would we make "my_method" a class method? Logically, you would think that simply deleting the "pass" statement under "MyClass" and indenting "my_method" would be enough, but that isn’t the case. There’s one more addition we need to make. Figure 9. Creating a class method As Figure 9 demonstrates, we need to pass "self" as a parameter to "my_method". The first—and sometimes the only—parameter in a class method’s parameter list must always be an instance of the containing class. By convention, this parameter should be named "self", though you could call it anything you’d like. Why the extra step? That’s because Python values the explicit over the implicit. Hiding this detail from the developer is at odds with Python’s philosophy. Creating a Windows Form Now that we have an understanding of classes, methods and whitespace, Figure 10 continues our example from Figure 7 by creating a blank form. Figure 10. Creating a blank form The code in Figure 10 should be fairly understandable. We create the "MyForm" class by inheriting from "System.Windows.Forms.Form". We create a new instance of "MyForm" and pass the resulting object to the "Application.Run()" method. The only thing that may give you pause is the "__init__()" method. The "__init__()" method is what’s called a magic method. Magic methods are designated with double underscores on either end of the method name and are rarely called directly. For instance, when the code in Line 10 of Figure 10 executes, the "__init__()" method defined in "MyForm" is actually being called behind the scenes. Figure 11. Populating the form with controls and handling an event handler Figure 11 adds a lot of code to our application, most of which isn’t very interesting. The exception here is the Click event of the goButton. In C#, the method would get passed as an argument in the constructor of a new EventHandler. In IronPython, we simply add a function with the proper signature to the Click event. Now that we have a button that will respond to a click, Figure 12 shows a modified version of our regular expression sample code from earlier inserted into the click method. Note the "__str__()" magic method is the equivalent of ToString(). Figure 12. Populating click with our regular expression example When we run the code, you should see the form displayed in Figure 13. You can enter dates into the top textbox, press the button and either True or False will appear in the lower textbox indicating the results of the IsMatch() function. Figure 13. Completed form Conclusion During the course of one brief article, you went from knowing little of IronPython to using it to build Windows Forms. We were able to move so quickly because we leveraged your existing .NET knowledge. We spent most of our time talking about the very intuitive Python syntax. Go through sample or even production code you've written in the past and duplicate it in IronPython. You’ll find working with familiar .NET libraries will speed your learning process, making it more fun. Before you know it, Python will become second-nature!

0
0
3795

article-image-using-osgi-bundle-repository-osgi-and-apache-felix-30

Packt

03 Nov 2010

5 min read

Using the OSGi Bundle Repository in OSGi and Apache Felix 3.0

Packt

03 Nov 2010

5 min read

0
0
3782

Packt

05 Apr 2017

16 min read

IT Operations Management

Packt

05 Apr 2017

16 min read

In this article by Ajaykumar Guggilla, the author of the book ServiceNow IT Operations Management, we will learn the ServiceNow ITOM capabilities within ServiceNow, which include: Dependency views Cloud management Discovery Credentials (For more resources related to this topic, see here.) ServiceNow IT Operations Management overview Every organization and business focuses on key strategies, some of them include: Time to market Agility Customer satisfaction Return on investment Information technology is heavily involved in supporting these strategic goals, either directly or indirectly, providing the underlying IT Services with the required IT infrastructure. IT infrastructure includes network, servers, routers, switches, desktops, laptops, and much more. IT supports these infrastructure components enabling the business to achieve their goals. IT continuously supports the IT infrastructure and its components with a set of governance, processes, and tools, which is called IT Operations Management. IT cares and feeds a business, and the business expects reliability of services provided by IT to support the underlying business services. A business cares and feeds the customers who expect satisfaction of the services offered to them without service disruption. Unlike any other tools it is important to understand the underlying relationship between IT, businesses, and customers. IT just providing the underlying infrastructure and associated components is not going to help, to effectively and efficiently support the business IT needs to understand how the infrastructure components and process are aligned and associated with the business services to understand the impact to the business with an associated incident, problem, event, or change that is arising out of an IT infrastructure component. IT needs to have a consolidated and complete view of the dependency between the business and the customers, not compromising on the technology used, the process followed, the infrastructure components used, which includes the technology used. There needs to be a connected way for IT to understand the relations of these seamless technology components to be able to proactively stop the possible outages before they occur and handle a change in the environment. On the other hand, a business expects service reliability to be able to support the business services to the customers. There is a huge financial impact of businesses not being able to provide the agreed service levels to their customers. So there is always a pressure and dependence from the business to IT to provide a reliable service and it does not matter what technology or processes are used. Customers as always expect satisfaction of the services provided by the business, at times these are adversely affected with service outages caused from the IT infrastructure. Customer satisfaction is also a key strategic goal for the business to be able to sustain in the competitive market. IT is also expected as necessarily to be able to integrate with the customer infrastructure components to provide a holistic view of the IT infrastructure view to be able to effectively support the business by proactively identifying and fixing the outages before they happen to reduce the outages and increase the reliability of IT services delivered. Most of the tools do not understand the context of the Service-Oriented Architecture (SOA) connecting the business services to the impacted IT infrastructure components to be able to effectively support the business and also IT to be able to justify the cost and impact of providing end to end service. Most of the traditional tools perform certain aspects of ITOM functions, some partially and some support the integration with the IT Service Management (ITSM) tool suite. The missing integration piece between the traditional tools and a full blown cloud solution platform is leaning to the SOA. ServiceNow, a cloud based solution, has focused the lens of true SOA that brings together the ITOM suite providing and leveraging the native data and that is also able to connect to the customer infrastructure to provide a holistic and end to end view of the IT Service at a given snapshot. With ServiceNow IT has a complete view of the business service and technical dependencies in real time leveraging powerful individual capabilities, applications, and plugins within ServiceNow ITOM. ServiceNow ITOM comprises of the following applications and capabilities, some of the plugins, applications, and technology might have license restrictions that require separate licensing to be purchased: Management, Instrumentation, and Discovery (MID) Server: MID Server helps to establish communication and data movement between ServiceNow and the external corporate network and application Credentials: Is a platform that stores credentials including usernames, passwords, or certificates in an encrypted field on the credentials table that is leveraged by ServiceNow discovery Service mapping: Service mapping discovers and maps the relationships between IT components that comprise specific business services, even in dynamic, virtualized environments Service mapping: Service mapping creates relationships between different IT components and business services Dependency views: Dependency views graphically displays an infrastructure view with relationships of configuration items and the underlying business services Event management: Event management provides a holistic view of all the event that are triggered from various event monitoring tools Orchestration: Orchestration helps in automating IT and business processes for operations management. Discovery: Works with MID Server and explores the IT infrastructure environment to discover the configuration items and populating the Configuration Management Database (CMDB) Cloud management: Helps to easily manage third-party cloud providers, which includes AWS, Microsoft Azure, and VMware clouds Understanding ServiceNow IT Operations Management components Now that we have covered what ITOM is about and focusing on ServiceNow ITOM capabilities, let's deep dive and explore more about each capability. Dependency views Maps like the preceding one are becoming so important in everyday life; imagine a world without GPS devices or electronic maps. There were hard copies of the maps that were available all over the streets for us to get to the place and also there were special maps to the utilities and other public service agencies to be able to identify the impact to either digging a tunnel or a water pipe or an underground electric cable. These maps help them to identify the impact of making a change to the ground. Maps also helps us to understand the relationships between a states, countries, cities, and streets with different set of information in real time that includes real-time traffic information showing accident information, any constructions, and so on. Dependency views is also similar to the real life navigation maps, they provide a map of relationships between the IT Infrastructure components and the business services that are defined under the scope, unlike the real-time traffic updates on the maps the dependency views show real-time active incidents, change, and problems reported on an individual configuration item or an infrastructure component. Changes frequently happen in the environment, some of the changes are handled with a legacy knowledge of how the individual components are connected to the business services through the service mapping plugin down to the individual component level. Making a change without understanding the relationships between each IT infrastructure component might adversely affect the service levels and impact the business service. ServiceNow dependency views provide a snapshot of how the underlying business service is connected to individual Configuration Item (CI) elements. Drilling down to the individual CI elements provides a view of associated service operations and service transition data that includes incidents logged against on a given CI, any underlying problem reported against the given CI, and also changes associated with the given CI. Dependency views are based on D3 and Angular technology that provides a graphical view of configuration items and their relationships. The dependency views provide a view of the CI and their relationships, in order to get a perspective from a business stand point you will need to enable the service mapping plugin. Having a detailed view of how the individual CI components are connected from the Business service to the CI components compliments the change management to perform effective impact analysis before any changes are made to the respective CI: Image source: wiki.servicenow.com A dependency map starts with a root node, which is usually termed as a root CI that is grayed out with a gray frame. Relationships start building up and they map from the upstream and downstream dependencies of the infrastructure components that are scoped to discover by the ServiceNow auto discovery. Administrators have the control of the number of levels to display on the dependency maps. It is also easy to manage the maps that allow creating or modifying existing relationships right from the map that posts the respective changes to the CMDB automatically. Each of the CI component of the dependency maps have an indicator that shows any active and pending issues against a CI that includes any incidents, problems, changes, and any events associated with the respective configuration item. Cloud management In the earlier versions prior to Helsinki, there was not a direct way to manage cloud instances, people had to create orchestration scripts to be able to manage the cloud instances and also create custom roles. Managing and provisioning has become easy with the ServiceNow cloud management application. The cloud management application seamlessly integrates with the ServiceNow service catalog and also provides providing automation capability with orchestration workflows. The cloud management application fully integrates the life cycle management of virtual resources into standard ServiceNow data collection, management, analytics, and reporting capabilities. The ServiceNow cloud management application provides easy and quick options to key private cloud providers, which include: AWS Cloud: Manages Amazon Web Services (AWS) using AWS Cloud Microsoft Azure Cloud: The Microsoft Azure Cloud application integrates with Azure through the service catalog and provides the ability to manage virtual resources easily VMware Cloud: The VMware Cloud application integrates with VMware vCenter to manage the virtual resources by integrating with the service catalog The following figure describes a high-level architecture of the cloud management application: Key features with the cloud management applications include the following: Single pane of glass to manage the virtual services in public and private cloud environment including approvals, notifications, security, asset management, and so on Ability to repurpose configurations through resource templates that help to reuse the capability sets Seamless integration with the service catalog, with a defined workflow and approvals integration can be done end to end right from the user request to the cloud provisioning Ability to control the leased resources through date controls and role-based security access Ability to use the ServiceNow discovery application or the standalone capability to discover virtual resources and their relationships in their environments Ability to determine the best virtualization server for a VM based on the discovered data by the CMDB auto discovery Ability to control and manage virtual resources effectively with a controlled termination shutdown date Ability to increate virtual server resources through a controlled fashion, for example, increasing storage or memory, integrating with the service catalog, and with right and appropriate approvals the required resources can be increased to the required Ability to perform a price calculation and integration of managed virtual machines with asset management Ability to auto or manually provision the required cloud environment with zero click options There are different roles within the cloud management applications, here are some of them: Virtual provisioning cloud administrator: The administrator owns the cloud admin portal and end to end management including configuration of the cloud providers. They have access to be able to configure the service catalog items that will be used by the requesters and the approvals required to provision the cloud environment. Virtual provisioning cloud approver: Who either approves or rejects requests for virtual resources. Virtual provisioning cloud operator: The operator fulfills the requests to manage the virtual resources and the respective cloud management providers. Cloud operators are mostly involved when there is a manual human intervention required to manage or provision the virtual resources. Virtual provisioning cloud user: Users have access to the my virtual assets portal that helps them to manage the virtual resources they own, or requested, or are responsible for. How clouds are provisioned The cloud administrator creates a service catalog item for users to be able to request for cloud resources The cloud user requests for a virtual machine through the service catalog The request goes to the approver who either approves or rejects it The cloud operator provisions the requests manually or virtual resources are auto provisioned Discovery Imagine how an atlas is mapped and how places have been discovered by the satellite using exploration devices including manually, satellite, survey maps, such as street maps collector devices. These devices crawl through all the streets to collect different data points that include information about the streets, houses, and much more details are collected. This information is used by the consumers for various purposes including GPS devices, finding and exploring different areas, address of a location, on the way finding for any incidents, constructions, road closures, and so on. ServiceNow discovery works the same way, ServiceNow discovery explores through the enterprise network identifying for the devices in scope. ServiceNow discovery probes and sensors perform the collection of infrastructure devices connected to a given enterprise network. Discovery uses Shazzam probes to determine the TCP ports opened and to see if it responds to the SNMP queries and sensors to explore any given computer or device, starting first with basic probes and then using more specific probes as it learns more. Discovery explores to check on the type of device, for each type of device, discovery uses different kinds of probes to extract more information about the computer or device, and the software that is running on it. CMDB is updated or data is federated through the ServiceNow discovery. They are identified with the discovery that is set and actioned to search the CMDB for a CI that again matches the discovered CI on the network. When a device match is found what actions to be taken are defined by the administrator when discovery runs based on the configuration when a CI is discovered; either CMDB gets updated with an existing CI or a new CI is created within the CMDB. Discovery can be scheduled to perform the scan on certain intervals; configuration management keeps the up to date status of the CI through the discovery. During discovery the MID Server looks back on the probes to run from the ServiceNow instance and executes probes to retrieves the results to the ServiceNow instance or the CMDB for processing. No data is retained on the MID Server. The data collected by these probes are processed by sensors. ServiceNow is hosted in the ServiceNow data centers spanned across the globe. ServiceNow as an application does not have the ability to communicate with any given enterprise network. Traditionally, there are two different types of discovery tools on the market: Agent: A piece of software is installed on the servers or individual systems that sends all information about the system to the CMDB. Agentless: Usually doesn't require any individual installations on the systems or components. They utilize a single system or software to usually probe and sense the network by scanning and federating the CMDB. ServiceNow is an agentless discovery that does not require any individual software to be installed, it uses MID Server. Discovery is available as a separate subscription from the rest of the ServiceNow platform and requires the discovery plugin. MID Server is a Java software that runs on any windows or UNIX or Linux system that resides within the enterprise network that needs to be discovered. MID Server is the bridge and communicator between the ServiceNow instance that is sitting somewhere on the cloud and the enterprise network that is secured and controlled. MID Server uses several techniques to probe devices without using agents. Depending on the type of infrastructure components, MID Server uses the appropriate protocol to gather information from the infrastructure component, for example, to gather information from network devices MID Server will use Simple Network Management Protocol (SNMP), to be able to connect to the Unix systems MID Server will use SSH. The following table shows different ServiceNow discovery probe types: Device Probe type Windows computers and servers Remote WMI queries, shell commands UNIX and Linux servers Shell command (via SSH protocol) Storage CIM/WBEM queries Printers SNMP queries Network gear (switches, routers, and so on) SNMP queries Web servers HTTP header examination Uninterruptible Power Supplies (UPS) SNMP queries Credentials ServiceNow discovery and orchestration features require credentials to be able to access the enterprise network; these credentials vary from network and devices. Credentials such as usernames, passwords, and certificates need a secure place to store these credentials. ServiceNow credentials applications store credentials in an encrypted format on a specific table within the credentials table. Credential tagging allows workflow creators to assign individual credentials to any activity in an orchestration workflow or assign different credentials to each occurrence of the same activity type in an orchestration workflow. Credential tagging also works with credential affinities. Credentials can be assigned an order value that forces the discovery and orchestration to try all the credentials when orchestration attempts to run a command or discovery tries to query. Credentials tables contain many credentials, based on pattern of usage the credential applications which places on the highly used list that enables the discovery and orchestration to work faster after first successful connection and system knowing which credential to use for a faster logon to the device next time. Image source: wiki.servicenow.com Credentials are encrypted automatically with a fixed instance key when they are submitted or updated in the credentials (discovery_credentials) table. When credentials are requested by the MID Server, the platform decrypts the credentials using the following process: The credentials are decrypted on the instance with the password2 fixed key. The credentials are re-encrypted on the instance with the MID Server's public key. The credentials are encrypted on the load balancer with SSL. The credentials are decrypted on the MID Server with SSL. The credentials are decrypted on the MID Server with the MID Server's private key. The ServiceNow credential application integrates with the CyberArk credential storage. The MID Server integration with CyberArk vault enables orchestration and discovery to run without storing any credentials on the ServiceNow instance. The instance maintains a unique identifier for each credential, the credential type (such as SSH, SNMP, or Windows), and any credential affinities. The MID Server obtains the credential identifier and IP address from the instance, and then uses the CyberArk vault to resolve these elements into a usable credential. The CyberArk integration requires the external credential storage plugin, which is available by request. The CyberArk integration supports these ServiceNow credential types: CIM JMS SNMP community SSH SSH private key (with key only) VMware Windows Orchestration activities that use these network protocols support the use of credentials stored on a CyberArk vault: SSH PowerShell JMS SFTP Summary In this article, we covered an overview of ITOM, explored different ServiceNow ITOM components including high level architecture, functional aspects of ServiceNow ITOM components that include discovery, credentials, dependency views, and, cloud management. Resources for Article: Further resources on this subject: Management of SOA Composite Applications [article] Working with Business Rules to Define Decision Points in Oracle SOA Suite 11g R1 [article] Introduction to SOA Testing [article]

0
0
3777

article-image-linking-your-customers-your-sugarcrm

Packt

21 Sep 2010

12 min read

Linking Your Customers to Your SugarCRM

Packt

21 Sep 2010

12 min read

(For more resources on SugarCRM, see here.) Surely, the most important goal of any CRM system is to make your customers feel positive about your company and to make them feel that exciting things are happening at your company, such as the following: That the employees they are in contact with are caring and well-informed That new and better information systems are coming into place That your company is responsive to product and service issues, and cares about its customers Limiting CRM system access to only the employees of a business will certainly affect the first of the aforementioned items positively, but not necessarily the other items. To really improve a customer's perception of your organization, one of the biggest improvements you can make is to allow customers to interact almost directly with your CRM system. Some of the activities that make this possible are as follows: Capturing customer leads and requests for information from the public website directly within the CRM system. Efficiently tracking customer service requests and related product/service flaws to help improve your offerings and customer satisfaction. Developing a customer self-service portal in conjunction with the CRM system to allow clients to file their own service cases, check on the latest status of a case, and to update their own customer profile. Most of us in our own lives can forgive or understand if a family member, friend, or supplier lets us down a bit, or makes a mistake—as long as they communicate with us honestly and effectively. In addition, with early detection of any errors, corrective action can always be put in place more quickly. Integrating your CRM system more directly with your customer is no more complicated than this—promoting more effective, more accurate, and timely communications with your customers. The net effect of such actions is that your customers feel informed, valued, and empowered. Capturing leads from your website Capturing leads from your company's website directly into your CRM is one of the greatest early initiatives you can implement in terms of streamlining business processes to save time and effort. This section will guide you through the manner in which this can be accomplished with SugarCRM. In the past, setting up a process similar to the one just described would have required the expertise and assistance of a programmer and your webmaster. Coordinating everyone's efforts to accomplish the goal would sometimes become a task in and of itself. Days may have elapsed before your lead capture form finally made it up to your website. Fortunately, those days are behind us. SugarCRM includes a tool that allows you to quickly and easily create a form that you can use to capture leads from your website. Through this tool, you will be able to select the fields corresponding to the data you wish to capture and also create a ready-to-use web form. Let us set up a web lead capture form through SugarCRM's tool. p style="margin-left:40px;margin-right:40px">The lead capture tool is specifically designed to import data into the Leads module only. Should you choose not to use the Leads module, or you wish to use a similar technique to capture data within a different module, you should use SugarCRM's SOAP API to accomplish the task. To begin the setup process, hover over the Marketing tab and select Campaigns. On the shortcuts menu on the left-hand side, click on Create Lead Form, as highlighted in the following image: After clicking on it, you will see a screen that permits you to select the fields you wish to capture through your form, as illustrated in the following screenshot: The field selection process is quite simple. On the leftmost column of the three that are presented, you will see a list of all the fields corresponding to the Leads module (including custom fields). To select a field for your form, simply drag-and-drop it from the field listing on the left onto one of the two rightmost columns. It is best to visualize the layout of the form that will be produced as one similar to the edit or detail view layouts. Fields can appear next to each other, horizontally or vertically, but only within one of two columns. Most organizations prefer the vertical approach, which is the technique we will apply. However, feel free to experiment. Proceed to select the fields to match the preceding image, plus any other fields you may wish to include. Note that required fields are marked with an asterisk, as they are within the Edit view screen. You must make sure to include all your required fields to ensure that the process will work as expected. In addition, you will notice that we have selected the Lead Source field. Doing so will allow website visitors to make the appropriate selection corresponding to what drove them to your site. Click on Next once you are satisfied with your field selection. Now you need to set some final parameters, as illustrated in the following image: You will undoubtedly want to modify the Form Header. This value corresponds to the title of the page that website visitors will see in their browser, so you will want to tailor it to reflect something a bit friendlier than the generic text. The form we are building is no different than any other web form you may have encountered in your day-to-day web browsing. As such, it too will include a button for visitors to click and send the data they typed in. If you prefer the label of the button to read as something other than the default label of Submit, change the Submit Button Label accordingly. The Redirect URL and Related Campaign fields are also quite important. The former is used to specify a URL that a visitor will be sent to after clicking on the Submit button on your lead capture form, while the latter is used to associate a particular marketing campaign to the form. Establishing this relationship is critical as it will help you properly measure the effectiveness of your marketing efforts. Lastly, the Assigned To option allows you to define a user to whom the Leads will be assigned upon being entered into SugarCRM. You may want to consider creating a specific user, such as WebCapture, and assigning the Leads to that user. Doing so will permit you to quickly identify records that entered your system through the web lead capture tool versus other means. Click on Generate Form after you have applied your edits and you should see something similar to the following: The default form should now be presented within SugarCRM's HTML editor. This is a handy capability as it allows you to manipulate the look and feel of the form to make it conform to the already existing look and feel of your website. However, you may wish to ignore that, as additional options allow you to more easily integrate it into your website. To access those features and save the form, click on the Save Web To Lead Form button. SugarCRM provides the convenience of a fully formatted, ready-to-use web form which can be downloaded by clicking on the Web To Lead Form link. However, if you prefer, you may copy the code displayed in the box and then embed it into one of your already existing pages. The second approach would save you the hassle of having to modify the cosmetic aspects of the default page to match your site. To start receiving data into your SugarCRM system, simply place the form on your web server, fill out the fields and submit the form. Make sure that the server on which it is placed is able to access your SugarCRM system or it will not function. You can test it by opening the form in your web browser and submitting data, as shown in the following image: Assuming everything is working as expected, the records will automatically appear within the leads module of your SugarCRM system without any intervention on your part or that of other users. In addition, e-mail notifications of new records will automatically be sent to the defined assigned user to inform them of the new entry so they may act upon it. Through the use of add-on modules, like SierraCRM's Process Manager, further actions, like the scheduling of follow up calls, can also be automated. Remember, all of this can happen automatically and herein we begin to see the real benefits of a CRM system. There are few things quite as satisfying as driving along in the car, and receiving an e-mail on your BlackBerry telling you that a new lead has been received. Especially, when you know that it all happened automatically! From a process perspective, the concept of having every new lead automatically entered into the CRM system makes it quick and easy to convert that lead into a contact, enter details of new sales opportunities, or include them in e-mail marketing campaigns—all without any data transcription errors, or lost leads, due to human errors. One note of caution: most lead capture sites capture as much as 50% bad data. Some visitors to your site will enter anything they fancy in the form; potentially polluting your database. This highlights another reason why it is beneficial to enter them by utilizing a username such as WebCapture. Doing so would allow you to easily filter leads to only show those created by WebCapture and in turn allow you to cleanse them, either by deleting them or performing other data integrity checks. Customer self-service portals After automating the lead capture process, the most common step that follows in linking your customers into your CRM system is the self-service portal. Just as it sounds, this i s a software system that enables your customers to exchange information with your organization in a completely autonomous manner. In this initial implementation, we will show you how to implement a system that allows customers to submit and manage service cases directly within your CRM system. Most of us have had the experience of needing to contact a call center to address a customer service issue or other matters. Usually, that process involves staying on hold for some time time. If you are lucky, the time that you stay on hold is not long, but at the same time, spending 30 to 45 minutes on hold or being transferred around is not unheard of. To make matters worse, you usually need to make these calls during normal business hours, meaning you are not able to tend to your normal work while you are burning time on hold. The fundamental capability that the self-service portal provides is empowering customers by allowing them to contact you at a time that is most convenient to them. Customers are no longer bound to specific business hours, nor must they wait in a call queue or navigate a maze of phone options. If they need your company's help to resolve an issue, they simply go to your website and submit their issue. Likewise, customers do not need to contact you directly to check in on their previously submitted cases. They simply visit your website again and they will be able to review their cases. This functionality works hand-in-hand with the Cases module that is built into SugarCRM. Typically, users would leverage this module to track service calls that they receive from customers. Through this functionality, all members of the organization are kept up-to-date on any issues that a customer may be experiencing at any given time. The Bug Tracker module complements the Cases module quite well by providing a central repository where all known product flaws can be tracked. In turn, all cases resulting from any of these flaws can be related to a given bug, allowing you to measure the impact it is having on your customers. Together, they can be used as very effective tools for not only providing customer service, but also prioritizing product development needs and improving customer satisfaction. However, that process can be inefficient, as it relies on a user to enter the data to produce a case in the first place. Empowering the customer in such a way that allows them to directly interact with the Cases module not only makes it easier for you to get feedback and become aware of problems, but it also gives customers the feeling that you care to hear what they have to say about their problems. That is the goal that the self-service portal hopes to accomplish. Self-service portal configuration Before we get too deep into the specifics of configuring and using the self-service portal, you must first understand some important boundaries. First, although this is a built-in feature of the Enterprise Edition of SugarCRM, it is not a feature of Community Edition. To obtain this functionality, we must use the combination of a SugarCRM add-on available on SugarExchange.com, plus an open source CMS (Content Management System) named Joomla! If you are already using another CMS package or cannot use Joomla! for other reasons, you will not be able to utilize the functionality described in this exercise. The second and last important note is that, at the time of this writing, the add-on did not support versions of SugarCRM Community Edition higher than 5.2. Now that we have a clear understanding of some important limitations, let us begin the process of deploying this feature. Installing Joomla! Assuming you have already installed SugarCRM Community Edition on the target server, you have already established the perfect environment for installing the Joomla! CMS package. Like SugarCRM, it too leverages the LAMP or WAMP system software platforms. Just like SugarCRM, it is also an open source application. You can download Joomla! from the project's site, located at http://www.joomla.org. Our exercise will use version 1.5 of Joomla! (Full Package). It is assumed that you have already successfully downloaded and installed it onto your server. If you require help with the process, visit the Joomla! website to review its documentation and obtain further assistance. Assuming Joomla! is operational, proceed to access the administrator page. It should resemble the following: Let us leave it at the admin page for now.

0
0
3725

article-image-liferay-mail-and-sms-text-messenger-portlet

Packt

08 Oct 2009

5 min read

Liferay Mail and SMS Text Messenger Portlet

Packt

08 Oct 2009

5 min read

Working with Mail Portlet For the purpose of this article, we will use an intranet website called book.com which is created for a fictions company named "Palm Tree Publications". In order to let employees manage their emails, we can use the Liferay Mail portlet. As an administrator of "Palm Tree Publications", you need to create a Page called "Mail" under the Page, "Community" at the Book Lovers Community Public Pages and also add the Mail portlet in the Page, "Mail". Experiencing Mail Management First of all, login as "Palm Tree". Then, let's do the above as follows: Add a Page called "Mail" under the Page "Community" at the Book Lovers Community Public Pages, if the Page is not already present. If the Mail portlet is not already present, add it in the Page, "Mail" of the Book Lovers Community where you want to manage mails. You will see the Mail portlet as shown in the following figure. Let's assume that we set up the mail domain as "cignex.com" in the Enterprise Admin portlet for testing purposes, as we have a mail engine with this mail domain already. Of course, you could set up the mail domain as "book.com", or something else, if you had a mail engine with this mail domain in your hand. As an editor of the editorial department, "Lotti Stein", you may want to manage your mails in the mail domain, "cignex.com". You can first choose a user name for your personal company the email address, say "admin1234" and register. Let's do it as follows: Login as "Lotti Stein". Go to the Page "Mail" under the Page, "Community", at the Book Lovers community Public Pages. Locate the Mail portlet. Input the value for User name as "admin1234". Click the Register button. Your new email address is "admin1234@cignex.com". This email address will also serve as your login, as shown in the following figure. You can now check for new messages in your inbox, by clicking the Inbox link first. Then, you can view the Unread messages, and either Check Mail or create a New mail, as shown in the following figure: You can go to the Page of Mail management by clicking on any link of Unread Messages, or Check Mail button or New button. Further, you can manage emails through the Mail portlet of your current account. Email management includes the following features (as shown in the following figure): Create a New email. Check Mail. Reply to an email. Reply All emails. Forward emails. Delete emails. Print emails, and Search. Note that the first email with the subject, "Users in Staging server", is sent through the SMS Text Messenger portlet. For more details, refer to the forthcoming section. How to Set up Mail Server? In order to make the Mail portlet work, we have to set up a mail server with IMAP, POP and SMTP protocols. Suppose that the Enterprise "Palm Tree Publications" has a mail server with the domain "exg3.exghost.com", an account "admin@cignex.com/admin1234", and protocol IMAP, POP and SMTP. As an administrator, you need to integrate this mail server with IMAP, POP and SMTP protocol in Liferay. Let's do it as follows: Find the file ROOT.xml in $TOMCAT_DIR/conf/Catalina/localhost. Find the mail configuration first. Then configure it as follows: <Resourcename="mail/MailSession"auth="Container"type="javax.mail.Session"mail.imap.host="exg3.exghost.com"mail.imap.port="143"mail.pop.host="exg3.exghost.com"mail.pop.port="110"mail.store.protocol="imap"mail.transport.protocol="smtp"mail.smtp.host="exg3.exghost.com"mail.smtp.port="2525"mail.smtp.auth="true"mail.smtp.starttls.enable="true"mail.smtp.user="admin@cignex.com"password="admin1234"mail.smtp.socketFactory.class="javax.net.ssl.SSLSocketFactory"/> In short, a Mail portlet is an AJAX web-mail client. We can configure it to work with any mail server. It reduces page refreshes, since it displays message previews and message lists in a dual pane window. How to Set up Mail Portlet? If you have proper Permissions, you can change the preferences of the Mail portlet. To change the preferences, you can simply click the Preferences icon to the upper right of the Mail portlet. With the Recipients tab selected, you can find potential recipients from the Directory (Enabled or Disabled) and the Organization (My Organization or All Available). Click the Save button after making any changes. Using the Filters tab, you can set the values to filter emails associated with an email address to a Folder. Click the Save button after making any changes. Note that the maximum number of email addresses is ten. This number is also configurable at the portal-ext.properties Similarly, the Forward Address tab allows all emails to be forwarded to the email address you want. Enter one email address Per Line. Remove all entries to disable email forwarding. Select Yes to leave, or No to not leave a copy of the forwarded message. Click the Save button after making any changes. Further, the Signature tab also allows you to set up your signature using HTML text editor. The signature you have set up will be added to each outgoing message. Click the Save button after making any changes. The Vacation Message tab allows you to set up vacation messages using HTML text editor. The vacation message notifies others of your absence (as shown in the following figure). Click the Save button after making any changes.

0
0
3715

article-image-networking-performance-design

Packt

23 Aug 2013

18 min read

Networking Performance Design

Packt

23 Aug 2013

18 min read

(For more resources related to this topic, see here.) Device and I/O virtualization involves managing the routing of I/O requests between virtual devices and the shared physical hardware. Software-based I/O virtualization and management, in contrast to a direct pass through to the hardware, enables a rich set of features and simplified management. With networking, virtual NICs and virtual switches create virtual networks between virtual machines which are running on the same host without the network traffic consuming bandwidth on the physical network NIC teaming consists of multiple, physical NICs and provides failover and load balancing for virtual machines. Virtual machines can be seamlessly relocated to different systems by using VMware vMotion, while keeping their existing MAC addresses and the running state of the VM. The key to effective I/O virtualization is to preserve these virtualization benefits while keeping the added CPU overhead to a minimum. The hypervisor virtualizes the physical hardware and presents each virtual machine with a standardized set of virtual devices. These virtual devices effectively emulate well-known hardware and translate the virtual machine requests to the system hardware. This standardization on consistent device drivers also helps with virtual machine standardization and portability across platforms, because all virtual machines are configured to run on the same virtual hardware, regardless of the physical hardware in the system. In this article we will discuss the following: Describe various network performance problems Discuss the causes of network performance problems Propose solutions to correct network performance problems Designing a network for load balancing and failover for vSphere Standard Switch The load balancing and failover policies that are chosen for the infrastructure can have an impact on the overall design. Using NIC teaming we can group several physical network adapters attached to a vSwitch. This grouping enables load balancing between the different physical NICs and provides fault tolerance if a card or link failure occurs. Network adapter teaming offers a number of available load balancing and load distribution options. Load balancing is load distribution based on the number of connections, not on network traffic. In most cases, load is managed only for the outgoing traffic and balancing is based on three different policies: Route based on the originating virtual switch port ID (default) Route based on the source MAC hash Route based on IP hash Also, we have two network failure detection options and those are: Link status only Beacon probing Getting ready To step through this recipe, you will need one or more running ESXi hosts, a vCenter Server, and a working installation of vSphere Client. No other prerequisites are required. How to do it... To change the load balancing policy and to select the right one for your environment, and also select the appropriate failover policy, you need to follow the proceeding steps: Open up your VMware vSphere Client. Log in to the vCenter Server. On the left hand side, choose any ESXi Server and choose configuration from the right hand pane. Click on the Networking section and select the vSwitch for which you want to change the load balancing and failover settings. You may wish to override this per port group level as well. Click on Properties. Select the vSwitch and click on Edit. Go to the NIC Teaming tab. Select one of the available policies from the Load Balancing drop-down menu. Select one of the available policies on the Network Failover Detection drop-down menu. Click on OK to make it effective. How it works... Route based on the originating virtual switch port ID (default) In this configuration, load balancing is based on the number of physical network cards and the number of virtual ports used. With this configuration policy, a virtual network card connected to a vSwitch port will always use the same physical network card. If a physical network card fails, the virtual network card is redirected to another physical network card. You typically do not see the individual ports on a vSwitch. However, each vNIC that gets connected to a vSwitch is implicitly using a particular port on the vSwitch. (It's just that there's no reason to ever configure which port, because that is always done automatically.) It does a reasonable job of balancing your egress uplinks for the traffic leaving an ESXi host as long as all the virtual machines using these uplinks have similar usage patterns. It is important to note that port allocation occurs only when a VM is started or when a failover occurs. Balancing is done based on a port's occupation rate at the time the VM starts up. This means that which pNIC is selected for use by this VM is determined at the time the VM powers on based on which ports in the vSwitch are occupied at the time. For example, if you started 20 VMs in a row on a vSwitch with two pNICs, the odd-numbered VMs would use the left pNIC and the even-numbered VMs would use the right pNIC and that would persist even if you shut down all the even-numbered VMs; the left pNIC, would have all the VMs and the right pNIC would have none. It might happen that two heavily-loaded VMs are connected to the same pNIC, thus load is not balanced. This policy is the easiest one and we always call for the simplest one to map it to a best operational simplification. Now when speaking of this policy, it is important to understand that if, for example, teaming is created with two 1 GB cards, and if one VM consumes more than one card's capacity, a performance problem will arise because traffic greater than 1 Gbps will not go through the other card, and there will be an impact on the VMs sharing the same port as the VM consuming all resources. Likewise, if two VMs each wish to use 600 Mbps and they happen to go to the first pNIC, the first pNIC cannot meet the 1.2 Gbps demand no matter how idle the second pNIC is. Route based on source MAC hash This principle is the same as the default policy but is based on the number of MAC addresses. This policy may put those VM vNICs on the same physical uplink depending on how the MAC hash is resolved. For MAC hash, VMware has a different way of assigning ports. It's not based on the dynamically changing port (after a power off and power on the VM usually gets a different vSwitch port assigned), but is instead based on fixed MAC address. As a result one VM is always assigned to the same physical NIC unless the configuration is not changed. With the port ID, the VM could get different pNICs after a reboot or VMotion. If you have two ESXi Servers with the same configuration, the VM will stay on the same pNIC number even after a vMotion. But again, one pNIC may be congested while others are bored. So there is no real load balancing. Route based on IP hash The limitation of the two previously-discussed policies is that a given virtual NIC will always use the same physical network card for all its traffic. IP hash-based load balancing uses the source and destination of the IP address to determine which physical network card to use. Using this algorithm, a VM can communicate through several different physical network cards based on its destination. This option requires configuration of the physical switch's ports to EtherChannel. Because the physical switch is configured similarly, this option is the only one that also provides inbound load distribution, where the distribution is not necessarily balanced. There are some limitations and reasons why this policy is not commonly used. These reasons are described as follows: The route based on IP hash load balancing option involves added complexity and configuration support from upstream switches. Link Aggregation Control Protocol (LACP) or EtherChannel is required for this algorithm to be used. However, this does not apply for a vSphere Standard Switch. For IP hash to be an effective algorithm for load balancing there must be many IP sources and destinations. This is not a common practice for IP storage networks, where a single VMkernel port is used to access a single IP address on a storage device. The same NIC will always send all its traffic to the same destination (for example, Google.com) through the same pNIC, though another destination (for example, bing.com) might go through another pNIC. So, in a nutshell, due to the added complexity, the upstream dependency on the advanced switch configuration and the management overhead, this configuration is rarely used in production environments. The main reason is that if you use IP hash, the pSwitch must be configured with LACP or EtherChannel. Also, if you use LACP or EtherChannel, the load balancing algorithm must be IP hash. This is because with LACP, inbound traffic to the VM could come through either of the pNICs, and the vSwitch must be ready to deliver that to the VM and only IP Hash will do that (the other policies will drop the inbound traffic to this VM that comes in on a pNIC that the VM doesn't use). We have only two failover detection options and those are: Link status only The link status option enables the detection of failures related to the physical network's cables and switch. However, be aware that configuration issues are not detected. This option also cannot detect the link state problems with upstream switches; it works only with the first hop switch from the host. Beacon probing The beacon probing option allows the detection of failures unseen by the link status option, by sending the Ethernet broadcast frames through all the network cards. These network frames authorize the vSwitch to detect faulty configurations or upstream switch failures and force the failover if the ports are blocked. When using an inverted U physical network topology in conjunction with a dual-NIC server, it is recommended to enable link state tracking or a similar network feature in order to avoid traffic black holes. According to VMware's best practices, it is recommended to have at least three cards before activating this functionality. However, if IP hash is going to be used, beacon probing should not be used as a network failure detection, in order to avoid an ambiguous state due to the limitation that a packet cannot hairpin on the port it is received. Beacon probing works by sending out and listening to beacon probes from the NICs in a team. If there are two NICs, then each NIC will send out a probe and the other NICs will receive that probe. Because EtherChannel is considered one link, this will not function properly as the NIC uplinks are not logically separate uplinks. If beacon probing is used, this can result in MAC address flapping errors, and the network connectivity may be interrupted. Designing a network for load balancing and failover for vSphere Distributed Switch The load balancing and failover policies that are chosen for the infrastructure can have an impact on the overall design. Using NIC teaming, we can group several physical network switches attached to a vSwitch. This grouping enables load balancing between the different Physical NICs, and provides fault tolerance if a card failure occurs. The vSphere distributed vSwitch offers a load balancing option that actually takes the network workload into account when choosing the physical uplink. This is route based on a physical NIC load. This is also called Load Based Teaming (LBT). We recommend this load balancing option over the others when using a distributed vSwitch. Benefits of using this load balancing policy are as follows: It is the only load balancing option that actually considers NIC load when choosing uplinks. It does not require upstream switch configuration dependencies like the route based on IP hash algorithm does. When the route based on physical NIC load is combined with the network I/O control, a truly dynamic traffic distribution is achieved. Getting ready To step through this recipe, you will need one or more running ESXi Servers, a vCenter Server, and a working installation of vSphere Client. No other prerequisites are required. How to do it... To change the load balancing policy and select the right one for your environment, and also select the appropriate failover policy you need to follow the proceeding steps: Open up your VMware vSphere Client. Log in to the vCenter Server. Navigate to Networking on the home screen. Navigate to a Distributed Port group and right click and select Edit Settings. Click on the Teaming and Failover section. From the Load Balancing drop-down menu, select Route Based on physical NIC load as the load balancing policy. Choose the appropriate network failover detection policy from the drop-down menu. Click on OK and your settings will be effective. How it works... Load based teaming, also known as route based on physical NIC load, maps vNICs to pNICs and remaps the vNIC to pNIC affiliation if the load exceeds specific thresholds on a pNIC. LBT uses the originating port ID load balancing algorithm for the initial port assignment, which results in the first vNIC being affiliated to the first pNIC, the second vNIC to the second pNIC, and so on. Once the initial placement is over after the VM being powered on, LBT will examine both the inbound and outbound traffic on each of the pNICs and then distribute the load across if there is congestion. LBT will send a congestion alert when the average utilization of a pNIC is 75 percent over a period of 30 seconds. 30 seconds of interval period is being used for avoiding the MAC flapping issues. However, you should enable port fast on the upstream switches if you plan to use STP. VMware recommends LBT over IP hash when you use vSphere Distributed Switch, as it does not require any special or additional settings in the upstream switch layer. In this way you can reduce unnecessary operational complexity. LBT maps vNIC to pNIC and then distributes the load across all the available uplinks, unlike IP hash which just maps the vNIC to pNIC but does not do load distribution. So it may happen that when a high network I/O VM is sending traffic through pNIC0, your other VM will also get to map to the same pNIC and send the traffic. What to know when offloading checksum VMware takes advantage of many of the performance features from modern network adaptors. In this section we are going to talk about two of them and those are: TCP checksum offload TCP segmentation offload Getting ready To step through this recipe, you will need a running ESXi Server and a SSH Client (Putty). No other prerequisites are required. How to do it... The list of network adapter features that are enabled on your NIC can be found in the file /etc/vmware/esx.conf on your ESXi Server. Look for the lines that start with /net/vswitch. However, do not change the default NIC's driver settings unless you have a valid reason to do so. A good practice is to follow any configuration recommendations that are specified by the hardware vendor. Carry out the following steps in order to check the settings: Open up your SSH Client and connect to your ESXi host. Open the file etc/vmware/esx.conf Look for the line that starts with /net/vswitch Your output should look like the following screenshot: How it works... A TCP message must be broken down into Ethernet frames. The size of each frame is the maximum transmission unit (MUT). The default maximum transmission unit is 1500 bytes. The process of breaking messages into frames is called segmentation. Modern NIC adapters have the ability to perform checksum calculations natively. TCP checksums are used to determine the validity of transmitted or received network packets based on error correcting code. These calculations are traditionally performed by the host's CPU. By offloading these calculations to the network adapters, the CPU is freed up to perform other tasks. As a result, the system as a whole runs better. TCP segmentation offload (TSO) allows a TCP/IP stack from the guest OS inside the VM to emit large frames (up to 64KB) even though the MTU of the interface is smaller. Earlier operating system used the CPU to perform segmentation. Modern NICs try to optimize this TCP segmentation by using a larger segment size as well as offloading work from the CPU to the NIC hardware. ESXi utilizes this concept to provide a virtual NIC with TSO support, without requiring specialized network hardware. With TSO, instead of processing many small MTU frames during transmission, the system can send fewer, larger virtual MTU frames. TSO improves performance for the TCP network traffic coming from a virtual machine and for network traffic sent out of the server. TSO is supported at the virtual machine level and in the VMkernel TCP/IP stack. TSO is enabled on the VMkernel interface by default. If TSO becomes disabled for a particular VMkernel interface, the only way to enable TSO is to delete that VMkernel interface and recreate it with TSO enabled. TSO is used in the guest when the VMXNET 2 (or later) network adapter is installed. To enable TSO at the virtual machine level, you must replace the existing VMXNET or flexible virtual network adapter with a VMXNET 2 (or later) adapter. This replacement might result in a change in the MAC address of the virtual network adapter. Selecting the correct virtual network adapter When you configure a virtual machine, you can add NICs and specify the adapter type. The types of network adapters that are available depend on the following factors: The version of the virtual machine, which depends on which host created it or most recently updated it. Whether or not the virtual machine has been updated to the latest version for the current host. The guest operating system. The following virtual NIC types are supported: Vlance VMXNET Flexible E 1000 Enhanced VMXNET (VMXNET 2) VMXNET 3 If you want to know more about these network adapter types then refer to the following KB article: http://kb.vmware.com/kb/1001805 Getting ready To step through this recipe, you will need one or more running ESXi Servers, a vCenter Server, and a working installation of vSphere Client. No other prerequisites are required. How to do it... To choose a particular virtual network adapter you have two ways, one is while you create a new VM and the other one is while adding a new network adaptor to an existing VM. To choose a network adaptor while creating a new VM is as follows: Open vSphere Client. Log in to the vCenter Server. Click on the File menu, and navigate to New| Virtual Machine. Go through the steps and hold on to the step where you need to create network connections. Here you need to choose how many network adaptors you need, which port group you want them to connect to, and an adaptor type. To choose an adaptor type while adding a new network interface in an existing VM you should follow these steps: Open vSphere Client. Log in to the vCenter Server. Navigate to VMs and Templates on your home screen. Select an existing VM where you want to add a new network adaptor, right click and select Edit Settings. Click on the Add button. Select Ethernet Adaptor. Select the Adaptor type and select the network where you want this adaptor to connect. Click on Next and then click on Finish How it works... Among the entire supported virtual network adaptor types, VMXNETis the paravirtualized device driver for virtual networking. The VMXNET driver implements an idealized network interface that passes through the network traffic from the virtual machine to the physical cards with minimal overhead. The three versions of VMXNET are VMXNET, VMXNET 2 (Enhanced VMXNET), and VMXNET 3. The VMXNET driver improves the performance through a number of optimizations as follows: Shares a ring buffer between the virtual machine and the VMkernel, and uses zero copy, which in turn saves CPU cycles. Zero copy improves performance by having the virtual machines and the VMkernel share a buffer, reducing the internal copy operations between buffers to free up CPU cycles. Takes advantage of transmission packet coalescing to reduce address space switching. Batches packets and issues a single interrupt, rather than issuing multiple interrupts. This improves efficiency, but in some cases with slow packet-sending rates, it could hurt throughput while waiting to get enough packets to actually send. Offloads TCP checksum calculation to the network hardware rather than use the CPU resources of the virtual machine monitor. Use vmxnet3 if you can, or the most recent model you can. Use VMware Tools where possible. For certain unusual types of network traffic, sometimes the generally-best model isn't optimal; if you have poor network performance, experiment with other types of vNICs to see which performs best.

0
0
3698

article-image-developing-wiki-seek-widget-using-javascript

Packt

14 Oct 2009

8 min read

Developing Wiki Seek Widget Using Javascript

Packt

14 Oct 2009

8 min read

If you’re searching for details of a particular term in Google, you’re most probably going to see a link for relevant articles from wikipedia.org in the top 10 result list. Wikipedia, is the largest encyclopedia on the Internet, and contains huge collections of articles in many languages. The most significant feature of this encyclopedia is that it is a Wiki, so anybody can contribute to the knowledge base. A Wiki, (a new concept of web2.0), is a collection of web pages whose content can be created and changed by the visitor of the page with simplified mark-up language. Wikis are usually used as knowledge management systems on the web. Brief Introduction to Wikipedia Wikipedia has defined itself as : … a free, multilingual, open content encyclopedia project operated by the United States-based non-profit Wikimedia Foundation. Wikipedia is built upon an open source wiki package called MediaWiki. MediaWiki uses PHP as a server side scripting language and MySql as the database. Wikipedia uses MediaWiki’s wikitext format for editing the text, so the user (without any necessary knowledge of HTML and CSS) can edit them easily. The Wikitext language (also called Wiki Markup) is a markup language which gives instruction on how outputted text will be displayed. It provides a simplified approach to writing pages in a wiki website. Different types of wiki software employ different styles of Wikitext language. For example, the Wikitext markup language has ways to hyperlink pages within the website but a number of different syntaxes are available for creating such links. Wikipedia was launched by Jimmy Wales and Larry Sanger in 2001 as a means of collecting and summarizing human knowledge in every major language. As of April 2008, Wikipedia had over 10 million articles in 253 languages. With so many articles, it is the largest encyclopedia ever assembled. Wikipedia articles are written collaboratively by volunteers, and any visitor can modify the content of article. Any modification must be accepted by the editors of Wikipedia otherwise the article will be reverted to the previous content. Along with popularity, Wikipedia is also criticized for systematic bias and inconsistency since the modifications must be cleared by the editors. Critics also argue that it’s open nature and the lack of proper sources for many articles makes it unreliable. Searching in Wikipedia To search for a particular article in Wikipedia, you can use the search box in the home page of wikipedia.org.Wikipedia classifies its articles in different sub-domains according to language; “en.wikipedia.org” contains articles in English language whereas “es.wikipedia.org” contains Spanish articles. Whenever you select “english” language in the dropdown box, the related articles will be searched over “en.wikipedia.org” and so on for the another language. You can also search the articles of Wikipedia from a remote server. For this, you have to send the language and search parameters to http://www.wikipedia.org/search-redirect.php via the GET method Creating a Wiki Seek Widget Up till now, we’ve looked at the background concept of Wikipedia. Now, let’s start building the widget. This widget contains a form with three components. A textbox where the visitors enters the search keyword, a dropdown list which contains the language of the article and finally a submit button to search the articles of Wikipedia. By the time we’re done, you should have a widget that looks like this: Concept for creating form Before looking at the JavaScript code, first let’s understand the architecture of the form with the parameters to be sent for searching Wikipedia. The request should be sent to http://www.wikipedia.org/search-redirect.php via the GET method. <form action="http://www.wikipedia.org/search-redirect.php" ></form> If you don’t specify the method attribute in the form, the form uses GET, which is the default method. After creating the form element, we need to add the textbox inside the above form with the name search because we’ve to send the search keyword in the name of search parameter. <input type="text" name="search" size="20" /> After adding the textbox for the search keyword, we need to add the dropdown list which contains the language of the article to search. The name of this dropdown-list should be language as we’ve to send the language code to the above URL in the language parameter. These language codes are two or three letter codes specified by ISO. ISO has assigned three letter language codes for most of the popular languages of the world. And, there are a few languages that are represented by two letter ISO codes. For example, eng and en are the three and two letter language code for English. Some of the article languages of Wikipedia don’t have ISO codes, and you have to find the value of the language parameter from Wikipedia. For example, articles in the Alemannisch language is als. Here is the HTML code for constructing a dropdown list in major languages : <select name="language"><option value="de" >Deutsch</option><option value="en" selected="selected">English</option><option value="es" >Español</option><option value="eo" >Esperanto</option><option value="fr" >Français</option><option value="it" >Italiano</option><option value="hu" >Magyar</option><option value="nl" >Nederlands</option></select> As you can see in the above dropdown list, English is the default language selected. Now, we just need to add a submit button in the above form to complete the form for searching the article in wikipedia. <input type="submit" name="go" value="Search" title="Search in wikipedia" /> Put all the HTML code together to create the form. JavaScript Code As we’ve already got the background concept of the HTML form, we just have to use the document.write() to output the HTML to the web browser. Here is the JavaScript code to create the Wiki Seek Widget : document.write('<div>');document.write('<form action="http://www.wikipedia.org/search-redirect.php" >');document.write('<input type="text" name="search" size="20" />');document.write(' <select name="language">');document.write('<option value="de" >Deutsch</option>');document.write('<option value="en" selected="selected">English</option>');document.write('<option value="es" >Español</option>');document.write('<option value="eo" >Esperanto</option>');document.write('<option value="fr" >Français</option>');document.write('<option value="it" >Italiano</option>');document.write('<option value="hu" >Magyar</option>');document.write('<option value="nl" >Nederlands</option>');document.write('</select>');document.write(' <input type="submit" name="go" value="Search" title="Search in wikipedia" />');document.write('</form>');document.write('</div>'); In the above code, I’ve used division (div) as the container for the HTML form. I’ve also saved the above code in a wiki_seek.js file. The above JavaScript code displays a non-stylish widget. To make a stylish widget, you can use style property in the input elements of the form. Using Wiki Seek widget To use this wiki seek widget we’ve to follow these steps: First of all, we need to upload the above wiki_seek.js to a web server so that it can be used by the client websites. Let’s suppose that is uploaded and placed in the URL : http://www.widget-server.com/wiki_seek.js Now, we can widget in any web pages by placing the following JavaScript Code in the website. <script type="text/javascript" language="javascript"src="http://www.widget-server.com/wiki_seek.js"></script> The Wiki Seek widget is displayed in any part of web page, where you place the above code.

0
0
3687

How-To Tutorials - Programming

Business Process Modeling

Multiplying Performance with Parallel Computing

Functions in Swift

Adding Connectors in Bonita

Database, Active Record, and Model Tricks

Asynchrony in Action

High Availability: Oracle 11g R1 R2 Real Application Clusters (RAC)

Drools Integration Modules: Spring Framework and Apache Camel

Getting a Jump-Start with IronPython

Using the OSGi Bundle Repository in OSGi and Apache Felix 3.0

Trending Topics

IT Operations Management

Linking Your Customers to Your SugarCRM

Liferay Mail and SMS Text Messenger Portlet

Networking Performance Design

Developing Wiki Seek Widget Using Javascript

Create a Free Account To Continue Reading

Sign in to activate your 7-day free access