Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Save more on your purchases! discount-offer-chevron-icon
Savings automatically calculated. No voucher code required.
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Events
Videos
Audiobooks
Packt Hub
Free Learning
Arrow right icon
timer SALE ENDS IN
0 Days
:
00 Hours
:
00 Minutes
:
00 Seconds

How-To Tutorials - Programming

1081 Articles
article-image-business-process-modeling
Packt
23 Oct 2009
13 min read
Save for later

Business Process Modeling

Packt
23 Oct 2009
13 min read
Modeling Business Processes The transparency of the process flow is crucial, as this gives the process owners, process analysts, and all others involved an insight into what is going on. An understanding of the as-is process flow also ensures that we can judge the efficiency and the quality of the process. The main objective of process modeling is the definition of the as-is process flow. Process modeling needs to answer the following questions: What is the outcome of the business process? What activities are performed within the business process? What is the order of activities? Who performs the activities? Which business documents are exchanged within the process? How foolproof is the process, and how can it be extended in the future? After answering these and some other questions, we get a good insight into how the process works. We can also identify structural, organizational, and technological weak points and even bottlenecks, and identify potential improvements to the process. We will model business process to satisfy the following objectives: To specify the exact result of the business process, and to understand the business value of this result. To understand the activities of the business process. Knowing the exact tasks and activities that have to be performed is crucial to understanding the details of the process. To understand the order of activities. Activities can be performed in sequence or in parallel, which can help improve the overall time required to fulfill a business process. Activities can be short-running or long-running. To understand the responsibilities, to identify (and later supervise) who is responsible for which activities and tasks. To understand the utilization of resources consumed in the business process. Knowing who uses which resources can help improve the utilization of resources as resource requirements can be planned for and optimized. To understand the relationship between people involved in the processes, and their communication. Knowing exactly who communicates with whom is important and can help to organize and optimize communications. To understand the document flow. Business processes produce and consume documents (regardless of whether these are paper or electronic documents). Understanding where the documents are going, and where they are coming from is important. A good overview of the documents also gives us the opportunity to identify whether all of the documents are really necessary. To identify potential bottlenecks and points of improvements, which can be used later in the process optimization phase. To introduce quality standards such as ISO 9001 more successfully, and to better pass certification. To improve the understandability of quality regulations that can be supplemented with process diagrams. To use business process models as work guidelines for new employees who can introduce themselves to the business processes faster and more efficiently. To understand business processes, which will enable us to understand and describe the company as a whole. A good understanding of business processes is very important for developing IT support. Applications that provide end-to-end support for business processes, can be developed efficiently only if we understand the business processes in details. Modeling Method and Notation Efficient process modeling requires a modeling method that provides a structured and controlled approach to process modeling. Several modeling methods have been developed over the years. Examples include IDS Sheer's the ARIS methodology, CSC's Catalyst, Business Genetics, SCOR and the extensions PCOR and VCOR, POEM, and so on. The ARIS methodology has been the most popular methodology, and has been adopted by many software vendors. In the next section, we will describe the basics of the ARIS methodology, which has lately been adapted to be conformant with SOA. ARIS ARIS is both a BPM methodology, and an architectural framework for designing enterprise architectures. Enterprise architecture combines business models (process models, organizational models, and so on) with IT models (IT architecture, data model, and so on). ARIS stands for Architecture of Integrated Information Systems and comprises of two things, the methodology and framework, and the software that supports both. Here, we will give a brief introduction to ARIS methodology and framework, which dates back to 1992. The objective of ARIS is to narrow the gap between business requirements and IT. The ARIS framework is not only about process models (describing business processes), although process models are one of the most important things of ARIS. As enterprise architecture is complex, ARIS defines several views that focus on specific aspects such as business, technology, information, and so on, to reduce the complexity. The ARIS framework describes the following: Business processes Products and services related to the processes The structure of the organization Business objectives and strategies Information flows IT architecture and applications The data model Resources (people and hardware resources) Costs Skills and knowledge These views are gathered under the concept of ARIS House, which provides a structured view on all information on business processes. ARIS House offers five views: The process view (also called the control view) is the central view that shows the behavior of the processes, how the processes relate to the products and services, organization, functions, and data. The process view includes the process models in the selected notation, and other diagrams such as information flow, material flow, value chains, communication diagrams, and so on. The product and service view shows the products and services, their structures, relations, and product/service trees. The organizational view shows the organizational structure of the company, including departments, roles, and employees. It shows these in hierarchical organizational charts. The organization view also shows technical resources and communication networks. The function view defines process tasks and describes business objectives, function hierarchies, and application software. The data view shows business data and information. This view includes data models, information maps, database models, and knowledge structures. The ARIS House is illustrated in the following figure: In ARIS House, the process view is the central view of the dynamic behavior of the business processes and brings together the other four static views, the organizational view, data view, function view and product/service view. In this book, we will focus primarily on the process view. Each ARIS view is divided further into phases. The translation of business requirements into IT applications requires that we follow certain phases. Globally, three general phases are likely to be used: Requirements phase Design specification phase Implementation phase ARIS is particularly strong in the requirements phase, while other phases may differ depending on the implementation method and the architecture we use. We will talk about these later in this article. Let us now look at the other important aspect, the business process modeling notations. Modeling Notation Process modeling also requires a notation In the past, several notations were used to model processes. Flow diagrams and block diagrams were representatives of the first-generation notations. Then, more sophisticated notations were defined, such as EPC (Event Process Chain) and eEPC (Extended Event Process Chain). UML activity diagrams, XPDL, and IDEF 3 were also used, in addition to some other less-known notations. A few years ago a new notation, called Business Process Modeling Notation (BPMN) was developed. BPMN was developed particularly for modeling business processes in accordance with SOA. In this article, we will use BPMN for modeling processes. BPMN BPMN is the most comprehensive notation for process modeling so far. It has been developed under the hood of OMG (Object Management Group). Let us look into the brief introduction of the most important BPMN elements so that we can read the diagrams presented later in this article. The most important goals while designing BPMN have been: To develop a notation, which will be understandable at all levels: In business process modeling different people are involved, from business users, business analysts, and process owners, to the technical architects and developers. The management reviews business processes at periodic intervals. Therefore, the goal of BPMN has been to provide a graphical notation the is simple to understand, yet powerful enough to model business processes at the required level of detail. To enable automatic transformation into executable code, that is, BPEL, and vice-versa: The gap between the business process models and the information technology (application software) has been quite large in existing technologies. There is no clear definition on how one relates to the other. Therefore, BPMN has been designed specifically to provide such transformations. To model the diagrams, BPMN defines four categories of elements: Flow objects, which are activities, events, and gateways. Activities can be tasks or sub-processes. Events can be triggers or results. Three types of events are supported: start, intermediate, and end. Gateways control the divergence of sequential flows into concurrent flows, and their convergence back to sequential flow. Connecting objects are used to connect flow objects together. Connectors are sequence flows, message flows, and associations. Swim lanes are used to organize activities into visual categories in order to illustrate different responsibilities or functional capabilities. Pools and lanes can be used for swim lanes. Artifacts are used to add specific context to the business processes that are being modeled. Data objects are used to show how data is produced or required by the process. Groups are used to group together similar activities or other elements. Annotations are used to add text information to the diagram. We can also define custom artifacts. The following diagrams show the various notations used in BPMN: Activities are the basic elements of BPMN and are represented by rectangles with rounded corners. A plus sign denotes that the activity can be further decomposed: Decisions are shown as diamonds. A plus sign inside the diamond denotes a logical AND, while an x denotes a logical OR: Events are shown as double circles: Roles are shown as pools and swim-lanes within pools: A Document is shown as follows: The order of activities is indicated by an arrow: The flow of a document or information is shown with a dashed line: BPMN can be used to model parts of processes or whole processes. Processes can be modeled at different levels of fidelity. BPMN is equally suitable for internal (private) business processes, and for public (collaborative) business-to-business processes. Internal business processes focus on the point of view of a single company, and define activities that are internal to the company. Such processes might also define interactions with external partners. Public collaborative processes show the interaction between all involved businesses and organizations. Such processes models should be modeled from the general point of view, and should show interactions between the participants. Process Design The main activity in process design is the recording of the actual processes. The objective is to develop the as-is process model. To develop the as-is model, it is necessary to gather all knowledge about the process. This knowledge often exists only in the heads of the employees, who are involved in the process. Therefore, it is necessary to perform detailed interviews with all involved people. Often, process supervisors might think that they know exactly how the process is performed. However, after talking with those employees who really carry out the work, they see that the actual situation differs considerably. It is very important to gather all this information about the process, otherwise it will not be possible to develop a sound process model, that reflects the as-is state of the process. The first question related to the as-is model is the business result that the process generates. Understanding the business result is crucial, as sometimes it may not be clearly articulated. After the business result is identified, we should understand the process flow. The process flow consists of activities (or tasks) that are performed in a certain order. The process flow is modeled at various levels of abstraction. At the highest level of abstraction, the process flow shows only the most important activities (usually up to ten). Each of the top-level activities are then decomposed into detailed flows. The process complexity, and the required level of detail, are the criteria that instruct us how deep we should decompose. To understand the process behavior completely, it makes sense to decompose until atomic activities (that is, activities that cannot be further decomposed) are reached. When developing the as-is process model, one of the most important things to consider is the level of detail. In order to provide end-to-end support for business processes using SOA, detailed process modeling should be done. The difficulties often hide in the details! In the process design, we should understand the detailed structure of the business process. Therefore, we should identify at least the following: Process activities at various levels of detail Roles responsible for carrying out each process activity Events that trigger the process execution and events that interrupt the process flow Documents exchanged within the process. This includes input documents and output documents Business rules that are part of the process We should design the usual (also called optimal) process flow and identify possible exception scenarios. Exceptions interrupt the usual process flow. Therefore, we need to specify how the exceptions will be handled. The usual approach to the process design includes the following steps: Identifying the roles Identifying the activities Connecting activities to roles Defining the order of activities Adding events Adding documents We should also understand the efficiency of the business process. This includes resource utilization, the time taken by involved employees, possible bottlenecks, and inefficiencies. This is the reason why we should also identify metrics that are used to measure the efficiency of the process. While some of these metrics may be KPIs, other metrics relevant to the process should also be identified. We should identify if the process is compliant with standards or reference processes. In some industry domains, reference processes have been defined. An example is the telecommunications industry where the TMF (Telecom Management Forum) has defined NGOSS. Part of NGOSS is eTom (Enhanced Telecom Operations Map), which specifies compliant business processes for telecom companies. Other industries have also started to develop similar reference processes. We should also identify the business goals to which the process contributes to. Business goals are the same as the process results. A business process should not only have at least one result, but should also contribute to at least one (preferably more than one) business goal. Here, we can look into the company strategy to identify the business goals. We should also identify the events that can interrupt the process flow. Each process can be interrupted, and we should understand how this happens. If a process is interrupted, we might need to compensate those activities of the process that have already been successfully completed. Therefore, we should also specify the compensation logic related to different interruption events. Finally, we should also understand the current software support for the business process. This is important because existing software may hide the details of process behavior. This information can also be re-used for end-to-end process support. Once we have identified all of these artifacts, we will have gathered a good understanding of the process. Therefore, let us now look at the results of the process modeling.
Read more
  • 0
  • 0
  • 3919

article-image-multiplying-performance-parallel-computing
Packt
06 Feb 2015
22 min read
Save for later

Multiplying Performance with Parallel Computing

Packt
06 Feb 2015
22 min read
In this article, by Aloysius Lim and William Tjhi, authors of the book R High Performance Programming, we will learn how to write and execute a parallel R code, where different parts of the code run simultaneously. So far, we have learned various ways to optimize the performance of R programs running serially, that is in a single process. This does not take full advantage of the computing power of modern CPUs with multiple cores. Parallel computing allows us to tap into all the computational resources available and to speed up the execution of R programs by many times. We will examine the different types of parallelism and how to implement them in R, and we will take a closer look at a few performance considerations when designing the parallel architecture of R programs. (For more resources related to this topic, see here.) Data parallelism versus task parallelism Many modern software applications are designed to run computations in parallel in order to take advantage of the multiple CPU cores available on almost any computer today. Many R programs can similarly be written in order to run in parallel. However, the extent of possible parallelism depends on the computing task involved. On one side of the scale are embarrassingly parallel tasks, where there are no dependencies between the parallel subtasks; such tasks can be made to run in parallel very easily. An example of this is, building an ensemble of decision trees in a random forest algorithm—randomized decision trees can be built independently from one another and in parallel across tens or hundreds of CPUs, and can be combined to form the random forest. On the other end of the scale are tasks that cannot be parallelized, as each step of the task depends on the results of the previous step. One such example is a depth-first search of a tree, where the subtree to search at each step depends on the path taken in previous steps. Most algorithms fall somewhere in between with some steps that must run serially and some that can run in parallel. With this in mind, careful thought must be given when designing a parallel code that works correctly and efficiently. Often an R program has some parts that have to be run serially and other parts that can run in parallel. Before making the effort to parallelize any of the R code, it is useful to have an estimate of the potential performance gains that can be achieved. Amdahl's law provides a way to estimate the best attainable performance gain when you convert a code from serial to parallel execution. It divides a computing task into its serial and potentially-parallel parts and states that the time needed to execute the task in parallel will be no less than this formula: T(n) = T(1)(P + (1-P)/n), where: T(n) is the time taken to execute the task using n parallel processes P is the proportion of the whole task that is strictly serial The theoretical best possible speed up of the parallel algorithm is thus: S(n) = T(1) / T(n) = 1 / (P + (1-P)/n) For example, given a task that takes 10 seconds to execute on one processor, where half of the task can be run in parallel, then the best possible time to run it on four processors is T(4) = 10(0.5 + (1-0.5)/4) = 6.25 seconds. The theoretical best possible speed up of the parallel algorithm with four processors is 1 / (0.5 + (1-0.5)/4) = 1.6x . The following figure shows you how the theoretical best possible execution time decreases as more CPU cores are added. Notice that the execution time reaches a limit that is just above five seconds. This corresponds to the half of the task that must be run serially, where parallelism does not help. Best possible execution time versus number of CPU cores In general, Amdahl's law means that the fastest execution time for any parallelized algorithm is limited by the time needed for the serial portions of the algorithm. Bear in mind that Amdahl's law provides only a theoretical estimate. It does not account for the overheads of parallel computing (such as starting and coordinating tasks) and assumes that the parallel portions of the algorithm are infinitely scalable. In practice, these factors might significantly limit the performance gains of parallelism, so use Amdahl's law only to get a rough estimate of the maximum speedup possible. There are two main classes of parallelism: data parallelism and task parallelism. Understanding these concepts helps to determine what types of tasks can be modified to run in parallel. In data parallelism, a dataset is divided into multiple partitions. Different partitions are distributed to multiple processors, and the same task is executed on each partition of data. Take for example, the task of finding the maximum value in a vector dataset, say one that has one billion numeric data points. A serial algorithm to do this would look like the following code, which iterates over every element of the data in sequence to search for the largest value. (This code is intentionally verbose to illustrate how the algorithm works; in practice, the max() function in R, though also serial in nature, is much faster.) serialmax <- function(data) {max = -Inffor (i in data) {if (i > max)max = i}return max} One way to parallelize this algorithm is to split the data into partitions. If we have a computer with eight CPU cores, we can split the data into eight partitions of 125 million numbers each. Here is the pseudocode for how to perform the same task in parallel: # Run this in parallel across 8 CPU corespart.results <- run.in.parallel(serialmax(data.part))# Compute global maxglobal.max <- serialmax(part.results) This pseudocode runs eight instances of serialmax()in parallel—one for each data partition—to find the local maximum value in each partition. Once all the partitions have been processed, the algorithm finds the global maximum value by finding the largest value among the local maxima. This parallel algorithm works because the global maximum of a dataset must be the largest of the local maxima from all the partitions. The following figure depicts data parallelism pictorially. The key behind data parallel algorithms is that each partition of data can be processed independently of the other partitions, and the results from all the partitions can be combined to compute the final results. This is similar to the mechanism of the MapReduce framework from Hadoop. Data parallelism allows algorithms to scale up easily as data volume increases—as more data is added to the dataset, more computing nodes can be added to a cluster to process new partitions of data. Data parallelism Other examples of computations and algorithms that can be run in a data parallel way include: Element-wise matrix operations such as addition and subtraction: The matrices can be partitioned and the operations are applied to each pair of partitions. Means: The sums and number of elements in each partition can be added to find the global sum and number of elements from which the mean can be computed. K-means clustering: After data partitioning, the K centroids are distributed to all the partitions. Finding the closest centroid is performed in parallel and independently across the partitions. The centroids are updated by first, calculating the sums and the counts of their respective members in parallel, and then consolidating them in a single process to get the global means. Frequent itemset mining using the Partition algorithm: In the first pass, the frequent itemsets are mined from each partition of data to generate a global set of candidate itemsets; in the second pass, the supports of the candidate itemsets are summed from each partition to filter out the globally infrequent ones. The other main class of parallelism is task parallelism, where tasks are distributed to and executed on different processors in parallel. The tasks on each processor might be the same or different, and the data that they act on might also be the same or different. The key difference between task parallelism and data parallelism is that the data is not divided into partitions. An example of a task parallel algorithm performing the same task on the same data is the training of a random forest model. A random forest is a collection of decision trees built independently on the same data. During the training process for a particular tree, a random subset of the data is chosen as the training set, and the variables to consider at each branch of the tree are also selected randomly. Hence, even though the same data is used, the trees are different from one another. In order to train a random forest of say 100 decision trees, the workload could be distributed to a computing cluster with 100 processors, with each processor building one tree. All the processors perform the same task on the same data (or exact copies of the data), but the data is not partitioned. The parallel tasks can also be different. For example, computing a set of summary statistics on the same set of data can be done in a task parallel way. Each process can be assigned to compute a different statistic—the mean, standard deviation, percentiles, and so on. Pseudocode of a task parallel algorithm might look like this: # Run 4 tasks in parallel across 4 coresfor (task in tasks)run.in.parallel(task)# Collect the results of the 4 tasksresults <- collect.parallel.output()# Continue processing after all 4 tasks are complete Implementing data parallel algorithms Several R packages allow code to be executed in parallel. The parallel package that comes with R provides the foundation for most parallel computing capabilities in other packages. Let's see how it works with an example. This example involves finding documents that match a regular expression. Regular expression matching is a fairly computational expensive task, depending on the complexity of the regular expression. The corpus, or set of documents, for this example is a sample of the Reuters-21578 dataset for the topic corporate acquisitions (acq) from the tm package. Because this dataset contains only 50 documents, they are replicated 100,000 times to form a corpus of 5 million documents so that parallelizing the code will lead to meaningful savings in execution times. library(tm)data("acq")textdata <- rep(sapply(content(acq), content), 1e5) The task is to find documents that match the regular expression d+(,d+)? mln dlrs, which represents monetary amounts in millions of dollars. In this regular expression, d+ matches a string of one or more digits, and (,d+)? optionally matches a comma followed by one more digits. For example, the strings 12 mln dlrs, 1,234 mln dlrs and 123,456,789 mln dlrs will match the regular expression. First, we will measure the execution time to find these documents serially with grepl(): pattern <- "\d+(,\d+)? mln dlrs"system.time(res1 <- grepl(pattern, textdata))##   user  system elapsed ## 65.601   0.114  65.721 Next, we will modify the code to run in parallel and measure the execution time on a computer with four CPU cores: library(parallel)detectCores()## [1] 4cl <- makeCluster(detectCores())part <- clusterSplit(cl, seq_along(textdata))text.partitioned <- lapply(part, function(p) textdata[p])system.time(res2 <- unlist(    parSapply(cl, text.partitioned, grepl, pattern = pattern))) ##  user  system elapsed ## 3.708   8.007  50.806 stopCluster(cl) In this code, the detectCores() function reveals how many CPU cores are available on the machine, where this code is executed. Before running any parallel code, makeCluster() is called to create a local cluster of processing nodes with all four CPU cores. The corpus is then split into four partitions using the clusterSplit() function to determine the ideal split of the corpus such that each partition has roughly the same number of documents. The actual parallel execution of grepl() on each partition of the corpus is carried out by the parSapply() function. Each processing node in the cluster is given a copy of the partition of data that it is supposed to process along with the code to be executed and other variables that are needed to run the code (in this case, the pattern argument). When all four processing nodes have completed their tasks, the results are combined in a similar fashion to sapply(). Finally, the cluster is destroyed by calling stopCluster(). It is good practice to ensure that stopCluster() is always called in production code, even if an error occurs during execution. This can be done as follows: doSomethingInParallel <- function(...) {    cl <- makeCluster(...)    on.exit(stopCluster(cl))    # do something} In this example, running the task in parallel on four processors resulted in a 23 percent reduction in the execution time. This is not in proportion to the amount of compute resources used to perform the task; with four times as many CPU cores working on it, a perfectly parallelizable task might experience as much as a 75 percent runtime reduction. However, remember Amdahl's law—the speed of parallel code is limited by the serial parts, which includes the overheads of parallelization. In this case, calling makeCluster() with the default arguments creates a socket-based cluster. When such a cluster is created, additional copies of R are run as workers. The workers communicate with the master R process using network sockets, hence the name. The worker R processes are initialized with the relevant packages loaded, and data partitions are serialized and sent to each worker process. These overheads can be significant, especially in data parallel algorithms where large volumes of data needs to be transferred to the worker processes. Besides parSapply(), parallel also provides the parApply() and parLapply() functions; these functions are analogous to the standard sapply(), apply(), and lapply() functions, respectively. In addition, the parLapplyLB() and parSapplyLB() functions provide load balancing, which is useful when the execution of each parallel task takes variable amounts of time. Finally, parRapply() and parCapply() are parallel row and column apply() functions for matrices. On non-Windows systems, parallel supports another type of cluster that often incurs less overheads — forked clusters. In these clusters, new worker processes are forked from the parent R process with a copy of the data. However, the data is not actually copied in the memory unless it is modified by a child process. This means that, compared to socket-based clusters, initializing child processes is quicker and the memory usage is often lower. Another advantage of using forked clusters is that parallel provides a convenient and concise way to run tasks on them via the mclapply(), mcmapply(), and mcMap() functions. (These functions start with mc because they were originally a part of the multicore package) There is no need to explicitly create and destroy the cluster, as these functions do this automatically. We can simply call mclapply() and state the number of worker processes to fork via the mc.cores argument: system.time(res3 <- unlist(    mclapply(text.partitioned, grepl, pattern = pattern,             mc.cores = detectCores())))##    user  system elapsed ## 127.012   0.350  33.264 This shows a 49 percent reduction in execution time compared to the serial version, and 35 percent reduction compared to parallelizing using a socket-based cluster. For this example, forked clusters provide the best performance. Due to differences in system configuration, you might see very different results when you try the examples in your own environment. When you develop parallel code, it is important to test the code in an environment that is similar to the one that it will eventually run in. Implementing task parallel algorithms Let's now see how to implement a task parallel algorithm using both socket-based and forked clusters. We will look at how to run the same task and different tasks on workers in a cluster. Running the same task on workers in a cluster To demonstrate how to run the same task on a cluster, the task for this example is to generate 500 million Poisson random numbers. We will do this by using L'Ecuyer's combined multiple-recursive generator, which is the only random number generator in base R that supports multiple streams to generate random numbers in parallel. The random number generator is selected by calling the RNGkind() function. We cannot just use any random number generator in parallel because the randomness of the data depends on the algorithm used to generate random data and the seed value given to each parallel task. Most other algorithms were not designed to produce random numbers in multiple parallel streams, and might produce multiple highly correlated streams of numbers, or worse, multiple identical streams! First, we will measure the execution time of the serial algorithm: RNGkind("L'Ecuyer-CMRG")nsamples <- 5e8lambda <- 10system.time(random1 <- rpois(nsamples, lambda))##   user  system elapsed## 51.905   0.636  52.544 To generate the random numbers on a cluster, we will first distribute the task evenly among the workers. In the following code, the integer vector samples.per.process contains the number of random numbers that each worker needs to generate on a four-core CPU. The seq() function produces ncores+1 numbers evenly distributed between 0 and nsamples, with the first number being 0 and the next ncores numbers indicating the approximate cumulative number of samples across the worker processes. The round() function rounds off these numbers into integers and diff() computes the difference between them to give the number of random numbers that each worker process should generate. cores <- detectCores()cl <- makeCluster(ncores)samples.per.process <-    diff(round(seq(0, nsamples, length.out = ncores+1))) Before we can generate the random numbers on a cluster, each worker needs a different seed from which it can generate a stream of random numbers. The seeds need to be set on all the workers before running the task, to ensure that all the workers generate different random numbers. For a socket-based cluster, we can call clusterSetRNGStream() to set the seeds for the workers, then run the random number generation task on the cluster. When the task is completed, we call stopCluster() to shut down the cluster: clusterSetRNGStream(cl)system.time(random2 <- unlist(    parLapply(cl, samples.per.process, rpois,               lambda = lambda)))##  user  system elapsed ## 5.006   3.000  27.436stopCluster(cl) Using four parallel processes in a socket-based cluster reduces the execution time by 48 percent. The performance of this type of cluster for this example is better than that of the data parallel example because there is less data to copy to the worker processes—only an integer that indicates how many random numbers to generate. Next, we run the same task on a forked cluster (again, this is not supported on Windows). The mclapply() function can set the random number seeds for each worker for us, when the mc.set.seed argument is set to TRUE; we do not need to call clusterSetRNGStream(). Otherwise, the code is similar to that of the socket-based cluster: system.time(random3 <- unlist(    mclapply(samples.per.process, rpois,             lambda = lambda,             mc.set.seed = TRUE, mc.cores = ncores))) ##   user  system elapsed ## 76.283   7.272  25.052 On our test machine, the execution time of the forked cluster is slightly faster, but close to that of the socket-based cluster, indicating that the overheads for this task are similar for both types of clusters. Running different tasks on workers in a cluster So far, we have executed the same tasks on each parallel process. The parallel package also allows different tasks to be executed on different workers. For this example, the task is to generate not only Poisson random numbers, but also uniform, normal, and exponential random numbers. As before, we start by measuring the time to perform this task serially: RNGkind("L'Ecuyer-CMRG")nsamples <- 5e7pois.lambda <- 10system.time(random1 <- list(pois = rpois(nsamples,                                          pois.lambda),                            unif = runif(nsamples),                            norm = rnorm(nsamples),                            exp = rexp(nsamples)))##   user  system elapsed ## 14.180   0.384  14.570 In order to run different tasks on different workers on socket-based clusters, a list of function calls and their associated arguments must be passed to parLapply(). This is a bit cumbersome, but parallel unfortunately does not provide an easier interface to run different tasks on a socket-based cluster. In the following code, the function calls are represented as a list of lists, where the first element of each sublist is the name of the function that runs on a worker, and the second element contains the function arguments. The function do.call() is used to call the given function with the given arguments. cores <- detectCores()cl <- makeCluster(cores)calls <- list(pois = list("rpois", list(n = nsamples,                                        lambda = pois.lambda)),              unif = list("runif", list(n = nsamples)),              norm = list("rnorm", list(n = nsamples)),              exp = list("rexp", list(n = nsamples)))clusterSetRNGStream(cl)system.time(    random2 <- parLapply(cl, calls,                         function(call) {                             do.call(call[[1]], call[[2]])                         }))##  user  system elapsed ## 2.185   1.629  10.403stopCluster(cl) On forked clusters on non-Windows machines, the mcparallel() and mccollect() functions offer a more intuitive way to run different tasks on different workers. For each task, mcparallel() sends the given task to an available worker. Once all the workers have been assigned their tasks, mccollect() waits for the workers to complete their tasks and collects the results from all the workers. mc.reset.stream()system.time({    jobs <- list()    jobs[[1]] <- mcparallel(rpois(nsamples, pois.lambda),                            "pois", mc.set.seed = TRUE)    jobs[[2]] <- mcparallel(runif(nsamples),                            "unif", mc.set.seed = TRUE)    jobs[[3]] <- mcparallel(rnorm(nsamples),                            "norm", mc.set.seed = TRUE)    jobs[[4]] <- mcparallel(rexp(nsamples),                            "exp", mc.set.seed = TRUE)    random3 <- mccollect(jobs)})##   user  system elapsed ## 14.535   3.569   7.97 Notice that we also had to call mc.reset.stream() to set the seeds for random number generation in each worker. This was not necessary when we used mclapply(), which calls mc.reset.stream() for us. However, mcparallel() does not, so we need to call it ourselves. Summary In this article, we learned about two classes of parallelism: data parallelism and task parallelism. Data parallelism is good for tasks that can be performed in parallel on partitions of a dataset. The dataset to be processed is split into partitions and each partition is processed on a different worker processes. Task parallelism, on the other hand, divides a set of similar or different tasks to amongst the worker processes. In either case, Amdahl's law states that the maximum improvement in speed that can be achieved by parallelizing code is limited by the proportion of that code that can be parallelized. Resources for Article: Further resources on this subject: Using R for Statistics, Research, and Graphics [Article] Learning Data Analytics with R and Hadoop [Article] Aspects of Data Manipulation in R [Article]
Read more
  • 0
  • 0
  • 3888

article-image-functions-swift
Packt
30 Sep 2016
15 min read
Save for later

Functions in Swift

Packt
30 Sep 2016
15 min read
In this article by Dr. Fatih Nayebi, the author of the book Swift 3 Functional Programming, we will see that as functions are the fundamental building blocks in functional programming, this article dives deeper into it and explains all the aspects related to the definition and usage of functions in functional Swift with coding examples. This article will cover the following topics with coding examples: The general syntax of functions Defining and using function parameters Setting internal and external parameters Setting default parameter values Defining and using variadic functions Returning values from functions Defining and using nested functions (For more resources related to this topic, see here.) What is a function? Object-oriented programming (OOP) looks very natural to most developers as it simulates a real-life situation of classes or, in other words, blueprints and their instances, but it brought a lot of complexities and problems such as instance and memory management, complex multithreading, and concurrency programming. Before OOP became mainstream, we were used to developing in procedural languages. In the C programming language, we did not have objects and classes; we would use structs and function pointers. So now we are talking about functional programming that relies mostly on functions just as procedural languages relied on procedures. We are able to develop very powerful programs in C without classes; in fact, most operating systems are developed in C. There are other multipurpose programming languages such as Go by Google that is not object-oriented and is getting very popular because of its performance and simplicity. So, are we going to be able to write very complex applications without classes in Swift? We might wonder why we should do this. Generally, we should not, but attempting it will introduce us to the capabilities of functional programming. A function is a block of code that executes a specific task, can be stored, can persist data, and can be passed around. We define them in standalone Swift files as global functions or inside other building blocks such as classes, structs, enums, and protocols as methods. They are called methods if they are defined in classes but in terms of definition, there is no difference between a function and method in Swift. Defining them in other building blocks enables methods to use the scope of the parent or to be able to change them. They can access the scope of their parent and they have their own scope. Any variable that is defined inside a function is not accessible outside of it. The variables defined inside them and the corresponding allocated memory goes away when the function terminates. Functions are very powerful in Swift. We can compose a program with only functions as functions can receive and return functions, capture variables that exist in the context they were declared, and can persist data inside themselves. To understand the functional programming paradigms, we need to understand the capability of functions in detail. We need to think if we can avoid classes and only use functions so we will cover all the details related to functions in the upcoming sections of this article. The general syntax of functions and methods We can define functions or methods as follows: accessControl func functionName(parameter: ParameterType) throws -> ReturnType { } As we know already, when functions are defined in objects, they become methods. The first step to define a method is to tell the compiler from where it can be accessed. This concept is called access control in Swift and there are three levels of access control. We are going to explain them for methods as follows: Public access: Any entity can access a method that is defined as public if it is in the same module. If an entity is not in the same module, we will need to import the module to be able to call the method. We need to mark our methods and objects as public when we develop frameworks in order to enable other modules to use them. Internal access: Any method that is defined as internal can be accessed from other entities in a module but cannot be accessed from other modules. Private access: Any method that is defined as private can be accessed only from the same source file. By default, if we do not provide the access modifier, a variable or function becomes internal. Using these access modifiers, we can structure our code properly, for instance, we can hide details from other modules if we define an entity as internal. We can even hide the details of a method from other files if we define them as private. Before Swift 2.0, we had to define everything as public or add all source files to the testing target. Swift 2.0 introduced the @testable import syntax that enables us to define internal or private methods that can be accessed from testing modules. Methods can generally be in three forms: Instance methods: We need to obtain an instance of an object (In this article we will refer to classes, structs, and enums as objects) in order to be able to call the method defined in it, and then we will be able to access the scope and data of the object. Static methods: Swift names them type methods also. They do not need any instances of objects and they cannot access the instance data. They are called by putting a dot after the name of the object type (for example, Person.sayHi()). The static methods cannot be overridden by the subclasses of the object that they reside in. Class methods: Class methods are like the static methods but they can be overridden by subclasses. We have covered the keywords that are required for method definitions; now we will concentrate on the syntax that is shared among functions and methods. There are other concepts related to methods that are out of scope of this article as we will concentrate on functional programming in Swift. Continuing to cover the function definition, now comes the func keyword that is mandatory and is used to tell the compiler that it is going to deal with a function. Then comes the function name that is mandatory and is recommended to be camel-cased with the first letter as lowercase. The function name should be stating what the function does and is recommended to be in the form of a verb when we define our methods in objects. Basically, our classes will be named nouns and methods will be verbs that are in the form of orders to the class. In pure functional programming, as the function does not reside in other objects, they can be named by their functionalities. Parameters follow the func name. They will be defined in parentheses to pass arguments to the function. Parentheses are mandatory even if we do not have any parameters. We will cover all aspects of parameters in an upcoming section of this article. Then comes throws, which is not mandatory. A function or method that is marked with the throw keyword may or may not throw errors. At this point, it is enough to know what they are when we see them in a function or method signature. The next entity in a function type declaration is the return type. If a function is not void, the return type will come after the -> sign. The return type indicates the type of entity that is going to be returned from a function. We will cover return types in detail in an upcoming section in this article, so now we can move on to the last piece of function that is present in most programming languages, our beloved { }. We defined functions as blocks of functionality and {} defines the borders of the block so that the function body is declared and execution happens in there. We will write the functionality inside {}. Best practices in function definition There are proven best practices for function and method definition provided by amazing software engineering resources, such as Clean Code, Code Complete, and Coding Horror, that we can summarize as follows: Try not to exceed 8-10 lines of code in each function as shorter functions or methods are easier to read, understand, and maintain. Keep the number of parameters minimal because the more parameters a function has, the more complex it is. Functions should have at least one parameter and one return value. Avoid using type names in function names as it is going to be redundant. Aim for one and only one functionality in a function. Name a function or method in a way that it describes its functionality properly and is easy to understand. Name functions and methods consistently. If we have a connect function, we can have a disconnect one. Write functions to solve the current problem and generalize it when needed. Try to avoid what if scenarios as probably you aren't going to need it (YAGNI). Calling functions We have covered a general syntax to define a function and method if it resides in an object. Now it is time to talk about how we call our defined functions and methods. To call a function, we will use its name and provide its required parameters. There are complexities with providing parameters that we will cover in the upcoming section. For now, we are going to cover the most basic type of parameter providing as follows: funcName(paramName, secondParam: secondParamName) This type of function calling should be familiar to Objective-C developers as the first parameter name is not named and the rest are named. To call a method, we need to use the dot notation provided by Swift. The following examples are for class instance methods and static class methods: let someClassInstance = SomeClass() someClassInstance.funcName(paramName, secondParam: secondParamName) StaticClass.funcName(paramName, secondParam: secondParamName)   Defining and using function parameters In function definition, parameters follow the function name and they are constants by default so we will not able to alter them inside the function body if we do not mark them with var. In functional programming, we avoid mutability, therefore, we would never use mutable parameters in functions. Parameters should be inside parentheses. If we do not have any parameters, we simply put open and close parentheses without any characters between them: func functionName() { } In functional programming, it is important to have functions that have at least one parameter. We will explain why it is important in upcoming sections. We can have multiple parameters separated by commas. In Swift, parameters are named so we need to provide the parameter name and type after putting a colon, as shown in the following example: func functionName(parameter: ParameterType, secondParameter: ParameterType) { } // To call: functionName(parameter, secondParameter: secondParam) ParameterType can also be an optional type so the function becomes the following if our parameters need to be optionals: func functionName(parameter: ParameterType?, secondParameter: ParameterType?) { } Swift enables us to provide external parameter names that will be used when functions are called. The following example presents the syntax: Func functionName(externalParamName localParamName: ParameterType) // To call: functionName(externalParamName: parameter) Only the local parameter name is usable in the function body. It is possible to omit the parameter names with the _ syntax, for instance, if we do not want to provide any parameter name when the function is called, we can use _ as externalParamName for the second or subsequent parameters. If we want to have a parameter name for the first parameter name in function calls, we can basically provide the local parameter name as external also. In this article, we are going to use the default function parameter definition. Parameters can have default values as follows: func functionName(parameter: Int = 3) { print("(parameter) is provided." } functionName(5) // prints "5 is provided." functionName() // prints "3 is provided" Parameters can be defined as inout to enable function callers obtaining parameters that are going to be changed in the body of a function. As we can use tuples for function returns, it is not recommended to use inout parameters unless we really need them. We can define function parameters as tuples. For instance, the following example function accepts a tuple of the (Int, Int) type: func functionWithTupleParam(tupleParam: (Int, Int)) {} As, under the hood, variables are represented by tuples in Swift, the parameters to a function can also be tuples. For instance, let's have a simple convert function that takes an array of Int and a multiplier and converts it to a different structure. Let's not worry about the implementation of this function for now: let numbers = [3, 5, 9, 10] func convert(numbers: [Int], multiplier: Int) -> [String] { let convertedValues = numbers.enumerate().map { (index, element) in return "(index): (element * multiplier)" } return convertedValues } If we use this function as convert(numbers, multiplier: 3), the result is going to be ["0: 9", "1: 15", "2: 27", "3: 30"]. We can call our function with a tuple. Let's create a tuple and pass it to our function: let parameters = (numbers, multiplier: 3) convert(parameters) The result is identical to our previous function call. However, passing tuples in function calls is deprecated and will be removed in Swift 3.0, so it is not recommended to use them. We can define higher-order functions that can receive functions as parameters. In the following example, we define funcParam as a function type of (Int, Int) -> Int: func functionWithFunctionParam(funcParam: (Int, Int)-> Int) In Swift, parameters can be of a generic type. The following example presents a function that has two generic parameters. In this syntax, any type (for example, T or V) that we put inside <> should be used in parameter definition: func functionWithGenerics<T, V>(firstParam: T, secondParam) Defining and using variadic functions Swift enables us to define functions with variadic parameters. A variadic parameter accepts zero or more values of a specified type. Variadic parameters are similar to array parameters but they are more readable and can only be used as the last parameter in the multiparameter functions. As variadic parameters can accept zero values, we will need to check whether it is empty. The following example presents a function with variadic parameters of the String type: func greet(names: String…) { for name in names { print("Greetings, (name)") } } // To call this function greet("Steve", "Craig") // prints twice greet("Steve", "Craig", "Johny") // prints three times Returning values from functions If we need our function to return a value, tuple, or another function, we can specify it by providing ReturnType after ->. For instance, the following example returns String: func functionName() -> String { } Any function that has ReturnType in its definition should have a return keyword with the matching type in its body. Return types can be optionals in Swift so the function becomes as follows if the return needs to be optional: func functionName() -> String? { } Tuples can be used to provide multiple return values. For instance, the following function returns tuple of the (Int, String) type: func functionName() -> (code: Int, status: String) { } As we are using parentheses for tuples, we should avoid using parentheses for single return value functions. Tuple return types can be optional too so the syntax becomes as follows: func functionName() -> (code: Int, status: String)? { } This syntax makes the entire tuple optional; if we want to make only status optional, we can define the function as follows: func functionName() -> (code: Int, status: String?) { } In Swift, functions can return functions. The following example presents a function with the return type of a function that takes two Int values and returns Int: func funcName() -> (Int, Int)-> Int {} If we do not expect a function to return any value, tuple, or function, we simply do not provide ReturnType: func functionName() { } We could also explicitly declare it with the Void keyword: func functionName() { } In functional programming, it is important to have return types in functions. In other words, it is a good practice to avoid functions that have Void as return types. A function with the Void return type typically is a function that changes another entity in the code; otherwise, why would we need to have a function? OK, we might have wanted to log an expression to the console/log file or write data to a database or file to a filesystem. In these cases, it is also preferable to have a return or feedback related to the success of the operation. As we try to avoid mutability and stateful programming in functional programming, we can assume that our functions will have returns in different forms. This requirement is in line with mathematical underlying bases of functional programming. In mathematics, a simple function is defined as follows: y = f(x) or f(x) -> y Here, f is a function that takes x and returns y. Therefore, a function receives at least one parameter and returns at least a value. In functional programming, following the same paradigm makes reasoning easier, function composition possible, and code more readable. Summary This article explained the function definition and usage in detail by giving examples for parameter and return types. You can also refer the following books on the similar topics: Protocol-Oriented Programming with Swift: https://www.packtpub.com/application-development/protocol-oriented-programming-swift OpenStack Object Storage (Swift) Essentials: https://www.packtpub.com/virtualization-and-cloud/openstack-object-storage-swift-essentials Implementing Cloud Storage with OpenStack Swift: https://www.packtpub.com/virtualization-and-cloud/implementing-cloud-storage-openstack-swift Resources for Article: Further resources on this subject: Introducing the Swift Programming Language [article] Swift for Open Source Developers [article] Your First Swift App [article]
Read more
  • 0
  • 0
  • 3884

article-image-adding-connectors-bonita
Packt
11 Nov 2013
7 min read
Save for later

Adding Connectors in Bonita

Packt
11 Nov 2013
7 min read
(For more resources related to this topic, see here.) Bonita connectors Bonita connectors are used to set variables or some other parameters inside Bonita. They can also be used to start a process or execute a step. These connectors equip the user to connect with different parameters of the Bonita work flow. The other kind of connectors are used to integrate with some other third-party tools. Most of the Bonita connectors are related to the documents and comments at a particular step. Although these may be useful in some cases, in a majority of the cases we will not find much use for them. The most useful ones are getting the users a step, executing a step, starting a new process, and setting variables. Click on any step on which you want to define the connector and click on Add.... Here, we will check the start an instance connector of Bonita. Give a name to this connector and click on Next. Here we have to fill in the name of the process that we want to invoke. We also have an option to specify different versions of the process. If we leave this blank, it will pick up the latest version. Next, we can specify the process variables that need to be copied from one pool to the other. Start an instance connector in Bonita Studio In the previous example, the process variables that we specify will be copied over to the target pool. We have to make sure that the target pool has the process variables mentioned in this connector. Make sure that you mention the name of the variable in the first column without the curly braces. If you select the names from the drop-down menu, make sure you remove the $ and the {} for filling in the name. The value field can be filled by the actual process variable. We can also use the set variable connector to set a value to a variable, either a process variable or a step variable. Here, we have two parameters: one is the variable whose value we have to set and the other parameter is the actual value of the variable. Note that this value may be a Groovy expression, too. Hence, it is similar to writing a Groovy script to assign a value to a variable. Another type of connector is the one to start or finish a step. In this connector, all we have to do is mention the name of the step we want to start or stop. Similarly, there is another connector to execute a step. Executing will run all the start and end Connectors of a particular step and then finish it. These connectors might be useful in the cases where some step may be waiting for another step, and at the end of the current step we might execute that step or mark it finished. We also have connectors to get the users from the workflow. There are connectors to find out the initiator of a process and the step submitter. Another useful connector is to get a user based on the username. This returns the User class that Bonita uses to implement the functionality of a user in the work flow. Select the connector to get a user from a username. Enter the username and click on Next. Here, we get the output of the connector and we can decide to save the output in a particular pool or step variable. Saving the connector output in a variable in Bonita The user class has methods to retrieve data, such as the e-mail, first name, last name, metadata, and password from the user. The e-mail connector We have a connector in the messaging group to send an e-mail. Now, we might use this connector for a variety of purposes: to send information about the work flow to an external e-mail, to send a notification to the person performing the task that he/she has some pending items in his/her inbox, and so on. We have to configure the e-mail connector on various parameters. In our TicketingWorkflow, let us send an e-mail to the person in whose name the tickets are booked. He/she enters his/her e-mail address in the Payment step of the workflow. Hence, let us send an e-mail at the end of the Payment step to the person at his/her e-mail address with which the tickets have been booked. For this, let us configure the e-mail connector: Click on the Payment step of the work flow. Click on the Connectors tab to add a connector. Select the connector as a medium to send an e-mail. Then name the connector as SendEmail and make sure that this connector is at the finish event of the step. In the next step, we are required to enter the configuration details of the SMTP server we will use for sending the e-mail. By default, it is set to the Gmail configuration with the host as smtp.gmail.com and the port as 465. Let us stick to the default option and send an e-mail from a Gmail hosted server. Leave the Security option as it is, but enter your credentials in the Authentication section. Here, you should enter your full e-mail address, not just your username. You can also use your own domain e-mail address if it is hosted on a Gmail server. Next, we define the parameters of the e-mail notification that has to be sent. After entering the From address as the ticketing admin address or some similar address, enter the To address as the variable in which we have saved the e-mail address: email. In the title field, we have to specify the subject of the e-mail. We have already seen that we can use Java inside the Groovy editor. Here, we will have a look at a simple Java code that is executed inside the editor. Enter the following code in the Groovy editor: import java.text.SimpleDateFormat; return "Flight ticket from " + from + " to " + to + " on " + new SimpleDateFormat("MM-dd-yyyy").format(departOn); The overview of the flight details is mentioned in the subject of the e-mail. We know that the departOn variable is a Date object. For printing the date, we have to convert it into a String by using the SimpleDateFormat class. Next, we have to write the actual e-mail that we will send to the customer. Below the Title field, make sure that the e-mail body is in HTML and not plain text. We can insert Groovy scripts in between the text, which will be substituted with the actual variable value when the e-mail is sent. Write the following in the body of the e-mail: Hi ${passenger1}, Your ${from} to ${to} flight is confirmed. The flight details are given below: Date Departure  Arrival Duration Price ${import java.text. SimpleDateFormat; return new SimpleDateFormat ("MM-dd-yyyy"). format(departOn); ${departure} ${arrival} ${duration} ${price} Travelers: ${passenger1} ${passenger2} ${passenger3} Payment Details: Card Holder - ${cardHolder} Card Number - ${cardNumber} Thank you for booking with TicketingWorkflow! Configuring the e-mail connector Clicking on Next will get you to the advanced options. Generally it's not really required to configure these options, and we can make do with the default settings. Summary This article looked at the various connector integration options available in Bonita Studio. It showed how connectors can be used to fetch data into the workflow and how to export data, too. We have a close look at the Bonita inbuilt connectors and e-mail connectors. Resources for Article: Further resources on this subject: Oracle BPM Suite 11gR1: Creating a BPM Application [Article] Managing Oracle Business Intelligence [Article] Setting Up Oracle Order Management [Article]
Read more
  • 0
  • 0
  • 3875

article-image-database-active-record-and-model-tricks
Packt
11 Jul 2013
14 min read
Save for later

Database, Active Record, and Model Tricks

Packt
11 Jul 2013
14 min read
(For more resources related to this topic, see here.) Getting data from a database Most applications today use databases. Be it a small website or a social network, at least some parts are powered by databases. Yii introduces three ways that allow you to work with databases: Active Record Query builder SQL via DAO We will use all these methods to get data from the film, film_actor, and actor tables and show it in a list. We will measure the execution time and memory usage to determine when to use these methods. Getting ready Create a new application by using yiic webapp as described in the official guide at the following URL:http://www.yiiframework.com/doc/guide/en/quickstart.first-app Download the Sakila database from the following URL:http://dev.mysql.com/doc/index-other.html Execute the downloaded SQLs; first schema then data. Configure the DB connection in protected/config/main.php to use the Sakila database. Use Gii to create models for the actor and film tables. How to do it... We will create protected/controllers/DbController.php as follows: <?php class DbController extends Controller { protected function afterAction($action) { $time = sprintf('%0.5f', Yii::getLogger() ->getExecutionTime()); $memory = round(memory_get_peak_usage()/(1024*1024),2)."MB"; echo "Time: $time, memory: $memory"; parent::afterAction($action); } public function actionAr() { $actors = Actor::model()->findAll(array('with' => 'films', 'order' => 't.first_name, t.last_name, films.title')); echo '<ol>'; foreach($actors as $actor) { echo '<li>'; echo $actor->first_name.' '.$actor->last_name; echo '<ol>'; foreach($actor->films as $film) { echo '<li>'; echo $film->title; echo '</li>'; } echo '</ol>'; echo '</li>'; } echo '</ol>'; } public function actionQueryBuilder() { $rows = Yii::app()->db->createCommand() ->from('actor') ->join('film_actor', 'actor.actor_id=film_actor.actor_id') ->leftJoin('film', 'film.film_id=film_actor.film_id') ->order('actor.first_name, actor.last_name, film.title') ->queryAll(); $this->renderRows($rows); } public function actionSql() { $sql = "SELECT * FROM actor a JOIN film_actor fa ON fa.actor_id = a.actor_id JOIN film f ON fa.film_id = f.film_id ORDER BY a.first_name, a.last_name, f.title"; $rows = Yii::app()->db->createCommand($sql)->queryAll(); $this->renderRows($rows); } public function renderRows($rows) { $lastActorName = null; echo '<ol>'; foreach($rows as $row) { $actorName = $row['first_name'].' '.$row['last_name']; if($actorName!=$lastActorName){ if($lastActorName!==null){ echo '</ol>'; echo '</li>'; } $lastActorName = $actorName; echo '<li>'; echo $actorName; echo '<ol>'; } echo '<li>'; echo $row['title']; echo '</li>'; } echo '</ol>'; } } Here, we have three actions corresponding to three different methods of getting data from a database. After running the preceding db/ar, db/queryBuilder and db/sql actions, you should get a tree showing 200 actors and 1,000 films they have acted in, as shown in the following screenshot: At the bottom there are statistics that give information about the memory usage and execution time. Absolute numbers can be different if you run this code, but the difference between the methods used should be about the same: Method Memory usage (megabytes) Execution time (seconds) Active Record 19.74 1.14109 Query builder 17.98 0.35732 SQL (DAO) 17.74 0.35038 How it works... Let's review the preceding code. The actionAr action method gets model instances by using the Active Record approach. We start with the Actor model generated with Gii to get all the actors and specify 'with' => 'films' to get the corresponding films using a single query or eager loading through relation, which Gii builds for us from InnoDB table foreign keys. We then simply iterate over all the actors and for each actor—over each film. Then for each item, we print its name. The actionQueryBuilder function uses query builder. First, we create a query command for the current DB connection with Yii::app()->db->createCommand(). We then add query parts one by one with from, join, and leftJoin. These methods escape values, tables, and field names automatically. The queryAll function returns an array of raw database rows. Each row is also an array indexed with result field names. We pass the result to renderRows, which renders it. With actionSql, we do the same, except we pass SQL directly instead of adding its parts one by one. It's worth mentioning that we should escape parameter values manually with Yii::app()->db->quoteValue before using them in the query string. The renderRows function renders the query builder. The DAO raw row requires you to add more checks and generally, it feels unnatural compared to rendering an Active Record result. As we can see, all these methods give the same result in the end, but they all have different performance, syntax, and extra features. We will now do a comparison and figure out when to use each method: Method Active Record Query Builder SQL (DAO) Syntax This will do SQL for you. Gii will generate models and relations for you. Works with models, completely OO-style, and very clean API. Produces array of properly nested models as the result. Clean API, suitable for building query on the fly. Produces raw data arrays as the result. Good for complex SQL. Manual values and keywords quoting. Not very suitable for building query on the fly. Produces raw data arrays as results. Performance Higher memory usage and execution time compared to SQL and query builder. Okay. Okay. Extra features Quotes values and names automatically. Behaviors. Before/after hooks. Validation. Quotes values and names automatically. None. Best for Prototyping selects. Update, delete, and create actions for single models (model gives a huge benefit when using with forms). Working with large amounts of data, building queries on the fly. Complex queries you want to do with pure SQL and have maximum possible performance. There's more... In order to learn more about working with databases in Yii, refer to the following resources: http://www.yiiframework.com/doc/guide/en/database.dao http://www.yiiframework.com/doc/guide/en/database.query-builder http://www.yiiframework.com/doc/guide/en/database.ar See also The Using CDbCriteria recipe Defining and using multiple DB connections Multiple database connections are not used very often for new standalone web applications. However, when you are building an add-on application for an existing system, you will most probably need another database connection. From this recipe you will learn how to define multiple DB connections and use them with DAO, query builder, and Active Record models. Getting ready Create a new application by using yiic webapp as described in the official guide at the following URL:http://www.yiiframework.com/doc/guide/en/quickstart.first-app Create two MySQL databases named db1 and db2. Create a table named post in db1 as follows: DROP TABLE IF EXISTS `post`; CREATE TABLE IF NOT EXISTS `post` ( `id` INT(10) UNSIGNED NOT NULL AUTO_INCREMENT, `title` VARCHAR(255) NOT NULL, `text` TEXT NOT NULL, PRIMARY KEY (`id`) ); Create a table named comment in db2 as follows: DROP TABLE IF EXISTS `comment`; CREATE TABLE IF NOT EXISTS `comment` ( `id` INT(10) UNSIGNED NOT NULL AUTO_INCREMENT, `text` TEXT NOT NULL, `postId` INT(10) UNSIGNED NOT NULL, PRIMARY KEY (`id`) ); How to do it... We will start with configuring the DB connections. Open protected/config/main.php and define a primary connection as described in the official guide: 'db'=>array( 'connectionString' => 'mysql:host=localhost;dbname=db1', 'emulatePrepare' => true, 'username' => 'root', 'password' => '', 'charset' => 'utf8', ), Copy it, rename the db component to db2, and change the connection string accordingly. Also, you need to add the class name as follows: 'db2'=>array( 'class'=>'CDbConnection', 'connectionString' => 'mysql:host=localhost;dbname=db2', 'emulatePrepare' => true, 'username' => 'root', 'password' => '', 'charset' => 'utf8', ), That is it. Now you have two database connections and you can use them with DAO and query builder as follows: $db1Rows = Yii::app()->db->createCommand($sql)->queryAll(); $db2Rows = Yii::app()->db2->createCommand($sql)->queryAll(); Now, if we need to use Active Record models, we first need to create Post and Comment models with Gii. Starting from Yii version 1.1.11, you can just select an appropriate connection for each model.Now you can use the Comment model as usual. Create protected/controllers/DbtestController.php as follows: <?php class DbtestController extends CController { public function actionIndex() { $post = new Post(); $post->title = "Post #".rand(1, 1000); $post->text = "text"; $post->save(); echo '<h1>Posts</h1>'; $posts = Post::model()->findAll(); foreach($posts as $post) { echo $post->title."<br />"; } $comment = new Comment(); $comment->postId = $post->id; $comment->text = "comment #".rand(1, 1000); $comment->save(); echo '<h1>Comments</h1>'; $comments = Comment::model()->findAll(); foreach($comments as $comment) { echo $comment->text."<br />"; } } } Run dbtest/index multiple times and you should see records added to both databases, as shown in the following screenshot: How it works... In Yii you can add and configure your own components through the configuration file. For non-standard components, such as db2, you have to specify the component class. Similarly, you can add db3, db4, or any other component, for example, facebookApi. The remaining array key/value pairs are assigned to the component's public properties respectively. There's more... Depending on the RDBMS used, there are additional things we can do to make it easier to use multiple databases. Cross-database relations If you are using MySQL, it is possible to create cross-database relations for your models. In order to do this, you should prefix the Comment model's table name with the database name as follows: class Comment extends CActiveRecord { //… public function tableName() { return 'db2.comment'; } //… } Now, if you have a comments relation defined in the Post model relations method, you can use the following code: $posts = Post::model()->with('comments')->findAll(); Further reading For further information, refer to the following URL: http://www.yiiframework.com/doc/api/CActiveRecord See also The Getting data from a database recipe Using scopes to get models for different languages Internationalizing your application is not an easy task. You need to translate interfaces, translate messages, format dates properly, and so on. Yii helps you to do this by giving you access to the Common Locale Data Repository ( CLDR ) data of Unicode and providing translation and formatting tools. When it comes to applications with data in multiple languages, you have to find your own way. From this recipe, you will learn a possible way to get a handy model function that will help to get blog posts for different languages. Getting ready Create a new application by using yiic webapp as described in the official guide at the following URL:http://www.yiiframework.com/doc/guide/en/quickstart.first-app Set up the database connection and create a table named post as follows: DROP TABLE IF EXISTS `post`; CREATE TABLE IF NOT EXISTS `post` ( `id` INT(10) UNSIGNED NOT NULL AUTO_INCREMENT, `lang` VARCHAR(5) NOT NULL DEFAULT 'en', `title` VARCHAR(255) NOT NULL, `text` TEXT NOT NULL, PRIMARY KEY (`id`) ); INSERT INTO `post`(`id`,`lang`,`title`,`text`) VALUES (1,'en_us','Yii news','Text in English'), (2,'de','Yii Nachrichten','Text in Deutsch'); Generate a Post model using Gii. How to do it... Add the following methods to protected/models/Post.php as follows: class Post extends CActiveRecord { public function defaultScope() { return array( 'condition' => "lang=:lang", 'params' => array( ':lang' => Yii::app()->language, ), ); } public function lang($lang){ $this->getDbCriteria()->mergeWith(array( 'condition' => "lang=:lang", 'params' => array( ':lang' => $lang, ), )); return $this; } } That is it. Now, we can use our model. Create protected/controllers/ DbtestController.php as follows: <?php class DbtestController extends CController { public function actionIndex() { // Get posts written in default application language $posts = Post::model()->findAll(); echo '<h1>Default language</h1>'; foreach($posts as $post) { echo '<h2>'.$post->title.'</h2>'; echo $post->text; } // Get posts written in German $posts = Post::model()->lang('de')->findAll(); echo '<h1>German</h1>'; foreach($posts as $post) { echo '<h2>'.$post->title.'</h2>'; echo $post->text; } } } Now, run dbtest/index and you should get an output similar to the one shown in the following screenshot: How it works... We have used Yii's Active Record scopes in the preceding code. The defaultScope function returns the default condition or criteria that will be applied to all the Post model query methods. As we need to specify the language explicitly, we create a scope named lang, which accepts the language name. With $this->getDbCriteria(), we get the model's criteria in its current state and then merge it with the new condition. As the condition is exactly the same as in defaultScope, except for the parameter value, it overrides the default scope. In order to support chained calls, lang returns the model instance by itself. There's more... For further information, refer to the following URLs: http://www.yiiframework.com/doc/guide/en/database.ar http://www.yiiframework.com/doc/api/CDbCriteria/ See also The Getting data from a database recipe The Using CDbCriteria recipe Processing model fields with AR event-like methods Active Record implementation in Yii is very powerful and has many features. One of these features is event-like methods , which you can use to preprocess model fields before putting them into the database or getting them from a database, as well as deleting data related to the model, and so on. In this recipe, we will linkify all URLs in the post text and we will list all existing Active Record event-like methods. Getting ready Create a new application by using yiic webapp as described in the official guide at the following URL:http://www.yiiframework.com/doc/guide/en/quickstart.first-app Set up a database connection and create a table named post as follows: DROP TABLE IF EXISTS `post`; CREATE TABLE IF NOT EXISTS `post` ( `id` INT(10) UNSIGNED NOT NULL AUTO_INCREMENT, `title` VARCHAR(255) NOT NULL, `text` TEXT NOT NULL, PRIMARY KEY (`id`) ); Generate the Post model using Gii How to do it... Add the following method to protected/models/Post.php as follows: protected function beforeSave() { $this->text = preg_replace('~((?:https?|ftps?)://.*?)( |$)~iu', '<a href="1">1</a>2', $this->text); return parent::beforeSave(); } That is it. Now, try saving a post containing a link. Create protected/controllers/TestController.php as follows: <?php class TestController extends CController { function actionIndex() { $post=new Post(); $post->title='links test'; $post->text='test http://www.yiiframework.com/ test'; $post->save(); print_r($post->text); } } Run test/index. You should get the following: How it works... The beforeSave method is implemented in the CActiveRecord class and executed just before saving a model. By using a regular expression, we replace everything that looks like a URL with a link that uses this URL and call the parent implementation, so that real events are raised properly. In order to prevent saving, you can return false. There's more... There are more event-like methods available as shown in the following table: Method name Description afterConstruct Called after a model instance is created by the new operator beforeDelete/afterDelete Called before/after deleting a record beforeFind/afterFind Method is invoked before/after each record is instantiated by a find method beforeSave/afterSave Method is invoked before/after saving a record successfully beforeValidate/afterValidate Method is invoked before/after validation ends Further reading In order to learn more about using event-like methods in Yii, you can refer to the following URLs: http://www.yiiframework.com/doc/api/CActiveRecord/ http://www.yiiframework.com/doc/api/CModel See also The Using Yii events recipe  The Highlighting code with Yii recipe The Automating timestamps recipe The Setting up an author automatically recipe
Read more
  • 0
  • 0
  • 3874

article-image-asynchrony-action
Packt
06 Mar 2013
17 min read
Save for later

Asynchrony in Action

Packt
06 Mar 2013
17 min read
(For more resources related to this topic, see here.) Asynchrony When we talk about C# 5.0, the primary topic of conversation is the new asynchronous programming features. What does asynchrony mean? Well, it can mean a few different things, but in our context, it is simply the opposite of synchronous. When you break up execution of a program into asynchronous blocks, you gain the ability to execute them side-by-side, in parallel. As you can see in the following diagram, executing multiple ac-tions concurrently can bring various positive qualities to your programs: Parallel execution can bring performance improvements to the execution of a program. The best way to put this into context is by way of an example, an example that has been experienced all too often in the world of desktop software. Let's say you have an application that you are developing, and this software should fulfill the following requirements: When the user clicks on a button, initiate a call to a web service. Upon completion of the web service call, store the results into a database. Finally, bind the results and display them to the user. There are a number of problems with the naïve way of implementing this solution. The first is that many developers write code in such a way that the user interface will be completely unresponsive while we are waiting to receive the results of these web service calls. Then, once the results finally arrive, we continue to make the user wait while we store the results in a database, an operation that the user does not care about in this case. The primary vehicle for mitigating these kinds of problems in the past has been writing multithreaded code. This is of course nothing new, as multi-threaded hardware has been around for many years, along with software capabilities to take advantage of this hardware. Most of the programming languages did not provide a very good abstraction layer on top of this hardware, often letting (or requiring) you program directly against the hardware threads. Thankfully, Microsoft introduced a new library to simplify the task of writing highly concurrent programs, which is explained in the next section. Task Parallel Library The Task Parallel Library (TPL) was introduced in .NET 4.0 (along with C# 4.0). Firstly, it is a huge topic and could not have been examined properly in such a small space. Secondly, it is highly relevant to the new asynchrony features in C# 5.0, so much so that they are the literal foundation upon which the new features are built. So, in this section, we will cover the basics of the TPL, along with some of the background information about how and why it works. TPL introduces a new type, the Task type, which abstracts away the concept of something that must be done into an object. At first glance, you might think that this abstraction already exists in the form of the Thread class. While there are some similarities between Task and Thread, the implementations have quite different implications. With a Thread class, you can program directly against the lowest level of parallelism supported by the operating system, as shown in the following code: Thread thread = new Thread(new ThreadStart(() => { Thread.Sleep(1000); Console.WriteLine("Hello, from the Thread"); })); thread.Start(); Console.WriteLine("Hello, from the main thread"); thread.Join(); In the previous example, we create a new Thread class, which when started will sleep for a second and then write out the text Hello, from the Thread. After we call thread.Start(), the code on the main thread immediately continues and writes Hello, from the main thread. After a second, we see the text from the background thread printed to the screen. In one sense, this example of using the Thread class shows how easy it is to branch off the execution to a background thread, while allowing execution of the main thread to continue, unimpeded. However, the problem with using the Thread class as your "concurrency primitive" is that the class itself is an indication of the implementation, which is to say, an operating system thread will be created. As far as abstractions go, it is not really an abstraction at all; your code must both manage the lifecycle of the thread, while at the same time dealing with the task the thread is executing. If you have multiple tasks to execute, spawning multiple threads can be disastrous, because the operating system can only spawn a finite number of them. For performance intensive applications, a thread should be considered a heavyweight resource, which means you should avoid using too many of them, and keep them alive for as long as possible. As you might imagine, the designers of the .NET Framework did not simply leave you to program against this without any help. The early versions of the frameworks had a mechanism to deal with this in the form of the ThreadPool, which lets you queue up a unit of work, and have the thread pool manage the lifecycle of a pool of threads. When a thread becomes available, your work item is then executed. The following is a simple example of using the thread pool: int[] numbers = { 1, 2, 3, 4 }; foreach (var number in numbers) { ThreadPool.QueueUserWorkItem(new WaitCallback(o => { Thread.Sleep(500); string tabs = new String('t', (int)o); Console.WriteLine("{0}processing #{1}", tabs, o); }), number); } This sample simulates multiple tasks, which should be executed in parallel. We start with an array of numbers, and for each number we want to queue a work item that will sleep for half a second, and then write to the console. This works much better than trying to manage multiple threads yourself because the pool will take care of spawning more threads if there is more work. When the configured limit of concurrent threads is reached, it will hold work items until a thread becomes available to process it. This is all work that you would have done yourself if you were using threads directly. However, the thread pool is not without its complications. First, it offers no way of synchronizing on completion of the work item. If you want to be notified when a job is completed, you have to code the notification yourself, whether by raising an event, or using a thread synchronization primitive, such as ManualResetEvent. You also have to be careful not to queue too many work items, or you may run into system limitations with the size of the thread pool. With the TPL, we now have a concurrency primitive called Task. Consider the following code: Task task = Task.Factory.StartNew(() => { Thread.Sleep(1000); Console.WriteLine("Hello, from the Task"); }); Console.WriteLine("Hello, from the main thread"); task.Wait(); Upon first glance, the code looks very similar to the sample using Thread, but they are very different. One big difference is that with Task, you are not committing to an implementation. The TPL uses some very interesting algorithms behind the scenes to manage the workload and system resources, and in fact, allows you customize those algorithms through the use of custom schedulers and synchronization contexts. This allows you to control the parallel execution of your programs with a high degree of control. Dealing with multiple tasks, as we did with the thread pool, is also easier because each task has synchronization features built-in. To demonstrate how simple it is to quickly parallelize an arbitrary number of tasks, we start with the same array of integers, as shown in the previous thread pool example: int[] numbers = { 1, 2, 3, 4 }; Because Task can be thought of as a primitive type that represents an asynchronous task, we can think of it as data. This means that we can use things such as Linq to project the numbers array to a list of tasks as follows: var tasks = numbers.Select(number => Task.Factory.StartNew(() => { Thread.Sleep(500); string tabs = new String('t', number); Console.WriteLine("{0}processing #{1}", tabs, number); })); And finally, if we wanted to wait until all of the tasks were done before continuing on, we could easily do that by calling the following method: Task.WaitAll(tasks.ToArray()); Once the code reaches this method, it will wait until every task in the array completes before continuing on. This level of control is very convenient, especially when you consider that, in the past, you would have had to depend on a number of different synchronization techniques to achieve the very same result that was accomplished in just a few lines of TPL code. With the usage patterns that we have discussed so far, there is still a big disconnect between the process that spawns a task, and the child process. It is very easy to pass values into a background task, but the tricky part comes when you want to retrieve a value and then do something with it. Consider the following requirements: Make a network call to retrieve some data. Query the database for some configuration data. Process the results of the network data, along with the configuration data. The following diagram shows the logic: Both the network call and query to the database can be done in parallel. With what we have learned so far about tasks, this is not a problem. However, acting on the results of those tasks would be slightly more complex, if it were not for the fact that the TPL provides support for exactly that scenario. There is an additional kind of Task, which is especially useful in cases like this called Task<T>. This generic version of a task expects the running task to ultimately return a value, whenever it is finished. Clients of the task can access the value through the .Result property of the task. When you call that property, it will return immediately if the task is completed and the result is available. If the task is not done, however, it will block execution in the current thread until it is. Using this kind of task, which promises you a result, you can write your programs such that you can plan for and initiate the parallelism that is required, and handle the response in a very logical manner. Look at the following code: varwebTask = Task.Factory.StartNew(() => { WebClient client = new WebClient(); return client.DownloadString("http://bing.com"); }); vardbTask = Task.Factory.StartNew(() => { // do a lengthy database query return new { WriteToConsole=true }; }); if (dbTask.Result.WriteToConsole) { Console.WriteLine(webTask.Result); } else { ProcessWebResult(webTask.Result); } In the previous example, we have two tasks, the webTask, and dbTask, which will execute at the same time. The webTask is simply downloading the HTML from http://bing.com Accessing things over the Internet can be notoriously flaky due to the dynamic nature of accessing the network so you never know how long that is going to take. With the dbTask task, we are simulating accessing a database to return some stored settings. Although in this simple example we are just returning a static anonymous type, database access will usually access a different server over the network; again, this is an I/O bound task just like downloading something over the Internet. Rather than waiting for both of them to execute like we did with Task.WaitAll, we can simply access the .Result property of the task. If the task is done, the result will be returned and execution can continue, and if not, the program will simply wait until it is. This ability to write your code without having to manually deal with task synchronization is great because the fewer concepts a programmer has to keep in his/her head, the more resources he/she can devote to the program. If you are curious about where this concept of a task that returns a value comes from, you can look for resources pertaining to "Futures", and "Promises" at: http://en.wikipedia.org/wiki/Promise_%28programming%29 At the simplest level, this is a construct that "promises" to give you a result in the "future", which is exactly what Task<T> does. Task composability Having a proper abstraction for asynchronous tasks makes it easier to coordinate multiple asynchronous activities. Once the first task has been initiated, the TPL allows you to compose a number of tasks together into a cohesive whole using what are called continuations. Look at the following code: Task<string> task = Task.Factory.StartNew(() => { WebClient client = new WebClient(); return client.DownloadString("http://bing.com"); }); task.ContinueWith(webTask => { Console.WriteLine(webTask.Result); }); Every task object has the .ContinueWith method, which lets you chain another task to it. This continuation task will begin execution once the first task is done. Unlike the previous example, where we relied on the .Result method to wait until the task was done—thus potentially holding up the main thread while it completed—the continuation will run asynchronously. This is a better approach for composing tasks because you can write tasks that will not block the UI thread, which results in very responsive applications. Task composability does not stop at providing continuations though, the TPL also provides considerations for scenarios, where a task must launch a number of subtasks. You have the ability to control how completion of those child tasks affects the parent task. In the following example, we will start a task, which will in turn launch a number of subtasks: int[] numbers = { 1, 2, 3, 4, 5, 6 }; varmainTask = Task.Factory.StartNew(() => { // create a new child task foreach (intnum in numbers) { int n = num; Task.Factory.StartNew(() => { Thread.SpinWait(1000); int multiplied = n * 2; Console.WriteLine("Child Task #{0}, result {1}", n, multiplied); }); } }); mainTask.Wait(); Console.WriteLine("done"); Each child task will write to the console, so that you can see how the child tasks behave along with the parent task. When you execute the previous program, it results in the following output: Child Task #1, result 2 Child Task #2, result 4 done Child Task #3, result 6 Child Task #6, result 12 Child Task #5, result 10 Child Task #4, result 8 Notice how even though you have called the .Wait() method on the outer task before writing done, the execution of the child task continues a bit longer after the task is concluded. This is because, by default, child tasks are detached, which means their execution is not tied to the task that launched it. An unrelated, but important bit in the previous example code, is you will notice that we assigned the loop variable to an intermediary variable before using it in the task. int n = num; Task.Factory.StartNew(() => { int multiplied = n * 2; This is actually related to the way closures work, and is a common misconception when trying to "pass in" values in a loop. Because the closure actually creates a reference to the value, rather than copying the value in, using the loop value will end up changing every time the loop iterates, and you will not get the behavior you expect. As you can see, an easy way to mitigate this is to set the value to a local variable before passing it into the lambda expression. That way, it will not be a reference to an integer that changes before it is used. You do however have the option to mark a child task as Attached, as follows: Task.Factory.StartNew( () =>DoSomething(), TaskCreationOptions.AttachedToParent); The TaskCreationOptions enumeration has a number of different options. Specifically in this case, the ability to attach a task to its parent task means that the parent task will not complete until all child tasks are complete. Other options in TaskCreationOptions let you give hints and instructions to the task scheduler. From the documentation, the following are the descriptions of all these options: None: This specifies that the default behavior should be used. PreferFairness: This is a hint to a TaskScheduler class to schedule a task in as fair a manner as possible, meaning that tasks scheduled sooner will be more likely to be run sooner, and tasks scheduled later will be more likely to be run later. LongRunning: This specifies that a task will be a long-running, coarsegrained operation. It provides a hint to the TaskScheduler class that oversubscription may be warranted. AttachedToParent: This specifies that a task is attached to a parent in the task hierarchy. DenyChildAttach: This specifies that an exception of the type InvalidOperationException will be thrown if an attempt is made to attach a child task to the created task. HideScheduler: This prevents the ambient scheduler from being seen as the current scheduler in the created task. This means that operations such as StartNew or ContinueWith that are performed in the created task, will see Default as the current scheduler. The best part about these options, and the way the TPL works, is that most of them are merely hints. So you can suggest that a task you are starting is long running, or that you would prefer tasks scheduled sooner to run first, but that does not guarantee this will be the case. The framework will take the responsibility of completing the tasks in the most efficient manner, so if you prefer fairness, but a task is taking too long, it will start executing other tasks to make sure it keeps using the available resources optimally. Error handling with tasks Error handling in the world of tasks needs special consideration. In summary, when an exception is thrown, the CLR will unwind the stack frames looking for an appropriate try/catch handler that wants to handle the error. If the exception reaches the top of the stack, the application crashes. With asynchronous programs, though, there is not a single linear stack of execution. So when your code launches a task, it is not immediately obvious what will happen to an exception that is thrown inside of the task. For example, look at the following code: Task t = Task.Factory.StartNew(() => { throw new Exception("fail"); }); This exception will not bubble up as an unhandled exception, and your application will not crash if you leave it unhandled in your code. It was in fact handled, but by the task machinery. However, if you call the .Wait() method, the exception will bubble up to the calling thread at that point. This is shown in the following example: try { t.Wait(); } catch (Exception ex) { Console.WriteLine(ex.Message); } When you execute that, it will print out the somewhat unhelpful message One or more errors occurred, rather than the fail message that is the actual message contained in the exception. This is because unhandled exceptions that occur in tasks will be wrapped in an AggregateException exception, which you can handle specifically when dealing with task exceptions. Look at the following code: catch (AggregateException ex) { foreach (var inner in ex.InnerExceptions) { Console.WriteLine(inner.Message); } } If you think about it, this makes sense, because of the way that tasks are composable with continuations and child tasks, this is a great way to represent all of the errors raised by this task. If you would rather handle exceptions on a more granular level, you can also pass a special TaskContinuationOptions parameter as follows: Task.Factory.StartNew(() => { throw new Exception("Fail"); }).ContinueWith(t => { // log the exception Console.WriteLine(t.Exception.ToString()); }, TaskContinuationOptions.OnlyOnFaulted); This continuation task will only run if the task that it was attached to is faulted (for example, if there was an unhandled exception). Error handling is, of course, something that is often overlooked when developers write code, so it is important to be familiar with the various methods of handling exceptions in an asynchronous world.
Read more
  • 0
  • 0
  • 3855
Unlock access to the largest independent learning library in Tech for FREE!
Get unlimited access to 7500+ expert-authored eBooks and video courses covering every tech area you can think of.
Renews at €18.99/month. Cancel anytime
article-image-high-availability-oracle-11g-r1-r2-real-application-clusters-rac
Packt
20 May 2011
12 min read
Save for later

High Availability: Oracle 11g R1 R2 Real Application Clusters (RAC)

Packt
20 May 2011
12 min read
High availability is a discipline within database technology that provides a solution to protect against data loss and against downtime, which is costly to mission-critical database systems. As such, we will provide details on what constitutes high availability and what does not. By having the proper framework, you will understand how to leverage Oracle RAC and auxiliary technologies including Oracle Data Guard to maximize the Return On Investment (ROI) for your data center environment. High availability concepts High availability provides data center environments that run mission-critical database applications with the resiliency to withstand failures that may occur due to natural, human, or environmental conditions. For example, if a hurricane wipes out the production data center that hosts a financial application's production database, high availability would provide the much-needed protection to avoid data loss, minimize downtime, and maximize the availability of the firm's resources and database applications. Let's now move to the high availability concepts. Planned versus unplanned downtime The distinction needs to be made between planned downtime and unplanned downtime. In most cases, planned downtime is the result of maintenance that is disruptive to system operations and cannot be avoided with current system designs for a data center. An example of planned downtime would be a DBA maintenance activity such as database patching to an Oracle database, which would require taking an outage to take the system offline for a period of time. From the database administrator's perspective, planned downtime situations usually are the result of management-initiated events. On the other hand, unplanned downtime issues frequently occur due to a physical event caused by a hardware, software, or environmental failure or caused by human error. A few examples of unplanned downtime events include hardware server component failures such as CPU, disk, or power outages. Most data centers will exclude planned downtime from the high availability factor in terms of calculating the current total availability percentage. Even so, both planned and unplanned maintenance windows affect high availability. For instance, database upgrades require a few hours of downtime. Another example would be a SAN replacement. Such items make comprehensive four nine solutions nigh impossible to implement without additional considerations. The fact is that implementing a true 100% high availability is nearly impossible without exorbitant costs. To have complete high availability for all components within the data center requires an architecture for all systems and databases that eliminates any Single Point of Failure (SPOF) and allows for total online availability for all server hardware, network, operating systems, applications, and database systems. Service Level Agreements for high availability When it comes to determining high availability ratios, this is often expressed as the percentage of uptime in a given year. The following table shows the approximate downtime that is allowed for a specific percentage of high availability, granted that the system is required to operate continuously. Service Level Agreements (SLAs) usually refer to monthly downtime or availability in order to calculate service levels to match monthly financial cycles. The following table from the International Organization for Standardization (ISO) illustrates the correlation between a given availability percentage and the relevant amount of time a system would be unavailable per year, month, or week: For monthly calculations, a 30-day month is used. It should be noted that availability and uptimes are not the same thing. For instance, a database system may be online but not available, as in the case of application outages such as when a user's SQL script cannot be executed. In most cases, the number of nines is not often used by the database or system professional when measuring high availability for data center environments because it is difficult to extrapolate such hard numbers without a large test environment. For practical purposes, availability is calculated more as a probability or average downtime given per annual basis. High availability interpretations When it comes to discussing how availability is measured, there is a debate on the correct method of interpretation for high availability ratios. For instance, an Oracle database server that has been online for 365 days in a given non-leap year might have been eclipsed by an application failure that lasted for nine hours during a peak usage period. As a consequence, the users will see the complete system as unavailable, whereas the Oracle database administrator will claim 100% "uptime." However, given the true definition of availability, the Oracle database will be approximately 99.897% available (8751 hours of available timeout of 8760 hours per non-leap year). Furthermore, Oracle database systems experiencing performance problems are often deemed partially or entirely unavailable by users, while in the eyes of the database administrator the system is fine and available. Another situation that presents a challenge in terms of what constitutes availability would be the scenario in which the availability of a mission-critical application might go offline yet is not viewed as unavailable by the Oracle DBA, as the database instance could still be online and thus available. However, the application in question is offline to the end user, thus presenting a status of unavailable from the perspective of the end user. This illustrates the key point that a true availability measure must be from a holistic perspective and not strictly from the database's point of view. Availability should be measured with comprehensive monitoring tools that are themselves highly available and present the proper instrumentation. If there is a lack of instrumentation, systems supporting high-volume transaction processing frequently during the day and night, such as credit-card-processing database servers, are often inherently better monitored than systems that experience a periodic lull in demand. Currently, custom scripts can be developed in conjunction with third-party tools to provide a measure of availability. One such tool that we recommend for monitoring database, server, and application availability is that provided by Oracle Grid Control, which also includes Oracle Enterprise Manager. Oracle Grid Control provides instrumentation via agents and plugin modules to measure availability and performance on a system-wide enterprise level, thereby greatly aiding the Oracle database professional to measure, track, and report to management and users on the status of availability with all mission-critical applications and system components. However, the current version of Oracle Enterprise Manager will not provide a true picture of availability until 11g Grid Control is released in the future. Recovery time and high availability Recovery time is closely related to the concept of high availability. Recovery time varies based on system design and failure experienced, in that a full recovery may well be impossible if the system design prevents such recovery options. For example, if the data center is not designed correctly with the required system and database backups and a standby disaster recovery site in place, then a major catastrophe such as a fire or earthquake will almost always result in complete unavailability until a complete MAA solution is implemented. In this case, only a partial recovery may be possible. This drives home the point that for all major data center operations, you should always have a backup plan with an offsite secondary disaster-recovery data center to protect against losing all critical systems and data. In terms of database administration for Oracle data centers, the concept of data availability is essential when dealing with recovery time and planning for highly available options. Data availability references the degree to which databases such as Oracle record and report transactions. Data management professionals often focus just on data availability in order to judge what constitutes an acceptable data loss with different types of failure events. While application service interruptions are inconvenient and sometimes permitted, data loss is not to be tolerated. As one Chief Information Officer (CIO) and executive once told us while working for a large financial brokerage, you can have the system down to perform maintenance but never ever lose my data! The next item related to high availability and recovery standards is that of Service Level Agreements or SLAs for data center operations. The purpose of the Service Level Agreement is to actualize the availability objectives and requirements for a data center environment per business requirements into a standard corporate information technology (IT) policy. System design for high availability Ironically, by adding further components to the overall system and database architecture design, you may actually undermine your efforts to achieve true high availability for your Oracle data center environment. The reason for this is by their very nature, complex systems inherently have more potential failure points and thus are more difficult to implement properly. The most highly available systems for Oracle adhere to a simple design pattern that makes use of a single, high quality, multipurpose physical system with comprehensive internal redundancy running all interdependent functions, paired with a second like system at a separate physical location. An example would be to have a primary Oracle RAC clustered site with a second Disaster Recovery site at another location with Oracle Data Guard and perhaps dual Oracle RAC clusters at both sites connected by stretch clusters. The best possible way to implement an active standby site with Oracle would be to have Oracle Streams and Oracle Data Guard. Large commercial banking and insurance institutions would benefit from this model for Oracle data center design to maximize system availability. Business Continuity and high availability Business Continuity Planning (BCP) refers to the creation and validation of a rehearsed operations plan for the IT organization that explains the procedures of how the data center and business unit will recover and restore, partially or completely, interrupted business functions within a predetermined time after a major disaster. In its simplest terms, BCP is the foundation for the IT data center operations team to maintain critical systems in the event of disaster. Major incidents could include events such as fires, earthquakes, or national acts of terrorism. BCP may also encompass corporate training efforts to help reduce operational risk factors associated with the lack of information technology (IT) management controls. These BCP processes may also be integrated with IT standards and practices to improve security and corporate risk management practices. An example would be to implement BCP controls as part of Sarbanes-Oxley (SOX) compliance requirements for publicly traded corporations. The origins for BCP standards arose from the British Standards Institution (BSI) in 2006 when the BSI released a new independent standard for business continuity named BS 25999-1. Prior to the introduction of this standard for BCP, IT professionals had to rely on the previous BSI information security standard, BS 7799, which provided only limited standards for business continuity compliance procedures. One of the key benefits of these new standards was to extend additional practices for business continuity to a wider variety of organizations, to cover needs for public sector, government, non-profit, and private corporations. Disaster Recovery Disaster Recovery (DR) is the process, policies, and procedures related to preparing for recovery or continuation of technology infrastructure critical to an organization after either a natural or human-caused disaster. Disaster Recovery Planning (DRP) is a subset of larger processes such as Business Continuity and should include planning for resumption of applications, databases, hardware, networking, and other IT infrastructure components. A Business Continuity Plan includes planning for non-IT-related aspects, such as staff member activities, during a major disaster as well as site facility operations, and it should reference the Disaster Recovery Plan for IT-related infrastructure recovery and business continuity procedures and guidelines. Business Continuity and Disaster Recovery guidelines The following recommendations will provide you with a blueprint to formulate your requirements and implementation for a robust Business Continuity and Disaster Recovery plan: Identifying the scope and boundaries of your Business Continuity Plan:The first step enables you to define the scope of your new Business Continuity Plan. It provides you with an idea of the limitations and boundaries of the Business Continuity Plan. It also includes important audit and risk analysis reports for corporate assets. Conducting a Business Impact Analysis session:Business Impact Analysis (BIA) is the assessment of financial losses to institutions, which usually results as the consequence of destructive events such as the loss or unavailability of mission-critical business services. Obtaining support for your business continuity plans and goals from the executive management team:You will need to convince senior management to approve your business continuity plan, so that you can flawlessly execute your disaster recovery planning. Assign stakeholders as representatives on the project planning committee team, once approval is obtained from the corporate executive team. Understanding its specific role:In the possible event of a major disaster, each of your departments must be prepared to take immediate action. In order to successfully recover your mission-critical database systems with minimal loss, each team must understand the BCP and DRP plans, as well as follow them correctly. Furthermore, it is also important to maintain your DRP and BCP plans, as well as conduct periodic training of your IT staff members on a regular basis to have successful response time for emergencies. Such "smoke tests" to train and keep your IT staff members up to date on the correct procedures and communications will pay major dividends in the event of an unforeseen disaster. One useful tool for creating and managing BCP plans is available from the National Institute of Standards and Technologies (NIST). The NIST documentation can be used to generate templates that can be used as an excellent starting point for your Business Continuity and Disaster Recovery planning. We highly recommend that you download and review the following NIST publication for creating and evaluating BCP plans, Contingency Planning Guide for Information Technology Systems, which is available online at http://csrc.nist.gov/publications/nistpubs/800-34/sp800-34.pdf. Additional NIST documents may also provide insight into how best to manage new or current BCP or DRP plans. A complete listing of NIST publications is available online at http://csrc.nist.gov/publications/PubsSPs.html.
Read more
  • 0
  • 0
  • 3828

article-image-drools-integration-modules-spring-framework-and-apache-camel
Packt
28 Dec 2011
14 min read
Save for later

Drools Integration Modules: Spring Framework and Apache Camel

Packt
28 Dec 2011
14 min read
Setting up Drools using Spring Framework In this recipe, you will see how to configure the Drools business rules engine using the Spring framework, using the integration module specially created to configure the Drools beans with XML. How to do it... Carry out the following steps in order to configure a Drools project using the Spring Framework integration: Add the following dependency in your Maven project by adding this XML code snippet in the pom.xml file: <dependency> <groupId>org.drools</groupId> <artifactId>drools-spring</artifactId> <version>5.2.0.Final</version> </dependency> Once the drools-spring module and the Spring Framework dependencies are added into your project, it's time to write the rules that are going to be included in the knowledge base: package drools.cookbook.chapter07 import drools.cookbook.chapter07.model.Server import drools.cookbook.chapter07.model.Virtualization rule "check minimum server configuration" dialect "mvel" when $server : Server(processors < 2 || memory<=1024 || diskSpace<=250) then System.out.println("Server "" + $server.name + "" was rejected by don't apply the minimum configuration."); retract($server); end rule "check available server for a new virtualization" dialect "mvel" when $virtualization : Virtualization($virtMemory : memory, $virtDiskSpace : diskSpace) $server : Server($memory : memory, $diskSpace : diskSpace, virtualizations !=null) Number((intValue + $virtMemory) < $memory) from accumulate (Virtualization($vmemory : memory) from $server.virtualizations, sum($vmemory)) Number((intValue + $virtDiskSpace) < $diskSpace) from accumulate(Virtualization($vdiskSpace : diskSpace) from $server.virtualizations, sum($vdiskSpace)) then $server.addVirtualization($virtualization); retract($virtualization); end Then a Spring Application Context XML file has to be created to configure the Drools beans with the following code <?xml version="1.0" encoding="UTF-8"?> <beans xsi_schemaLocation=" http://www.springframework.org/schema/beans http://www.springframework.org/schema/beans/spring-beans-2.0.xsd http://drools.org/schema/drools-spring org/drools/container/spring/drools-spring-1.2.0.xsd http://camel.apache.org/schema/spring http://camel.apache.org/schema/spring/camel-spring.xsd"> <drools:grid-node id="node1" /> <drools:resource id="resource1" type="DRL" source="classpath:drools/cookbook/chapter07/rules.drl" /> <drools:kbase id="kbase1" node="node1"> <drools:resources> <drools:resource ref="resource1" /> </drools:resources> </drools:kbase> <drools:ksession id="ksession1" type="stateful" kbase="kbase1" node="node1" /> <drools:ksession id="ksession2" type="stateless" kbase="kbase1" node="node1" /> </beans> After all these three steps, you are ready to load the XML file using the Spring Framework API and obtain the instantiated beans to interact with the knowledge sessions: ClassPathXmlApplicationContext applicationContext = new ClassPathXmlApplicationContext("applicationContext.xml"); applicationContext.start(); StatefulKnowledgeSession ksession1= (StatefulKnowledgeSession)applicationContext .getBean("ksession1"); Server debianServer = new Server("debian-1", 4, 2048, 250, 0); ksession1.insert(debianServer); ksession1.fireAllRules(); applicationContext.stop();> In your Maven project, add the following dependencies in the pom.xml file to use JPA persistence using the Spring Framework: public static void main(String[] args) { ClassPathXmlApplicationContext applicationContext = new ClassPath mlApplicationContext("applicationContext.xml"); applicationContext.start(); StatefulKnowledgeSession ksession1 = (StatefulKnowledgeSession) applicationContext.getBean("ksession1"); Server debianServer = new Server("debian-1", 4, 2048, 250, 0); ksession1.insert(debianServer); ksession1.fireAllRules(); applicationContext.stop(); } Implement the java.io.Serializable interface to the objects of your domain model that will be persisted. Create a persistence.xml file inside the resources/META-INF folder to configure the persistence unit. In this recipe, we will use an embedded H2 database for testing purposes, but you can configure it for any relational database engine:<?xml version="1.0" encoding="UTF-8"?> <persistence version="1.0" xsi_schemaLocation="http://java.sun.com/xml/ns/persistence http: java.sun.com/xml/ns/persistence/persistence_1_0.xsd http://java.sun.com/xml/ns/persistence/orm http://java.sun.com/ xml/ns/persistence/orm_1_0.xsd"> <persistence-unit name="drools.cookbook.spring.jpa" transaction- type="RESOURCE_LOCAL"> <provider>org.hibernate.ejb.HibernatePersistence</provider> <class>org.drools.persistence.info.SessionInfo</class> <properties> <property name="hibernate.dialect" value="org.hibernate.dialect.H2Dialect" /> <property name="hibernate.max_fetch_depth" value="3" /> <property name="hibernate.hbm2ddl.auto" value="create" /> <property name="hibernate.show_sql" value="false" /> </properties> </persistence-unit> </persistence> Now, we have to create an XML file named applicationContext.xml in the resources folder, in which we are going to define the beans needed to configure the JPA persistence and the Drools beans: <?xml version="1.0" encoding="UTF-8"?> <beans xsi_schemaLocation="http://www.springframework.org/schema/ beans http://www.springframework.org/schema/beans/spring- beans-2.0.xsdhttp://drools.org/schema/drools-spring org/drools/ container/spring/drools-spring-1.2.0.xsd http://camel.apache.org/schema/spring http://camel.apache.org/ schema/spring/camel-spring.xsd"> <bean id="dataSource" class="org.springframework.jdbc.datasource. DriverManagerDataSource"> <property name="driverClassName" value="org.h2.Driver" /> <property name="url" value="jdbc:h2:tcp://localhost/Drools" /> <property name="username" value="sa" /> <property name="password" value="" /> </bean> <bean id="entityManagerFactory" class="org.springframework.orm. jpa.LocalContainerEntityManagerFactoryBean"> <property name="dataSource" ref="dataSource" /> <property name="persistenceUnitName" value="drools.cookbook. spring.jpa" /> </bean> <bean id="txManager" class="org.springframework.orm.jpa. JpaTransactionManager"> <property name="entityManagerFactory" ref="entityManagerFactory" /> </bean> <drools:grid-node id="node1" /> <drools:kstore id="kstore1" /> <drools:resource id="resource1" type="DRL" source="classpath:drools/cookbook/chapter07/rules.drl" /> <drools:kbase id="kbase1" node="node1"> <drools:resources> <drools:resource ref="resource1" /> </drools:resources> </drools:kbase> <drools:ksession id="ksession1" type="stateful" kbase="kbase1" node="node1"> <drools:configuration> <drools:jpa-persistence> <drools:transaction-manager ref="txManager" /> <drools:entity-manager-factory ref="entityManagerFactory" /> </drools:jpa-persistence> </drools:configuration> </drools:ksession> </beans> Finally, we have to write the following code in a new Java class file, or in an existing one, in order to interact with the Stateful knowledge session and persist this state into the H2 database without further actions: public void startApplicationContext() { ClassPathXmlApplicationContext applicationContext =newClassPathXmlApplicationContext("/applicationContext.xml"); applicationContext.start(); StatefulKnowledgeSession ksession1 = (StatefulKnowledgeSession) applicationContext.getBean("ksession1"); int sessionId = ksession1.getId(); Server debianServer = new Server("debianServer", 4, 2048, 1222, 0); ksession1.insert(debianServer); ksession1.fireAllRules(); ksession1.dispose(); Environment env = KnowledgeBaseFactory.newEnvironment(); env.set(EnvironmentName.ENTITY_MANAGER_FACTORY, applicationContext.getBean("entityManagerFactory")); env.set(EnvironmentName.TRANSACTION_MANAGER, applicationContext.getBean("txManager")); Virtualization virtualization = new Virtualization( "dev","debian", 512, 30); KnowledgeStoreService kstore = (KnowledgeStoreService) applicationContext.getBean("kstore1"); KnowledgeBase kbase1 = (KnowledgeBase)applicationContext. getBean("kbase1"); ksession1 = kstore .loadStatefulKnowledgeSession(sessionId, kbase1, null, env); ksession1.insert(virtualization); ksession1.fireAllRules(); applicationContext.stop(); } How it works... In order to use the Spring Framework integration in your project, First you have to add the drools-spring module to it. In a Maven project, you can do it by adding the following code snippet in your pom.xml file: <dependency> <artifactId>org.drools</artifactId> <groupId>drools-spring</groupId> <version>5.2.0.Final</version> </dependency> This dependency will transitively include the required Spring Framework libraries in the Maven dependencies. Currently, the integration is done using the 2.5.6 version, but it should work with the newest version as well. Now, we are going to skip the rule authoring step because it's a very common task and you really should know how to do it at this point, and we are going to move forward to the beans configuration. As you know, the Spring Framework configuration is done through an XML file where the beans are defined and injected between them, and to make Drools declaration easy the integration module provides a schema and custom parsers. Before starting the bean configuration, the schema must be added into the XML namespace declaration, otherwise the Spring XML Bean Definition Reader is not going to recognize the Drools tags and some exceptions will be thrown. In the following code lines, you can see the namespace declarations that are needed before you start writing the bean definitions: <?xml version="1.0" encoding="UTF-8"?> <beans xsi:schemaLocation=" http://www.springframework.org/schema/beans http://www.springframework.org/schema/beans/spring-beans-2.0.xsd http://drools.org/schema/drools-spring org/drools/container/spring/drools-spring-1.2.0.xsd http://camel.apache.org/schema/spring http://camel.apache.org/schema/spring/camel-spring.xsd"> <!—- define your beans here --> </beans> After this, the drools beans can be added inside the XML configuration file using the friendly tags: <drools: /> tags: <?xml version="1.0" encoding="UTF-8"?> <beans xsi_schemaLocation=" http://www.springframework.org/schema/beans http://www.springframework.org/schema/beans/spring-beans-2.0.xsd http://drools.org/schema/drools-spring org/drools/container/spring/drools-spring-1.2.0.xsd http://camel.apache.org/schema/spring http://camel.apache.org/schema/spring/camel-spring.xsd"> <drools:grid-node id="node1" /> <drools:resource id="resource1" type="DRL" source="classpath:drools/cookbook/chapter07/rules.drl" /> <drools:kbase id="kbase1" node="node1"> <drools:resources> <drools:resource ref="resource1" /> </drools:resources> </drools:kbase> <drools:ksession id="ksession1" type="stateful" kbase="kbase1" node="node1" /> </beans> As you can see, there is only one stateful knowledge session bean configured by using the tag with a ksession1 ID. This ksession1 bean was injected with a knowledge base and a grid node so that the Drools Spring beans factories, which are provided by the integration module, can instantiate it. Once the drools beans are configured, it's time to instantiate them using the Spring Framework API, as you usually do: public static void main(String[] args) { ClassPathXmlApplicationContext applicationContext = new ClassPath mlApplicationContext("applicationContext.xml"); applicationContext.start(); StatefulKnowledgeSession ksession1 = (StatefulKnowledgeSession) applicationContext.getBean("ksession1"); Server debianServer = new Server("debian-1", 4, 2048, 250, 0); ksession1.insert(debianServer); ksession1.fireAllRules(); applicationContext.stop(); } In the Java main method, a ClassPathXmlApplicationContext object instance is used to load the bean definitions, and once they are successfully instantiated they are available to be obtained using the getBean(beanId) method . At this point, the Drools beans are instantiated and you can start interacting with them as usual by just obtaining their references. As you saw in this recipe, the Spring framework integration provided by Drools is pretty straightforward and allows the creation of a complete integration, thanks to its custom tags and simple configuration. See also For more information about the Drools bean definitions, read the Spring Integration in the official documentation available at http://www.jboss.org/drools/documentation. Configuring JPA to persist our knowledge with Spring Framework How to do it... Carry out the following steps in order to configure the Drools JPA persistence using the Spring module integration: How it works... Before we start declaring the beans that are needed to persist the knowledge using JPA, we have to add some dependencies into our project configuration, especially the ones used by the Spring Framework. These dependencies were already described in the first step of the previous section, so we can safely continue with the remaining steps. Once the dependencies are added into the project, we have to implement the java.io.Serializable interface in the classes of our domain model that will be persisted. After this, we have to create a persistence unit configuration by using the default persistence.xml file located in the resources/META-INF directory of our project. This persistence unit is named drools.cookbook.spring.jpa and uses the Hibernate JPA implementation. Also, it is configured to use an H2 Java database, but in your real environment, you should supply the appropriate configuration. Next, you will see the persistence unit example, with the annotated SessionInfo entity that will be used to store the session data, which is ready to be used with Drools: <?xml version="1.0" encoding="UTF-8"?> <persistence version="1.0" xsi_schemaLocation="http://java.sun.com/xml/ns/persistence http:// java.sun.com/xml/ns/persistence/persistence_1_0.xsd http://java.sun.com/xml/ns/persistence/orm http://java.sun.com/xml/ns/ persistence/orm_1_0.xsd"> <persistence-unit name="drools.cookbook.spring.jpa" transaction- type="RESOURCE_LOCAL"> <provider>org.hibernate.ejb.HibernatePersistence</provider> <class>org.drools.persistence.info.SessionInfo</class> <properties> <property name="hibernate.dialect" value="org.hibernate.dialect.H2Dialect" /> <property name="hibernate.max_fetch_depth" value="3" /> <property name="hibernate.hbm2ddl.auto" value="create" /> <property name="hibernate.show_sql" value="false" /> </properties> </persistence-unit> </persistence> Now, we are ready to declare the beans that are needed to enable the JPA persistence with an XML file, where the most important section is the declaration of the Spring DriverManagerDataSource and LocalContainerEntityManagerFactoryBean beans , which are very descriptive and can be configured with the parameters of your data engine. Also, one of the most important declarations is the KnowledgeStoreService bean, using the tag, that will be primarily used to load the persistence knowledge session:<?xml version="1.0" encoding="UTF-8"?> <beans xsi_schemaLocation="http://www.springframework.org/schema/beans http://www.springframework.org/schema/beans/spring-beans-2.0.xsd http://drools.org/schema/drools-spring org/drools/container/spring/drools-spring-1.2.0.xsd http://camel.apache.org/schema/spring http://camel.apache.org/schema/ spring/camel-spring.xsd"> <bean id="dataSource" class="org.springframework.jdbc.datasource. DriverManagerDataSource"> <property name="driverClassName"value="org.h2.Driver" /> <property name="url"value="jdbc:h2:tcp://localhost/Drools" /> <property name="username" value="sa" /> <property name="password" value="" /> </bean> <bean id="entityManagerFactory" class="org.springframework.orm.jpa. LocalContainerEntityManagerFactoryBean"> <property name="dataSource" ref="dataSource" /> <property name="persistenceUnitName" value="drools.cookbook.spring.jpa" /> </bean> <bean id="txManager" class="org.springframework.orm.jpa. JpaTransactionManager"> <property name="entityManagerFactory" ref="entityManagerFactory" /> </bean> <drools:grid-node id="node1" /> <drools:kstore id="kstore1" /> <drools:resource id="resource1" type="DRL" source="classpath:drools/ cookbook/chapter07/rules.drl" /> <drools:kbase id="kbase1" node="node1"> <drools:resources> <drools:resource ref="resource1" /> </drools:resources> </drools:kbase> <drools:ksession id="ksession1" type="stateful" kbase="kbase1" node="node1"> <drools:configuration> <drools:jpa-persistence> <drools:transaction-manager ref="txManager" /> <drools:entity-manager-factory ref="entityManagerFactory" /> </drools:jpa-persistence> </drools:configuration> </drools:ksession> </beans> After the bean definitions, we can start writing the Java code needed to initialize the Spring Framework application context and interact with the defined beans. After loading the application context by using a ClassPathXmlApplicationContext object, we have to obtain the stateful knowledge session to insert the facts into the working memory, and also obtain the ID of the knowledge session to recover it later: ClassPathXmlApplicationContext applicationContext = new ClassPathXmlApplicationContext("/applicationContext.xml"); applicationContext.start(); StatefulKnowledgeSession ksession1 = (StatefulKnowledgeSession) applicationContext.getBean("ksession1"); int sessionId = ksession1.getId(); Server debianServer = new Server("debianServer", 4, 2048, 1222, 0); ksession1.insert(debianServer); ksession1.fireAllRules(); ksession1.dispose(); Once we are done interacting with the knowledge session and inserting facts, firing the rules, and so on, these can be disposed. They can be restored later using the KnowledgeStoreService bean , but we have to create a new org.drools.runtime.Environment object to set the EntityManager and TransactionManager used in the persistence process before trying to load the persisted knowledge session. The org.drools.runtime.Environment object can be created as follows: Environment env = KnowledgeBaseFactory.newEnvironment(); env.set(EnvironmentName.ENTITY_MANAGER_FACTORY, applicationContext.getBean("entityManagerFactory")); env.set(EnvironmentName.TRANSACTION_MANAGER, applicationContext.getBean("txManager")); Virtualization virtualization = new Virtualization("dev", "debian", 512, 30); Finally, with the Environment object created, we can obtain the KnowledgeStoreService bean together with the KnowledgeSession bean and the StatefulKnowledgeSession ID to load the stored state and start to interact with it as we do usually: KnowledgeStoreService kstore = (KnowledgeStoreService) applicationContext.getBean("kstore1"); KnowledgeBase kbase1 = (KnowledgeBase) applicationContext. getBean("kbase1"); ksession1 = kstore.loadStatefulKnowledgeSession(sessionId, kbase1, null, env); ksession1.insert(virtualization); ksession1.fireAllRules(); applicationContext.stop(); As you saw in this recipe, the knowledge session persistence is totally transparent to the user and automatic without any extra steps to save the state. By following these steps you can easily integrate JPA persistence using Hibernate, or any other vendor's JPA implementation, in order to save the current state of the knowledge session using the Spring Framework Integration.
Read more
  • 0
  • 0
  • 3795

article-image-getting-jump-start-ironpython
Packt
07 Oct 2009
9 min read
Save for later

Getting a Jump-Start with IronPython

Packt
07 Oct 2009
9 min read
Where do you get it? Before getting started, you’ll need to download IronPython 2.0.1 (though the contents of this article could just as easily be applied to past or even future versions). The official IronPython site is www.codeplex.com/ironpython. Here you’ll find not only the IronPython bits, but also samples, source code, documentation and many other resources. After clicking on the "Downloads" tab at the top you will be presented with three download options: IronPython.msi, IronPython-2.0.1-Bin.zip (the binaries) or IronPython-2.0.1-Src.zip (the source code). If you already have CPython installed—the standard Python implementation—the binaries are probably your best bet. You simply unzip the files to your preferred installation directory and you’re done. If you don’t have CPython installed, I recommend the IronPython.msi file since it comes prepackaged with portions of the CPython Standard Library. Figure 1. IronPython installation directory. There are a few items I would like to highlight in the IronPython installation directory displayed in Figure 1. The first is the FAQ.html file. This covers all of your basic IronPython questions, from licensing questions to implementation details. Periodically reviewing this while you’re learning IronPython will probably save you a lot of frustration. The second item of importance is the two executables, ipy.exe and ipyw.exe. As you probably guessed, these are what you use to launch IronPython; ipy.exe is used for scripts and console applications while ipyw.exe is reserved for other types of applications (Windows Forms, WPF, etc). Lastly, I’d like to draw your attention to the Tutorial folder. Inside the Tutorial folder, you’ll find a Tutorial.html file in addition to a number of other files. The Tutorial.html file is a comprehensive review of what you need to know to get started with IronPython. If you want to be quickly productive, be sure to at least review the tutorial. It will answer many of your questions. Visual Studio or a Text Editor? One thing that neither the ReadMe nor the Tutorial really covers is the tooling story. While Visual Studio 2008 is a viable Python development tool, you may want to consider other options. Personally, I bounce between VS and SciTE, but I’m always watching for new tools that might improve my development experience.  There are a number of IDEs and debuggers out there and you owe it to yourself to investigate them. Sometimes, however, Visual Studio IS the right tool for the job. If that’s the case then you’ll need to install the Visual Studio SDK from http://www.microsoft.com/downloads/details.aspx?FamilyID=30402623-93ca-479a-867c-04dc45164f5b&displaylang=en. Let’s Write Some Code! To get started, let’s create a simple python script and execute it with ipy. In a new file called "sample.py" (Python files are indicated by a ".py" extension), type "print ‘Hello, world’". Open a command window, navigate to the directory where you saved sample.py and then call ipy.exe passing "sample.py" as an argument. Figure 1 displays what you might expect to see in the console window. Figure 2. Executing a script using the comand line Not that executing from the Command Line isn’t effective, but I prefer a more efficient approach.  Therefore I’m going to use SciTE, an editor I briefly mentioned earlier, to duplicate the example in Figure2. Why SciTE? I get syntax highlighting, I can run my code simply by hitting F5 and the stdout is redirected to SciTE’s output window. In short, I never have to leave my coding environment. If you performed the above "hello, world" example in SciTE, the example would look like Figure 2. Figure 3. Executing a script using SciTE Congratulations! You’ve written your first bit of Python code! The problem is it doesn’t really touch any .NET namespaces. Fortunately, this is not a difficult thing to do. Figure 3 shows all the code you need to start working with the System namespace. Figure 4. You only need a single of code to gain access to the System namespace With that simple import statement, we now enjoy access to the entirety of the System namespace. For example, to access the String class we simply would type System.String. That’s great for getting started but what happens when we want to use something like the Regex class? Do we have to type System.Text.RegularExpressions.Regex? Figure 5. Using .NET regular expressions from IronPython No! The first line of Figure 5 introduces a new form of the import statement that only imports the specific items you want. In our case, we only want the Regex class. The code in line 3 demonstrates creating a new instance of the Regex class. Note the lack of a "new" keyword. Python considers "new" redundant since you have to include parentheses anyways. Another interesting note is the syntax—or is it the lack of syntax—for creating a variable. There’s no declaration statement or type required. We simply create a name that we set equal to a new instance of the Regex class. If you’ve ever written any PHP or classic ASP, this should feel pretty familiar to you.  Finally, the print statement on line 6 produces the output shown in Figure 6. Figure 6. Output from the Regex example The last example was easy because IronPython already holds a reference to System and mscorlib. Let’s push our limits and create a simple Windows form. This requires a bit more work. Figure 7. Using the clr module to add a reference A Quick Review of Python Classes Figure 7 introduces the clr module as a way of adding references to other libraries in the Global Assembly Cache (GAC). Once we have a reference, now we can import the Form, TextBox and Button classes so we can start constructing our GUI. Before we do that though, we have a couple of concepts we need to cover. Figure 8. Introducing classes and methods Up until this point, we really haven’t needed to create any classes or methods. But now that we need to create a form we’re going to need both. Figure 8 demonstrates a very simple class and an equally simple method. I think it’s clear that the "class" keyword defines a class and the "def" keyword defines a method. You probably correctly assumed that "(object)" after "MyClass" is Python’s way of expressing inheritance. The "pass" keyword, however, may not be immediately obvious. In Python, classes and methods cannot be empty. Therefore, if you have a class or method that you aren’t quite ready to code yet, you can use the "pass" statement until you are ready. A more subtle characteristic of Figure 8 is the whitespace. In Python we indent the contents of control structures with four spaces. Tabs will also work, but by convention four spaces are used. In our example above, since "my_method" has no preceding spaces, it’s clear that "my_method" is not part of "MyClass". So how would we make "my_method" a class method? Logically, you would think that simply deleting the "pass" statement under "MyClass" and indenting "my_method" would be enough, but that isn’t the case. There’s one more addition we need to make. Figure 9. Creating a class method As Figure 9 demonstrates, we need to pass "self" as a parameter to "my_method". The first—and sometimes the only—parameter in a class method’s parameter list must always be an instance of the containing class. By convention, this parameter should be named "self", though you could call it anything you’d like. Why the extra step? That’s because Python values the explicit over the implicit. Hiding this detail from the developer is at odds with Python’s philosophy. Creating a Windows Form Now that we have an understanding of classes, methods and whitespace, Figure 10 continues our example from Figure 7 by creating a blank form. Figure 10. Creating a blank form The code in Figure 10 should be fairly understandable. We create the "MyForm" class by inheriting from "System.Windows.Forms.Form". We create a new instance of "MyForm" and pass the resulting object to the "Application.Run()" method. The only thing that may give you pause is the "__init__()" method. The "__init__()" method is what’s called a magic method. Magic methods are designated with double underscores on either end of the method name and are rarely called directly. For instance, when the code in Line 10 of Figure 10 executes, the "__init__()" method defined in "MyForm" is actually being called behind the scenes. Figure 11. Populating the form with controls and handling an event handler Figure 11 adds a lot of code to our application, most of which isn’t very interesting. The exception here is the Click event of the goButton. In C#, the method would get passed as an argument in the constructor of a new EventHandler. In IronPython, we simply add a function with the proper signature to the Click event.  Now that we have a button that will respond to a click, Figure 12 shows a modified version of our regular expression sample code from earlier inserted into the click method. Note the "__str__()" magic method is the equivalent of ToString(). Figure 12. Populating click with our regular expression example When we run the code, you should see the form displayed in Figure 13. You can enter dates into the top textbox, press the button and either True or False will appear in the lower textbox indicating the results of the IsMatch() function. Figure 13. Completed form Conclusion During the course of one brief article, you went from knowing little of IronPython to using it to build Windows Forms. We were able to move so quickly because we leveraged your existing .NET knowledge.  We spent most of our time talking about the very intuitive Python syntax. Go through sample or even production code you've written in the past and duplicate it in IronPython. You’ll find working with familiar .NET libraries will speed your learning process, making it more fun. Before you know it, Python will become second-nature!
Read more
  • 0
  • 0
  • 3795

article-image-using-osgi-bundle-repository-osgi-and-apache-felix-30
Packt
03 Nov 2010
5 min read
Save for later

Using the OSGi Bundle Repository in OSGi and Apache Felix 3.0

Packt
03 Nov 2010
5 min read
OSGi and Apache Felix 3.0 Beginner's Guide Introduction The OSGi Bundle Repository (OBR) is a draft specification from the OSGi alliance for a service that would allow getting access to a set of remote bundle repositories. Each remote repository, potentially a front for a federation of repositories, provides a list of bundles available for download, along with some additional information related to them. The access to the OBR repository can be through a defined API to a remote service or as a direct connection to an XML repository file. The bundles declared in an OBR repository can then be downloaded and installed to an OSGi framework like Felix. We will go through this install process a bit later. The OSGi specification for OBRs is currently in the draft state, which means that it may change before it is released. The following diagram shows the elements related to the OBR, in the context of the OSGi framework: The OBR bundle exposes a service that is registered with the framework. This interface can be used by other components on the framework to inspect repositories, download bundles, and install them. The Gogo command bundle also registers commands that interact with the OBR service to achieve the same purpose. Later in this article, we will cover those commands. API-based interaction with the service is not covered, as it is beyond the scope of this article. The OBR service currently implements remote XML repositories only. However, the Repository interface defined by the OBR service can be implemented for other potential types of repositories as well as for a direct API integration. There are a few OSGi repositories out there, here are some examples: Apache Felix: http://felix.apache.org/obr/releases.xml Apache Sling: http://sling.apache.org/obr/sling.xml Paremus: http://sigil.codecauldron.org/spring-external.obr and http://sigil.codecauldron.org/spring-release.obr Those may be of use later, as a source for the dependencies of your project. The repository XML Descriptor We already have an OBR repository available to us, our releases repository. Typically, you'll rarely need to look into the repository XML file. However, it's a good validation step when investigating issues with the deploy/install process. Let's inspect some of its contents: <repository lastmodified='20100905070524.031'> Not included above in the automatically created repository file is the optional repository name attribute. The repository contains a list of resources that it makes available for download. Here, we're inspecting the entry for the bundle com.packt.felix.bookshelf-inventory-api: <resource id='com.packtpub.felix.bookshelf-inventory-api/1.4.0' symbolicname='com.packtpub.felix.bookshelf-inventory-api' presentationname='Bookshelf Inventory API' uri='file:/C:/projects/felixbook/releases/com/packtpub/felix/ com.packtpub.felix.bookshelf-inventory-api/1.4.0/com.packtpub.felix. bookshelf-inventory-api-1.4.0.jar' version='1.4.0'> <description> Defines the API for the Bookshelf inventory.</description> <size>7781</size> <category id='sample'/> <capability name='bundle'> <p n='symbolicname' v='com.packtpub.felix.bookshelf-inventory-api'/> <p n='presentationname' v='Bookshelf Inventory API'/> <p n='version' t='version' v='1.4.0'/> <p n='manifestversion' v='2'/> </capability> <capability name='package'> <p n='package' v='com.packtpub.felix.bookshelf.inventory.api'/> <p n='version' t='version' v='0.0.0'/> </capability> <require name='package' filter= '(&amp;(package=com.packtpub.felix.bookshelf.inventory.api))' extend='false' multiple='false' optional='false'> Import package com.packtpub.felix.bookshelf.inventory.api </require> </resource> Notice that the bundle location (attribute uri), which points to where the bundle can be downloaded, relative to the base repository location. The presentationname is used when listing the bundles and the uri is used to get the bundle when a request to install it is issued. Inside the main resource entry tag are further bundle characteristics, a description of its capabilities, its requirements, and so on. Although the same information is included in the bundle manifest, it is also included in the repository XML for quick access during validation of the environment, before the actual bundle is downloaded. For example, the package capability elements describe the packages that this bundle exports: <capability name="package"> <p n="package" v="com.packtpub.felix.bookshelf.inventory.api"/> <p n="version" t="version" v="0.0.0"/> </capability> The require elements describe the bundle requirements from the target platform: <require extend="false" filter="(&amp;(package=com.packtpub.felix.bookshelf.inventory. api)(version&gt;=0.0.0))" multiple="false" name="package" optional="false"> Import package com.packtpub.felix.bookshelf.inventory.api </require> </resource> <!-- ... –-> </repository> The preceding excerpts respectively correspond to the Export-Package and Import-Package manifest headers. Each bundle may have more than one entry in the repository XML: an entry for every deployed version. Updating the OBR repository The Felix Maven Bundle Plugin attaches to the deploy phase to automate the bundle deployment and the update of the repository.xml file. Using the OBR scope commands The Gogo Command bundle registers a set of commands for the interaction with the OBR service. Those commands allow registering repositories, listing their bundles, and requesting their download and installation. Let's look at those commands in detail.
Read more
  • 0
  • 0
  • 3782
article-image-it-operations-management
Packt
05 Apr 2017
16 min read
Save for later

IT Operations Management

Packt
05 Apr 2017
16 min read
In this article by Ajaykumar Guggilla, the author of the book ServiceNow IT Operations Management, we will learn the ServiceNow ITOM capabilities within ServiceNow, which include: Dependency views Cloud management Discovery Credentials (For more resources related to this topic, see here.) ServiceNow IT Operations Management overview Every organization and business focuses on key strategies, some of them include: Time to market Agility Customer satisfaction Return on investment Information technology is heavily involved in supporting these strategic goals, either directly or indirectly, providing the underlying IT Services with the required IT infrastructure. IT infrastructure includes network, servers, routers, switches, desktops, laptops, and much more. IT supports these infrastructure components enabling the business to achieve their goals. IT continuously supports the IT infrastructure and its components with a set of governance, processes, and tools, which is called IT Operations Management. IT cares and feeds a business, and the business expects reliability of services provided by IT to support the underlying business services. A business cares and feeds the customers who expect satisfaction of the services offered to them without service disruption. Unlike any other tools it is important to understand the underlying relationship between IT, businesses, and customers. IT just providing the underlying infrastructure and associated components is not going to help, to effectively and efficiently support the business IT needs to understand how the infrastructure components and process are aligned and associated with the business services to understand the impact to the business with an associated incident, problem, event, or change that is arising out of an IT infrastructure component. IT needs to have a consolidated and complete view of the dependency between the business and the customers, not compromising on the technology used, the process followed, the infrastructure components used, which includes the technology used. There needs to be a connected way for IT to understand the relations of these seamless technology components to be able to proactively stop the possible outages before they occur and handle a change in the environment. On the other hand, a business expects service reliability to be able to support the business services to the customers. There is a huge financial impact of businesses not being able to provide the agreed service levels to their customers. So there is always a pressure and dependence from the business to IT to provide a reliable service and it does not matter what technology or processes are used. Customers as always expect satisfaction of the services provided by the business, at times these are adversely affected with service outages caused from the IT infrastructure. Customer satisfaction is also a key strategic goal for the business to be able to sustain in the competitive market. IT is also expected as necessarily to be able to integrate with the customer infrastructure components to provide a holistic view of the IT infrastructure view to be able to effectively support the business by proactively identifying and fixing the outages before they happen to reduce the outages and increase the reliability of IT services delivered. Most of the tools do not understand the context of the Service-Oriented Architecture (SOA) connecting the business services to the impacted IT infrastructure components to be able to effectively support the business and also IT to be able to justify the cost and impact of providing end to end service. Most of the traditional tools perform certain aspects of ITOM functions, some partially and some support the integration with the IT Service Management (ITSM) tool suite. The missing integration piece between the traditional tools and a full blown cloud solution platform is leaning to the SOA. ServiceNow, a cloud based solution, has focused the lens of true SOA that brings together the ITOM suite providing and leveraging the native data and that is also able to connect to the customer infrastructure to provide a holistic and end to end view of the IT Service at a given snapshot. With ServiceNow IT has a complete view of the business service and technical dependencies in real time leveraging powerful individual capabilities, applications, and plugins within ServiceNow ITOM. ServiceNow ITOM comprises of the following applications and capabilities, some of the plugins, applications, and technology might have license restrictions that require separate licensing to be purchased: Management, Instrumentation, and Discovery (MID) Server: MID Server helps to establish communication and data movement between ServiceNow and the external corporate network and application Credentials: Is a platform that stores credentials including usernames, passwords, or certificates in an encrypted field on the credentials table that is leveraged by ServiceNow discovery Service mapping: Service mapping discovers and maps the relationships between IT components that comprise specific business services, even in dynamic, virtualized environments Service mapping: Service mapping creates relationships between different IT components and business services Dependency views: Dependency views graphically displays an infrastructure view with relationships of configuration items and the underlying business services Event management: Event management provides a holistic view of all the event that are triggered from various event monitoring tools Orchestration: Orchestration helps in automating IT and business processes for operations management. Discovery: Works with MID Server and explores the IT infrastructure environment to discover the configuration items and populating the Configuration Management Database (CMDB) Cloud management: Helps to easily manage third-party cloud providers, which includes AWS, Microsoft Azure, and VMware clouds Understanding ServiceNow IT Operations Management components Now that we have covered what ITOM is about and focusing on ServiceNow ITOM capabilities, let's deep dive and explore more about each capability. Dependency views Maps like the preceding one are becoming so important in everyday life; imagine a world without GPS devices or electronic maps. There were hard copies of the maps that were available all over the streets for us to get to the place and also there were special maps to the utilities and other public service agencies to be able to identify the impact to either digging a tunnel or a water pipe or an underground electric cable. These maps help them to identify the impact of making a change to the ground. Maps also helps us to understand the relationships between a states, countries, cities, and streets with different set of information in real time that includes real-time traffic information showing accident information, any constructions, and so on. Dependency views is also similar to the real life navigation maps, they provide a map of relationships between the IT Infrastructure components and the business services that are defined under the scope, unlike the real-time traffic updates on the maps the dependency views show real-time active incidents, change, and problems reported on an individual configuration item or an infrastructure component. Changes frequently happen in the environment, some of the changes are handled with a legacy knowledge of how the individual components are connected to the business services through the service mapping plugin down to the individual component level. Making a change without understanding the relationships between each IT infrastructure component might adversely affect the service levels and impact the business service. ServiceNow dependency views provide a snapshot of how the underlying business service is connected to individual Configuration Item (CI) elements. Drilling down to the individual CI elements provides a view of associated service operations and service transition data that includes incidents logged against on a given CI, any underlying problem reported against the given CI, and also changes associated with the given CI. Dependency views are based on D3 and Angular technology that provides a graphical view of configuration items and their relationships. The dependency views provide a view of the CI and their relationships, in order to get a perspective from a business stand point you will need to enable the service mapping plugin. Having a detailed view of how the individual CI components are connected from the Business service to the CI components compliments the change management to perform effective impact analysis before any changes are made to the respective CI: Image source: wiki.servicenow.com A dependency map starts with a root node, which is usually termed as a root CI that is grayed out with a gray frame. Relationships start building up and they map from the upstream and downstream dependencies of the infrastructure components that are scoped to discover by the ServiceNow auto discovery. Administrators have the control of the number of levels to display on the dependency maps. It is also easy to manage the maps that allow creating or modifying existing relationships right from the map that posts the respective changes to the CMDB automatically. Each of the CI component of the dependency maps have an indicator that shows any active and pending issues against a CI that includes any incidents, problems, changes, and any events associated with the respective configuration item. Cloud management In the earlier versions prior to Helsinki, there was not a direct way to manage cloud instances, people had to create orchestration scripts to be able to manage the cloud instances and also create custom roles. Managing and provisioning has become easy with the ServiceNow cloud management application. The cloud management application seamlessly integrates with the ServiceNow service catalog and also provides providing automation capability with orchestration workflows. The cloud management application fully integrates the life cycle management of virtual resources into standard ServiceNow data collection, management, analytics, and reporting capabilities. The ServiceNow cloud management application provides easy and quick options to key private cloud providers, which include: AWS Cloud: Manages Amazon Web Services (AWS) using AWS Cloud Microsoft Azure Cloud: The Microsoft Azure Cloud application integrates with Azure through the service catalog and provides the ability to manage virtual resources easily VMware Cloud: The VMware Cloud application integrates with VMware vCenter to manage the virtual resources by integrating with the service catalog The following figure describes a high-level architecture of the cloud management application: Key features with the cloud management applications include the following: Single pane of glass to manage the virtual services in public and private cloud environment including approvals, notifications, security, asset management, and so on Ability to repurpose configurations through resource templates that help to reuse the capability sets Seamless integration with the service catalog, with a defined workflow and approvals integration can be done end to end right from the user request to the cloud provisioning Ability to control the leased resources through date controls and role-based security access Ability to use the ServiceNow discovery application or the standalone capability to discover virtual resources and their relationships in their environments Ability to determine the best virtualization server for a VM based on the discovered data by the CMDB auto discovery Ability to control and manage virtual resources effectively with a controlled termination shutdown date Ability to increate virtual server resources through a controlled fashion, for example, increasing storage or memory, integrating with the service catalog, and with right and appropriate approvals the required resources can be increased to the required Ability to perform a price calculation and integration of managed virtual machines with asset management Ability to auto or manually provision the required cloud environment with zero click options There are different roles within the cloud management applications, here are some of them: Virtual provisioning cloud administrator: The administrator owns the cloud admin portal and end to end management including configuration of the cloud providers. They have access to be able to configure the service catalog items that will be used by the requesters and the approvals required to provision the cloud environment. Virtual provisioning cloud approver: Who either approves or rejects requests for virtual resources. Virtual provisioning cloud operator: The operator fulfills the requests to manage the virtual resources and the respective cloud management providers. Cloud operators are mostly involved when there is a manual human intervention required to manage or provision the virtual resources. Virtual provisioning cloud user: Users have access to the my virtual assets portal that helps them to manage the virtual resources they own, or requested, or are responsible for.   How clouds are provisioned The cloud administrator creates a service catalog item for users to be able to request for cloud resources The cloud user requests for a virtual machine through the service catalog The request goes to the approver who either approves or rejects it The cloud operator provisions the requests manually or virtual resources are auto provisioned Discovery Imagine how an atlas is mapped and how places have been discovered by the satellite using exploration devices including manually, satellite, survey maps, such as street maps collector devices. These devices crawl through all the streets to collect different data points that include information about the streets, houses, and much more details are collected. This information is used by the consumers for various purposes including GPS devices, finding and exploring different areas, address of a location, on the way finding for any incidents, constructions, road closures, and so on. ServiceNow discovery works the same way, ServiceNow discovery explores through the enterprise network identifying for the devices in scope. ServiceNow discovery probes and sensors perform the collection of infrastructure devices connected to a given enterprise network. Discovery uses Shazzam probes to determine the TCP ports opened and to see if it responds to the SNMP queries and sensors to explore any given computer or device, starting first with basic probes and then using more specific probes as it learns more. Discovery explores to check on the type of device, for each type of device, discovery uses different kinds of probes to extract more information about the computer or device, and the software that is running on it. CMDB is updated or data is federated through the ServiceNow discovery. They are identified with the discovery that is set and actioned to search the CMDB for a CI that again matches the discovered CI on the network. When a device match is found what actions to be taken are defined by the administrator when discovery runs based on the configuration when a CI is discovered; either CMDB gets updated with an existing CI or a new CI is created within the CMDB. Discovery can be scheduled to perform the scan on certain intervals; configuration management keeps the up to date status of the CI through the discovery. During discovery the MID Server looks back on the probes to run from the ServiceNow instance and executes probes to retrieves the results to the ServiceNow instance or the CMDB for processing. No data is retained on the MID Server. The data collected by these probes are processed by sensors. ServiceNow is hosted in the ServiceNow data centers spanned across the globe. ServiceNow as an application does not have the ability to communicate with any given enterprise network. Traditionally, there are two different types of discovery tools on the market: Agent: A piece of software is installed on the servers or individual systems that sends all information about the system to the CMDB. Agentless: Usually doesn't require any individual installations on the systems or components. They utilize a single system or software to usually probe and sense the network by scanning and federating the CMDB. ServiceNow is an agentless discovery that does not require any individual software to be installed, it uses MID Server. Discovery is available as a separate subscription from the rest of the ServiceNow platform and requires the discovery plugin. MID Server is a Java software that runs on any windows or UNIX or Linux system that resides within the enterprise network that needs to be discovered. MID Server is the bridge and communicator between the ServiceNow instance that is sitting somewhere on the cloud and the enterprise network that is secured and controlled. MID Server uses several techniques to probe devices without using agents. Depending on the type of infrastructure components, MID Server uses the appropriate protocol to gather information from the infrastructure component, for example, to gather information from network devices MID Server will use Simple Network Management Protocol (SNMP), to be able to connect to the Unix systems MID Server will use SSH. The following table shows different ServiceNow discovery probe types: Device Probe type Windows computers and servers Remote WMI queries, shell commands UNIX and Linux servers Shell command (via SSH protocol) Storage CIM/WBEM queries Printers SNMP queries Network gear (switches, routers, and so on) SNMP queries Web servers HTTP header examination Uninterruptible Power Supplies (UPS) SNMP queries Credentials ServiceNow discovery and orchestration features require credentials to be able to access the enterprise network; these credentials vary from network and devices. Credentials such as usernames, passwords, and certificates need a secure place to store these credentials. ServiceNow credentials applications store credentials in an encrypted format on a specific table within the credentials table. Credential tagging allows workflow creators to assign individual credentials to any activity in an orchestration workflow or assign different credentials to each occurrence of the same activity type in an orchestration workflow. Credential tagging also works with credential affinities. Credentials can be assigned an order value that forces the discovery and orchestration to try all the credentials when orchestration attempts to run a command or discovery tries to query. Credentials tables contain many credentials, based on pattern of usage the credential applications which places on the highly used list that enables the discovery and orchestration to work faster after first successful connection and system knowing which credential to use for a faster logon to the device next time. Image source: wiki.servicenow.com Credentials are encrypted automatically with a fixed instance key when they are submitted or updated in the credentials (discovery_credentials) table. When credentials are requested by the MID Server, the platform decrypts the credentials using the following process: The credentials are decrypted on the instance with the password2 fixed key. The credentials are re-encrypted on the instance with the MID Server's public key. The credentials are encrypted on the load balancer with SSL. The credentials are decrypted on the MID Server with SSL. The credentials are decrypted on the MID Server with the MID Server's private key. The ServiceNow credential application integrates with the CyberArk credential storage. The MID Server integration with CyberArk vault enables orchestration and discovery to run without storing any credentials on the ServiceNow instance. The instance maintains a unique identifier for each credential, the credential type (such as SSH, SNMP, or Windows), and any credential affinities. The MID Server obtains the credential identifier and IP address from the instance, and then uses the CyberArk vault to resolve these elements into a usable credential. The CyberArk integration requires the external credential storage plugin, which is available by request. The CyberArk integration supports these ServiceNow credential types: CIM JMS SNMP community SSH SSH private key (with key only) VMware Windows Orchestration activities that use these network protocols support the use of credentials stored on a CyberArk vault: SSH PowerShell JMS SFTP Summary In this article, we covered an overview of ITOM, explored different ServiceNow ITOM components including high level architecture, functional aspects of ServiceNow ITOM components that include discovery, credentials, dependency views, and, cloud management.  Resources for Article: Further resources on this subject: Management of SOA Composite Applications [article] Working with Business Rules to Define Decision Points in Oracle SOA Suite 11g R1 [article] Introduction to SOA Testing [article]
Read more
  • 0
  • 0
  • 3777

article-image-linking-your-customers-your-sugarcrm
Packt
21 Sep 2010
12 min read
Save for later

Linking Your Customers to Your SugarCRM

Packt
21 Sep 2010
12 min read
(For more resources on SugarCRM, see here.) Surely, the most important goal of any CRM system is to make your customers feel positive about your company and to make them feel that exciting things are happening at your company, such as the following: That the employees they are in contact with are caring and well-informed That new and better information systems are coming into place That your company is responsive to product and service issues, and cares about its customers Limiting CRM system access to only the employees of a business will certainly affect the first of the aforementioned items positively, but not necessarily the other items. To really improve a customer's perception of your organization, one of the biggest improvements you can make is to allow customers to interact almost directly with your CRM system. Some of the activities that make this possible are as follows: Capturing customer leads and requests for information from the public website directly within the CRM system. Efficiently tracking customer service requests and related product/service flaws to help improve your offerings and customer satisfaction. Developing a customer self-service portal in conjunction with the CRM system to allow clients to file their own service cases, check on the latest status of a case, and to update their own customer profile. Most of us in our own lives can forgive or understand if a family member, friend, or supplier lets us down a bit, or makes a mistake—as long as they communicate with us honestly and effectively. In addition, with early detection of any errors, corrective action can always be put in place more quickly. Integrating your CRM system more directly with your customer is no more complicated than this—promoting more effective, more accurate, and timely communications with your customers. The net effect of such actions is that your customers feel informed, valued, and empowered. Capturing leads from your website Capturing leads from your company's website directly into your CRM is one of the greatest early initiatives you can implement in terms of streamlining business processes to save time and effort. This section will guide you through the manner in which this can be accomplished with SugarCRM. In the past, setting up a process similar to the one just described would have required the expertise and assistance of a programmer and your webmaster. Coordinating everyone's efforts to accomplish the goal would sometimes become a task in and of itself. Days may have elapsed before your lead capture form finally made it up to your website. Fortunately, those days are behind us. SugarCRM includes a tool that allows you to quickly and easily create a form that you can use to capture leads from your website. Through this tool, you will be able to select the fields corresponding to the data you wish to capture and also create a ready-to-use web form. Let us set up a web lead capture form through SugarCRM's tool. p style="margin-left:40px;margin-right:40px">The lead capture tool is specifically designed to import data into the Leads module only. Should you choose not to use the Leads module, or you wish to use a similar technique to capture data within a different module, you should use SugarCRM's SOAP API to accomplish the task.   To begin the setup process, hover over the Marketing tab and select Campaigns. On the shortcuts menu on the left-hand side, click on Create Lead Form, as highlighted in the following image: After clicking on it, you will see a screen that permits you to select the fields you wish to capture through your form, as illustrated in the following screenshot: The field selection process is quite simple. On the leftmost column of the three that are presented, you will see a list of all the fields corresponding to the Leads module (including custom fields). To select a field for your form, simply drag-and-drop it from the field listing on the left onto one of the two rightmost columns. It is best to visualize the layout of the form that will be produced as one similar to the edit or detail view layouts. Fields can appear next to each other, horizontally or vertically, but only within one of two columns. Most organizations prefer the vertical approach, which is the technique we will apply. However, feel free to experiment. Proceed to select the fields to match the preceding image, plus any other fields you may wish to include. Note that required fields are marked with an asterisk, as they are within the Edit view screen. You must make sure to include all your required fields to ensure that the process will work as expected. In addition, you will notice that we have selected the Lead Source field. Doing so will allow website visitors to make the appropriate selection corresponding to what drove them to your site. Click on Next once you are satisfied with your field selection. Now you need to set some final parameters, as illustrated in the following image: You will undoubtedly want to modify the Form Header. This value corresponds to the title of the page that website visitors will see in their browser, so you will want to tailor it to reflect something a bit friendlier than the generic text. The form we are building is no different than any other web form you may have encountered in your day-to-day web browsing. As such, it too will include a button for visitors to click and send the data they typed in. If you prefer the label of the button to read as something other than the default label of Submit, change the Submit Button Label accordingly. The Redirect URL and Related Campaign fields are also quite important. The former is used to specify a URL that a visitor will be sent to after clicking on the Submit button on your lead capture form, while the latter is used to associate a particular marketing campaign to the form. Establishing this relationship is critical as it will help you properly measure the effectiveness of your marketing efforts. Lastly, the Assigned To option allows you to define a user to whom the Leads will be assigned upon being entered into SugarCRM. You may want to consider creating a specific user, such as WebCapture, and assigning the Leads to that user. Doing so will permit you to quickly identify records that entered your system through the web lead capture tool versus other means. Click on Generate Form after you have applied your edits and you should see something similar to the following: The default form should now be presented within SugarCRM's HTML editor. This is a handy capability as it allows you to manipulate the look and feel of the form to make it conform to the already existing look and feel of your website. However, you may wish to ignore that, as additional options allow you to more easily integrate it into your website. To access those features and save the form, click on the Save Web To Lead Form button. SugarCRM provides the convenience of a fully formatted, ready-to-use web form which can be downloaded by clicking on the Web To Lead Form link. However, if you prefer, you may copy the code displayed in the box and then embed it into one of your already existing pages. The second approach would save you the hassle of having to modify the cosmetic aspects of the default page to match your site. To start receiving data into your SugarCRM system, simply place the form on your web server, fill out the fields and submit the form. Make sure that the server on which it is placed is able to access your SugarCRM system or it will not function. You can test it by opening the form in your web browser and submitting data, as shown in the following image: Assuming everything is working as expected, the records will automatically appear within the leads module of your SugarCRM system without any intervention on your part or that of other users. In addition, e-mail notifications of new records will automatically be sent to the defined assigned user to inform them of the new entry so they may act upon it. Through the use of add-on modules, like SierraCRM's Process Manager, further actions, like the scheduling of follow up calls, can also be automated. Remember, all of this can happen automatically and herein we begin to see the real benefits of a CRM system. There are few things quite as satisfying as driving along in the car, and receiving an e-mail on your BlackBerry telling you that a new lead has been received. Especially, when you know that it all happened automatically! From a process perspective, the concept of having every new lead automatically entered into the CRM system makes it quick and easy to convert that lead into a contact, enter details of new sales opportunities, or include them in e-mail marketing campaigns—all without any data transcription errors, or lost leads, due to human errors. One note of caution: most lead capture sites capture as much as 50% bad data. Some visitors to your site will enter anything they fancy in the form; potentially polluting your database. This highlights another reason why it is beneficial to enter them by utilizing a username such as WebCapture. Doing so would allow you to easily filter leads to only show those created by WebCapture and in turn allow you to cleanse them, either by deleting them or performing other data integrity checks. Customer self-service portals After automating the lead capture process, the most common step that follows in linking your customers into your CRM system is the self-service portal. Just as it sounds, this i s a software system that enables your customers to exchange information with your organization in a completely autonomous manner. In this initial implementation, we will show you how to implement a system that allows customers to submit and manage service cases directly within your CRM system. Most of us have had the experience of needing to contact a call center to address a customer service issue or other matters. Usually, that process involves staying on hold for some time time. If you are lucky, the time that you stay on hold is not long, but at the same time, spending 30 to 45 minutes on hold or being transferred around is not unheard of. To make matters worse, you usually need to make these calls during normal business hours, meaning you are not able to tend to your normal work while you are burning time on hold. The fundamental capability that the self-service portal provides is empowering customers by allowing them to contact you at a time that is most convenient to them. Customers are no longer bound to specific business hours, nor must they wait in a call queue or navigate a maze of phone options. If they need your company's help to resolve an issue, they simply go to your website and submit their issue. Likewise, customers do not need to contact you directly to check in on their previously submitted cases. They simply visit your website again and they will be able to review their cases. This functionality works hand-in-hand with the Cases module that is built into SugarCRM. Typically, users would leverage this module to track service calls that they receive from customers. Through this functionality, all members of the organization are kept up-to-date on any issues that a customer may be experiencing at any given time. The Bug Tracker module complements the Cases module quite well by providing a central repository where all known product flaws can be tracked. In turn, all cases resulting from any of these flaws can be related to a given bug, allowing you to measure the impact it is having on your customers. Together, they can be used as very effective tools for not only providing customer service, but also prioritizing product development needs and improving customer satisfaction. However, that process can be inefficient, as it relies on a user to enter the data to produce a case in the first place. Empowering the customer in such a way that allows them to directly interact with the Cases module not only makes it easier for you to get feedback and become aware of problems, but it also gives customers the feeling that you care to hear what they have to say about their problems. That is the goal that the self-service portal hopes to accomplish. Self-service portal configuration Before we get too deep into the specifics of configuring and using the self-service portal, you must first understand some important boundaries. First, although this is a built-in feature of the Enterprise Edition of SugarCRM, it is not a feature of Community Edition. To obtain this functionality, we must use the combination of a SugarCRM add-on available on SugarExchange.com, plus an open source CMS (Content Management System) named Joomla! If you are already using another CMS package or cannot use Joomla! for other reasons, you will not be able to utilize the functionality described in this exercise. The second and last important note is that, at the time of this writing, the add-on did not support versions of SugarCRM Community Edition higher than 5.2. Now that we have a clear understanding of some important limitations, let us begin the process of deploying this feature. Installing Joomla! Assuming you have already installed SugarCRM Community Edition on the target server, you have already established the perfect environment for installing the Joomla! CMS package. Like SugarCRM, it too leverages the LAMP or WAMP system software platforms. Just like SugarCRM, it is also an open source application. You can download Joomla! from the project's site, located at http://www.joomla.org. Our exercise will use version 1.5 of Joomla! (Full Package). It is assumed that you have already successfully downloaded and installed it onto your server. If you require help with the process, visit the Joomla! website to review its documentation and obtain further assistance. Assuming Joomla! is operational, proceed to access the administrator page. It should resemble the following: Let us leave it at the admin page for now.
Read more
  • 0
  • 0
  • 3725

article-image-liferay-mail-and-sms-text-messenger-portlet
Packt
08 Oct 2009
5 min read
Save for later

Liferay Mail and SMS Text Messenger Portlet

Packt
08 Oct 2009
5 min read
Working with Mail Portlet For the purpose of this article, we will use an intranet website called book.com  which is created  for a fictions company named "Palm Tree Publications". In order to let employees manage their emails, we can use the Liferay Mail portlet. As an administrator of "Palm Tree Publications", you need to create a Page called "Mail" under the Page, "Community" at the Book Lovers Community Public Pages and also add the Mail portlet in the Page, "Mail". Experiencing Mail Management First of all, login as "Palm Tree". Then, let's do the above as follows: Add a Page called "Mail" under the Page "Community" at the Book Lovers Community Public Pages, if the Page is not already present. If the Mail portlet is not already present, add it in the Page, "Mail" of the Book Lovers Community where you want to manage mails. You will see the Mail portlet as shown in the following figure. Let's assume that we set up the mail domain as "cignex.com" in the Enterprise Admin portlet for testing purposes, as we have a mail engine with this mail domain already. Of course, you could set up the mail domain as "book.com", or something else, if you had a mail engine with this mail domain in your hand. As an editor of the editorial department, "Lotti Stein", you may want to manage your mails in the mail domain, "cignex.com". You can first choose a user name for your personal company the email address, say "admin1234" and register. Let's do it as follows: Login as "Lotti Stein". Go to the Page "Mail" under the Page, "Community", at the Book Lovers community Public Pages. Locate the Mail portlet. Input the value for User name as "admin1234". Click the Register button. Your new email address is "admin1234@cignex.com". This email address will also serve as your login, as shown in the following figure. You can now check for new messages in your inbox, by clicking the Inbox link first. Then, you can view the Unread messages, and either Check Mail or create a New mail, as shown in the following figure: You can go to the Page of Mail management by clicking on any link of Unread Messages, or Check Mail button or New button. Further, you can manage emails through the Mail portlet of your current account. Email management includes the following features (as shown in the following figure): Create a New email. Check Mail. Reply to an email. Reply All emails. Forward emails. Delete emails. Print emails, and Search. Note that the first email with the subject, "Users in Staging server", is sent through the SMS Text Messenger portlet. For more details, refer to the forthcoming section. How to Set up Mail Server? In order to make the Mail portlet work, we have to set up a mail server with IMAP, POP and SMTP protocols. Suppose that the Enterprise "Palm Tree Publications" has a mail server with the domain "exg3.exghost.com", an account "admin@cignex.com/admin1234", and protocol IMAP, POP and SMTP. As an administrator, you need to integrate this mail server with IMAP, POP and SMTP protocol in Liferay. Let's do it as follows: Find the file ROOT.xml in $TOMCAT_DIR/conf/Catalina/localhost. Find the mail configuration first. Then configure it as follows: <!-- Mail --><Resourcename="mail/MailSession"auth="Container"type="javax.mail.Session"mail.imap.host="exg3.exghost.com"mail.imap.port="143"mail.pop.host="exg3.exghost.com"mail.pop.port="110"mail.store.protocol="imap"mail.transport.protocol="smtp"mail.smtp.host="exg3.exghost.com"mail.smtp.port="2525"mail.smtp.auth="true"mail.smtp.starttls.enable="true"mail.smtp.user="admin@cignex.com"password="admin1234"mail.smtp.socketFactory.class="javax.net.ssl.SSLSocketFactory"/> In short, a Mail portlet is an AJAX web-mail client. We can configure it to work with any mail server. It reduces page refreshes, since it displays message previews and message lists in a dual pane window. How to Set up Mail Portlet? If you have proper Permissions, you can change the preferences of the Mail portlet. To change the preferences, you can simply click the Preferences icon to the upper right of the Mail portlet. With the Recipients tab selected, you can find potential recipients from the Directory (Enabled or Disabled) and the Organization (My Organization or All Available). Click the Save button after making any changes. Using the Filters tab, you can set the values to filter emails associated with an email address to a Folder. Click the Save button after making any changes. Note that the maximum number of email addresses is ten. This number is also configurable at the portal-ext.properties Similarly, the Forward Address tab allows all emails to be forwarded to the email address you want. Enter one email address Per Line. Remove all entries to disable email forwarding. Select Yes to leave, or No to not leave a copy of the forwarded message. Click the Save button after making any changes. Further, the Signature tab also allows you to set up your signature using HTML text editor. The signature you have set up will be added to each outgoing message. Click the Save button after making any changes. The Vacation Message tab allows you to set up vacation messages using HTML text editor. The vacation message notifies others of your absence (as shown in the following figure). Click the Save button after making any changes.
Read more
  • 0
  • 0
  • 3715
article-image-networking-performance-design
Packt
23 Aug 2013
18 min read
Save for later

Networking Performance Design

Packt
23 Aug 2013
18 min read
(For more resources related to this topic, see here.) Device and I/O virtualization involves managing the routing of I/O requests between virtual devices and the shared physical hardware. Software-based I/O virtualization and management, in contrast to a direct pass through to the hardware, enables a rich set of features and simplified management. With networking, virtual NICs and virtual switches create virtual networks between virtual machines which are running on the same host without the network traffic consuming bandwidth on the physical network NIC teaming consists of multiple, physical NICs and provides failover and load balancing for virtual machines. Virtual machines can be seamlessly relocated to different systems by using VMware vMotion, while keeping their existing MAC addresses and the running state of the VM. The key to effective I/O virtualization is to preserve these virtualization benefits while keeping the added CPU overhead to a minimum. The hypervisor virtualizes the physical hardware and presents each virtual machine with a standardized set of virtual devices. These virtual devices effectively emulate well-known hardware and translate the virtual machine requests to the system hardware. This standardization on consistent device drivers also helps with virtual machine standardization and portability across platforms, because all virtual machines are configured to run on the same virtual hardware, regardless of the physical hardware in the system. In this article we will discuss the following: Describe various network performance problems Discuss the causes of network performance problems Propose solutions to correct network performance problems Designing a network for load balancing and failover for vSphere Standard Switch The load balancing and failover policies that are chosen for the infrastructure can have an impact on the overall design. Using NIC teaming we can group several physical network adapters attached to a vSwitch. This grouping enables load balancing between the different physical NICs and provides fault tolerance if a card or link failure occurs. Network adapter teaming offers a number of available load balancing and load distribution options. Load balancing is load distribution based on the number of connections, not on network traffic. In most cases, load is managed only for the outgoing traffic and balancing is based on three different policies: Route based on the originating virtual switch port ID (default) Route based on the source MAC hash Route based on IP hash Also, we have two network failure detection options and those are: Link status only Beacon probing Getting ready To step through this recipe, you will need one or more running ESXi hosts, a vCenter Server, and a working installation of vSphere Client. No other prerequisites are required. How to do it... To change the load balancing policy and to select the right one for your environment, and also select the appropriate failover policy, you need to follow the proceeding steps: Open up your VMware vSphere Client. Log in to the vCenter Server. On the left hand side, choose any ESXi Server and choose configuration from the right hand pane. Click on the Networking section and select the vSwitch for which you want to change the load balancing and failover settings. You may wish to override this per port group level as well. Click on Properties. Select the vSwitch and click on Edit. Go to the NIC Teaming tab. Select one of the available policies from the Load Balancing drop-down menu. Select one of the available policies on the Network Failover Detection drop-down menu. Click on OK to make it effective. How it works... Route based on the originating virtual switch port ID (default) In this configuration, load balancing is based on the number of physical network cards and the number of virtual ports used. With this configuration policy, a virtual network card connected to a vSwitch port will always use the same physical network card. If a physical network card fails, the virtual network card is redirected to another physical network card. You typically do not see the individual ports on a vSwitch. However, each vNIC that gets connected to a vSwitch is implicitly using a particular port on the vSwitch. (It's just that there's no reason to ever configure which port, because that is always done automatically.) It does a reasonable job of balancing your egress uplinks for the traffic leaving an ESXi host as long as all the virtual machines using these uplinks have similar usage patterns. It is important to note that port allocation occurs only when a VM is started or when a failover occurs. Balancing is done based on a port's occupation rate at the time the VM starts up. This means that which pNIC is selected for use by this VM is determined at the time the VM powers on based on which ports in the vSwitch are occupied at the time. For example, if you started 20 VMs in a row on a vSwitch with two pNICs, the odd-numbered VMs would use the left pNIC and the even-numbered VMs would use the right pNIC and that would persist even if you shut down all the even-numbered VMs; the left pNIC, would have all the VMs and the right pNIC would have none. It might happen that two heavily-loaded VMs are connected to the same pNIC, thus load is not balanced. This policy is the easiest one and we always call for the simplest one to map it to a best operational simplification. Now when speaking of this policy, it is important to understand that if, for example, teaming is created with two 1 GB cards, and if one VM consumes more than one card's capacity, a performance problem will arise because traffic greater than 1 Gbps will not go through the other card, and there will be an impact on the VMs sharing the same port as the VM consuming all resources. Likewise, if two VMs each wish to use 600 Mbps and they happen to go to the first pNIC, the first pNIC cannot meet the 1.2 Gbps demand no matter how idle the second pNIC is. Route based on source MAC hash This principle is the same as the default policy but is based on the number of MAC addresses. This policy may put those VM vNICs on the same physical uplink depending on how the MAC hash is resolved. For MAC hash, VMware has a different way of assigning ports. It's not based on the dynamically changing port (after a power off and power on the VM usually gets a different vSwitch port assigned), but is instead based on fixed MAC address. As a result one VM is always assigned to the same physical NIC unless the configuration is not changed. With the port ID, the VM could get different pNICs after a reboot or VMotion. If you have two ESXi Servers with the same configuration, the VM will stay on the same pNIC number even after a vMotion. But again, one pNIC may be congested while others are bored. So there is no real load balancing. Route based on IP hash The limitation of the two previously-discussed policies is that a given virtual NIC will always use the same physical network card for all its traffic. IP hash-based load balancing uses the source and destination of the IP address to determine which physical network card to use. Using this algorithm, a VM can communicate through several different physical network cards based on its destination. This option requires configuration of the physical switch's ports to EtherChannel. Because the physical switch is configured similarly, this option is the only one that also provides inbound load distribution, where the distribution is not necessarily balanced. There are some limitations and reasons why this policy is not commonly used. These reasons are described as follows: The route based on IP hash load balancing option involves added complexity and configuration support from upstream switches. Link Aggregation Control Protocol (LACP) or EtherChannel is required for this algorithm to be used. However, this does not apply for a vSphere Standard Switch. For IP hash to be an effective algorithm for load balancing there must be many IP sources and destinations. This is not a common practice for IP storage networks, where a single VMkernel port is used to access a single IP address on a storage device. The same NIC will always send all its traffic to the same destination (for example, Google.com) through the same pNIC, though another destination (for example, bing.com) might go through another pNIC. So, in a nutshell, due to the added complexity, the upstream dependency on the advanced switch configuration and the management overhead, this configuration is rarely used in production environments. The main reason is that if you use IP hash, the pSwitch must be configured with LACP or EtherChannel. Also, if you use LACP or EtherChannel, the load balancing algorithm must be IP hash. This is because with LACP, inbound traffic to the VM could come through either of the pNICs, and the vSwitch must be ready to deliver that to the VM and only IP Hash will do that (the other policies will drop the inbound traffic to this VM that comes in on a pNIC that the VM doesn't use). We have only two failover detection options and those are: Link status only The link status option enables the detection of failures related to the physical network's cables and switch. However, be aware that configuration issues are not detected. This option also cannot detect the link state problems with upstream switches; it works only with the first hop switch from the host. Beacon probing The beacon probing option allows the detection of failures unseen by the link status option, by sending the Ethernet broadcast frames through all the network cards. These network frames authorize the vSwitch to detect faulty configurations or upstream switch failures and force the failover if the ports are blocked. When using an inverted U physical network topology in conjunction with a dual-NIC server, it is recommended to enable link state tracking or a similar network feature in order to avoid traffic black holes. According to VMware's best practices, it is recommended to have at least three cards before activating this functionality. However, if IP hash is going to be used, beacon probing should not be used as a network failure detection, in order to avoid an ambiguous state due to the limitation that a packet cannot hairpin on the port it is received. Beacon probing works by sending out and listening to beacon probes from the NICs in a team. If there are two NICs, then each NIC will send out a probe and the other NICs will receive that probe. Because EtherChannel is considered one link, this will not function properly as the NIC uplinks are not logically separate uplinks. If beacon probing is used, this can result in MAC address flapping errors, and the network connectivity may be interrupted. Designing a network for load balancing and failover for vSphere Distributed Switch The load balancing and failover policies that are chosen for the infrastructure can have an impact on the overall design. Using NIC teaming, we can group several physical network switches attached to a vSwitch. This grouping enables load balancing between the different Physical NICs, and provides fault tolerance if a card failure occurs. The vSphere distributed vSwitch offers a load balancing option that actually takes the network workload into account when choosing the physical uplink. This is route based on a physical NIC load. This is also called Load Based Teaming (LBT). We recommend this load balancing option over the others when using a distributed vSwitch. Benefits of using this load balancing policy are as follows: It is the only load balancing option that actually considers NIC load when choosing uplinks. It does not require upstream switch configuration dependencies like the route based on IP hash algorithm does. When the route based on physical NIC load is combined with the network I/O control, a truly dynamic traffic distribution is achieved. Getting ready To step through this recipe, you will need one or more running ESXi Servers, a vCenter Server, and a working installation of vSphere Client. No other prerequisites are required. How to do it... To change the load balancing policy and select the right one for your environment, and also select the appropriate failover policy you need to follow the proceeding steps: Open up your VMware vSphere Client. Log in to the vCenter Server. Navigate to Networking on the home screen. Navigate to a Distributed Port group and right click and select Edit Settings. Click on the Teaming and Failover section. From the Load Balancing drop-down menu, select Route Based on physical NIC load as the load balancing policy. Choose the appropriate network failover detection policy from the drop-down menu. Click on OK and your settings will be effective. How it works... Load based teaming, also known as route based on physical NIC load, maps vNICs to pNICs and remaps the vNIC to pNIC affiliation if the load exceeds specific thresholds on a pNIC. LBT uses the originating port ID load balancing algorithm for the initial port assignment, which results in the first vNIC being affiliated to the first pNIC, the second vNIC to the second pNIC, and so on. Once the initial placement is over after the VM being powered on, LBT will examine both the inbound and outbound traffic on each of the pNICs and then distribute the load across if there is congestion. LBT will send a congestion alert when the average utilization of a pNIC is 75 percent over a period of 30 seconds. 30 seconds of interval period is being used for avoiding the MAC flapping issues. However, you should enable port fast on the upstream switches if you plan to use STP. VMware recommends LBT over IP hash when you use vSphere Distributed Switch, as it does not require any special or additional settings in the upstream switch layer. In this way you can reduce unnecessary operational complexity. LBT maps vNIC to pNIC and then distributes the load across all the available uplinks, unlike IP hash which just maps the vNIC to pNIC but does not do load distribution. So it may happen that when a high network I/O VM is sending traffic through pNIC0, your other VM will also get to map to the same pNIC and send the traffic. What to know when offloading checksum VMware takes advantage of many of the performance features from modern network adaptors. In this section we are going to talk about two of them and those are: TCP checksum offload TCP segmentation offload Getting ready To step through this recipe, you will need a running ESXi Server and a SSH Client (Putty). No other prerequisites are required. How to do it... The list of network adapter features that are enabled on your NIC can be found in the file /etc/vmware/esx.conf on your ESXi Server. Look for the lines that start with /net/vswitch. However, do not change the default NIC's driver settings unless you have a valid reason to do so. A good practice is to follow any configuration recommendations that are specified by the hardware vendor. Carry out the following steps in order to check the settings: Open up your SSH Client and connect to your ESXi host. Open the file etc/vmware/esx.conf Look for the line that starts with /net/vswitch Your output should look like the following screenshot: How it works... A TCP message must be broken down into Ethernet frames. The size of each frame is the maximum transmission unit (MUT). The default maximum transmission unit is 1500 bytes. The process of breaking messages into frames is called segmentation. Modern NIC adapters have the ability to perform checksum calculations natively. TCP checksums are used to determine the validity of transmitted or received network packets based on error correcting code. These calculations are traditionally performed by the host's CPU. By offloading these calculations to the network adapters, the CPU is freed up to perform other tasks. As a result, the system as a whole runs better. TCP segmentation offload (TSO) allows a TCP/IP stack from the guest OS inside the VM to emit large frames (up to 64KB) even though the MTU of the interface is smaller. Earlier operating system used the CPU to perform segmentation. Modern NICs try to optimize this TCP segmentation by using a larger segment size as well as offloading work from the CPU to the NIC hardware. ESXi utilizes this concept to provide a virtual NIC with TSO support, without requiring specialized network hardware. With TSO, instead of processing many small MTU frames during transmission, the system can send fewer, larger virtual MTU frames. TSO improves performance for the TCP network traffic coming from a virtual machine and for network traffic sent out of the server. TSO is supported at the virtual machine level and in the VMkernel TCP/IP stack. TSO is enabled on the VMkernel interface by default. If TSO becomes disabled for a particular VMkernel interface, the only way to enable TSO is to delete that VMkernel interface and recreate it with TSO enabled. TSO is used in the guest when the VMXNET 2 (or later) network adapter is installed. To enable TSO at the virtual machine level, you must replace the existing VMXNET or flexible virtual network adapter with a VMXNET 2 (or later) adapter. This replacement might result in a change in the MAC address of the virtual network adapter. Selecting the correct virtual network adapter When you configure a virtual machine, you can add NICs and specify the adapter type. The types of network adapters that are available depend on the following factors: The version of the virtual machine, which depends on which host created it or most recently updated it. Whether or not the virtual machine has been updated to the latest version for the current host. The guest operating system. The following virtual NIC types are supported: Vlance VMXNET Flexible E 1000 Enhanced VMXNET (VMXNET 2) VMXNET 3 If you want to know more about these network adapter types then refer to the following KB article: http://kb.vmware.com/kb/1001805 Getting ready To step through this recipe, you will need one or more running ESXi Servers, a vCenter Server, and a working installation of vSphere Client. No other prerequisites are required. How to do it... To choose a particular virtual network adapter you have two ways, one is while you create a new VM and the other one is while adding a new network adaptor to an existing VM. To choose a network adaptor while creating a new VM is as follows: Open vSphere Client. Log in to the vCenter Server. Click on the File menu, and navigate to New| Virtual Machine. Go through the steps and hold on to the step where you need to create network connections. Here you need to choose how many network adaptors you need, which port group you want them to connect to, and an adaptor type. To choose an adaptor type while adding a new network interface in an existing VM you should follow these steps: Open vSphere Client. Log in to the vCenter Server. Navigate to VMs and Templates on your home screen. Select an existing VM where you want to add a new network adaptor, right click and select Edit Settings. Click on the Add button. Select Ethernet Adaptor. Select the Adaptor type and select the network where you want this adaptor to connect. Click on Next and then click on Finish How it works... Among the entire supported virtual network adaptor types, VMXNETis the paravirtualized device driver for virtual networking. The VMXNET driver implements an idealized network interface that passes through the network traffic from the virtual machine to the physical cards with minimal overhead. The three versions of VMXNET are VMXNET, VMXNET 2 (Enhanced VMXNET), and VMXNET 3. The VMXNET driver improves the performance through a number of optimizations as follows: Shares a ring buffer between the virtual machine and the VMkernel, and uses zero copy, which in turn saves CPU cycles. Zero copy improves performance by having the virtual machines and the VMkernel share a buffer, reducing the internal copy operations between buffers to free up CPU cycles. Takes advantage of transmission packet coalescing to reduce address space switching. Batches packets and issues a single interrupt, rather than issuing multiple interrupts. This improves efficiency, but in some cases with slow packet-sending rates, it could hurt throughput while waiting to get enough packets to actually send. Offloads TCP checksum calculation to the network hardware rather than use the CPU resources of the virtual machine monitor. Use vmxnet3 if you can, or the most recent model you can. Use VMware Tools where possible. For certain unusual types of network traffic, sometimes the generally-best model isn't optimal; if you have poor network performance, experiment with other types of vNICs to see which performs best.
Read more
  • 0
  • 0
  • 3698

article-image-developing-wiki-seek-widget-using-javascript
Packt
14 Oct 2009
8 min read
Save for later

Developing Wiki Seek Widget Using Javascript

Packt
14 Oct 2009
8 min read
If you’re searching for details of a particular term in Google, you’re most probably going to see a link for relevant articles from wikipedia.org in the top 10 result list. Wikipedia, is the largest encyclopedia on the Internet, and contains huge collections of articles in many languages. The most significant feature of this encyclopedia is that it is a Wiki, so anybody can contribute to the knowledge base. A Wiki, (a new concept of web2.0), is a collection of web pages whose content can be created and changed by the visitor of the page with simplified mark-up language. Wikis are usually used as knowledge management systems on the web. Brief Introduction to Wikipedia Wikipedia has defined itself as : … a free, multilingual, open content encyclopedia project operated by the United States-based non-profit Wikimedia Foundation. Wikipedia is built upon an open source wiki package called MediaWiki. MediaWiki uses PHP as a server side scripting language and MySql as the database. Wikipedia uses MediaWiki’s wikitext format for editing the text, so the user (without any necessary  knowledge of HTML and CSS) can edit them easily. The Wikitext language (also called Wiki Markup) is a markup language which gives instruction on how outputted text will be displayed. It provides a simplified approach to writing pages in a wiki website. Different types of wiki software employ different styles of Wikitext language. For example, the Wikitext markup language has ways to hyperlink pages within the website but a number of different syntaxes are available for creating such links. Wikipedia was launched by Jimmy Wales and Larry Sanger in 2001 as a means of collecting and summarizing human knowledge in every major language. As of April 2008, Wikipedia had over 10 million articles in 253 languages. With so many articles, it is the largest encyclopedia ever assembled. Wikipedia articles are written collaboratively by volunteers, and any visitor can modify the content of article. Any modification must be accepted by the editors of Wikipedia otherwise the article will be reverted to the previous content. Along with popularity, Wikipedia is also criticized for systematic bias and inconsistency since the modifications must be cleared by the editors. Critics also argue that it’s open nature and the lack of proper sources for many articles makes it unreliable. Searching in Wikipedia To search for a particular article in Wikipedia, you can use the search box in the home page of wikipedia.org.Wikipedia classifies its articles in different sub-domains according to language; “en.wikipedia.org” contains articles in English language whereas “es.wikipedia.org” contains Spanish articles. Whenever you select “english” language in the dropdown box, the related articles will be searched over “en.wikipedia.org” and so on for the another language. You can also search the articles of Wikipedia from a remote server. For this, you have to send the language and search parameters to http://www.wikipedia.org/search-redirect.php via the GET method Creating a Wiki Seek Widget Up till now, we’ve looked at the background concept of Wikipedia. Now, let’s start building the widget. This widget contains a form with three components. A textbox where the visitors enters the search keyword, a dropdown list which contains the language of the article and finally a submit button to search the articles of Wikipedia. By the time we’re done, you should have a widget that looks like this: Concept for creating form Before looking at the JavaScript code, first let’s understand the architecture of the form with the parameters to be sent for searching Wikipedia. The request should be sent to http://www.wikipedia.org/search-redirect.php via the GET method. <form action="http://www.wikipedia.org/search-redirect.php" ></form> If you don’t specify the method attribute in the form, the form uses GET, which is the default method. After creating the form element, we need to add the textbox inside the above form with the name search because we’ve to send the search keyword in the name of search parameter. <input type="text" name="search" size="20" /> After adding the textbox for the search keyword, we need to add the dropdown list which contains the language of the article to search. The name of this dropdown-list should be language as we’ve to send the language code to the above URL in the language parameter. These language codes are two or three letter codes specified by ISO. ISO has assigned three letter language codes for most of the popular languages of the world. And, there are a few languages that are represented by two letter ISO codes. For example, eng and en are the three and two letter language code for English. Some of the article languages of Wikipedia don’t have ISO codes, and you have to find the value of the language parameter from Wikipedia. For example, articles in the Alemannisch language is als. Here is the HTML code for constructing a dropdown list in major languages : <select name="language"><option value="de" >Deutsch</option><option value="en" selected="selected">English</option><option value="es" >Español</option><option value="eo" >Esperanto</option><option value="fr" >Français</option><option value="it" >Italiano</option><option value="hu" >Magyar</option><option value="nl" >Nederlands</option></select> As you can see in the above dropdown list, English is the default language selected. Now, we just need to add a submit button in the above form to complete the form for searching the article in wikipedia. <input type="submit" name="go" value="Search" title="Search in wikipedia" /> Put all the HTML code together to create the form. JavaScript Code As we’ve already got the background concept of the HTML form, we just have to use the document.write() to output the HTML to the web browser. Here is the JavaScript code to create the Wiki Seek Widget : document.write('<div>');document.write('<form action="http://www.wikipedia.org/search-redirect.php" >');document.write('<input type="text" name="search" size="20" />');document.write('&nbsp;<select name="language">');document.write('<option value="de" >Deutsch</option>');document.write('<option value="en" selected="selected">English</option>');document.write('<option value="es" >Español</option>');document.write('<option value="eo" >Esperanto</option>');document.write('<option value="fr" >Français</option>');document.write('<option value="it" >Italiano</option>');document.write('<option value="hu" >Magyar</option>');document.write('<option value="nl" >Nederlands</option>');document.write('</select>');document.write('&nbsp;<input type="submit" name="go" value="Search" title="Search in wikipedia" />');document.write('</form>');document.write('</div>'); In the above code, I’ve used division (div) as the container for the HTML form. I’ve also saved the above code in a wiki_seek.js file. The above JavaScript code displays a non-stylish widget. To make a stylish widget, you can use style property in the input elements of the form. Using Wiki Seek widget To use this wiki seek widget we’ve to follow these steps: First of all, we need to upload the above wiki_seek.js to a web server so that it can be used by the client websites. Let’s suppose that is uploaded and placed in the URL : http://www.widget-server.com/wiki_seek.js Now, we can widget in any web pages by placing the following JavaScript Code in the website. <script type="text/javascript" language="javascript"src="http://www.widget-server.com/wiki_seek.js"></script> The Wiki Seek widget is displayed in any part of web page, where you place the above code.
Read more
  • 0
  • 0
  • 3687
Modal Close icon
Modal Close icon