Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Save more on your purchases! discount-offer-chevron-icon
Savings automatically calculated. No voucher code required.
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Events
Videos
Audiobooks
Packt Hub
Free Learning
Arrow right icon
timer SALE ENDS IN
0 Days
:
00 Hours
:
00 Minutes
:
00 Seconds

How-To Tutorials

7018 Articles
article-image-setting-scans
Packt
04 Sep 2013
10 min read
Save for later

Setting up scans

Packt
04 Sep 2013
10 min read
(For more resources related to this topic, see here.) Setting up a scan in Spiceworks The first thing Spiceworks tries to do to scan a network is contact Active Directory(AD); it also uses AD to populate the People portion of your Inventory. Let's set up AD first, as everything else we will be configuring is on the same page. We are all about saving your time and not going back and forth between pages. If you do not have AD in your environment, you can just skip to the Configuring IP range scans section. Scanning and Active Directory There is a wealth of information within AD that Spiceworks uses. We are going to need to configure Spiceworks to log into AD and get that information. OK, we need to get to the Active Directory Configuration screen in Spiceworks in order to do that. As with most things within the app, it is just a couple of clicks. From anywhere in the app, mouse over the Inventory link at the top of the page; a menu will open up. Click on Settings. This will take us to the Settings screen. You will be spending a lot of time here so you can either get very used to these clicks or just have a separate tab open with these settings already set up. The top section is called Getting Started and the first link is Active Directory Configuration. That is our destination for this section so click away. It will take you to the Active Directory Configuration page: There are three sections that are highlighted. Let's go over each and what they do: The area highlighted as 1 is where you are going to enter the credentials that allow Spiceworks to log into your AD and get information. You specify the Active Directory Server (Domain Controller), username and password. Usernames must be in either domain/username or username@domain.com. If you have SSL enabled for AD inquiries, check the Use SSL box. The area highlighted as 2 shows the frequency at which Spiceworks retrieves information from your AD environment. When Spiceworks queries AD, it does not cause a huge amount of traffic or load. Shortening these times should not cause undue stress on your AD servers. This is useful because when you add a user in AD, it will automatically get loaded into Spiceworks at the next scan. If you want any changes you make to users in Spiceworks to be uploaded into your AD environment, the section highlighted as 3 is for you. Just click on the box and any modifications you make in Spiceworks will automatically be synchronized with your AD. There is one more section that is not in the screenshot. This deals with your user portal and help desk. Setting up AD in your Spiceworks really makes a lot of difference with scans and filling in information. It is recommended that if you are running AD, hook this up. If you are wary about Spiceworks writing data into your AD environment, just set up the user that Spiceworks uses to connect as read-only and don't check the box that writes changes back to AD. Easy enough. Since you are convinced that you should connect your AD to Spiceworks, just fill in the ActiveDirectory server, User, and Password fields and click on Save. Spiceworks will automatically test the credentials and let you know immediately if it can connect. If you have some challenges with Spiceworks connecting to your Domain Controller with just the server name, another method is to put the IP address directly into that field. Let's move on to setting up an IP range scan and get some devices into your Spiceworks install. Configuring IP range scans Remember the Settings page that we have been to a couple of times? We are going back! In case you have forgotten, just mouse over to either Inventory or Help Desk and click on the Settings link at the bottom of the left column. Once on the Settings page, we are going to click on the Network Scan link. It is in the first section of links titled Getting Started. This takes us to the main Network Scan page. The first section is where we are going to set up our IP ranges. Since you will not have any ranges in here as you just installed Spiceworks, let's get one configured so you can get some information into the app. To do this, just click on the Add IP Range button and this window will pop up. There is a lot of flexibility that Spiceworks gives you regarding how it scans IP ranges. You can put a fill range (192.168.1.1-254) with or without exclusions, or just a single IP if you so wish. The next box is for exclusions, if you so choose. If you decide you want to scan a range that has both servers and desktops, you can exclude server IP addresses. This is handy. The last options are for scheduling this IP range scan. If you choose the Daily at… option as we have seen in the screenshot, you can also select the time of the day to run this scan. Other options in this drop-down list are every 4, 6, 8, or 12 hours. If you do decide that you want to scan on an hourly basis, the time of the day magically disappears. The bottom of the window lets you select what days of the week you want to run the scan. When Spiceworks runs an initial scan, it can take a bit of time as there is a ton of data that it is collecting. Spiceworks tries a multitude of credentials and reads all information from devices, which it then writes to the database. Once Spiceworks has scanned and written the data to the database, any subsequent scans just write delta data into it. Enter what range you want to scan, any exclusions you choose, and the scan frequency, and click on the Add button. Congratulations! You have just added an IP range scan! Scanning credentials As we have covered, Spiceworks uses a multitude of credentials to try and figure out what is on your network and put those devices into the inventory. This has been completely overhauled in Spiceworks. In this easy-to-use interface, you can enter all the credentials that you are going to need to have a successful scan. Here you can configure multiple usernames/passwords for the following protocols: WMI SSH SNMP Enable ESX/vSphere HTTP iLo SNMP v2c/v3 Telnet Intel vPro As you can see, if you need to put device-specific usernames and passwords into Spiceworks, you can do so using the format, Domainusername. So if you have a server that uses a unique username/password combination, it is easy to set all that up through this interface. The preceding screenshot shows an example of this. Something new in Spiceworks is the section where it shows devices that the credentials were successfully used on. This is really helpful for troubleshooting any scan errors! To add your own username/password combinations, just use these easy-to-follow directions: Click on the protocol you want to add credentials to on the left column (WMI, SNMP, and so on). Click on +Add Account in the middle column labeled Existing Accounts. Enter all the pertinent information on the left pane labeled Edit Account.For usernames that have passwords, there is a Show Password button as well, so you can make sure that you didn't fat finger it! That's it. Just fill in any credentials that will let Spiceworks access your devices on your network, and as far as permissions are concerned you should be good to go! Best practices and kicking off your first Spiceworks scan You have everything you need to start your first Spiceworks scan. It might be best to read the following best practices before you kick it off, though. They will guide you through some potential pitfalls. Scanning best practices For initial scans, be aware of the number of IP addresses you are scanning and the amount of information that Spiceworks is going to pull out of those devices. If you put in a full IP range on your first scan, do not expect Spiceworks to be completed in 10-15 minutes. The initial scan is the most network traffic intensive and will take the longest duration of time. Do full initial scans during nonbusiness hours. Though running an initial full scan shouldn't flood your network, depending on your network configuration, it is always best to run full initial scans during nonbusiness hours just in case. If you are running a 24 x 7 business, break up your IP ranges into smaller chunks and scan that way. Expect some unknown devices. Unless you are a super administrator with a team of hundreds behind you to make sure that every aspect of your network is 100 percent buttoned down, there will most likely be a few devices that Spiceworks cannot connect to. One of the biggest culprits is that WMI has been disabled, or that there is a firewall of some sort blocking Spiceworks from connecting to the machine. Don't get down on yourself if the scan doesn't work 100 percent the first time. If you are really worried about traffic that Spiceworks might cause, what information it collects, or how it will affect workstation performance, just set up a test environment and run a scan there. Whether it be 5 machines or 500, Spiceworks does the same to each one; so test away. Spiceworks is not designed to scan 10,000 devices at one time without a performance hit. If you have a very large network, break it up into smaller chunks for best performance. Spiceworks could get through a 10,000 device scan, but it would hurt performance until the scan is complete. If you have multiple sites linked either by WAN or VPN connections, drop a remote collector at these to run local scans and then send the data back to your main Spiceworks installation. You can find more information at http://community.spiceworks.com/help/Remote_Collectors OK, now that you have read the required best practices, you can set up your IP range on the Network Scan settings page, check the box associated with that range and click on Start Scan. Away you go! Depending on the IP range you set and the time of the day, your scan could take just a few minutes or several hours. If you are having some serious issues trying to get a successful scan, open a browser and hit this site: http://community.spiceworks.com/support. There are in-depth articles and even real-live support folks that can dive into the specifics of your environment, and they won't give up until you are successful. Let's assume that even if you did have an issue, it is resolved and you have got your first scan under your belt. Summary We were provided with details on how to set up a scan in Spiceworks. Also, we got to know how to run the scan we set up and the best practices. Resources for Article : Further resources on this subject: Using SpriteFonts in a Board-based Game with XNA 4.0 [Article] Why CoffeeScript?HTML5 Games Development: Using Local Storage to Store Game Data [Article] Making Money with Your Game [Article]
Read more
  • 0
  • 0
  • 11583

article-image-integrating-storm-and-hadoop
Packt
04 Sep 2013
17 min read
Save for later

Integrating Storm and Hadoop

Packt
04 Sep 2013
17 min read
(For more resources related to this topic, see here.) In this article, we will implement the Batch and Service layers to complete the architecture. There are some key concepts underlying this big data architecture: Immutable state Abstraction and composition Constrain complexity Immutable state is the key, in that it provides true fault-tolerance for the architecture. If a failure is experienced at any level, we can always rebuild the data from the original immutable data. This is in contrast to many existing data systems, where the paradigm is to act on mutable data. This approach may seem simple and logical; however, it exposes the system to a particular kind of risk in which the state is lost or corrupted. It also constrains the system, in that you can only work with the current view of the data; it isn't possible to derive new views of the data. When the architecture is based on a fundamentally immutable state, it becomes both flexible and fault-tolerant. Abstractions allow us to remove complexity in some cases, and in others they can introduce complexity. It is important to achieve an appropriate set of abstractions that increase our productivity and remove complexity, but at an appropriate cost. It must be noted that all abstractions leak, meaning that when failures occur at a lower abstraction, they will affect the higher-level abstractions. It is therefore often important to be able to make changes within the various layers and understand more than one layer of abstraction. The designs we choose to implement our abstractions must therefore not prevent us from reasoning about or working at the lower levels of abstraction when required. Open source projects are often good at this, because of the obvious access to the code of the lower level abstractions, but even with source code available, it is easy to convolute the abstraction to the extent that it becomes a risk. In a big data solution, we have to work at higher levels of abstraction in order to be productive and deal with the massive complexity, so we need to choose our abstractions carefully. In the case of Storm, Trident represents an appropriate abstraction for dealing with the data-processing complexity, but the lower level Storm API on which Trident is based isn't hidden from us. We are therefore able to easily reason about Trident based on an understanding of lower-level abstractions within Storm. Another key issue to consider when dealing with complexity and productivity is composition. Composition within a given layer of abstraction allows us to quickly build out a solution that is well tested and easy to reason about. Composition is fundamentally decoupled, while abstraction contains some inherent coupling to the lower-level abstractions—something that we need to be aware of. Finally, a big data solution needs to constrain complexity. Complexity always equates to risk and cost in the long run, both from a development perspective and from an operational perspective. Real-time solutions will always be more complex than batch-based systems; they also lack some of the qualities we require in terms of performance. Nathan Marz's Lambda architecture attempts to address this by combining the qualities of each type of system to constrain complexity and deliver a truly fault-tolerant architecture. We divided this flow into preprocessing and "at time" phases, using streams and DRPC streams respectively. We also introduced time windows that allowed us to segment the preprocessed data. In this article, we complete the entire architecture by implementing the Batch and Service layers. The Service layer is simply a store of a view of the data. In this case, we will store this view in Cassandra, as it is a convenient place to access the state alongside Trident's state. The preprocessed view is identical to the preprocessed view created by Trident, counted elements of the TF-IDF formula (D, DF, and TF), but in the batch case, the dataset is much larger, as it includes the entire history. The Batch layer is implemented in Hadoop using MapReduce to calculate the preprocessed view of the data. MapReduce is extremely powerful, but like the lower-level Storm API, is potentially too low-level for the problem at hand for the following reasons: We need to describe the problem as a data pipeline; MapReduce isn't congruent with such a way of thinking Productivity We would like to think of a data pipeline in terms of streams of data, tuples within the stream and predicates acting on those tuples. This allows us to easily describe a solution to a data processing problem, but it also promotes composability, in that predicates are fundamentally composable, but pipelines themselves can also be composed to form larger, more complex pipelines. Cascading provides such an abstraction for MapReduce in the same way as Trident does for Storm. With these tools, approaches, and considerations in place, we can now complete our real-time big data architecture. There are a number of elements, that we will update, and a number of elements that we will add. The following figure illustrates the final architecture, where the elements in light grey will be updated from the existing recipe, and the elements in dark grey will be added in this article: Implementing TF-IDF in Hadoop TF-IDF is a well-known problem in the MapReduce communities; it is well-documented and implemented, and it is interesting in that it is sufficiently complex to be useful and instructive at the same time. Cascading has a series of tutorials on TF-IDF at http://www.cascading.org/2012/07/31/cascading-for-the-impatient-part-5/, which documents this implementation well. For this recipe, we shall use a Clojure Domain Specific Language (DSL) called Cascalog that is implemented on top of Cascading. Cascalog has been chosen because it provides a set of abstractions that are very semantically similar to the Trident API and are very terse while still remaining very readable and easy to understand. Getting ready Before you begin, please ensure that you have installed Hadoop by following the instructions at http://www.michael-noll.com/tutorials/running-hadoop-on-ubuntu-linux-single-node-cluster/. How to do it… Start by creating the project using the lein command: lein new tfidf-cascalog Next, you need to edit the project.clj file to include the dependencies: (defproject tfidf-cascalog "0.1.0-SNAPSHOT" :dependencies [[org.clojure/clojure "1.4.0"] [cascalog "1.10.1"] [org.apache.cassandra/cassandra-all "1.1.5"] [clojurewerkz/cassaforte "1.0.0-beta11-SNAPSHOT"] [quintona/cascading-cassandra "0.0.7-SNAPSHOT"] [clj-time "0.5.0"] [cascading.avro/avro-scheme "2.2-SNAPSHOT"] [cascalog-more-taps "0.3.0"] [org.apache.httpcomponents/httpclient "4.2.3"]] :profiles{:dev{:dependencies[[org.apache.hadoop/hadoop-core "0.20.2-dev"] [lein-midje "3.0.1"] [cascalog/midje-cascalog "1.10.1"]]}}) It is always a good idea to validate your dependencies; to do this, execute lein deps and review any errors. In this particular case, cascading-cassandra has not been deployed to clojars, and so you will receive an error message. Simply download the source from https://github.com/quintona/cascading-cassandra and install it into your local repository using Maven. It is also good practice to understand your dependency tree. This is important to not only prevent duplicate classpath issues, but also to understand what licenses you are subject to. To do this, simply run lein pom, followed by mvn dependency:tree. You can then review the tree for conflicts. In this particular case, you will notice that there are two conflicting versions of Avro. You can fix this by adding the appropriate exclusions: [org.apache.cassandra/cassandra-all "1.1.5" :exclusions [org.apache.cassandra.deps/avro]] We then need to create the Clojure-based Cascade queries that will process the document data. We first need to create the query that will create the "D" view of the data; that is, the D portion of the TF-IDF function. This is achieved by defining a Cascalog function that will output a key and a value, which is composed of a set of predicates: (defn D [src] (let [src (select-fields src ["?doc-id"])] (<- [?key ?d-str] (src ?doc-id) (c/distinct-count ?doc-id :> ?n-docs) (str "twitter" :> ?key) (str ?n-docs :> ?d-str)))) You can define this and any of the following functions in the REPL, or add them to core.clj in your project. If you want to use the REPL, simply use lein repl from within the project folder. The required namespace (the use statement), require, and import definitions can be found in the source code bundle. We then need to add similar functions to calculate the TF and DF values: (defn DF [src] (<- [?key ?df-count-str] (src ?doc-id ?time ?df-word) (c/distinct-count ?doc-id ?df-word :> ?df-count) (str ?df-word :> ?key) (str ?df-count :> ?df-count-str))) (defn TF [src] (<- [?key ?tf-count-str] (src ?doc-id ?time ?tf-word) (c/count ?tf-count) (str ?doc-id ?tf-word :> ?key) (str ?tf-count :> ?tf-count-str))) This Batch layer is only interested in calculating views for all the data leading up to, but not including, the current hour. This is because the data for the current hour will be provided by Trident when it merges this batch view with the view it has calculated. In order to achieve this, we need to filter out all the records that are within the current hour. The following function makes that possible: (deffilterop timing-correct? [doc-time] (let [now (local-now) interval (in-minutes (interval (from-long doc-time) now))] (if (< interval 60) false true)) Each of the preceding query definitions require a clean stream of words. The text contained in the source documents isn't clean. It still contains stop words. In order to filter these and emit a clean set of words for these queries, we can compose a function that splits the text into words and filters them based on a list of stop words and the time function defined previously: (defn etl-docs-gen [rain stop] (<- [?doc-id ?time ?word] (rain ?doc-id ?time ?line) (split ?line :> ?word-dirty) ((c/comp s/trim s/lower-case) ?word-dirty :> ?word) (stop ?word :> false) (timing-correct? ?time))) We will be storing the outputs from our queries to Cassandra, which requires us to define a set of taps for these views: (defn create-tap [rowkey cassandra-ip] (let [keyspace storm_keyspace column-family "tfidfbatch" scheme (CassandraScheme. cassandra-ip "9160" keyspace column-family rowkey {"cassandra.inputPartitioner""org.apache.cassandra.dht.RandomPartitioner" "cassandra.outputPartitioner" "org.apache.cassandra.dht.RandomPartitioner"}) tap (CassandraTap. scheme)] tap)) (defn create-d-tap [cassandra-ip] (create-tap "d"cassandra-ip)) (defn create-df-tap [cassandra-ip] (create-tap "df" cassandra-ip)) (defn create-tf-tap [cassandra-ip] (create-tap "tf" cassandra-ip)) The way this schema is created means that it will use a static row key and persist name-value pairs from the tuples as column:value within that row. This is congruent with the approach used by the Trident Cassandra adaptor. This is a convenient approach, as it will make our lives easier later. We can complete the implementation by a providing a function that ties everything together and executes the queries: (defn execute [in stop cassandra-ip] (cc/connect! cassandra-ip) (sch/set-keyspace storm_keyspace) (let [input (tap/hfs-tap (AvroScheme. (load-schema)) in) stop (hfs-delimited stop :skip-header? true) src (etl-docs-gen input stop)] (?- (create-d-tap cassandra-ip) (D src)) (?- (create-df-tap cassandra-ip) (DF src)) (?- (create-tf-tap cassandra-ip) (TF src)))) Next, we need to get some data to test with. I have created some test data, which is available at https://bitbucket.org/qanderson/tfidf-cascalog. Simply download the project and copy the contents of src/data to the data folder in your project structure. We can now test this entire implementation. To do this, we need to insert the data into Hadoop: hadoop fs -copyFromLocal ./data/document.avro data/document.avro hadoop fs -copyFromLocal ./data/en.stop data/en.stop Then launch the execution from the REPL: => (execute "data/document" "data/en.stop" "127.0.0.1") How it works… There are many excellent guides on the Cascalog wiki (https://github.com/nathanmarz/cascalog/wiki), but for completeness's sake, the nature of a Cascalog query will be explained here. Before that, however, a revision of Cascading pipelines is required. The following is quoted from the Cascading documentation (http://docs.cascading.org/cascading/2.1/userguide/htmlsingle/): Pipe assemblies define what work should be done against tuple streams, which are read from tap sources and written to tap sinks. The work performed on the data stream may include actions such as filtering, transforming, organizing, and calculating. Pipe assemblies may use multiple sources and multiple sinks, and may define splits, merges, and joins to manipulate the tuple streams. This concept is embodied in Cascalog through the definition of queries. A query takes a set of inputs and applies a list of predicates across the fields in each tuple of the input stream. Queries are composed through the application of many predicates. Queries can also be composed to form larger, more complex queries. In either event, these queries are reduced down into a Cascading pipeline. Cascalog therefore provides an extremely terse and powerful abstraction on top of Cascading; moreover, it enables an excellent development workflow through the REPL. Queries can be easily composed and executed against smaller representative datasets within the REPL, providing the idiomatic API and development workflow that makes Clojure beautiful. If we unpack the query we defined for TF, we will find the following code: (defn DF [src] (<- [?key ?df-count-str] (src ?doc-id ?time ?df-word) (c/distinct-count ?doc-id ?df-word :> ?df-count) (str ?df-word :> ?key) (str ?df-count :> ?df-count-str))) The <- macro defines a query, but does not execute it. The initial vector, [?key ?df-count-str], defines the output fields, which is followed by a list of predicate functions. Each predicate can be one of the following three types: Generators: A source of data where the underlying source is either a tap or another query. Operations: Implicit relations that take in input variables defined elsewhere and either act as a function that binds new variables or a filter. Operations typically act within the scope of a single tuple. Aggregators: Functions that act across tuples to create aggregate representations of data. For example, count and sum. The :> keyword is used to separate input variables from output variables. If no :> keyword is specified, the variables are considered as input variables for operations and output variables for generators and aggregators. The (src ?doc-id ?time ?df-word) predicate function names the first three values within the input tuple, whose names are applicable within the query scope. Therefore, if the tuple ("doc1" 123324 "This") arrives in this query, the variables would effectively bind as follows: ?doc-id: "doc1" ?time: 123324 ?df-word: "This" Each predicate within the scope of the query can use any bound value or add new bound variables to the scope of the query. The final set of bound values that are emitted is defined by the output vector. We defined three queries, each calculating a portion of the value required for the TF-IDF algorithm. These are fed from two single taps, which are files stored in the Hadoop filesystem. The document file is stored using Apache Avro, which provides a high-performance and dynamic serialization layer. Avro takes a record definition and enables serialization/deserialization based on it. The record structure, in this case, is for a document and is defined as follows: {"namespace": "storm.cookbook", "type": "record", "name": "Document", "fields": [ {"name": "docid", "type": "string"}, {"name": "time", "type": "long"}, {"name": "line", "type": "string"} ] } Both the stop words and documents are fed through an ETL function that emits a clean set of words that have been filtered. The words are derived by splitting the line field using a regular expression: (defmapcatop split [line] (s/split line #"[[](),.)s]+")) The ETL function is also a query, which serves as a source for our downstream queries, and defines the [?doc-id ?time ?word] output fields. The output tap, or sink, is based on the Cassandra scheme. A query defines predicate logic, not the source and destination of data. The sink ensures that the outputs of our queries are sent to Cassandra. The ?- macro executes a query, and it is only at execution time that a query is bound to its source and destination, again allowing for extreme levels of composition. The following, therefore, executes the TF query and outputs to Cassandra: (?- (create-tf-tap cassandra-ip) (TF src)) There's more… The Avro test data was created using the test data from the Cascading tutorial at http://www.cascading.org/2012/07/31/cascading-for-the-impatient-part-5/. Within this tutorial is the rain.txt tab-separated data file. A new column was created called time that holds the Unix epoc time in milliseconds. The updated text file was then processed using some basic Java code that leverages Avro: Schema schema = Schema.parse(SandboxMain.class.getResourceAsStream("/document.avsc")); File file = new File("document.avro"); DatumWriter<GenericRecord> datumWriter = new GenericDatumWriter<GenericRecord>(schema); DataFileWriter<GenericRecord> dataFileWriter = new DataFileWriter<GenericRecord>(datumWriter); dataFileWriter.create(schema, file); BufferedReader reader = new BufferedReader(new InputStreamReader(SandboxMain.class.getResourceAsStream("/rain.txt"))); String line = null; try { while ((line = reader.readLine()) != null) { String[] tokens = line.split("t"); GenericRecord docEntry = new GenericData.Record(schema); docEntry.put("docid", tokens[0]); docEntry.put("time", Long.parseLong(tokens[1])); docEntry.put("line", tokens[2]); dataFileWriter.append(docEntry); } } catch (IOException e) { e.printStackTrace(); } dataFileWriter.close(); Persisting documents from Storm In the previous recipe, we looked at deriving precomputed views of our data taking some immutable data as the source. In that recipe, we used statically created data. In an operational system, we need Storm to store the immutable data into Hadoop so that it can be used in any preprocessing that is required. How to do it… As each tuple is processed in Storm, we must generate an Avro record based on the document record definition and append it to the data file within the Hadoop filesystem. We must create a Trident function that takes each document tuple and stores the associated Avro record. Within the tfidf-topology project created in, inside the storm.cookbook.tfidf.function package, create a new class named PersistDocumentFunction that extends BaseFunction. Within the prepare function, initialize the Avro schema and document writer: public void prepare(Map conf, TridentOperationContext context) { try { String path = (String) conf.get("DOCUMENT_PATH"); schema = Schema.parse(PersistDocumentFunction.class .getResourceAsStream("/document.avsc")); File file = new File(path); DatumWriter<GenericRecord> datumWriter = new GenericDatumWriter<GenericRecord>(schema); dataFileWriter = new DataFileWriter<GenericRecord>(datumWriter); if(file.exists()) dataFileWriter.appendTo(file); else dataFileWriter.create(schema, file); } catch (IOException e) { throw new RuntimeException(e); } } As each tuple is received, coerce it into an Avro record and add it to the file: public void execute(TridentTuple tuple, TridentCollector collector) { GenericRecord docEntry = new GenericData.Record(schema); docEntry.put("docid", tuple.getStringByField("documentId")); docEntry.put("time", Time.currentTimeMillis()); docEntry.put("line", tuple.getStringByField("document")); try { dataFileWriter.append(docEntry); dataFileWriter.flush(); } catch (IOException e) { LOG.error("Error writing to document record: " + e); throw new RuntimeException(e); } } Next, edit the TermTopology.build topology and add the function to the document stream: documentStream.each(new Fields("documentId","document"), new PersistDocumentFunction(), new Fields()); Finally, include the document path into the topology configuration: conf.put("DOCUMENT_PATH", "document.avro"); How it works… There are various logical streams within the topology, and certainly the input for the topology is not in the appropriate state for the recipes in this article containing only URLs. We therefore need to select the correct stream from which to consume tuples, coerce these into Avro records, and serialize them into a file. The previous recipe will then periodically consume this file. Within the context of the topology definition, include the following code: Stream documentStream = getUrlStream(topology, spout) .each(new Fields("url"), new DocumentFetchFunction(mimeTypes), new Fields("document", "documentId", "source")); documentStream.each(new Fields("documentId","document"), new PersistDocumentFunction(), new Fields()); The function should consume tuples from the document stream whose tuples are populated with already fetched documents.
Read more
  • 0
  • 0
  • 2862

article-image-using-virtual-destinations-advanced
Packt
04 Sep 2013
5 min read
Save for later

Using Virtual Destinations (Advanced)

Packt
04 Sep 2013
5 min read
(For more resources related to this topic, see here.) Getting ready For this article we will use the sample application virtual-destinations to demonstrate the use of Virtual Destinations. How to do it... To run the sample for this article, open a terminal, change the path to point to the directory where virtual-destinations is located, and run it by typing mvn compile exec:java. In the terminal where you started the example, you will see output similar to the following snippet, indicating that the application is running: Starting Virtual Destination example now... Queue A Consumer 1 processed 500 Messages Queue A Consumer 2 processed 500 Messages Queue B Consumer processed 1000 Messages Finished running the Virtual Destination example. How it works... Our example takes advantage of ActiveMQ's Virtual Destination feature to allow a JMS Producer to send messages to a topic and have those messages be received by a number of queue consumers. Let's take a look at the sent code first: private void sendMessages() throws Exception { Connection connection = connectionFactory.createConnection(); Session session = connection.createSession(false, Session.AUTO_ACKNOWLEDGE); Destination destination = session.createTopic("VirtualTopic.Foo"); MessageProducer producer = session.createProducer(destination); for (int i = 0; i < 1000; ++i) { producer.send(session.createMessage()); } connection.close(); } As we can see, there's not much new here; we are just using standard JMS API code to send our messages. The important part is in the name of the topic destination VirtualTopic.Foo, which informs ActiveMQ Broker we want this topic to be one of those special Virtual Destinations. Now let's take a look at the consumer code, and then we'll try and make sense of how this works: Queue queueA = receiverSession.createQueue("Consumer.A. VirtualTopic.Foo"); VirtualMessageListener listenerA1 = new VirtualMessageListener(done); MessageConsumer consumerA1 = receiverSession.createConsumer(queueA); consumerA1.setMessageListener(listenerA1); VirtualMessageListener listenerA2 = new VirtualMessageListener(done); MessageConsumer consumerA2 = receiverSession.createConsumer(queueA); consumerA2.setMessageListener(listenerA2);  Queue queueB = receiverSession.createQueue("Consumer.B. VirtualTopic.Foo"); VirtualMessageListener listenerB = new VirtualMessageListener(done); MessageConsumer consumerB = receiverSession.createConsumer(queueB); consumerB.setMessageListener(listenerB); As we can see, the consumer code is also not very mysterious; we create two consumers on the Consumer.A.VirtualTopic.Foo queue and one on the Consumer.B.VirtualTopic.Foo queue, and yet when we run the example, the two consumers on queue A split 500 of the topic message and the consumer on queue B gets its own copy of the 1,000 messages. So what's happening on the broker then to make this happen? By default, whenever a topic is created with a name matching the pattern VirtualTopic.<TopicName>, we gain access to the feature known as Virtual Destinations. These Virtual Destinations allow us to send a message to a topic but create consumers that have all the benefits of queue consumers, namely, those of load balancing and message persistence. Our queue consumers just need to use their own naming convention to access the Virtual Destination functionality. Each queue that we want to create must be named using the pattern Consumer.<Name>.VirtualTopic.<TopicName>. In the following figure we can see a visual representation of the message flow in our example: As messages are sent to our example's topic, they are distributed to both the queues we created. Two of our consumers split the work of processing messages from queue A, while the other consumer gets all the messages from queue B. In our example we didn't attach a listener to the topic itself, but we could have done so and it would also have received all the messages we sent. It's easy to imagine using the topic as a way to monitor the messages that are being distributed to the queue consumers in this scenario; this pattern is often referred to as a wiretap. If it isn't already apparent, Virtual Destinations are a very powerful feature for your messaging applications. Topics are great for broadcasting events but when we need to be able to consume messages that were sent while a client was offline, the only recourse is to use a durable topic subscription. Unfortunately, durable subscriptions have a number of limitations, the least of which is that only one JMS connection can be actively subscribed to a logical topic subscription. This means that we can't load balance messages and we can't have fast failover if a subscriber goes down. Virtual Destinations solve these problems since all of our consumers subscribe to a queue so their messages are persistent, and the work of processing the messages on the queue can be shared by more than one active subscription. There's more... The naming pattern for Virtual Destinations is not set in stone. By default, ActiveMQ is configured to use the pattern we used in our sample application, but you can easily change this. In the following XML snippet, we configured our broker to all topics on our broker into virtual topics. We use the wildcard syntax here, which matches every topic name a client sends messages to: <broker> <destinationInterceptors> <virtualDestinationInterceptor> <virtualDestinations><virtualTopic name=">" prefix="VirtualTopic Consumers.*."/> </virtualDestinations></virtualDestinationInterceptor> </destinationInterceptors> </broker> More information on Virtual Destinations can be found on the ActiveMQ website, http://activemq.apache.org/virtual-destinations.html. Summary This article thus explained what Virtual Destinations are and how to use them to avoid the limitations of JMS's durable topic subscriptions. Resources for Article: Further resources on this subject: Routing to an external ActiveMQ broker [Article] So, what is Apache Wicket? [Article] Apache Geronimo Logging [Article]
Read more
  • 0
  • 0
  • 5999

article-image-important-features-mockito
Packt
04 Sep 2013
4 min read
Save for later

Important features of Mockito

Packt
04 Sep 2013
4 min read
Reducing boilerplate code with annotations Mockito allows the use of annotations to reduce the lines of test code in order to increase the readability of tests. Let's take into consideration some of the tests that we have seen in previous examples. Removing boilerplate code by using the MockitoJUnitRunner The shouldCalculateTotalWaitingTimeAndAssertTheArgumentsOnMockUsingArgumentCaptor from Verifying behavior (including argument capturing, verifying call order and working with asynchronous code) section, can be rewritten as follows, using Mockito annotations, and the @RunWith(MockitoJUnitRunner.class) JUnit runner: @RunWith(MockitoJUnitRunner.class) public class _07ReduceBoilerplateCodeWithAnnotationsWithRunner { @Mock KitchenService kitchenServiceMock; @Captor ArgumentCaptor mealArgumentCaptor; @InjectMocks WaiterImpl objectUnderTest; @Test public void shouldCalculateTotalWaitingTimeAndAssert TheArgumentsOnMockUsingArgumentCaptor() throws Exception { //given final int mealPreparationTime = 10; when(kitchenServiceMock.calculate PreparationTime(any(Meal.class))).thenReturn(mealPreparationTime); //when int waitingTime = objectUnderTest.calculate TotalWaitingTime(createSampleMeals ContainingVegetarianFirstCourse()); //then assertThat(waitingTime, is(mealPreparationTime)); verify(kitchenServiceMock).calculatePreparation Time(mealArgumentCaptor.capture()); assertThat(mealArgumentCaptor.getValue(), is (VegetarianFirstCourse.class)); assertThat(mealArgumentCaptor.getAllValues().size(), is(1)); } private List createSampleMeals ContainingVegetarianFirstCourse() { List meals = new ArrayList(); meals.add(new VegetarianFirstCourse()); return meals; } } What happened here is that: All of the boilerplate code can be removed due to the fact that you are using the @RunWith(MockitoJUnitRunner.class) JUnit runner Mockito.mock(…) has been replaced with @Mock annotation You can provide additional parameters to the annotation, such as name, answer or extraInterfaces. The fieldname related to the annotated mock is referred to in any verification so it's easier to identify the mock ArgumentCaptor.forClass(…) is replaced with @Captor annotation. When using the @Captor annotation you avoid warnings related to complex generic types Thanks to the @InjectMocks annotation your object under test is initialized, proper constructor/setters are found and Mockito injects the appropriate mocks for you There is no explicit creation of the object under test You don't need to provide the mocks as arguments of the constructor Mockito @InjectMocksuses either constructor injection, property injection or setter injection Taking advantage of advanced mocks configuration Mockito gives you a possibility of providing different answers for your mocks. Let's focus more on that. Getting more information on NullPointerException Remember the Waiter's askTheCleaningServiceToCleanTheRestaurantMethod(): @Override public boolean askTheCleaningServiceToCleanTheRestaurant (TypeOfCleaningService typeOfCleaningService) { CleaningService cleaningService = cleaningServiceFactory.getMe ACleaningService(typeOfCleaningService); try{ cleaningService.cleanTheTables(); cleaningService.sendInformationAfterCleaning(); return SUCCESSFULLY_CLEANED_THE_RESTAURANT; }catch(CommunicationException communicationException){ System.err.println("An exception took place while trying to send info about cleaning the restaurant"); return FAILED_TO_CLEAN_THE_RESTAURANT; } } Let's assume that we want to test this function. We inject the CleaningServiceFactory as a mock but we forgot to stub the getMeACleaningService(…) method. Normally we would get a NullPointerException since, if the method is not stubbed, it will return null. But what will happen, if as an answer we would provide a RETURNS_SMART_NULLS answer? Let's take a look at the body of the following test: @Mock(answer = Answers.RETURNS_SMART_NULLS) CleaningServiceFactory cleaningServiceFactory; @InjectMocks WaiterImpl objectUnderTest; @Test public void shouldThrowSmartNullPointerExceptionWhenUsingUnstubbedMethod() { //given // Oops forgotten to stub the CleaningServiceFactory.getMeACle aningService(TypeOfCleaningService) method try { //when objectUnderTest.askTheCleaningServiceToCleanTheRestaurant( TypeOfCleaningService.VERY_FAST); fail(); } catch (SmartNullPointerException smartNullPointerException) { System.err.println("A SmartNullPointerException will be thrown here with a very precise information about the error [" + smartNullPointerException + "]"); } } What happened in the test is that: We create a mock with an answer RETURNS_SMART_NULLS of the CleaningServiceFactory The mock is injected to the WaiterImpl We do not stub the getMeACleaningService(…) of the CleaningServiceFactory The SmartNullPointerException will be thrown at the line containing the cleaningService.cleanTheTables() It will contain very detailed information about the reason for the exception to happen and where it happened In order to have the RETURNS_SMART_NULLS as the default answer (you wouldn't have to explicitly define the answer for your mock), you would have to create the class named MockitoConfiguration in a package org.mockito.configuration that either extends the DefaultMockitoConfiguration or implements the IMockitoConfiguration interface: package org.mockito.configuration; import org.mockito.internal.stubbing.defaultanswers.ReturnsSmartNulls; import org.mockito.stubbing.Answer; public class MockitoConfiguration extends DefaultMockitoConfiguration { public Answer<Object> getDefaultAnswer() { return new ReturnsSmartNulls(); } } Summary In this article we learned in detail about reducing the boilerplate code with annotations, and taking advantage of advanced mocks configuration, along with their code implementation. Resources for Article : Further resources on this subject: Testing your App [Article] Drools JBoss Rules 5.0 Flow (Part 2) [Article] Easily Writing SQL Queries with Spring Python [Article]
Read more
  • 0
  • 0
  • 11388

article-image-learning-musescore
Packt
04 Sep 2013
15 min read
Save for later

Learning MuseScore

Packt
04 Sep 2013
15 min read
(For more resources related to this topic, see here.) Entering notes In order to enter notes into our score, we need to enter Note Entry mode. MuseScore has various modes that we can use to accomplish special tasks. You can enter Note Entry mode by clicking on the N button in the toolbar. You can tell whether you are in Note Entry mode at any given time by checking whether the N button is depressed. You may also enter/exit Note Entry mode by pressing the N key. After you enter Note Entry mode, the quarter note should be selected by default. If you hover over the staff, you should see a light blue outline of a note appear. Clicking here will cause a quarter note of that pitch to be inserted. In the toolbar, you will see several notes of different lengths, such as half notes, eighth notes, and whole notes. This area is called the Note Entry toolbar, and indicates which note will be inserted when you click on the staff. Right now, the quarter note should be selected. Click on the half note, and then click an area of the staff on top of the rest that is immediately after the quarter note we just inserted. A half note of the pitch you chose will be added. In MuseScore, whenever we add notes, we must overwrite other notes. First, we overwrote a whole rest with a quarter note, which caused three beats of rest to be added after the quarter note. Then, we overwrote a quarter rest with a half note. Since the half note was longer than the quarter rest, it also overwrote one beat from the half rest following it, and changed the rest to a quarter rest to accommodate the size of the half note. To add an accidental, simply insert the note without the accidental, and then press the appropriate accidental button in the toolbar. For example, let's insert an F eighth note. We click on the eighth note button, then on the F line of the staff, and finally on the sharp button in the toolbar. We can insert dotted notes in a similar fashion by using the dot button on the Note Entry toolbar. In the next measure, let's add a G dotted quarter note by clicking on the quarter note in the Note Entry toolbar, then clicking on the dot button, and then clicking on a G in the staff. The dot will stay selected after you insert the note. If you would like to deselect the dot, you can click on it again. It is also automatically deselected when you change the note duration. Thus, you should always select the dot after you select the value of the note you would like to be dotted. It is possible to notate more quickly using keyboard shortcuts. The number keys 1 through 9 will select different durations, and the letters A through G will insert the designated note. The 0 key inserts a rest. Inserting notes this way will always insert the closest note with the desired pitch. If you hold Ctrl (or on Mac) while pressing the up or down arrow keys, MuseScore will move the last note you inserted up or down an octave. So, inserting a C half note and moving it up an octave can be accomplished by pressing the sequence 6 C Ctrl + ↑. Notes can be adjusted by a half step by pressing the up or down arrows without holding the Ctrl key. Hitting the up arrow will always create sharps, and the down arrow creates flats. This allows us to insert an F eighth note with the keystroke sequence 4 f ↑. While at first the keyboard shortcuts may seem complicated, as you get the hang of MuseScore, it is worthwhile to learn them. They will allow you to notate music extremely quickly and make your overall experience with MuseScore much more pleasurable. Making chords is also very straightforward. We just click on top of our previously inserted note after selecting a note of the same value. Be careful! If a different note length is selected, it will overwrite the previous note. Chords can also be inserted rapidly with keyboard shortcuts. Just start by inserting the first note of the chord normally. If you would like to insert a note of the chord above the previous note, hold Alt and press the interval above the previous note you would like to insert. To insert it below, hold Shift and do the same. Notes are always inserted in the present key signature. So to insert a C first inversion chord, press the sequence E Alt + 3 Alt + 4, or to insert a C second inversion chord, press the sequence G Alt + 4 Alt + 3. Alternatively, after inserting the first note, you can hold Shift and type the letter names of the notes to add to the chord. So pressing the sequence G Shift+C Shift+E would insert the same C second inversion chord. If you ever make a mistake, you can always undo your latest changes by going to the Edit menu and selecting Undo. You can also use the keyboard shortcut Ctrl + Z (or + Z on Mac). Let's put some notes and chords in some measures for both the trombone and piano parts so that we have something to work with. Inserting triplets To insert a triplet, first enter Note Entry mode. Then, from the Note Entry toolbar, choose the total duration that you would like all three triplets to sum to. Next, insert the first note of the triplet in the position you would like the triplet to occupy. After this, exit Note Entry mode, and from the Notes menu, under the Tuplets submenu, click on the Triplet option. A triplet will be created with the selected note as the first note. MuseScore will automatically enter Note Entry mode for you again, and select the correct duration of note needed to complete the triplet. From here, you can replace the two rests with notes by inserting the correct notes on top of them, as we did when we entered notes previously. Also, there is a keyboard shortcut to make this process easier. While in Note Entry mode, select the proper duration you would like the entire triplet to be, as before, but then hit Ctrl + 3 (or + 3 on Mac). The triplet will be inserted, and the proper note duration to fill in the triplet will be selected. You can now enter the notes of the triplet as you would enter normal notes. For instance, to insert a triplet arpeggio of an F major triad totaling one beat, we would press the sequence 5 Ctrl + 3 F A C. For a B major triad totaling two beats, we would similarly press 6 Ctrl + 3 B D ↑ F ↑. Inserting ties Ties are very easy to create in MuseScore. The simplest way to insert a tie is to insert both of the notes that you want to be tied together, exit Note Entry mode, click on the first note, and then click on the tie button in the toolbar, or press the + key. Make sure the two notes you are trying to tie together have the same pitch, or no tie will be inserted. This method works for individual notes, and also for chords. In order to have flexibility when tying chords, you must tie each note of the chord individually if you want the full chord to be tied. An easy way to do this is to ensure that you are not in Note Entry mode, hold Shift, click on the first note of the first chord so that the whole chord is selected, and press the + key. Again, for this to work, you must have two chords with identical pitches next to each other. If you are working with keyboard shortcuts, then there is also a faster way to enter ties that does not require the use of the mouse. After you enter a note in Note Entry mode, the note you just entered will be selected, and the cursor will be located on the right-hand side of this note, as shown in the following screenshot: Then, using the appropriate keyboard shortcut, select the duration of note you would like this note to be tied to. Finally, press the + key. MuseScore will insert a note of the selected duration tied to the previous note. So, pressing the sequence 5 C 4 + will insert a quarter note C tied to an eighth note. While this method is extremely convenient for single notes, it does not work for chords. Often, it is necessary to flip the tie for visual appeal, especially when tying chords. This can be accomplished by ensuring that you are not in Note Entry mode, clicking on a tie, and then pressing the X key. Even though ties look very similar to slurs in many situations, they are created differently. Slurs will be discussed later. Copying and pasting Suppose that we would like to repeat a measure in the bass line, or that the next measure in the melody is very similar to the previous measure. As in a word processor, we can copy and paste measures and fragments of music. First, let's copy and paste a measure. Exit Note Entry mode by ensuring the N button in the toolbar is not selected. Then, click on a portion of the measure where no notes are present. The measure should be selected, as indicated by the blue box around it. Now, either go to the Edit menu and click on Copy, or press Ctrl + C ( + C on Mac). The measure will be copied to the clipboard. Now, click on a portion of the target measure without any notes, and either click on Paste from the Edit menu, or press Ctrl + V ( + V on Mac). The notes will be inserted, and the target measure will be overwritten. It is also possible to copy any portion of your score, even if it spans partial measures or multiple staves. First, click on the note at the top-left of the region you want to copy. In the following example, this would be the E♭ in the right hand. Then, press and hold the Shift key, and click on the note at the bottom right corner of the region you would like to copy. Here, that would be the D in the left hand. MuseScore will select all of the notes in between. Once you have selected the region, you can copy it in the same way you copied the measure before. To paste the region, click on the first note or rest in the uppermost stave where you would like to paste it, and paste as we did with a single measure using either Ctrl + V or Paste from the Edit menu. If your selection has different measure breaks or is in a different meter than the destination, the selection will be reflowed to fit the destination, and ties will be added as necessary. Inserting and deleting measures Often, it is helpful to insert or delete a measure in your score. Luckily, MuseScore makes this extremely easy. To insert a measure, select the measure (as we did when we copied a measure) immediately after the location where you would like to insert the measure. Then, go to the Create menu, and under the Measures submenu, select Insert Measure. A measure will be inserted. To insert multiple measures, select Insert Measures. A dialog box will prompt you for how many measures to insert. If you would like to add measures to the end of the score, you can select Append Measures from under the Measures submenu within the Create menu. There is no need to select any measures to perform this operation. To delete measures, simply select the measure by clicking any blank area within the measure, and then go to the Edit menu, and click on Delete Selected Measures. Doing so will delete this measure position within all staves, not just the selected staff. You can also select multiple measures (as we did earlier when we were copying by selecting one measure, holding the Shift key, and selecting additional measures), and use the same menu button to delete all of the measures that you have selected. Chord symbols In jazz and popular music, it is very common to give musicians chord symbols to read from. To create a chord symbol, make sure you are not in Note Entry mode, and click on a note that you would like to add a chord symbol to. Then, either go to the Create menu, go to the Text submenu, and select Chord Name, or press Ctrl + K ( + K on Mac). A text box should appear that looks exactly like the ones we saw before. Now, you can type the name of the chord in the same way you would write it on paper. (For example, D minor would be Dm, and a G7 chord would just be G7.) All lowercase b characters will be converted into flat signs, and all # characters will be converted into sharps. To move to the next location in the measure, press the space bar. If you press the space bar repeatedly, you will move forward without inserting any chords. Now that our chords are inserted, we can optionally make them look stylized. To do this, go to the Style menu and click on Edit General Style. Then, click on the Chordnames option on the left-hand side. You should see a textbox appear on the right-hand side containing the text stdchords.xml. Change this to jazzchords.xml, and then press OK. The chords you entered should be appropriately stylized. Many styles of notation, especially within jazz music, use chord symbols and slashes to indicate improvisation. To create these slashes in MuseScore, insert four quarter notes on the middle line of the staff. Then, after exiting Note Entry mode, right-click on each note and select Note Properties. Check the box that says Stemless. Also, find the option labeled velocity type and choose user, and then change the value of the box velocity (0-127) to 0. Now press OK. Then, locate the section of the palette labeled Note Heads, and drag the parallelogram slash shape on top of each note. This will create the slash notation. Beaming The proper beaming of notes is a key feature of quality engraved scores that often goes unappreciated. It is extremely easy to change the beaming patterns to enhance the readability of your score. There are several utilities in the palette that allow for this. To start, go to the section of the palette labeled Beam Properties. Hovering over each icon will tell you what it does. These properties can be applied to different notes. The Start beam option is for notes in the middle of an existing beam. It breaks the existing beam at the specified note, and starts a new beam on that note. The Middle of the beam option will ensure that the selected note is beamed to the notes on both sides of it, and the No beam option will break any beams going to the selected note. Let's learn how to use these with a simple use case scenario. Suppose you enter three eighth notes followed by an eighth rest. MuseScore will automatically choose the following beaming: However, to a musician who is sight-reading, it may be easy to confuse this with a triplet. To correct this, simply drag the No beam icon on top of the third eighth note in the passage. The note should highlight red as you hover over it, before you drop it. Once you let go of the mouse button, MuseScore will automatically adjust the beam according to what you specified. Similarly, choosing the beaming wisely can make difficult passages easier to read. Let's consider the case of two sixteenth notes followed by two eighth notes and two more sixteenth notes. Especially with the sharps and flats in this example, it would not be easy to sight-read such a passage. However, dragging the Start beam option on top of the B♮ makes this passage much cleaner and easier to read. To undo any of these changes, ensure that you are not in Note Entry mode, and click on the note that you have changed. Then, in the Beam Properties section of the palette, double-click the A icon to reset it back to default. Though MuseScore uses standard conventions for whether to put the beam above or below the notes, if you would like to change this, simply ensure that you are not in Note Entry mode, click on the beam, and press the X key. The beam will flip to the other side of the staff. Summary In this article, we learned the basics of creating notes including ties and triplets, copying and pasting measures, creating chord symbols, and also changing the beaming patterns to enhance the readability of our score. Resources for Article: Further resources on this subject: Importing and Adding Background Music with Audacity 1.3 [Article] New iPad Features in iOS 6 [Article] Quick start – media files and XBMC [Article]
Read more
  • 0
  • 0
  • 4553

article-image-article-rapid-development
Packt
04 Sep 2013
7 min read
Save for later

Rapid Development

Packt
04 Sep 2013
7 min read
(For more resources related to this topic, see here.) Concept of reusability The concept of reusability has its roots in the production process. Typically, most of us go about creating e-learning using a process similar to what is shown in the following screenshot. It works well for large teams and the one man band, except in the latter case, you become a specialist for all the stages of production. That's a heavy load. It's hard to be good at all things and it demands that you constantly stretch and improve your skills, and find ways to increase the efficiency of what you do. Reusability in Storyline is about leveraging the formatting, look and feel and interactions you create so that you can re-purpose your work and speed-up production. Not every project will be an original one-off, in fact most won't, so the concept is to approach development with a plan to repurpose 80 percent of the media, quizzes, interactions, and designs you create. As you do this, you begin to establish processes, templates, and libraries that can be used to rapidly assemble base courses. With a little tweaking and some minor customization, you'll have a new, original course in no time. Your client doesn't need to know that 80 percent was made from reusable elements with just 20 percent created as original, unique components, but you'll know the difference in terms of time and effort. Leveraging existing assets So how can you leverage existing assets with Storyline? The first things you'll want to look at are the courses you've built with other authoring programs, such as PowerPoint, QuizMaker Engage, Captivate, Flash, and Camtasia. If there are design themes, elements, or interactions within these courses that you might want to use for future Storyline courses, you should focus your efforts on importing what you can, and further adjusting within Storyline to create a new version of the asset that can be reused for future Storyline courses. If re-working the asset is too complex or if you don't expect to reuse it in multiple courses, then using Storyline's web object feature to embed the interaction without re-working it in any way may be the better approach. In both cases, you'll save time by reusing content you've already put a lot of time in developing. Importing external content Here are the steps to bring external content into Storyline: From the Articulate Startup screen or by choosing the Insert tab, and then New Slide within a project, select the Import option. There are options to import PowerPoint, Quizmaker, and Storyline. All of these will display the slides within the file to be imported. You can pick and choose which slides to import into a new or the current scene in Storyline. The Engage option displays the entire interaction that can be imported into a single slide in the current or a new scene. Click on Import to complete the process. Considerations when importing Keep the following points in mind when importing: PowerPoint and Quizmaker files can be imported directly into Storyline. Once imported, you can edit the content like you would any other Storyline slide. Master slides come along with the import making it simple to reuse previous designs. Note that 64-bit PowerPoint is not supported and you must have an installed, activated version of Quizmaker for the import to work. The PowerPoint to Storyline conversion is not one-to-one. You can expect some alignment issues with slide objects due to the fact that PowerPoint uses points and Storyline uses pixels. There are 2.66 pixels for each point which is why you'll need to tweak the imported slides just a bit. Same with Quizmaker though the reason why is slightly different; Quizmaker is 686 x 424 in size, whereas Storyline is 720 x 540 by default. Engage files can be imported into Storyline and they are completely functional, but cannot be edited within Storyline. Though the option to import Engage appears on the Import screen, what Storyline is really doing is creating a web object to contain the Engage interaction. Once imported into a new scene, clicking on the Engage interaction will display an Options menu where you can make minor adjustments to the behavior of the interaction as well as Preview and Edit in it Engage. You can also resize and position the interaction just as you would any web object. Remember that though web objects work in iPad and HTML5 outputs, Engage content is Flash, so it will not playback on an iPad or in an HTML5 browser. Like Quizmaker, you'll need an installed, activated version of Engage for the import to work. Flash, Captivate, and Camtasia files cannot be imported in Storyline and cannot be edited within Storyline. You can however, use web objects to embed these projects into Storyline or the Insert Flash option. In both cases, the imported elements appear seamless to the learner while retaining full functionality.   Build once, and reuse many times Quizzing is at the heart of many e-learning courses where often the quiz questions need to be randomized or even reused in different sections of a single course (that is, the same questions for a pre and post-test). The concept of building once and reusing many times works well with several aspects of Storyline. We'll start with quizzing and a feature called Question Banks as follows: Question Banks Question Bank offers a way to pool, reuse, and randomize questions within a project. Slides in a question bank are housed within the project file but are not visible until placed into the story. Question Banks can include groups of quiz slides and regular slides (that is, you might include a regular slide if you need to provide instructions for the quiz or would like to include a post-quiz summary). When you want to include questions from a Question Bank, you just need to insert a new Quizzing slide, and then choose Draw from Bank . You can then select one or more questions to include and randomize them if desired. Follow along… In this exercise we will be removing three questions from a scene and moving them into a question bank. This will allow you to draw one or more of those questions at any point in the project where the quiz questions are needed, as follows: From the Home tab, choose Question Banks , and then Create Question bank . Title this Identity Theft Questions . Notice that a new tab has opened in Normal View . The Question Bank appears in this tab. Click on the Import link and navigate to question slides 2, 3, and 4. From the Import drop-down menu at the top, select move questions into question bank . Click on the Story View tab and notice the three slides containing the quiz questions are no longer in the story. Click back on the Identity Theft tab and notice that they are located here. The questions will not become a part of the story until the next step, when you draw them from the bank. In Story View, click once on slide 1 to select it, and then from the Home tab, choose Question Banks and New Draw from Question Bank . From the Question Bank drop-down menu, select Identity Theft Questions . All questions will be selected by default and will be randomized after being placed into the story. This means that the learner will need to answer three questions before continuing onto the next slide in the story. Click on Insert . The Question Bank draw has been inserted as slide 2. To see how this works, Preview the scene. Save as Exercise 11 – Identity Theft Quiz.   There are multiple ways to get back to the questions that are in a question bank. You can do this by selecting the tab the questions are located in (in this case, Identity Theft ), you can view the question bank slide in Normal View or choose Question Banks from the Home tab and navigate to the name of the question bank you'd like to edit.
Read more
  • 0
  • 0
  • 1452
Unlock access to the largest independent learning library in Tech for FREE!
Get unlimited access to 7500+ expert-authored eBooks and video courses covering every tech area you can think of.
Renews at €18.99/month. Cancel anytime
article-image-so-what-openelec
Packt
04 Sep 2013
3 min read
Save for later

So, what is OpenELEC?

Packt
04 Sep 2013
3 min read
(For more resources related to this topic, see here.) Open Embedded Linux Entertainment Center (OpenELEC) is an open source embedded operating system developed specifically for the purpose of running complete and easy-to-use media center solutions on a wide variety of hardware. Built around the great open source media player and organizer, XBMC, OpenELEC is optimized to deliver a smooth, intuitive, and efficient user experience. It is developed with the goal of making the task of setting up and maintaining a Home Theater PC (HTPC) easy and straightforward for all users, regardless of their technical skills. This makes OpenELEC the obvious choice for anyone looking to enhance their home media experience with a fully functional media center. The OpenELEC project is open source, so anyone can download and use the operating system completely free of charge. This is a great enticement to get started with setting up a media center of your own, capable of providing many hours of digital entertainment for yourself and your friends and family to enjoy. Media center features Once installed and configured, OpenELEC incorporates XBMC to provide a wide variety of advanced features in a streamlined and straightforward interface. The following are the media types that can be indexed and/or browsed: Movies TV shows Music Pictures The experience of browsing indexed media is enhanced by automatic inclusion of media information and relevant images available from online databases. This provides easy access to ratings, resumes, artwork, and trailers, because everything is incorporated directly in the interface. Hardware requirements Because OpenELEC is very versatile in hardware compatibility, you will most likely be able to turn "that old PC, which has just been lying around" into a fully functional media center, just by installing OpenELEC on it. This approach is great for a fun hobby project, giving you the opportunity to experiment with OpenELEC at no cost. Alternatively, you can get brand new PCs small enough to fit behind a TV, making them the obvious choice for HTPC use. With a very small footprint and compatibility with small atom- or fusion-based platforms, this is where OpenELEC shows its full potential as a lightweight embedded operating system. Summary So, what is OpenELEC? finds out what OpenELEC actually is, what you can do with it, and why it’s so great. Resources for Article : Further resources on this subject: Creating a file server (Samba) [Article] Webcam and Video Wizardry [Article] Our First Project – A Basic Thermometer [Article]
Read more
  • 0
  • 0
  • 2931

article-image-understanding-point-time-recovery
Packt
04 Sep 2013
28 min read
Save for later

Understanding Point-In-Time-Recovery

Packt
04 Sep 2013
28 min read
(For more resources related to this topic, see here.) Understanding the purpose of PITR PostgreSQL offers a tool called pg_dump to backup a database. Basically, pg_dump will connect to the database, read all the data in transaction isolation level "serializable" and return the data as text. As we are using "serializable", the dump is always consistent. So, if your pg_dump starts at midnight and finishes at 6 A.M, you will have created a backup, which contains all the data as of midnight but no further data. This kind of snapshot creation is highly convenient and perfectly feasible for small to medium amounts of data. A dump is always consistent. This means that all foreign keys are intact; new data added after starting the dump will be missing. It is most likely the most common way to perform standard backups. But, what if your data is so valuable and maybe so large in size that you want to backup it incrementally? Taking a snapshot from time to time might be enough for some applications; for highly critical data, it is clearly not. In addition to that, replaying 20 TB of data in textual form is not efficient either. Point-In-Time-Recovery has been designed to address this problem. How does it work? Based on a snapshot of the database, the XLOG will be replayed later on. This can happen indefinitely or up to a point chosen by you. This way, you can reach any point in time. This method opens the door to many different approaches and features: Restoring a database instance up to a given point in time Creating a standby database, which holds a copy of the original data Creating a history of all changes In this article, we will specifically feature on the incremental backup functionality and describe how you can make your data more secure by incrementally archiving changes to a medium of choice. Moving to the bigger picture The following picture provides an overview of the general architecture in use for Point-In-Time-Recovery: PostgreSQL produces 16 MB segments of transaction log. Every time one of those segments is filled up and ready, PostgreSQL will call the so called archive_command. The goal of archive_command is to transport the XLOG file from the database instance to an archive. In our image, the archive is represented as the pot on the bottom-right side of the image. The beauty of the design is that you can basically use an arbitrary shell script to archive the transaction log. Here are some ideas: Use some simple copy to transport data to an NFS share Run rsync to move a file Use a custom made script to checksum the XLOG file and move it to an FTP server Copy the XLOG file to a tape The possible options to manage XLOG are only limited by imagination. The restore_command is the exact counterpart of the archive_command. Its purpose is to fetch data from the archive and provide it to the instance, which is supposed to replay it (in our image, this is labeled as Restored Backup). As you have seen, replay might be used for replication or simply to restore a database to a given point in time as outlined in this article. Again, the restore_command is simply a shell script doing whatever you wish, file by file. It is important to mention that you, the almighty administrator, are in charge of the archive. You have to decide how much XLOG to keep and when to delete it. The importance of this task cannot be underestimated. Keep in mind, when then archive_command fails for some reason, PostgreSQL will keep the XLOG file and retry after a couple of seconds. If archiving fails constantly from a certain point on, it might happen that the master fills up. The sequence of XLOG files must not be interrupted; if a single file is missing, you cannot continue to replay XLOG. All XLOG files must be present because PostgreSQL needs an uninterrupted sequence of XLOG files; if a single file is missing, the recovery process will stop there at the very latest. Archiving the transaction log After taking a look at the big picture, we can take a look and see how things can be put to work. The first thing you have to do when it comes to Point-In-Time-Recovery is to archive the XLOG. PostgreSQL offers all the configuration options related to archiving through postgresql.conf. Let us see step by step what has to be done in postgresql.conf to start archiving: First of all, you should turn archive_mode on. In the second step, you should configure your archive command. The archive command is a simple shell command taking two parameters: %p: This is a placeholder representing the XLOG file that should be archived, including its full path (source). %f: This variable holds the name of the XLOG without the path pointing to it. Let us set up archiving now. To do so, we should create a place to put the XLOG. Ideally, the XLOG is not stored on the same hardware as the database instance you want to archive. For the sake of this example, we assume that we want to apply an archive to /archive. The following changes have to be made to postgresql.conf: wal_level = archive # minimal, archive, or hot_standby # (change requires restart) archive_mode = on # allows archiving to be done # (change requires restart) archive_command = 'cp %p /archive/%f' # command to use to archive a logfile segment # placeholders: %p = path of file to archive # %f = file name only Once those changes have been made, archiving is ready for action and you can simply restart the database to activate things. Before we restart the database instance, we want to focus your attention on wal_level. Currently three different wal_level settings are available: minimal archive hot_standby The amount of transaction log produced in the case of just a single node is by far not enough to synchronize an entire second instance. There are some optimizations in PostgreSQL, which allow XLOG-writing to be skipped in the case of single-node mode. The following instructions can benefit from wal_level being set to minimal: CREATE TABLE AS, CREATE INDEX, CLUSTER, and COPY (if the table was created or truncated within the same transaction). To replay the transaction log, at least archive is needed. The difference between archive and hot_standby is that archive does not have to know about currently running transactions. For streaming replication, however, this information is vital. Restarting can either be done through pg_ctl –D /data_directory –m fast restart directly or through a standard init script. The easiest way to check if our archiving works is to create some useless data inside the database. The following code snippets shows a million rows can be made easily: test=# CREATE TABLE t_test AS SELECT * FROM generate_series(1,1000000);SELECT 1000000test=# SELECT * FROM t_test LIMIT 3;generate_series----------------- 1 2 3(3 rows) We have simply created a list of numbers. The important thing is that 1 million rows will trigger a fair amount of XLOG traffic. You will see that a handful of files have made it to the archive: iMac:archivehs$ ls -ltotal 131072-rw------- 1 hs wheel 16777216 Mar 5 22:31000000010000000000000001-rw------- 1 hs wheel 16777216 Mar 5 22:31000000010000000000000002-rw------- 1 hs wheel 16777216 Mar 5 22:31000000010000000000000003-rw------- 1 hs wheel 16777216 Mar 5 22:31000000010000000000000004 Those files can be easily used for future replay operations. If you want to save storage, you can also compress those XLOG files. Just add gzip to your archive_command. Taking base backups In the previous section, you have seen that enabling archiving takes just a handful of lines and offers a great deal of flexibility. In this section, we will see how to create a so called base backup, which can be used to apply XLOG later on. A base backup is an initial copy of the data. Keep in mind that the XLOG itself is more or less worthless. It is only useful in combination with the initial base backup. In PostgreSQL, there are two main options to create an initial base backup: Using pg_basebackup Traditional copy/rsync based methods The following two sections will explain in detail how a base backup can be created: Using pg_basebackup The first and most common method to create a backup of an existing server is to run a command called pg_basebackup, which was introduced in PostgreSQL 9.1.0. Basically pg_basebackup is able to fetch a database base backup directly over a database connection. When executed on the slave, pg_basebackup will connect to the database server of your choice and copy all the data files in the data directory over to your machine. There is no need to log into the box anymore, and all it takes is one line of code to run it; pg_basebackup will do all the rest for you. In this example, we will assume that we want to take a base backup of a host called sample.postgresql-support.de. The following steps must be performed: Modify pg_hba.conf to allow replication Signal the master to take pg_hba.conf changes into account Call pg_basebackup Modifying pg_hba.conf To allow remote boxes to log into a PostgreSQL server and to stream XLOG, you have to explicitly allow replication. In PostgreSQL, there is a file called pg_hba.conf, which tells the server which boxes are allowed to connect using which type of credentials. Entire IP ranges can be allowed or simply discarded through pg_hba.conf. To enable replication, we have to add one line for each IP range we want to allow. The following listing contains an example of a valid configuration: # TYPE DATABASE USER ADDRESS METHODhost replication all 192.168.0.34/32 md5 In this case we allow replication connections from 192.168.0.34. The IP range is identified by 32 (which simply represents a single server in our case). We have decided to use MD5 as our authentication method. It means that the pg_basebackup has to supply a password to the server. If you are doing this in a non-security critical environment, using trust as authentication method might also be an option. What happens if you actually have a database called replication in your system? Basically, setting the database to replication will just configure your streaming behavior, if you want to put in rules dealing with the database called replication, you have to quote the database name as follows: "replication". However, we strongly advise not to do this kind of trickery to avoid confusion. Signaling the master server Once pg_hba.conf has been changed, we can tell PostgreSQL to reload the configuration. There is no need to restart the database completely. We have three options to make PostgreSQL reload pg_hba.conf: By running an SQL command: SELECT pg_reload_conf(); By sending a signal to the master: kill –HUP 4711 (with 4711 being the process ID of the master) By calling pg_ctl: pg_ctl –D $PGDATA reload (with $PGDATA being the home directory of your database instance) Once we have told the server acting as data source to accept streaming connections, we can move forward and run pg_basebackup. pg_basebackup – basic features pg_basebackup is a very simple-to-use command-line tool for PostgreSQL. It has to be called on the target system and will provide you with a ready-to-use base backup, which is ready to consume the transaction log for Point-In-Time-Recovery. The syntax of pg_basebackup is as follows: iMac:dbhs$ pg_basebackup --help pg_basebackup takes a base backup of a running PostgreSQL server. Usage: pg_basebackup [OPTION]... Options controlling the output: -D, --pgdata=DIRECTORY receive base backup into directory -F, --format=p|t output format (plain (default), tar) -x, --xlog include required WAL files in backup (fetch mode) -X, --xlog-method=fetch|stream include required WAL files with specified method -z, --gzip compress tar output -Z, --compress=0-9 compress tar output with given compression level General options: -c, --checkpoint=fast|spread set fast or spread checkpointing -l, --label=LABEL set backup label -P, --progress show progress information -v, --verbose output verbose messages -V, --version output version information, then exit -?, --help show this help, then exit Connection options: -h, --host=HOSTNAME database server host or socket directory -p, --port=PORT database server port number -s, --status-interval=INTERVAL time between status packets sent to server (in seconds) -U, --username=NAME connect as specified database user -w, --no-password never prompt for password -W, --password force password prompt (should happen automatically) A basic call to pg_basebackup would look like that: iMac:dbhs$ pg_basebackup -D /target_directory -h sample.postgresql-support.de In this example, we will fetch the base backup from sample.postgresql-support.de and put it into our local directory called /target_directory. It just takes this simple line to copy an entire database instance to the target system. When we create a base backup as shown in this section, pg_basebackup will connect to the server and wait for a checkpoint to happen before the actual copy process is started. In this mode, this is necessary because the replay process will start exactly at this point in the XLOG. The problem is that it might take a while until a checkpoint kicks in; pg_basebackup does not enforce a checkpoint on the source server straight away to make sure that normal operations are not disturbed. If you don't want to wait on a checkpoint, consider using --checkpoint=fast. It will enforce an instant checkpoint and pg_basebackup will start copying instantly. By default, a plain base backup will be created. It will consist of all the files in directories found on the source server. If the base backup should be stored on tape, we suggest to give –-format=t a try. It will automatically create a TAR archive (maybe on a tape). If you want to move data to a tape, you can save an intermediate step easily this way. When using TAR, it is usually quite beneficial to use it in combination with --gzip to reduce the size of the base backup on disk. There is also a way to see a progress bar while doing the base backup but we don't recommend to use this option (--progress) because it requires pg_basebackup to determine the size of the source instance first, which can be costly. pg_basebackup – self-sufficient backups Usually a base backup without XLOG is pretty worthless. This is because the base backup is taken while the master is fully operational. While the backup is taken, those storage files in the database instance might have been modified heavily. The purpose of the XLOG is to fix those potential issues in the data files reliably. But, what if we want to create a base backup, which can live without (explicitly archived) XLOG? In this case, we can use the --xlog-method=stream option. If this option has been chosen, pg_basebackup will not just copy the data as it is but it will also stream the XLOG being created during the base backup to our destination server. This will provide us with just enough XLOG to allow us to start a base backup made that way directly. It is self-sufficient and does not need additional XLOG files. This is not Point-In-Time-Recovery but it can come in handy in case of trouble. Having a base backup, which can be started right away, is usually a good thing and it comes at fairly low cost. Please note that --xlog-method=stream will require two database connections to the source server, not just one. You have to keep that in mind when adjusting max_wal_senders on the source server. If you are planning to use Point-In-Time-Recovery and if there is absolutely no need to start the backup as it is, you can safely skip the XLOG and save some space this way (default mode). Making use of traditional methods to create base backups These days pg_basebackup is the most common way to get an initial copy of a database server. This has not always been the case. Traditionally, a different method has been used which works as follows: Call SELECT pg_start_backup('some label'); Copy all data files to the remote box through rsync or any other means. Run SELECT pg_stop_backup(); The main advantage of this old method is that there is no need to open a database connection and no need to configure XLOG-streaming infrastructure on the source server. A main advantage is also that you can make use of features such as ZFS-snapshots or similar means, which can help to dramatically reduce the amount of I/O needed to create an initial backup. Once you have started pg_start_backup, there is no need to hurry. It is not necessary and not even especially desirable to leave the backup mode soon. Nothing will happen if you are in backup mode for days. PostgreSQL will archive the transaction log as usual and the user won't face any kind of downside. Of course, it is bad habit not to close backups soon and properly. However, the way PostgreSQL works internally does not change when a base backup is running. There is nothing filling up, no disk I/O delayed, or anything of this sort. Tablespace issues If you happen to use more than one tablespace, pg_basebackup will handle this just fine if the filesystem layout on the target box is identical to the filesystem layout on the master. However, if your target system does not use the same filesystem layout there is a bit more work to do. Using the traditional way of doing the base backup might be beneficial in this case. In case you are using --format=t (for TAR), you will be provided with one TAR file per tablespace. Keeping an eye on network bandwidth Let us imagine a simple scenario involving two servers. Each server might have just one disk (no SSDs). Our two boxes might be interconnected through a 1 Gbit link. What will happen to your applications if the second server starts to run a pg_basebackup? The second box will connect, start to stream data at full speed and easily kill your hard drive by using the full bandwidth of your network. An application running on the master might instantly face disk wait and offer bad response times. Therefore it is highly recommended to control the bandwidth used up by rsync to make sure that your business applications have enough spare capacity (mainly disk, CPU is usually not an issue). If you want to limit rsync to, say, 20 MB/sec, you can simply use rsync --bwlimit=20000. This will definitely make the creation of the base backup take longer but it will make sure that your client apps will not face problems. In general we recommend a dedicated network interconnect between master and slave to make sure that a base backup does not affect normal operations. Limiting bandwidth cannot be done with pg_basebackup onboard functionality.Of course, you can use any other tool to copy data and achieve similar results. If you are using gzip compression with –-gzip, it can work as an implicit speed brake. However, this is mainly a workaround. Replaying the transaction log Once we have created ourselves a shiny initial base backup, we can collect the XLOG files created by the database. When the time has come, we can take all those XLOG files and perform our desired recovery process. This works as described in this section. Performing a basic recovery In PostgreSQL, the entire recovery process is governed by a file named recovery.conf, which has to reside in the main directory of the base backup. It is read during startup and tells the database server where to find the XLOG archive, when to end replay, and so forth. To get you started, we have decided to include a simple recovery.conf sample file for performing a basic recovery process: restore_command = 'cp /archive/%f %p'recovery_target_time = '2013-10-10 13:43:12' The restore_command is essentially the exact counterpart of the archive_command you have seen before. While the archive_command is supposed to put data into the archive, the restore_command is supposed to provide the recovering instance with the data file by file. Again, it is a simple shell command or a simple shell script providing one chunk of XLOG after the other. The options you have here are only limited by imagination; all PostgreSQL does is to check for the return code of the code you have written, and replay the data provided by your script. Just like in postgresql.conf, we have used %p and %f as placeholders; the meaning of those two placeholders is exactly the same. To tell the system when to stop recovery, we can set the recovery_target_time. The variable is actually optional. If it has not been specified, PostgreSQL will recover until it runs out of XLOG. In many cases, simply consuming the entire XLOG is a highly desirable process; if something crashes, you want to restore as much data as possible. But, it is not always so. If you want to make PostgreSQL stop recovery at a specific point in time, you simply have to put the proper date in. The crucial part here is actually to know how far you want to replay XLOG; in a real work scenario this has proven to be the trickiest question to answer. If you happen to a recovery_target_time, which is in the future, don't worry, PostgreSQL will start at the very last transaction available in your XLOG and simply stop recovery. The database instance will still be consistent and ready for action. You cannot break PostgreSQL, but, you might break your applications in case data is lost because of missing XLOG. Before starting PostgreSQL, you have to run chmod 700 on the directory containing the base backup, otherwise, PostgreSQL will error out: iMac:target_directoryhs$ pg_ctl -D /target_directorystartserver startingFATAL: data directory "/target_directory" has group or world accessDETAIL: Permissions should be u=rwx (0700). This additional security check is supposed to make sure that your data directory cannot be read by some user accidentally. Therefore an explicit permission change is definitely an advantage from a security point of view (better safe than sorry). Now that we have all the pieces in place, we can start the replay process by starting PostgreSQL: iMac:target_directoryhs$ pg_ctl –D /target_directory startserver startingLOG: database system was interrupted; last known up at 2013-03-1018:04:29 CETLOG: creating missing WAL directory "pg_xlog/archive_status"LOG: starting point-in-time recovery to 2013-10-10 13:43:12+02LOG: restored log file "000000010000000000000006" from archiveLOG: redo starts at 0/6000020LOG: consistent recovery state reached at 0/60000B8LOG: restored log file "000000010000000000000007" from archiveLOG: restored log file "000000010000000000000008" from archiveLOG: restored log file "000000010000000000000009" from archiveLOG: restored log file "00000001000000000000000A" from archivecp: /tmp/archive/00000001000000000000000B: No such file ordirectoryLOG: could not open file "pg_xlog/00000001000000000000000B" (logfile 0, segment 11): No such file or directoryLOG: redo done at 0/AD5CE40LOG: last completed transaction was at log time 2013-03-1018:05:33.852992+01LOG: restored log file "00000001000000000000000A" from archivecp: /tmp/archive/00000002.history: No such file or directoryLOG: selected new timeline ID: 2cp: /tmp/archive/00000001.history: No such file or directoryLOG: archive recovery completeLOG: database system is ready to accept connectionsLOG: autovacuum launcher started The amount of log produced by the database tells us everything we need to know about the restore process and it is definitely worth investigating this information in detail. The first line indicates that PostgreSQL has found out that it has been interrupted and that it has to start recovery. From the database instance point of view, a base backup looks more or less like a crash needing some instant care by replaying XLOG; this is precisely what we want. The next couple of lines (restored log file ...) indicate that we are replaying one XLOG file after the other that have been created since the base backup. It is worth mentioning that the replay process starts at the sixth file. The base backup knows where to start, so PostgreSQL will automatically look for the right XLOG file. The message displayed after PostgreSQL reaches the sixth file (consistent recovery state reached at 0/60000B8) is of importance. PostgreSQL states that it has reached a consistent state. This is important. The reason is that the data files inside a base backup are actually broken by definition, but, the data files are not broken beyond repair. As long as we have enough XLOG to recover, we are very well off. If you cannot reach a consistent state, your database instance will not be usable and your recovery cannot work without providing additional XLOG. Practically speaking, not being able to reach a consistent state usually indicates a problem somewhere in your archiving process and your system setup. If everything up to now has been working properly, there is no reason not to reach a consistent state. Once we have reached a consistent state, one file after the other will be replayed successfully until the system finally looks for the 00000001000000000000000B file. The problem is that this file has not been created by the source database instance. Logically, an error pops up. Not finding the last file is absolutely normal; this type of error is expected if the recovery_target_time does not ask PostgreSQL to stop recovery before it reaches the end of the XLOG stream. Don't worry, your system is actually fine. You have successfully replayed everything to the file showing up exactly before the error message. As soon as all the XLOG has been consumed and the error message discussed earlier has been issued, PostgreSQL reports the last transaction it was able or supposed to replay, and starts up. You have a fully recovered database instance now and you can connect to the database instantly. As soon as the recovery has ended, recovery.conf will be renamed by PostgreSQL to recovery.done to make sure that it does not do any harm when the new instance is restarted later on at some point. More sophisticated positioning in the XLOG Up to now, we have recovered a database up to the very latest moment available in our 16 MB chunks of transaction log. We have also seen that you can define the desired recovery timestamp. But the question now is: How do you know which point in time to perform the recovery to? Just imagine somebody has deleted a table during the day. What if you cannot easily determine the recovery timestamp right away? What if you want to recover to a certain transaction? recovery.conf has all you need. If you want to replay until a certain transaction, you can refer to recovery_target_xid. Just specify the transaction you need and configure recovery_target_inclusive to include this very specific transaction or not. Using this setting is technically easy but as mentioned before, it is not easy by far to find the right position to replay to. In a typical setup, the best way to find a reasonable point to stop recovery is to use pause_at_recovery_target. If this is set to true, PostgreSQL will not automatically turn into a productive instance if the recovery point has been reached. Instead, it will wait for further instructions from the database administrator. This is especially useful if you don't know exactly how far to replay. You can replay, log in, see how far the database is, change to the next target time, and continue replaying in small steps. You have to set hot_standby = on in postgresql.conf to allow reading during recovery. Resuming recovery after PostgreSQL has paused can be done by calling a simple SQL statement: SELECT pg_xlog_replay_resume(). It will make the instance move to the next position you have set in recovery.conf. Once you have found the right place, you can set the pause_at_recovery_target back to false and call pg_xlog_replay_resume. Alternatively, you can simply utilize pg_ctl –D ... promote to stop recovery and make the instance operational. Was this explanation too complicated? Let us boil it down to a simple list: Add a restore_command to the recovery.conf file. Add recovery_target_time to the recovery.conf file. Set pause_at_recovery_target to true in the recovery.conf file. Set hot_standby to on in postgresql.conf file. Start the instance to be recovered. Connect to the instance once it has reached a consistent state and as soon as it stops recovering. Check if you are already where you want to be. If you are not: Change recovery_target_time. Run SELECT pg_xlog_replay_resume(). Check again and repeat this section if it is necessary. Keep in mind that once recovery has finished and once PostgreSQL has started up as a normal database instance, there is (as of 9.2) no way to replay XLOG later on. Instead of going through this process, you can of course always use filesystem snapshots. A filesystem snapshot will always work with PostgreSQL because when you restart a snapshotted database instance, it will simply believe that it had crashed before and recover normally. Cleaning up the XLOG on the way Once you have configured archiving, you have to store the XLOG being created by the source server. Logically, this cannot happen forever. At some point, you really have to get rid of this XLOG; it is essential to have a sane and sustainable cleanup policy for your files. Keep in mind, however, that you must keep enough XLOG so that you can always perform recovery from the latest base backup. But if you are certain that a specific base backup is not needed anymore, you can safely clean out all the XLOG that is older than the base backup you want to keep. How can an administrator figure out what to delete? The best method is to simply take a look at your archive directory: 000000010000000000000005000000010000000000000006000000010000000000000006.00000020.backup000000010000000000000007000000010000000000000008 Check out the filename in the middle of the listing. The .backup file has been created by the base backup. It contains some information about the way the base backup has been made and tells the system where to continue replaying the XLOG. If the backup file belongs to the oldest base backup you need to keep around, you can safely erase all the XLOG lower than file number 6; in this case, file number 5 could be safely deleted. In our case, 000000010000000000000006.00000020.backup contains the following information: START WAL LOCATION: 0/6000020 (file 000000010000000000000006)STOP WAL LOCATION: 0/60000E0 (file 000000010000000000000006)CHECKPOINT LOCATION: 0/6000058BACKUP METHOD: streamedBACKUP FROM: masterSTART TIME: 2013-03-10 18:04:29 CETLABEL: pg_basebackup base backupSTOP TIME: 2013-03-10 18:04:30 CET The .backup file will also provide you with relevant information such as the time the base backup has been made. It is plain there and so it should be easy for ordinary users to read this information. As an alternative to deleting all the XLOG files at one point, it is also possible to clean them up during replay. One way is to hide an rm command inside your restore_command. While this is technically possible, it is not necessarily wise to do so (what if you want to recover again?). Also, you can add the recovery_end_command command to your recovery.conf file. The goal of recovery_end_command is to allow you to automatically trigger some action as soon as the recovery ends. Again, PostgreSQL will call a script doing precisely what you want. You can easily abuse this setting to clean up the old XLOG when the database declares itself active. Switching the XLOG files If you are going for an XLOG file-based recovery, you have seen that one XLOG will be archived every 16 MB. What would happen if you never manage to create 16 MB of changes? What if you are a small supermarket, which just makes 50 sales a day? Your system will never manage to fill up 16 MB in time. However, if your system crashes, the potential data loss can be seen as the amount of data in your last unfinished XLOG file. Maybe this is not good enough for you. A postgresql.conf setting on the source database might help. The archive_timeout tells PostgreSQL to create a new XLOG file at least every x seconds. So, if you are this little supermarket, you can ask the database to create a new XLOG file every day shortly before you are heading for home. In this case, you can be sure that the data of the day will safely be on your backup device already. It is also possible to make PostgreSQL switch to the next XLOG file by hand. A procedure named pg_switch_xlog() is provided by the server to do the job: test=# SELECT pg_switch_xlog();pg_switch_xlog----------------0/17C0EF8(1 row) You might want to call this procedure when some important patch job has finished or if you want to make sure that a certain chunk of data is safely in your XLOG archive. Summary In this article, you have learned about Point-In-Time-Recovery, which is a safe and easy way to restore your PostgreSQL database to any desired point in time. PITR will help you to implement better backup policies and make your setups more robust. Resources for Article: Further resources on this subject: Introduction to PostgreSQL 9 [Article] PostgreSQL: Tips and Tricks [Article] PostgreSQL 9: Reliable Controller and Disk Setup [Article]
Read more
  • 0
  • 0
  • 5261

article-image-so-what-nodejs
Packt
04 Sep 2013
2 min read
Save for later

So, what is Node.js?

Packt
04 Sep 2013
2 min read
(For more resources related to this topic, see here.) Node.js is an open source platform that allows you to build fast and scalable network applications using JavaScript. Node.js is built on top of V8, a modern JavaScript virtual machine that powers Google's Chrome web browser. Node.js uses an event-driven, non-blocking I/O model that makes it lightweight and efficient.Node.js can handle multiple concurrent network connections with little overhead, making it ideal for data-intensive, real-time applications. With Node.js, you can build many kinds of networked applications. For instance, you can use it to build a web application service, an HTTP proxy, a DNS server, an SMTP server, an IRC server, and basically any kind of process that is network intensive. You program Node.js using JavaScript, which is the language that powers the Web. JavaScript is a powerful language that, when mastered, makes writing networked, event-driven applications fun and easy. Node.js recognizes streams that are resistant to precarious network conditions and misbehaving clients. For instance, mobile clients are notoriously famous for having large latency network connections, which can put a big burden on servers by keeping around lots of connections and outstanding requests. By using streaming to handle data, you can use Node.js to control incoming and outgoing streams and enable your service to survive. Also, Node.js makes it easy for you to use third-party open source modules. By using Node Package Manager (NPM), you can easily install, manage, and use any of the several modules contained in a big and growing repository. NPM also allows you to manage the modules your application depends on in an isolated way, allowing different applications installed in the same machine to depend on different versions of the same module without originating a conflict, for instance. Given the way it's designed, NPM even allows different versions of the same module to coexist in the same application. Summary In this article, we learned that Node.js uses an event-driven, non-blocking I/O mode and can handle multiple concurrent network connections with little overhead. Resources for Article : Further resources on this subject: So, what is KineticJS? [Article] Cross-browser-distributed testing [Article] Accessing and using the RDF data in Stanbol [Article]
Read more
  • 0
  • 0
  • 1950

article-image-using-gerrit-github
Packt
04 Sep 2013
14 min read
Save for later

Using Gerrit with GitHub

Packt
04 Sep 2013
14 min read
In this article by Luca Milanesio, author of the book Learning Gerrit Code review, we will learn about Gerrit Code revew. GitHub is the world's largest platform for the free hosting of Git Projects, with over 4.5 million registered developers. We will now provide a step-by-step example of how to connect Gerrit to an external GitHub server so as to share the same set of repositories. Additionally, we will provide guidance on how to use the Gerrit Code Review workflow and GitHub concurrently. By the end of this article we will have our Gerrit installation fully integrated and ready to be used for both open source public projects and private projects on GitHub. (For more resources related to this topic, see here.) GitHub workflow GitHub has become the most popular website for open source projects, thanks to the migration of some major projects to Git (for example, Eclipse) and new projects adopting it, along with the introduction of the social aspect of software projects that piggybacks on the Facebook hype. The following diagram shows the GitHub collaboration model: The key aspects of the GitHub workflow are as follows: Each developer pushes to their own repository and pulls from others Developers who want to make a change to another repository, create a fork on GitHub and work on their own clone When forked repositories are ready to be merged, pull requests are sent to the original repository maintainer The pull requests include all of the proposed changes and their associated discussion threads Whenever a pull request is accepted, the change is merged by the maintainer and pushed to their repository on GitHub   GitHub controversy The preceding workflow works very effectively for most open source projects; however, when the projects gets bigger and more complex, the tools provided by GitHub are too unstructured, and a more defined review process with proper tools, additional security, and governance is needed. In May 2012 Linus Torvalds , the inventor of Git version control, openly criticized GitHub as a commit editing tool directly on the pull request discussion thread: " I consider GitHub useless for these kinds of things. It's fine for hosting, but the pull requests and the online commit editing, are just pure garbage " and additionally, " the way you can clone a (code repository), make changes on the web, and write total crap commit messages, without GitHub in any way making sure that the end result looks good. " See https://github.com/torvalds/linux/pull/17#issuecomment-5654674. Gerrit provides the additional value that Linus Torvalds claimed was missing in the GitHub workflow: Gerrit and GitHub together allows the open source development community to reuse the extended hosting reach and social integration of GitHub with the power of governance of the Gerrit review engine. GitHub authentication The list of authentication backends supported by Gerrit does not include GitHub and it cannot be used out of the box, as it does not support OpenID authentication. However, a GitHub plugin for Gerrit has been recently released in order to fill the gaps and allow a seamless integration. GitHub implements OAuth 2.0 for allowing external applications, such as Gerrit, to integrate using a three-step browser-based authentication. Using this scheme, a user can leverage their existing GitHub account without the need to provision and manage a separate one in Gerrit. Additionally, the Gerrit instance will be able to self-provision the SSH public keys needed for pushing changes for review. In order for us to use GitHub OAuth authentication with Gerrit, we need to do the following: Build the Gerrit GitHub plugin Install the GitHub OAuth filter into the Gerrit libraries (/lib under the Gerrit site directory) Reconfigure Gerrit to use the HTTP authentication type   Building the GitHub plugin The Gerrit GitHub plugin can be found under the Gerrit plugins/github repository on https://gerrit-review.googlesource.com/#/admin/projects/plugins/github. It is open source under the Apache 2.0 license and can be cloned and built using the Java 6 JDK and Maven. Refer to the following example: $ git clone https://gerrit.googlesource.com/plugins/github $ cd github $ mvn install […] [INFO] BUILD SUCCESS [INFO] ------------------------------------------------------- [INFO] Total time: 9.591s [INFO] Finished at: Wed Jun 19 18:38:44 BST 2013 [INFO] Final Memory: 12M/145M [INFO] ------------------------------------------------------- The Maven build should generate the following artifacts: github-oauth/target/github-oauth*.jar, the GitHub OAuth library for authenticating Gerrit users github-plugin/target/github-plugin*.jar, the Gerrit plugin for integrating with GitHub repositories and pull requests Installing GitHub OAuth library The GitHub OAuth JAR file needs to copied to the Gerrit /lib directory; this is required to allow Gerrit to use it for filtering all HTTP requests and enforcing the GitHub three-step authentication process: $ cp github-oauth/target/github-oauth-*.jar /opt/gerrit/lib/ Installing GitHub plugin The GitHub plugin includes the additional support for the overall configuration, the advanced GitHub repositories replication, and the integration of pull requests into the Code Review process. We now need to install the plugin before running the Gerrit init again so that we can benefit from the simplified automatic configuration steps: $ cp github-plugin/target/github-plugin-*.jar /opt/gerrit/plugins/github.jar Register Gerrit as a GitHub OAuth application Before going through the Gerrit init, we need to tell GitHub to trust Gerrit as a partner application. This is done through the generation of a ClientId/ClientSecret pair associated to the exact Gerrit URLs that will be used for initiating the 3-step OAuth authentication. We can register a new application in GitHub through the URL https://github.com/settings/applications/new, where the following three fields are requested: Application name : It is the logical name of the application authorized to access GitHub, for example, Gerrit. Main URL : The Gerrit canonical web URL used for redirecting to GitHub OAuth authentication, for example, https://myhost.mydomain:8443. Callback URL : The URL that GitHub should redirect to when the OAuth authentication is successfully completed, for example, https://myhost.mydomain:8443/oauth. GitHub will automatically generate a unique pair ClientId/ClientSecret that has to be provided to Gerrit identifying them as a trusted authentication partner. ClientId/ClientSecret are not GitHub credentials and cannot be used by an interactive user to access any GitHub data or information. They are only used for authorizing the integration between a Gerrit instance and GitHub. Running Gerrit init to configure GitHub OAuth We now need to stop Gerrit and go through the init steps again in order to reconfigure the Gerrit authentication. We need to enable HTTP authentication by choosing an HTTP header to be used to verify the user's credentials, and to go through the GitHub settings wizard to configure the OAuth authentication. $ /opt/gerrit/bin/gerrit.sh stop Stopping Gerrit Code Review: OK $ cd /opt/gerrit $ java -jar gerrit.war init [...] *** User Authentication *** Authentication method []: HTTP RETURN Get username from custom HTTP header [Y/n]? Y RETURN Username HTTP header []: GITHUB_USER RETURN SSO logout URL : /oauth/reset RETURN *** GitHub Integration *** GitHub URL [https://github.com]: RETURN Use GitHub for Gerrit login ? [Y/n]? Y RETURN ClientId []: 384cbe2e8d98192f9799 RETURN ClientSecret []: f82c3f9b3802666f2adcc4 RETURN Initialized /opt/gerrit $ /opt/gerrit/bin/gerrit.sh start Starting Gerrit Code Review: OK   Using GitHub login for Gerrit Gerrit is now fully configured to register and authenticate users through GitHub OAuth. When opening the browser to access any Gerrit web pages, we are automatically redirected to the GitHub for login. If we have already visited and authenticated with GitHub previously, the browser cookie will be automatically recognized and used for the authentication, instead of presenting the GitHub login page. Alternatively, if we do not yet have a GitHub account, we create a new GitHub profile by clicking on the SignUp button. Once the authentication process is successfully completed, GitHub requests the user's authorization to grant access to their public profile information. The following screenshot shows GitHub OAuth authorization for Gerrit: The authorization status is then stored under the user's GitHub applications preferences on https://github.com/settings/applications. Finally, GitHub redirects back to Gerrit propagating the user's profile securely using a one-time code which is used to retrieve the full data profile including username, full name, e-mail, and associated SSH public keys. Replication to GitHub The next steps in the Gerrit to GitHub integration is to share the same Git repositories and then keep them up-to-date; this can easily be achieved by using the Gerrit replication plugin. The standard Gerrit replication is a master-slave, where Gerrit always plays the role of the master node and pushes to remote slaves. We will refer to this scheme as push replication because the actual control of the action is given to Gerrit through a git push operation of new commits and branches. Configure Gerrit replication plugin In order to configure push replication we need to enable the Gerrit replication plugin through Gerrit init: $ /opt/gerrit/bin/gerrit.sh stop Stopping Gerrit Code Review: OK $ cd /opt/gerrit $ java -jar gerrit.war init [...] *** Plugins *** Prompt to install core plugins [y/N]? y RETURN Install plugin reviewnotes version 2.7-rc4 [y/N]? RETURN Install plugin commit-message-length-validator version 2.7-rc4 [y/N]? RETURN Install plugin replication version 2.6-rc3 [y/N]? y RETURN Initialized /opt/gerrit $ /opt/gerrit/bin/gerrit.sh start Starting Gerrit Code Review: OK The Gerrit replication plugin relies on the replication.config file under the /opt/gerrit/etc directory to identify the list of target Git repositories to push to. The configuration syntax is a standard .ini format where each group section represents a target replica slave. See the following simplest replication.config script for replicating to GitHub: [remote "github"] url = git@github.com:myorganisation/${name}.git The preceding configuration enables all of the repositories in Gerrit to be replicated to GitHub under the myorganisa tion GitHub Team account. Authorizing Gerrit to push to GitHub Now, that Gerrit knows where to push, we need GitHub to authorize the write operations to its repositories. To do so, we need to upload the SSH public key of the underlying OS user where Gerrit is running to one of the accounts in the GitHub myorganisation team, with the permissions to push to any of the GitHub repositories. Assuming that Gerrit runs under the OS user gerrit, we can copy and paste the SSH public key values from the ~gerrit/.ssh/id_rsa.pub (or ~gerrit/.ssh/id_dsa.pub) to the Add an SSH Key section of the GitHub account under target URL to be set to: https://github.com/settings/ssh Start working with Gerrit replication Everything is now ready to start playing with Gerrit to GitHub replication. Whenever a change to a repository is made on Gerrit, it will be automatically replicated to the corresponding GitHub repository. In reality there is one additional operation that is needed on the GitHub side: the actual creation of the empty repositories using https://github.com/new associated to the ones created in Gerrit. We need to make sure that we select the organization name and repository name, consistent with the ones defined in Gerrit and in the replication.config file. Never initialize the repository from GitHub with an empty commit or readme file; otherwise the first replication attempt from Gerrit will result in a conflict and will then fail. Now GitHub and Gerrit are fully connected and whenever a repository in GitHub matches one of the repositories in Gerrit, it will be linked and synchronized with the latest set of commits pushed in Gerrit. Thanks to the Gerrit-GitHub authentication previously configured, Gerrit and GitHub share the same set of users and the commits authors will be automatically recognized and formatted by GitHub. The following screenshot shows Gerrit commits replicated to GitHub: Reviewing and merging to GitHub branches The final goal of the Code Review process is to agree and merge changes to their branches. The merging strategies need to be aligned with real-life scenarios that may arise when using Gerrit and GitHub concurrently. During the Code Review process the alignment between Gerrit and GitHub was at the change level, not influenced by the evolution of their target branches. Gerrit changes and GitHub pull requests are isolated branches managed by their review lifecycle. When a change is merged, it needs to align with the latest status of its target branch using a fast-forward, merge, rebase, or cherry-pick strategy. Using the standard Gerrit merge functionality, we can apply the configured project merge strategy to the current status of the target branch on Gerrit. The situation on GitHub may have changed as well, so even if the Gerrit merge has succeeded there is no guarantee that the actual subsequent synchronization to GitHub will do the same! The GitHub plugin mitigates this risk by implementing a two-phase submit + merge operation for merging opened changes as follows: Phase-1 : The change target branch is checked against its remote peer on GitHub and fast forwarded if needed. If two branches diverge, the submit + merge is aborted and manual merge intervention is requested. Phase-2 : The change is merged on its target branch in Gerrit and an additional ad hoc replication is triggered. If the merge succeeds then the GitHub pull request is marked as completed. At the end of Phase-2 the Gerrit and GitHub statuses will be completely aligned. The pull request author will then receive the notification that his/her commit has been merged. Using Gerrit and GitHub on http://gerrithub.io When using Gerrit and GitHub on the web with public or private repositories, all of the commits are replicated from Gerrit to GitHub, and each one of them has a complete copy of the data. If we are using a Git and collaboration server on GitHub over the Internet, why can't we do the same for its Gerrit counterpart? Can we avoid installing a standalone instance of Gerrit just for the purpose of going through a formal Code Review? One hassle-free solution is to use the GerritHub service (http://gerrithub.io), which offers a free Gerrit instance on the cloud already configured and connected with GitHub through the github-plugin and github-oauth authentication library. All of the flows that we have covered in this article are completely automated, including the replication and automatic pull request to change automation. As accounts are shared with GitHub, we do not need to register or create another account to use GerritHub; we can just visit http://gerrithub.io and start using Gerrit Code Review with our existing GitHub projects without having to teach our existing community about a new tool. GerritHub also includes an initial setup Wizard for the configuration and automation of the Gerrit projects and the option to configure the Gerrit groups using the existing GitHub. Once Gerrit is configured, the Code Review and GitHub can be used seamlessly for achieving maximum control and social reach within your developer community. Summary We have now integrated our Gerrit installation with GitHub authentication for a seamless Single-Sign-On experience. Using an existing GitHub account we started using Gerrit replication to automatically mirror all the commits to GitHub repositories, allowing our projects to have an extended reach to external users, free to fork our repositories, and to contribute changes as pull requests. Finally, we have completed our Code Review in Gerrit and managed the merge to GitHub with a two-phase change submit + merge process to ensure that the target branches on both Gerrit and GitHub have been merged and aligned accordingly. Similarly to GitHub, this Gerrit setup can be leveraged for free on the web without having to manage a separate private instance, thanks to the free set target URL to http://gerrithub.io service available on the cloud. Resources for Article : Further resources on this subject: Getting Dynamics NAV 2013 on Your Computer – For (Almost) Free [Article] Building Your First Zend Framework Application [Article] Quick start - your first Sinatra application [Article]
Read more
  • 0
  • 1
  • 51232
article-image-managing-adobe-connect-meeting-room
Packt
04 Sep 2013
6 min read
Save for later

Managing Adobe Connect Meeting Room

Packt
04 Sep 2013
6 min read
(For more resources related to this topic, see here.) The Meeting Information page In order to get to the Meeting Information page, you will first need to navigate to the Meeting List page by following these steps: Log in to the Connect application. Click on the Meetings tab in the Home Page main menu. When you access the Meetings page, the My Meetings link is opened by default and a view is set on the Meeting List tab. You will find the meeting that is listed on this page as shown in the following screenshot: By clicking on the Cookbook Meeting option in the Name column (marked with a red outline), you will be presented with the Meeting Information page. In the section titled Meeting Information, you can examine various pieces of information about the selected meeting. On this page, you can review Name, Summary, Start Time, Duration, Number of users in room(that are currently present in the meeting room), URL, Language(selected), and the Access rights of the meeting. The two most important fields are marked with a red outline in the previous screenshot. The first one is the link to the meeting URL and the second is the Enter Meeting Room button. You can join the selected meeting room by clicking on any of these two options. In the upper portion of this page, you will notice the navigation bar with the following links: Meeting Information Edit Information Edit Participants Invitations Uploaded Content Recordings Reports By selecting any of these links, you will open pages associated with them. Our main focus of this article will be on the functionalities of these pages. Since we have explained the Meeting Information page, we can proceed to the Edit Information page. The Edit Information page The Edit Information page is very similar to the Enter Meeting Information page. We will briefly inform you about the meeting settings, which you can edit on this page. These settings are: Name Summary Start time Duration Language Access Audio conference settings Any changes made on this page are preserved by clicking on the Save button that you will find at very bottom of this page. Changes will not affect participants who are already logged in to the room, except changes to the Audio Conference settings. Next to the Save button, you will find the Cancel button. Any changes made on the Edit Information page, which are not already saved will be reverted by clicking on the Cancel button. The Edit Participants page After the Edit Information page, it's time for us to access the next page by clicking on the Edit Participants link in the navigation bar. This link will take you to the Select Participants page. In addition to the already described features, we will introduce you to a couple more functionalities that will help you to add participants, change their roles, or remove them from the meeting. Example 1 – changing roles In this example, we will change the role of the administrators group from participant to presenter by using the Search button. This feature is of great help when there are a large number of Connect users that are already added as meeting participants. In order to do so, you will need to follow the steps listed: In the Current Participants For Cookbook Meeting table on the right-hand side, click on the Search button located in the lower-left corner of the table. When you click on the Search button, a text field for instant search will be displayed. In the text field, enter the name of the Administrators group or part of the group name (the auto-complete function should recognize the name of the present group). When the group is present in the table, select it. Click on the Set User Role button. Select new role for this group in the menu. For the purpose of this example, we will select the Presenter role. By completing this action, you will grant Presenter privileges in the Cookbook Meeting table to all the administrators as shown in the following screenshot: Example 2 – removing a user In this example, we will show you how to remove a specific user from the selected meeting. For the purpose of this exercise, we will remove the Administrators group from the Participants list. In order to complete this action, please follow the given steps: Select Administrators in the Current Participants For Cookbook Meeting table. Click on the Remove button. Now, all the members of this group will be excluded from the meeting, and Administrators should not be present in the list. Example 3 – adding a specific user This example will demonstrate how to add a specific user from any group. For example, we will add a user from the Authors group to the Current Participants list. In the Available users and Groups table, double-click on the Authors group. This action will change the user interface of this table and list all the users that belong to the Authors group. Please note that table header is now changed to Authors. Select a specific user and click on the Add button. This will add the selected user from the Authors group to the Current Participants For Cookbook Meeting table. One thing that we would like to mention here is the ability to perform multiple selections in both the Available Users and Groups and Current Participants For Cookbook Meeting tables. To enable multiple selection functionality, select a specific user and group by clicking and selecting Ctrl and Shift on the keyboard at the same time. By demonstrating these examples, we reviewed the Edit Participant link functionalities. Summary In this article, we learned how to master all functionalities on how to edit different settings for already existing meetings. We covered the following topics: The Meeting information page The Managing Edit information page The Managing Edit participants page Resources for Article: Further resources on this subject: Top features you'll want to know about [Article] Remotely Preview and test mobile web pages on actual devices with Adobe Edge Inspect [Article] Exporting SAP BusinessObjects Dashboards into Different Environments [Article]
Read more
  • 0
  • 0
  • 1761

article-image-configuring-payment-models-intermediate
Packt
03 Sep 2013
4 min read
Save for later

Configuring payment models (Intermediate)

Packt
03 Sep 2013
4 min read
(For more resources related to this topic, see here.) How to do it... Let's learn how to integrate PayPal Website Payments Standard into our store in test mode. Integrating PayPal Website Payments Standard into our store (test mode) We will start by activating PayPal Payments Standard using the Payments section under the Extensions menu, and then we will edit the settings for it. The next step is to fill in the needed information for testing the PayPal system. For test purposes, we will choose the option for Sandbox Mode as Yes. We will now open a developer account to create test accounts on PayPal. Let's browse to http://developer.paypal.com and sign up for a developer account. Following is a screenshot of the screen that is displayed after we sign up and log in to the account. Click on the Create a preconfigured account link. The next screen will propose an account name with which your account will be created. Now we only need to add funds to the Account Balance field to create the account. Remember that it is a test account, so we can give any virtual amount of funds we want. Now we have a test PayPal account that can be used for our test purchases: Let's go to our shop's user interface, add a product to the shopping cart, and proceed to the Checkout page: Let's log in with the test account we have just created: The following screenshot shows a successful test order: Integrating PayPal Website Payments Pro into our store (live mode) We need to get the API information first. Let's log in to our PayPal account. Click on the User Profile link, and then click on the Update link next to API access in the My selling tools section. The next step is to click on the Request API Credentials link. Choose the option that says Request API signature. This will help us get the API information. The next step is to activate Payment Pro on the OpenCart administration interface using the Payments section under the Extensions menu. We need to edit the details and enter the API information. Let's not forget to select No for Test Mode, which means that this will be a live system. Choose Enabled for the Status field. How it works... Now let's learn how the PayPal Standard and Pro models work and how they differ from each other. PayPal Website Payments Standard PayPal Standard is the easiest payment model to integrate into our store. All we need is a PayPal account, and a bank account to withdraw the money from. PayPal Standard has no monthly costs or setup fee. However, the company charges a small percentage from each transaction. Please go to https://www.paypal.com/webapps/mpp/merchant for merchant service details. The activation of the Standard method is very straightforward. We only need to provide our e-mail address and then set Transaction Method to Sale, Sandbox Mode to No, and Status to Enabled on the administration panel. There is a difference in the test payments that we have made. Customers can also pay with their credit cards instantly, even without a PayPal account. This makes PayPal a very powerful and popular solution. If you are afraid of charging your real PayPal account, there is a good way to test your real payment environment. Create a dummy product with the price of $0.01 and complete the purchase with this tiny amount. PayPal Website Payments Pro This service can be used to charge credit cards using PayPal services in the background. The customers will not need to leave the store at all; the transaction will be completed at the shop itself. Many big e-commerce websites operate this way. Currently, Pro service is only available for merchant accounts located in the US, UK, and Canada. Summary This article explains about implementing different PayPal integrations. The article discusses about how to integrate the PayPal Website Payments Standard and Pro methods into a simple store. Resources for Article: Further resources on this subject: Upgrading OpenCart [Article] Setting Payment Model in OpenCart [Article] OpenCart: Layout Structure [Article]
Read more
  • 0
  • 0
  • 1519

article-image-resource-manager
Packt
03 Sep 2013
11 min read
Save for later

Resource Manager

Packt
03 Sep 2013
11 min read
(For more resources related to this topic, see here.) Resource definitions In order to be able to define resources, we need to create a module that will be in charge of handling this. The main idea is that before calling a certain asset through ResourceManager, it has to be defined inResourceDefinitions. In this way, ResourceManager will always have access to some metadata it needs to create the asset (filenames, sizes, volumes, and so on). In order to identify the asset types (sounds, images, tiles, and fonts), we will define some constants (note that the number values of these constants are arbitrary; you could use whatever you want here). Let’s call them RESOURCE_TYPE_[type] (feel free to use another convention if you want to). To make things easier, just follow this convention for now since it’s the one we’ll use in the rest of the book. You should enter them in main.lua as follows: RESOURCE_TYPE_IMAGE = 0RESOURCE_TYPE_TILED_IMAGE = 1RESOURCE_TYPE_FONT = 2RESOURCE_TYPE_SOUND = 3 If you want to understand the actual reason behind these resource type constants, take a look at the load function of our ResourceManager entity in the next section. We need to create a file named resource_definitions.lua and add some simple methods that will handle it. Add the following line to it: module ( “ResourceDefinitions”, package.seeall ) The preceding line indicates that all of the code in the file should be treated as a module function, being accessed through ResourceDefinitions in the code. This is one of many Lua patterns used to create modules. If you’re not used to the Lua’s module function, you can read about it in the modules tutorial at http://lua-users.org/wiki/ModulesTutorial. Next, we will create a table that contains these definitions: local definitions = {} This will be used internally and is not accessible through the module API, so we create it using the keyword local. Now, we need to create the setter, getter, and unload methods for the definitions. The setter method (called set) stores the definition parameter (a table) in the definitions table, using the name parameter (a string) as the key, as follows: function ResourceDefinitions:set(name, definition) definitions[name] = definitionend The getter method (called get, duh!) retrieves the definition that was previously stored (by use of ResourceDefinitions:set ()) using the name parameter as the key of the definitions table, as follows: function ResourceDefinitions:get(name) return definitions[name]end The final method that we’re creating is remove. We use it to clear the memory space used by the definition. In order to achieve this we assign nil to an entry in the definitions table indexed by the name parameter as follows: function ResourceDefinitions:remove (name) definitions[name] = nilend In this way, we remove the reference to the object, allowing the memory to be released by the garbage collector. This may seem useless here, but it’s a good example of how you should manage your objects to be removed from memory by the garbage collector. And besides this, we don’t know information comes in a resource definition; it may be huge, we just don’t know. This is all we need for the resource definitions. We’re making use of the dynamism that Lua provides. See how easy it was to create a repository for definitions that is abstracted from the content of each definition. We’ll define different fields for each asset type, and we don’t need to define them beforehand as we probably would have needed to do in C++. Resource manager We will now create our resource manager. This module will be in charge of creating and storing our decks and assets in general. We’ll retrieve the assets with one single command, and they’ll come from the cache or get created using the definition. We need to create a file named resource_manager.lua and add the following line to it: module ( “ResourceManager”, package.seeall ) This is the same as in the resource definitions; we’re creating a module that will be accessed using ResourceManager. ASSETS_PATH = ‘assets/’ We now create the ASSETS_PATH constant. This is the path where we will store our assets. You could have many paths for different kinds of assets, but in order to keep things simple, we’ll keep all of them in one single directory in this example. Using this constant will allow us to use just the filename instead of having to write the whole path when creating the actual resource definitions, saving us some phalanx injuries! local cache = {} Again, we’re creating a cache table as a local variable. This will be the variable that will store our initialized assets. Now we should take care of implementing the important functionality. In order to make this more readable, I’ll be using methods that we define in the following pages. So, I recommend that you read the whole section before trying to run what we code now. The full source code can be downloaded from the book’s website, featuring inline comments. In the book, we removed the comments for brevity’s sake. Getter The first thing we will implement is our getter method since it’s simple enough: function ResourceManager:get ( name ) if (not self:loaded ( name )) then self:load ( name ) end return cache[name] end This method receives a name parameter that is the identifier of the resource we’re working with. On the first line, we call loaded (a method that we will define soon) to see if the resource identified by name was already loaded. If it was, we just need to return the cached value, but if it was not we need to load it, and that’s what we do in the if statement. We use the internalload method (which we will define later as well) to take care of the loading. We will make this load method store the loaded object in the cache table. So after loading it, the only thing we have to do is return the object contained in the cache table indexed by name. One of the auxiliary functions that we use here is loaded. Let’s implement it since it’s really easy to do so: function ResourceManager:loaded ( name ) return cache[name] ~= nil end What we do here is check whether the cache table indexed by the name parameter is not equal to nil. If cache has an object under that key, this will return true, and that’s what we were looking for to decide whether the object represented by the name parameter was already loaded. Loader load and its auxiliary functions are the most important methods of this module. They’ll be slightly more complex than what we’ve done so far since they make the magic happen. Pay special attention to this section. It’s not particularly hard, but it might get confusing. Like the previous methods, this one receives just the name parameter that represents the asset we’re loading as follows: function ResourceManager:load ( name ) First of all, we retrieve the definition for the resource associated to name. We make a call to the get method from ResourceDefinitions, which we defined earlier as follows: local resourceDefinition = ResourceDefinitions:get( name ) If the resource definition does not exist (because we forgot to define it before), we print an error to the screen, as follows: if not resourceDefinition then print(“ERROR: Missing resource definition for “ .. name ) If the resource definition was retrieved successfully, we create a variable that will hold the resource and (pay attention) we call the correct load auxiliary function, depending on the asset type. else local resource Remember the RESOURCE_TYPE_[type] constants that we created in the ResourceDefinitions module? This is the reason for their existence. Thanks to the creation of the RESOURCE_TYPE_[type] constants, we now know how to load the resources correctly. When we define a resource, we must include a type key with one of the resource types. We’ll insist on this soon. What we do now is call the correct load method for images, tiled images, fonts, and sounds, using the value stored in resourceDefinition.type as follows: if (resourceDefinition.type == RESOURCE_TYPE_IMAGE) then resource = self:loadImage ( resourceDefinition ) elseif (resourceDefinition.type == RESOURCE_TYPE_TILED_IMAGE) then resource = self:loadTiledImage ( resourceDefinition ) elseif (resourceDefinition.type == RESOURCE_TYPE_FONT) then resource = self:loadFont ( resourceDefinition ) elseif (resourceDefinition.type == RESOURCE_TYPE_SOUND) then resource = self:loadSound ( resourceDefinition ) end After loading the current resource, we store it in the cache table, in an entry specified by the name parameter, as follows: -- store the resource under the name on cache cache[name] = resource endend Now, let’s take a look at all of the different load methods. The expected definitions are explained before the actual functions so you have a reference when reading them. Images Loading images is something that we’ve already done, so this is going to look somewhat familiar. In this book, we’ll have two ways of defining images. Let’s take a look at them: {type = RESOURCE_TYPE_IMAGEfileName = “tile_back.png”,width = 62,height = 62,} As you may have guessed, the type key is the one used in the load function. In this case, we need to make it of type RESOURCE_TYPE_IMAGE. Here we are defining an image that has specific width and height values, and that is located at assets/title_back.png. Remember that we will use ASSET_PATH in order to avoid writing assets/ a zillion times. That’s why we’re not writing it on the definition. Another useful definition is: {type = RESOURCE_TYPE_IMAGEfileName = “tile_back.png”,coords = { -10, -10, 10, 10 }} This is handy when you want a specific rectangle inside a bigger image. You can use the cords attribute to define this rectangle. For example, we get a square with 20 pixel long sides centered in the image by specifying coords = { -10, -10, 10, 10 }. Now, let’s take a look at the actual loadImage method to see how this all falls into place: function ResourceManager:loadImage ( definition ) local image First of all, we use the same technique of defining an empty variable that will hold our image: local filePath = ASSETS_PATH .. definition.fileName We create the actual full path by appending the value of fileName in the definition to the value of the ASSETS_PATH constant. if checks whether the coords attribute is defined: if definition.coords thenimage = self:loadGfxQuad2D ( filePath, definition.coords ) Then, we use another auxiliary function called loadGfxQuad2D. This will be in charge of creating the actual image. The reason why we’re using another auxiliary function is that the code used to create the image is the same for both definition styles, but the data in the definition needs to be processed differently. In this case, we just pass the coordinates of the rectangle. else local halfWidth = definition.width / 2 local halfHeight = definition.height / 2 image = self:loadGfxQuad2D(filePath, {-halfWidth, -halfHeight, halfWidth, halfHeight} ) If there were no coords attribute, we’d assume the image is defined using width and height. So what we do is to define a rectangle that covers the whole width and height for the image. We do this by calculating halfWidth and halfHeight and then passing these values to theloadGfxQuad2D method. Remember the discussion about the texture coordinates in Moai SDK; this is the reason why we need to divide the dimensions by 2 and pass them as negative and positive parameters for the rectangle. This allows it to be centered on (0, 0). After loading the image, we return it so it can be stored in the cache by the load method: end return imageend Now the last method we need to write is loadGfxQuad2D. This method is basically to display an image as follows: function ResourceManager:loadGfxQuad2D ( filePath, coords ) local image = MOAIGfxQuad2D.new () image:setTexture ( filePath ) image:setRect ( unpack(cords) ) return imageend Lua’s unpack method is a nice tool that allows you to pass a table as separate parameters. You can use it to split a table into multiple variables as well: x, y = unpack ( position_table )   What we do here is instantiate the MOAIGfxQuad2D class, set the texture we defined in the previous function, and use the coordinates we constructed to set the rectangle this image will use from the original texture. Then we return it so loadImage can use it. Well! That was it for images. It may look complicated at first, but it’s not that complex. The rest of the assets will be simpler than this, so if you understood this one, the rest will be a piece of cake.
Read more
  • 0
  • 0
  • 2046
Packt
03 Sep 2013
11 min read
Save for later

Quick start – Using Burp Proxy

Packt
03 Sep 2013
11 min read
(For more resources related to this topic, see here.) At the top of Burp Proxy, you will notice the following three tabs: intercept: HTTP requests and responses that are in transit can be inspected and modified from this window options: Proxy configurations and advanced preferences can be tuned from this window history: All intercepted traffic can be quickly analyzed from this window If you are not familiar with the HTTP protocol or you want to refresh your knowledge, HTTP Made Really Easy, A Practical Guide to Writing Clients and Servers, found at http://www.jmarshall.com/easy/http/, represents a compact reference. Step 1 – Intercepting web requests After firing up Burp and configuring the browser, let's intercept our first HTTP request. During this exercise, we will intercept a simple request to the publisher's website: In the intercept tab, make sure that Burp Proxy is properly stopping all requests in transit by checking the intercept button. This should be marked as intercept is on. In the browser, type http://www.packtpub.com/ in the URL bar and press Enter. Back in Burp Proxy, you should be able to see the HTTP request made by the browser. At this stage, the request is temporarily stopped in Burp Proxy waiting for the user to either forward or stop it. For instance, press forward and return to the browser. You should see the home page of Packt Publishing as you would normally interact with the website. Again, type http://www.packtpub.com/ in the URL bar and press Enter. Let's press drop this time. Back in the browser, the page will contain the warning Burp proxy error: message was dropped by user. We have dropped the request, thus Burp Proxy did not forward the request to the server. As a result, the browser received a temporary HTML page with the warning message generated by Burp, instead of the original HTML content. Let's try one more time. Type http://www.packtpub.com/ in the URL bar of the browser and press Enter. Once the request is properly captured by Burp Proxy, the action button becomes active. Click on it to display the contextual menu. This is an important functionality as it allows you to import the current web request in any of the other Burp tools. You can already imagine the potentialities of having a set of integrated tools that allow you to manipulate and analyze web requests so easily. For example, if we want to decode the request, we can simply click on send to decoder. Burp Proxy In Burp Proxy, we can also decide to automatically forward all requests without waiting for the user to either forward or drop the communication. By clicking on the intercept button, it is possible to switch from intercept is on to intercept is off. Nevertheless, the proxy will record all requests in transit. Also, Burp Proxy allows you to automatically intercept all responses matching specific characteristics. Take a look at the numerous options available in the intercept server response section from within the Burp Proxy options tab. For example, it is possible to intercept the server's response only if the client's request was intercepted. This is extremely helpful while testing input validation vulnerabilities as we are generally interested in evaluating the server's responses for all tampered requests. Or else, you may only want to intercept and inspect responses having a specific return code (for example, 200 OK). Step 2 – Inspecting web requests Once a request is properly intercepted, it is possible to inspect the entire content, headers, and parameters, using one of the four Burp Proxy message analysis tabs: raw: This view allows you to display the web request in raw format within a simple text editor. This is a very handy visualization as it enables maximum flexibility for further changing the content. params: In this view, the focus is on user-supplied parameters (GET/POST parameters, cookies). This is particularly important in case of complex requests as it allows to consider all entry points for potential vulnerabilities. Whenever applicable, Burp Proxy will also automatically perform URL decoding. In addition, Burp Proxy will attempt to parse commonly used formats, including JSON. headers: Similarly, this view displays the HTTP header names and values in tabular form. hex: In case of binary content, it is useful to inspect the hexadecimal representation of the resource. This view allows to display a request as in a traditional hex editor. The history tab enables you to analyze all web requests transited through the proxy: Click on the history tab. At the top, Burp Proxy shows all the requests in the bundle. At the bottom, it displays the content of the request and response corresponding to the specific selection. If you have previously modified the request, Burp Proxy history will also display the modified version. Displaying HTTP requests and responses intercepted by Burp Proxy By double-clicking on one of the requests, Burp will automatically open a new window with the specific content. From this window, it is possible to browse all the captured communication using the previous and next buttons Back in the history tab, Burp Proxy displays several details for each item including the request method, URL, response's code, and length. Each request is uniquely identified by a number, visible in the left-hand side column. Click on the request identifier. Burp Proxy allows you to set a color for that specific item. This is extremely helpful to highlight important requests or responses. For example, during the initial application enumeration, you may notice an interesting request; you can mark it and get back later for further testing. Burp Proxy history is also useful when you have to evaluate a sequence of requests in order to reproduce a specific application behavior. Click on the display filter, at the top of the history list to hide irrelevant content. If you want to analyze all HTTP requests containing at least one parameter, select the show only parameterised checkbox. If you want to display requests having a specific response, just select the appropriate response code in the filter by status code selection. At this point, you may have already understood the potentialities of the tool to filter and reveal interesting traffic. In addition, when using Burp Suite Professional, you can also use the filter by search term option. This feature is particularly important when you need to analyze hundreds of requests or responses as you can filter relevant traffic only by using regular expressions or simply matching particular strings. Using this feature, you may also be able to discover sensitive information (for example, credentials) embedded in the intercepted pages. Step 3 – Tampering web requests As part of a typical security assessment, you will need to modify HTTP requests and analyze the web application responses. For example, to identify SQL injection vulnerabilities, it is important to inject common attack vectors (for example, a single quote) in all user-supplied input, including HTTP headers, cookies, and GET/POST parameters. If you want to refresh your knowledge on common web application vulnerabilities, the OWASP Top Ten Project article at https://www. owasp.org/index.php/Category:OWASP_Top_Ten_Project is a good starting point. Tampering web requests with Burp is as easy as editing strings in a text editor: Intercept a request containing at least one HTTP parameter. For example, you can point your browser to http://www.packtpub.com/books/all?keys=ASP. Go to Burp Proxy | Intercept. At this point, you should see the corresponding HTTP request. From the raw view, you can simply edit any aspect of the web request in transit. For example, you can change the value of the the GET parameter's keys value from ASP to PHP. Edit the request to look like the following: GET /books/all?keys=PHP HTTP/1.1Host: www.packtpub.comUser-Agent: Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:15.0)Gecko/20100101 Firefox/15.0.1Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8Accept-Language: en-us,en;q=0.5Accept-Encoding: gzip, deflateProxy-Connection: keep-alive Click on forward and get back to the browser. This should result in a search query performed with the string PHP. You can verify it by simply checking the results in the HTML page. Although we have used the raw view to change the previous HTTP request, it is actually possible to use any of the Burp Proxy view. For example, in the params view, it is possible to add a new parameter by following these steps: Clicking on new (right side), from the Burp Proxy params view. Selecting the proper parameter type (URL, body, or cookie). URL should be used for GET parameters, whereas body denotes POST parameters. Typing the name and the value of the newly created parameter. Advanced features After practicing with the basic features provided by Burp Proxy, you are almost ready to experiment with more advanced configurations. Match and replace Let's imagine that you are testing an application designed for mobile devices using a standard browser from your computer. In most cases, the web server examines the user-agent provided by the browser to identify the specific platform and respond with customized resources that better fit mobile phones and tablets. Under these circumstances, you will particularly find the match and replace function, provided by Burp Proxy, very useful. Let's configure Burp Proxy in order to tamper the user-agent HTTP header field: In the options tab of Burp Proxy, scroll down to the match and replace section. Under the match and replace table, a drop-down list and two text fields allow to create a customized rule. Select request header from the drop-down list since we want to create a match condition pertaining to HTTP requests. Type ^User-Agent.*$ in the first text field. This field represents the match within the HTTP request. Burp Proxy's match and replace feature allows you to use simple strings as well as complex regular expressions. If you are not familiar with regular expressions, have a look at http://www.regular-expressions.info/quickstart. html. In the second text field, type Mozilla/5.0 (iPhone; U; CPU like Mac OS X; en) AppleWebKit/4h20+ (KHTML, like Gecko) Version/3.0 Mobile/1C25 Safari/419.3 or any other fake user-agent that you want to impersonate. Click add and verify that the new match has been added to the list; this button is shown here: Burp Proxy match and replace list Intercept a request, leave it to pass through the proxy, and verify that it has been automatically modified by the tool. Automatically modified HTTP header in Burp Proxy HTML modification Another interesting feature of Burp Proxy is the automatic HTML modification, that can be activated and configured in the appropriate section within Burp Proxy | options. By using this function, you can automatically remove JavaScript or modify HTML forms of all received HTTP responses. Some applications deploy client-side validation in the form of disabled HTML form fields or JavaScript code. If you want to verify the presence of server-side controls that enforce specific data formats, you would need to tamper the request with invalid data. In these situations, you can either manually tamper the request in the proxy or enable HTML modification to remove any client-side validation and use the browser in order to submit invalid data. This function can be also used to display hidden form fields. Let's see in practice how you can activate this feature: In Burp Proxy, go to options, scroll down to the HTML modification section. Numerous options are available in this section: unhide hidden form fields to display hidden HTML form fields, enable disabled form fields to submit all input forms present inside the HTML page, remove input field length limits to allow extra-long strings in the text fields, remove JavaScript form validation to make Burp Proxy all onsubmit handler JavaScript functions from HTML forms, remove all JavaScript to completely remove all JS scripts and remove object tags to remove embedded objects within the HTML document. Select the desired checkboxes to activate automatic HTML modification. Summary Using this feature, you will be able to understand whether the web application enforces server- side validation. For instance, some insecure applications use client-side validation only (for example, via JavaScript functions). You can activate the automatic HTML modification feature by selecting the remove JavaScript form validation checkbox in order to perform input validation testing directly from your browser. Resources for Article : Further resources on this subject: Visual Studio 2010 Test Types [Article] Ordered and Generic Tests in Visual Studio 2010 [Article] Manual, Generic, and Ordered Tests using Visual Studio 2008 [Article]  
Read more
  • 0
  • 0
  • 7779

article-image-introduction-drools
Packt
03 Sep 2013
8 min read
Save for later

Introduction to Drools

Packt
03 Sep 2013
8 min read
(For more resources related to this topic, see here.) So, what is Drools? The techie answer guaranteed to get that glazed over look from anyone hounding you for details on project design is that Drools, part of the JBoss Enterprise BRMS product since federating in 2005, is a Business Rule Management System (BRMS) and rules engine written in Java which implements and extends the Rete pattern-matching algorithm within a rules engine capable of both forward and backward chaining inference. Now, how about an answer fit for someone new to rules engines? After all, you're here to learn the basics, right? Drools is a collection of tools which allow us to separate and reason over logic and data found within business processes. Ok, but what does that mean? Digging deeper, the keywords in that statement we need to consider are "logic" and "data". Logic, or rules in our case, are pieces of knowledge often expressed as, "When some conditions occur, then do some tasks". Simple enough, no? These pieces of knowledge could be about any process in your organization, such as how you go about approving TPS reports, calculate interest on a loan, or how you divide workload among employees. While these processes sound complex, in reality, they're made up of a collection of simple business rules. Let's consider a daily ritual process for many workers: the morning coffee. The whole process is second nature to coffee drinkers. As they prepare for their work day, they probably don't consider the steps involved—they simply react to situations at hand. However, we can capture the process as a series of simple rules: When your mug is dirty, then go clean it When your mug is clean, then go check for coffee When the pot is full, then pour yourself a cup and return to your desk When the pot is empty, then mumble about co-workers and make some coffee Alright, so that's logic, but what's data? Facts (our word for data) are the objects that drive the decision process for us. Given the rules from our coffee example, some facts used to drive our decisions would be the mug and the coffee pot. While we know from reading our rules what to do when the mug or pot are in a particular state, we need facts that reflect an actual state on a particular day to reason over. In seeing how a BRMS allows us to define the business rules of a business process, we can now state some of the features of a rules engine. As stated before, we've separated logic from data—always a good thing! In our example, notice how we didn't see any detail about how to clean our mug or how to make a new batch of coffee, meaning we've also separated what to do from how to do it , thus allowing us to change procedure without altering logic. Lastly, by gathering all of our rules in one place, we've centralized our business process knowledge. This gives us an excellent facility when we need to explain a business process or transfer knowledge. It also helps to prevent tribal knowledge, or the ownership and understanding of an undocumented procedure by just one or a few users. So when is a BRMS the right choice? Consider a rules engine when a problem is too complex for traditional coding approaches. Rules can abstract away the complexity and prevent usage of fragile implementations. Rules engines are also beneficial when a problem isn't fully known. More often than not, you'll find yourself iterating business methodology in order to fully understand small details involved that are second nature to users. Rules are flexible and allow us to easily change what we know about a procedure to accommodate this iterative design. This same flexibility comes in handy if you find that your logic changes often over time. Lastly, in providing a straightforward approach in documenting business rules, rules engines are an excellent choice if you find domain knowledge readily available, but via non-technical people who may be incapable of contributing to code. Sounds great, so let's get started, right? Well, I promised I'd also help you decide when a rules engine is not the right choice for you. In using a rules engine, someone must translate processes into actual rules, which can be a blessing in taking business logic away from developers, but also a curse in required training. Secondly, if your logic doesn't change very often, then rules might be overkill. Likewise, If your project is small in nature and likely to be used once and forgotten, then rules probably aren't for you. However, beware of the small system that will grow in complexity going forward! So if rules are right for you, why should you choose Drools? First and foremost, Drools has the flexibility of an open source license with the support of JBoss available. Drools also boasts five modules (to be discussed in more detail later), making their system quite extensible with domain-specific languages, graphical editing tools, web-based tools, and more. If you're partial to Eclipse, you'll also likely come to appreciate their plugin. Still not convinced? Read on and give it a shot—after all, that's why you're here, right? Installation In just five easy steps, you can integrate Drools into a new or existing project. Step 1 – what do I need? For starters, you will need to check that you have all of the required elements, listed as follows (all versions are as of time of writing): Java 1.5 (or higher) SE JDK. Apache Maven 3.0.4. Eclipse 4.2 (Juno) and the Drools plugin. Memory—512 MB (minimum), 1 GB or higher recommended. This will depend largely on the scale of your JVM and rule sessions, but the more the better! Step 2 – installing Java Java is the core language on which Drools is built, and is the language in which we'll be writing, so we'll definitely be needing that. The easiest way to get Java going is to download from and follow the installation instructions found at: www.oracle.com/technetwork/java/javase/downloads/index.html Step 3 – installing Maven Maven is a build automation tool from Apache that lets us describe a configuration of the project we're building and leave dependency management (amongst other things) up to it to work out. Again, the easiest way to get Maven up and running is to download and follow the documentation provided with the tool, found at: maven.apache.org/download.cgi Step 4 – installing Eclipse If you happen to have some other IDE of choice, or maybe you're just the old school type, then it's perfectly acceptable to author and execute your Drools-integrated code in your usual fashion. However, if you're an Eclipse fan, or you'd like to take advantage of auto-complete, syntax highlighting, and debugging features, then I recommend you go ahead and install Eclipse and the Drools plugin. The version of Eclipse that we're after is Eclipse IDE for Java Developers, which you can download and find installation instructions for on their site: http://www.eclipse.org/downloads/ Step 5 – installing the Drools Eclipse plugin In order to add the IDE plugin to Eclipse, the easiest method is to use Eclipse's built-in update manager. First, you'll need to add something the plugin depends on—the Graphical Editing Framework (GEF). In the Eclipse menu, click on Help, then on Install New Software..., enter the following URL in the Work with: field, and hit Add. download.eclipse.org/tools/gef/updates/releases/ Give your repository a nifty name in the pop-up window, such as GEF, and continue on with the install as prompted. You'll be asked to verify what you're installing and accept the license. Now we can add the Drools plugin itself—you can find the URL you'll need by visiting: http://www.jboss.org/drools/downloads.html Then, search for the text Eclipse update site and you'll see the link you need. Copy the address of the link to your clipboard, head back into Eclipse, and follow the same process you did for installing GEF. Note that you'll be asked to confirm the install of unsigned content, and that this is expected. Summary By this point, you know what Drools is, you should also be ready to integrate Drools into your applications. If you find yourself stuck, one of the good parts about an open source community is that there's nearly always someone who has faced your problem before and likely has a solution to recommend. Resources for Article : Further resources on this subject: Drools Integration Modules: Spring Framework and Apache Camel [Article] Human-readable Rules with Drools JBoss Rules 5.0(Part 2) [Article] Drools JBoss Rules 5.0 Flow (Part 2) [Article]
Read more
  • 0
  • 0
  • 1696
Modal Close icon
Modal Close icon