Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Save more on your purchases! discount-offer-chevron-icon
Savings automatically calculated. No voucher code required.
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Events
Videos
Audiobooks
Packt Hub
Free Learning
Arrow right icon
timer SALE ENDS IN
0 Days
:
00 Hours
:
00 Minutes
:
00 Seconds

How-To Tutorials

7018 Articles
article-image-calculus
Packt
14 Aug 2013
8 min read
Save for later

Calculus

Packt
14 Aug 2013
8 min read
(For more resources related to this topic, see here.) Derivatives To compute the derivative of a function, create the corresponding expression and use diff(). Its first argument is the expression and the second is the variable with regard to which you want to differentiate. The result is the expression for the derivative: >>> diff(exp(x**2), x)2*x*exp(x**2)>>> diff(x**2 * y**2, y)2*x**2*y Higher-order derivatives can also be computed with a single call to diff(): >>> diff(x**3, x, x)6*x>>> diff(x**3, x, 2)6*x>>> diff(x**2 * y**2, x, 2, y, 2)4 Due to SymPy's focus on expressions rather than functions, the derivatives for symbolic functions can seem a little surprising, but LaTeX rendering in the notebook should make their meaning clear. >>> f = Function('f')>>> diff(f(x**2), x)2*x*Subs(Derivative(f(_xi_1), _xi_1), (_xi_1,), (x**2,)) Let's take a look at the following screenshot: Limits Limits are obtained through limit(). The syntax for the limit of expr when x goes to some value x0 is limit(expr, x, x0). To specify a limit towards infinity, you need to use SymPy's infinity object, named oo. This object will also be returned for infinite limits: >>> limit(exp(-x), x, oo)0>>> limit(1/x**2, x, 0)oo There is also a fourth optional parameter, to specify the direction of approach of the limit target. "+" (the default) gives the limit from above, and "-" is from below. Obviously, this parameter is ignored when the limit target is infinite: >>> limit(1/x, x, 0, "-")-oo>>> limit(1/x, x, 0, "+")oo Let's take a look at the following screenshot: Integrals SymPy has powerful algorithms for integration, and, in particular, can find most integrals of logarithmic and exponential functions expressible with special functions, and many more besides, thanks to Meijer G-functions. The main function for integration is integrate(). It can compute both antiderivatives (indefinite integrals) and definite integrals. Note that the value of an antiderivative is only defined up to an arbitrary constant but the result does not include it. >>> integrate(sin(x), x)-cos(x)>>> integrate(sin(x), (x, 0, pi))2 Unevaluated symbolic integrals and antiderivatives are represented by the Integral class. integrate() may return these objects if it cannot compute the integral. It is also possible to create Integral objects directly, using the same syntax as integrate(). To evaluate them, call their .doit() method: >>> integral = Integral(sin(x), (x, 0, pi))>>> integralIntegral(sin(x), (x, 0, pi))>>> integral.doit()2 Let's take a look at the following screenshot: Taylor series A Taylor series approximation is an approximation of a function obtained by truncating its Taylor series. To compute it, use series(expr, x, x0, n), where x is the relevant variable, x0 is the point where the expansion is done (defaults to 0), and n is the order of expansion (defaults to 6): >>> series(cos(x), x)1 - x**2/2 + x**4/24 + O(x**6)>>> series(cos(x), x, n=10)1 - x**2/2 + x**4/24 - x**6/720 + x**8/40320 + O(x**10) The O(x**6) part in the result is a "big-O" object. Intuitively, it represents all the terms of order equal to or higher than 6. This object automatically absorbs or combines with powers of the variable, which makes simple arithmetic operations on expansions convenient: >>> O(x**2) + 2*x**3O(x**2)>>> O(x**2) * 2*x**3O(x**5)>>> expand(series(sin(x), x, n=6) * series(cos(x), x, n=4))x - 2*x**3/3 + O(x**5)>>> series(sin(x)*cos(x), x, n=5)x - 2*x**3/3 + O(x**5) If you want to use the expansion as an approximation of the function, the O() term prevents it from behaving like an ordinary expression, so you need to remove it. You can do so by using the aptly named .removeO() method: >>> series(cos(x), x).removeO()x**4/24 - x**2/2 + 1 Taylor series look better in the notebook, as shown in the following screenshot: Solving equations This section will teach you how to solve the different types of equations that SymPy handles. The main function to use for solving equations is solve(). Its interface is somewhat complicated as it accepts many different kinds of inputs and can output results in various forms depending on the input. In the simplest case, univariate equations, use the syntax solve(expr, x) to solve the equation expr = 0 for the variable x. If you want to solve an equation of the form A = B, simply put it under the preceding form, using solve(A - B, x). This can solve algebraic and transcendental equations involving rational fractions, square roots, absolute values, exponentials, logarithms, trigonometric functions, and so on. The result is then a list of the values of the variables satisfying the equation. The following commands show a few examples of equations that can be solved: >>> solve(x**2 - 1, x)[-1, 1]>>> solve(x*exp(x) - 1, x)[LambertW(1)]>>> solve(abs(x**2-4) - 3, x)[-1, 1, -sqrt(7), sqrt(7)] Note that the form of the result means that it can only return a finite set of solutions. In cases where the true solution is infinite, it can therefore be misleading. When the solution is an interval, solve() typically returns an empty list. For periodic functions, usually only one solution is returned: >>> solve(0, x) # all x are solutions[]>>> solve(x - abs(x), x) # all positive x are solutions[]>>> solve(sin(x), x) # all k*pi with k integer are solutions[0] The domain over which the equation is solved depends on the assumptions on the variable. Hence, if the variable is a real Symbol object, only real solutions are returned, but if it is complex, then all solutions in the complex plane are returned (subject to the aforementioned restriction on returning infinite solution sets). This difference is readily apparent when solving polynomials, as the following example demonstrates: >>> solve(x**2 + 1, x)[]>>> solve(z**2 + 1, z)[-I, I] There is no restriction on the number of variables appearing in the expression. Solving a multivariate expression for any of its variables allows it to be expressed as a function of the other variables, and to eliminate it from other expressions. The following example shows different ways of solving the same multivariate expression: >>> solve(x**2 - exp(a), x)[-exp(a/2), exp(a/2)]>>> solve(x**2 - exp(a), a)[log(x**2)]>>> solve(x**2 - exp(a), x, a)[{x: -exp(a/2)}, {x: exp(a/2)}]>>> solve(x**2 - exp(a), x, b)[{x: -exp(a/2)}, {x: exp(a/2)}] To solve a system of equations, pass a list of expressions to solve(): each one will be interpreted, as in the univariate case, as an equation of the form expr = 0. The result can be returned in one of two forms, depending on the mathematical structure of the input: either as a list of tuples, where each tuple contains the values for the variables in the order given to solve, or a single dictionary, suitable for use in subs(), mapping variables to their values. As you can see in the following example, it can be hard to predict what form the result will take: >>> solve([exp(x**2) - y, y - 3], x, y)[(-sqrt(log(3)), 3), (sqrt(log(3)), 3)]>>> solve([x**2 - y, y - 3], x, y)[(-sqrt(3), 3), (sqrt(3), 3)]>>> solve([x - y, y - 3], x, y){y: 3, x: 3} This variability in return types is fine for interactive use, but for library code, more predictability is required. In this case, you should use the dict=True option. The output will then always be a list of mappings of variables to value. Compare the following example to the previous one: >>> solve([x**2 - y, y - 3], x, y, dict=True)[{y: 3, x: -sqrt(3)}, {y: 3, x: sqrt(3)}]>>> solve([x - y, y - 3], x, y, dict=True)[{y: 3, x: 3}] Summary We successfully computed the various mathematical operations using the SymPy application, Calculus. Resources for Article : Further resources on this subject: Move Further with NumPy Modules [Article] Advanced Indexing and Array Concepts [Article] Running a simple game using Pygame [Article]
Read more
  • 0
  • 0
  • 1911

article-image-quick-start-your-first-sinatra-application
Packt
14 Aug 2013
15 min read
Save for later

Quick start - your first Sinatra application

Packt
14 Aug 2013
15 min read
(For more resources related to this topic, see here.) Step 1 – creating the application The first thing to do is set up Sinatra itself, which means creating a Gemfile. Open up a Terminal window and navigate to the directory where you're going to keep your Sinatra applications. Create a directory called address-book using the following command: mkdir address-book Move into the new directory: cd address-book Create a file called Gemfile: source 'https://rubygems.org'gem 'sinatra' Install the gems via bundler: bundle install You will notice that Bundler will not just install the sinatra gem but also its dependencies. The most important dependency is Rack (http://rack.github.com/), which is a common handler layer for web servers. Rack will be receiving requests for web pages, digesting them, and then handing them off to your Sinatra application. If you set up your Bundler configuration as indicated in the previous section, you will now have the following files: .bundle: This is a directory containing the local configuration for Bundler Gemfile: As created previously Gemfile.lock: This is a list of the actual versions of gems that are installed vendor/bundle: This directory contains the gems You'll need to understand the Gemfile.lock file. It helps you know exactly which versions of your application's dependencies (gems) will get installed. When you run bundle install, if Bundler finds a file called Gemfile.lock, it will install exactly those gems and versions that are listed there. This means that when you deploy your application on the Internet, you can be sure of which versions are being used and that they are the same as the ones on your development machine. This fact makes debugging a lot more reliable. Without Gemfile.lock, you might spend hours trying to reproduce behavior that you're seeing on your deployed app, only to discover that it was caused by a glitch in a gem version that you haven't got on your machine. So now we can actually create the files that make up the first version of our application. Create address-book.rb: require 'sinatra/base'class AddressBook < Sinatra::Base get '/' do 'Hello World!' endend This is the skeleton of the first part of our application. Line 1 loads Sinatra, line 3 creates our application, and line 4 says we handle requests to '/'—the root path. So if our application is running on myapp.example.com, this means that this method will handle requests to http://myapp.example.com/. Line 5 returns the string Hello World!. Remember that a Ruby block or a method without explicit use of the return keyword will return the result of its last line of code. Create config.ru: $: << File.dirname(__FILE__)require 'address-book'run AddressBook.new This file gets loaded by rackup, which is part of the Rack gem. Rackup is a tool that runs rack-based applications. It reads the configuration from config.ru and runs our application. Line 1 adds the current directory to the list of paths where Ruby looks for files to load, line 2 loads the file we just created previously, and line 4 runs the application. Let's see if it works. In a Terminal, run the following command: bundle exec rackup -p 3000 Here rackup reads config.ru, loads our application, and runs it. We use the bundle exec command to ensure that only our application's gems (the ones in vendor/bundle) get used. Bundler prepares the environment so that the application only loads the gems that were installed via our Gemfile. The -p 3000 command means we want to run a web server on port 3000 while we're developing. Open up a browser and go to http://0.0.0.0:3000; you should see something that looks like the following screenshot: Illustration 1: The Hello World! output from the application Logging Have a look at the output in the Terminal window where you started the application. I got the following (line numbers are added for reference): 1 [2013-03-03 12:30:02] INFO WEBrick 1.3.12 [2013-03-03 12:30:02] INFO ruby 1.9.3 (2013-01-15) [x86_64-linux]3 [2013-03-03 12:30:02] INFO WEBrick::HTTPServer#start: pid=28551 port=30004 127.0.0.1 - - [03/Mar/2013 12:30:06] "GET / HTTP/1.1" 200 12 0.01425 127.0.0.1 - - [03/Mar/2013 12:30:06] "GET /favicon.ico HTTP/1.1" 404 445 0.0018 Like it or not, you'll be seeing a lot of logs such as this while doing web development, so it's a good idea to get used to noticing the information they contain. Line 1 says that we are running the WEBrick web server. This is a minimal server included with Ruby—it's slow and not very powerful so it shouldn't be used for production applications, but it will do for now for application development. Line 2 indicates that we are running the application on Version 1.9.3 of Ruby. Make sure you don't develop with older versions, especially the 1.8 series, as they're being phased out and are missing features that we will be using in this book. Line 3 tells us that the server started and that it is awaiting requests on port 3000, as we instructed. Line 4 is the request itself: GET /. The number 200 means the request succeeded—it is an HTTP status code that means Success . Line 5 is a second request created by our web browser. It's asking if the site has a favicon, an icon representing the site. We don't have one, so Sinatra responded with 404 (not found). When you want to stop the web server, hit Ctrl + C in the Terminal window where you launched it. Step 2 – putting the application under version control with Git When developing software, it is very important to manage the source code with a version control system such as Git or Mercurial. Version control systems allow you to look at the development of your project; they allow you to work on the project in parallel with others and also to try out code development ideas (branches) without messing up the stable application. Create a Git repository in this directory: git init Now add the files to the repository: git add Gemfile Gemfile.lock address-book.rb config.ru Then commit them: git commit -m "Hello World" I assume you created a GitHub account earlier. Let's push the code up to www.github.com for safe keeping. Go to https://github.com/new. Create a repo called sinatra-address-book. Set up your local repo to send code to your GitHub account: git remote add origin git@github.com:YOUR_ACCOUNT/sinatra-address-book.git Push the code: git push You may need to sort out authentication if this is your first time pushing code. So if you get an error such as the following, you'll need to set up authentication on GitHub: Permission denied (publickey) Go to https://github.com/settings/ssh and add the public key that you generated in the previous section. Now you can refresh your browser, and GitHub will show you your code as follows: Note that the code in my GitHub repository is marked with tags. If you want to follow the changes by looking at the repository, clone my repo from //github.com/joeyates/sinatra-address-book.git into a different directory and then "check out" the correct tag (indicated by a footnote) at each stage. To see the code at this stage, type in the following command: git checkout 01_hello_world If you type in the following command, Git will tell you that you have "untracked files", for example, .bundle: git status To get rid of the warning, create a file called .gitignore inside the project and add the following content: /.bundle//vendor/bundle/ Git will no longer complain about those directories. Remember to add .gitignore to the Git repository and commit it. Let's add a README file as the page is requesting, using the following steps: Create the README.md file and insert the following text: sinatra-address-book ==================== An example program of various Sinatra functionality. Add the new file to the repo: git add README.md Commit the changes: git commit -m "Add a README explaining the application" Send the update to GitHub: git push Now that we have a README file, GitHub will stop complaining. What's more is other people may see our application and decide to build on it. The README file will give them some information about what the application does. Step 3 – deploying the application We've used GitHub to host our project, but now we're going to publish it online as a working site. In the introduction, I asked you to create a Heroku account. We're now going to use that to deploy our code. Heroku uses Git to receive code, so we'll be setting up our repository to push code to Heroku as well. Now let's create a Heroku app: heroku createCreating limitless-basin-9090... done, stack is cedarhttp://limitless-basin-9090.herokuapp.com/ | git@heroku.com:limitless-basin-9090.gitGit remote heroku added My Heroku app is called limitless-basin-9090. This name was randomly generated by Heroku when I created the app. When you generate an app, you will get a different, randomly generated name. My app will be available on the Web at the http://limitless-basin-9090.herokuapp.com/ address. If you deploy your app, it will be available on an address based on the name that Heroku has generated for it. Note that, on the last line, Git has been configured too. To see what has happened, use the following command: git remote show heroku* remote heroku Fetch URL: git@heroku.com:limitless-basin-9090.git Push URL: git@heroku.com:limitless-basin-9090.git HEAD branch: (unknown) Now let's deploy the application to the Internet: git push heroku master Now the application is online for all to see: The initial version of the application, running on Heroku Step 4 – page layout with Slim The page looks a bit sad. Let's set up a standard page structure and use a templating language to lay out our pages. A templating language allows us to create the HTML for our web pages in a clearer and more concise way. There are many HTML templating systems available to the Sinatra developer: erb , haml , and slim are three popular choices. We'll be using Slim (http://slim-lang.com/). Let's add the gem: Update our Gemfile: gem 'slim' Install the gem: bundle We will be keeping our page templates as .slim files. Sinatra looks for these in the views directory. Let's create the directory, our new home page, and the standard layout for all the pages in the application. Create the views directory: mkdir views Create views/home.slim: p address book – a Sinatra application When run via Sinatra, this will create the following HTML markup: <p>address book – a Sinatra application</p> Create views/layout.slim: doctype html html head title Sinatra Address Book body == yield Note how Slim uses indenting to indicate the structure of the web page. The most important line here is as follows: == yield This is the point in the layout where our home page's HTML markup will get inserted. The yield instruction is where our Sinatra handler gets called. The result it returns (that is, the web page) is inserted here by Slim. Finally, we need to alter address-book.rb. Add the following line at the top of the file: require 'slim' Replace the get '/' handler with the following: get '/' do slim :home end Start the local web server as we did before: bundle exec rackup -p 3000 The following is the new home page: Using the Slim Templating Engine Have a look at the source for the page. Note how the results of home.slim are inserted into layout.slim. Let's get that deployed. Add the new code to Git and then add the two new files: git add views/*.slim Also add the changes made to the other files: git add address-book.rb Gemfile Gemfile.lock Commit the changes with a comment: git commit -m "Generate HTML using Slim" Deploy to Heroku: git push heroku master Check online that everything's as expected. Step 5 – styling To give a slightly nicer look to our pages, we can use Bootstrap (http://twitter.github.io/bootstrap/); it's a CSS framework made by Twitter. Let's modify views/layout.slim. After the line that says title Sinatra Address Book, add the following code: link href="//netdna.bootstrapcdn.com/twitter-bootstrap/2.3.1/css/bootstrap-combined.min.css" rel="stylesheet"There are a few things to note about this line. Firstly, we will be using a file hosted on a Content Distribution Network (CDN ). Clearly, we need to check that the file we're including is actually what we think it is. The advantage of a CDN is that we don't need to keep a copy of the file ourselves, but if our users visit other sites using the same CDN, they'll only need to download the file once. Note also the use of // at the beginning of the link address; this is called a "protocol agnostic URL". This way of referencing the document will allow us later on to switch our application to run securely under HTTPS, without having to readjust all our links to the content. Now let's change views/home.slim to the following: div class="container" h1 address book h2 a Sinatra application We're not using Bootstrap to anywhere near its full potential here. Later on we can improve the look of the app using Bootstrap as a starting point. Remember to commit your changes and to deploy to Heroku. Step 6 – development setup As things stand, during local development we have to manually restart our local web server every time we want to see a change. Now we are going to set things up with the following steps so the application reloads after each change: Add the following block to the Gemfile: group :development do gem 'unicorn' gem 'guard' gem 'listen' gem 'rb-inotify', :require => false gem 'rb-fsevent', :require => false gem 'guard-unicorn' endThe group around these gems means they will only be installed and used in development mode and not when we deploy our application to the Web. Unicorn is a web server—it's better than WEBrick —that is used in real production environments. WEBrick's slowness can even become noticeable during development, while Unicorn is very fast. rb-inotify and rb-fsevent are the Linux and Mac OS X components that keep a check on your hard disk. If any of your application's files change, guard restarts the whole application, updating the changes. Finally, update your gems: bundle Now add Guardfile: guard :unicorn, :daemonize => true do `git ls-files`.each_line { |s| s.chomp!; watch s }end Add a configuration file for unicorn: mkdir config In config/unicorn.rb, add the following: listen 3000 Run the web server: guard Now if you make any changes, the web server will restart and you will get a notification via a desktop message. To see this, type in the following command: touch address-book.rb You should get a desktop notification saying that guard has restarted the application. Note that to shut guard down, you need to press Ctrl + D . Also, remember to add the new files to Git. Step 7 – testing the application We want our application to be robust. Whenever we make changes and deploy, we want to be sure that it's going to keep working. What's more, if something does not work properly, we want to be able to fix bugs so we know that they won't come back. This is where testing comes in. Tests check that our application works properly and also act as detailed documentation for it; they tell us what the application is intended for. Our tests will actually be called "specs", a term that is supposed to indicate that you write tests as specifications for what your code should do. We will be using a library called RSpec . Let's get it installed. Add the gem to the Gemfile: group :test do gem 'rack-test' gem 'rspec'end Update the gems so RSpec gets installed: bundle Create a directory for our specs: mkdir spec Create the spec/spec_helper.rb file: $: << File.expand_path('../..', __FILE__)require 'address-book'require 'rack/test'def app AddressBook.newendRSpec.configure do |config| config.include Rack::Test::Methodsend Create a directory for the integration specs: mkdir spec/integration Create a spec/integration/home_spec.rb file for testing the home page: require 'spec_helper'describe "Sinatra App" do it "should respond to GET" do get '/' expect(last_response).to be_ok expect(last_response.body).to match(/address book/) endend What we do here is call the application, asking for its home page. We check that the application answers with an HTTP status code of 200 (be_ok). Then we check for some expected content in the resulting page, that is, the address book page. Run the spec: bundle exec rspec Finished in 0.0295 seconds1 example, 0 failures Ok, so our spec is executed without any errors. There you have it. We've created a micro application, written tests for it, and deployed it to the Internet. Summary This article discussed how to perform the core tasks of Sinatra: handling a GET request and rendering a web page. Resources for Article : Further resources on this subject: URL Shorteners – Designing the TinyURL Clone with Ruby [Article] Building tiny Web-applications in Ruby using Sinatra [Article] Setting up environment for Cucumber BDD Rails [Article]  
Read more
  • 0
  • 0
  • 17836

Packt
14 Aug 2013
6 min read
Save for later

Analytics – Drawing a Frequency Distribution with MapReduce (Intermediate)

Packt
14 Aug 2013
6 min read
(For more resources related to this topic, see here.) Often, we use Hadoop to calculate analytics, which are basic statistics about data. In such cases, we walk through the data using Hadoop and calculate interesting statistics about the data. Some of the common analytics are show as follows: Calculating statistical properties like minimum, maximum, mean, median, standard deviation, and so on of a dataset. For a dataset, generally there are multiple dimensions (for example, when processing HTTP access logs, names of the web page, the size of the web page, access time, and so on, are few of the dimensions). We can measure the previously mentioned properties by using one or more dimensions. For example, we can group the data into multiple groups and calculate the mean value in each case. Frequency distributions histogram counts the number of occurrences of each item in the dataset, sorts these frequencies, and plots different items as X axis and frequency as Y axis. Finding a correlation between two dimensions (for example, correlation between access count and the file size of web accesses). Hypothesis testing: To verify or disprove a hypothesis using a given dataset. However, Hadoop will only generate numbers. Although the numbers contain all the information, we humans are very bad at figuring out overall trends by just looking at numbers. On the other hand, the human eye is remarkably good at detecting patterns, and plotting the data often yields us a deeper understanding of the data. Therefore, we often plot the results of Hadoop jobs using some plotting program. Getting ready This article assumes that you have access to a computer that has Java installed and the JAVA_HOME variable configured. Download a Hadoop distribution 1.1.x from http://hadoop.apache.org/releases.html page. Unzip the distribution, we will call this directory HADOOP_HOME. Download the sample code for the article and copy the data files. How to do it... If you have not already done so, let us upload the amazon dataset to the HDFS filesystem using the following commands: >bin/hadoopdfs -mkdir /data/>bin/hadoopdfs -mkdir /data/amazon-dataset>bin/hadoopdfs -put <SAMPLE_DIR>/amazon-meta.txt /data/amazondataset/>bin/hadoopdfs -ls /data/amazon-dataset Copy the hadoop-microbook.jar file from SAMPLE_DIR to HADOOP_HOME. Run the first MapReduce job to calculate the buying frequency. To do that run the following command from HADOOP_HOME: $ bin/hadoop jar hadoop-microbook.jar microbook.frequency.BuyingFrequencyAnalyzer/data/amazon-dataset /data/frequencyoutput1 Use the following command to run the second MapReduce job to sort the results of the first MapReduce job: $ bin/hadoop jar hadoop-microbook.jar microbook.frequency.SimpleResultSorter /data/frequency-output1 frequency-output2 You can find the results from the output directory. Copy results to HADOOP_HOME using the following command: $ bin/Hadoop dfs -get /data/frequency-output2/part-r-00000 1.data Copy all the *.plot files from SAMPLE_DIR to HADOOP_HOME. Generate the plot by running the following command from HADOOP_HOME. $gnuplot buyfreq.plot It will generate a file called buyfreq.png, which will look like the following: As the figure depicts, few buyers have brought a very large number of items. The distribution is much steeper than normal distribution, and often follows what we call a Power Law distribution. This is an example that analytics and plotting results would give us insight into, underlying patterns in the dataset. How it works... You can find the mapper and reducer code at src/microbook/frequency/BuyingFrequencyAnalyzer.java. This figure shows the execution of two MapReduce jobs. Also the following code listing shows the map function and the reduce function of the first job: public void map(Object key, Text value, Context context) throwsIOException, InterruptedException {List<BuyerRecord> records = BuyerRecord.parseAItemLine(value.toString());for(BuyerRecord record: records){context.write(new Text(record.customerID), new IntWritable(record.itemsBrought.size()));}}public void reduce(Text key, Iterable<IntWritable> values, Context context) {int sum = 0;for (IntWritableval : values) {sum += val.get();}result.set(sum);context.write(key, result);} As shown by the figure, Hadoop will read the input file from the input folder and read records using the custom formatter we introduced in the Writing a formatter (Intermediate) article. It invokes the mapper once per each record, passing the record as input. The mapper extracts the customer ID and the number of items the customer has brought, and emits the customer ID as the key and number of items as the value. Then, Hadoop sorts the key-value pairs by the key and invokes a reducer once for each key passing all values for that key as inputs to the reducer. Each reducer sums up all item counts for each customer ID and emits the customer ID as the key and the count as the value in the results. Then the second job sorted the results. It reads output of the first job as the result and passes each line as argument to the map function. The map function extracts the customer ID and the number of items from the line and emits the number of items as the key and the customer ID as the value. Hadoop will sort the key-value pairs by the key, thus sorting them by the number of items, and invokes the reducer once per key in the same order. Therefore, the reducer prints them out in the same order essentially sorting the dataset. Since we have generated the results, let us look at the plotting. You can find the source for the gnuplot file from buyfreq.plot. The source for the plot will look like the following: set terminal pngset output "buyfreq.png"set title "Frequency Distribution of Items brought by Buyer";setylabel "Number of Items Brought";setxlabel "Buyers Sorted by Items count";set key left topset log yset log xplot "1.data" using 2 title "Frequency" with linespoints Here the first two lines define the output format. This example uses png, but gnuplot supports many other terminals such as screen, pdf, and eps. The next four lines define the axis labels and the title, and the next two lines define the scale of each axis, and this plot uses log scale for both. The last line defines the plot. Here, it is asking gnuplot to read the data from the 1.data file, and to use the data in the second column of the file via using 2, and to plot it using lines. Columns must be separated by whitespaces. Here if you want to plot one column against another, for example data from column 1 against column 2, you should write using 1:2 instead of using 2. There's more... We can use a similar method to calculate the most types of analytics and plot the results. Refer to the freely available article of Hadoop MapReduce Cookbook, Srinath Perera and Thilina Gunarathne, Packt Publishing at http://www.packtpub.com/article/advanced-hadoop-mapreduce-administration for more information. Summary In this article, we have learned how to process Amazon data with MapReduce, generate data for a histogram, and plot it using gnuplot. Resources for Article : Further resources on this subject: Advanced Hadoop MapReduce Administration [Article] Comparative Study of NoSQL Products [Article] HBase Administration, Performance Tuning [Article]
Read more
  • 0
  • 0
  • 4878

article-image-quick-start-creating-your-first-application
Packt
13 Aug 2013
14 min read
Save for later

Quick start - creating your first application

Packt
13 Aug 2013
14 min read
(For more resources related to this topic, see here.) By now you should have Meteor installed and ready to create your first app, but jumping in blindly would be more confusing than not. So let’s take a moment to discuss the anatomy of a Meteor application. We have already talked about how Meteor moves all the workload from the server to the browser, and we have seen firsthand the folder of plugins, which we can incorporate into our apps, so what have we missed? Well MVVM of course. MVVM stands for Model, View, and View-Model. These are the three components that make up a Meteor application. If you’ve ever studied programming academically, then you’ll know there’s a concept called separation of concerns. What this means is that you separate code with different intentions into different components. This allows you to keep things neat, but more importantly—if done right—it allows for better testing and customization down the line. A proper separation is one that allows you to remove a piece of code and replace it with another without disrupting the rest of your app. An example of this could be a simple function. If you print out debug messages to a file throughout your app, it would be a terrible practice to manually write this code out each time. A much better solution would be to “separate” this code out into its own function, and only reference it throughout your app. This way, down the line if you decide you want debug messages to be e-mailed instead of written to a file, you only need to change the one function and your app will continue to work without even knowing about the change. So we know separation is important but I haven’t clarified what MVVM is yet. To get a better idea let’s take a look at what kind of code should go in each component. Model: The Model is the section of your code that has to do with the backend code. This usually refers to your database, but it’s not exclusive to just that. In Meteor, you can generally consider the database to be your application’s model. View: The View is exactly what it sounds like, it’s your application’s view. It’s the HTML that you send to the browser. You want to keep these files as logic-less as possible, this will allow for better separation. It will assure that all your logic code is in one place, and it will help with testing and code re-use. View-Model: Now the View-Model is where all the magic happens. The View-Model has two jobs—one is to interface the model to the view and the second is to handle all the events. Basically, all your logic code will be going here. This is just a brief explanation on the MVVM pattern, but like most things I think an example is in order to better illustrate. Let’s pretend we have a site where people can share pictures, such as a typical social network would. On the Model side, you will have a database which contains all the user’s pictures. Now this is very nice but it’s private info and no user should be able to access it. That’s where the View-Model comes in. The View-Model accesses the main Model, and creates a custom version for the View. So, for instance, it creates a new dataset that only contains pictures from the user’s friends. That is the View-Model’s first job, to create datasets for the View with info from the Model. Next, the View accesses the View-Model and gets the information it needs to display the page; in our example this could be an array of pictures. Now the page is built and both the Model and View are done with their jobs. The last step is to handle page events, for example, the user clicks a button. If you remember, the views are logic-less, so when someone clicks a button, the event is sent back to the View-Model to be processed. If you’re still a bit fuzzy on the concept it should become clearer when we create our first application. Now that we have gone through the concepts we are ready to build our first application. To get started, open a terminal window and create a new folder for your Meteor applications: mkdir ~/meteorApps This creates a new directory in our home folder—which is represented by the tilde (~) symbol—called meteorApps. Next let’s enter this folder by typing: cd ~/meteorApps The cd (change directory) command will move the terminal to the location specified, which in our case is the meteorApps folder. The last step is to actually create a Meteor application and this is done by typing: meteor create firstApp You should be greeted with a message telling you how to run your app but we are going to hold of on that, for now just enter the directory by typing: cd firstAppls The cd command, you should already be familiar with what it does, and the ls function just lists the files in the current directory. If you didn’t play around with the skel folder from the last section, then you should have three files in your app’s folder—an HTML file, a JavaScript file, and a CSS file. The HTML and CSS files are the View in the MVVM pattern, while the JavaScript file is the View-Model. It’s a little difficult to begin explaining everything because we have a sort of chicken and egg paradox where we can’t explain one without the other. But let’s begin with the View as it’s the simpler of the two, and then we will move backwards to the View-Model. The View If you open the HTML file, you should see a couple of lines, mostly standard HTML, but there are a few commands from Meteor’s default templating language—Handlebars. This is not Meteor specific, as Handlebars is a templating language based on the popular mustache library, so you may already be familiar with it, even without knowing Meteor. But just in case, I’ll quickly run through the file: <head> <title>firstApp</title></head> This first part is completely standard HTML; it’s just a pair of head tags, with the page’s title being set inside. Next we have the body tag: <body> {{> hello}}</body> The outer body tags are standard HTML, but inside there is a Handlebars function. Handlebars allows you to define template partials, which are basically pieces of HTML that are given a name. That way you are able to add the piece wherever you want, even multiple times on the same page. In this example, Meteor has made a call to Handlebars to insert the template called hello inside the body tags. It’s a fairly easy syntax to learn; you just open two curly braces then you put a greater-than sign followed by the name of the template, finally closing it o ff with a pair of closing braces. The rest of the file is the definition of the hello template partial: <template name=”hello”> <h1>Hello World!</h1> {{greeting}} <input type=”button” value=”Click” /></template> Again it’s mostly standard HTML, just an H1 title and a button. The only special part is the greeting line in the middle, which is another Handlebars function to insert data. This is how the MVVM pattern works, I said earlier that you want to keep the view as simple as possible, so if you have to calculate anything you do it in the View-Model and then load the results to the View. You do this by leaving a reference; in our code the reference is to greeting , which means you place whatever greeting equals to here. It’s a placeholder for a variable, and if you guessed that the variable greeting will be in the View-Model, then you are 100 percent correct. Another thing to notice is the fact that we do have a button on the page, but you won’t find any event handlers here. That’s because, like I mentioned earlier, the events are handled in the View-Model as well. So it seems like we are done here, and the next logical step is to take a peek at the View-Model. If you remember, the View-Model is the .js file, so close this out and open the firstApp.js file. The JS file There is slightly more code here, but if you’re comfortable with JavaScript, then everything should feel right at home. At first glance you can see that the page is split up into two if statements— Meteor.isClient and Meteor.isServer. This is because the JS file is parsed on both the server and the user’s browser. These statements are used to write code for one and not the other. For now we aren’t going to be dealing with the server, so you don’t have to worry about the bottom section. The top section, on the other hand, has our HTML file’s data. While we were in the View, we saw a call to a template partial named hello and then inside it we referenced a placeholder called greeting . The way to set these placeholders is by referencing the global Template variable, and to set the value by following this pattern: Template.template_name.placeholder_name So in our example it would be: Template.hello.greeting And if you take a look at the first thing inside the isClient variable’s if statement, you will find exactly this. Here, it is set to a function, which returns a simple string. You can set it directly to a string, but then it’s not dynamic. Usually the only reason you are defining a View-Model variable is because it’s something that has to be computed via a function, so that’s why they did it like that. But there are cases where you may just want to reference a simple string, and that’s fine. To recap, so far in the View we have a reference to a piece of data named greeting inside a template partial called hello, which we are setting in the View-Model to the string Welcome to firstApp. The last part of the JS file is the part that handles events on the page; it does this by passing an event-map to a template’s events function. This follows the same notation as the previous, so you type: Template.template_name.events( events_map ); I’ll paste the example’s code here, for further illustration: Template.hello.events({ ‘click input’ : function () { // template data, if any, is available in ‘this’ if (typeof console !== ‘undefined’) console.log(“You pressed the button”); } }); Inside each events object, you place the action and target as the key, and you set a function as the value. The actions are standard JavaScript actions, so you have things such as click, dblclick, keydown, and so on. Targets use standard CSS notation, which is periods for classes, hash symbols for IDs, and just the tag name for HTML tags. Whenever the event happens (for example, the input is clicked) the attached function will be called. To view the full gist of event types, you can take a look at the full list here: http://docs.meteor.com/#template_events It would be a lot shorter if there wasn’t a comment or an if statement to make sure the console is defined. But basically the function will just output the words You pressed the button to the console every time you pressed the button. Pretty intuitive! So we went through the files, all that’s left to do is actually test them. To do this, go back to the terminal, and make sure you’re in the firstApps folder. This can be achieved by using ls again to make sure the three files are there, and by using cd ~/meteorApps/firstApp if you are not looking in the right folder. Next, just type meteor and hit Enter, which will cause Meteor to compile everything together and run the built-in web server. If this is done right, you should see a message saying something like: Running on: http: // localhost:3000/ Navigate your browser to the location specified (http : //localhost:3000), and you should see the app that we just created. If your browser has a console, you can open it up and click the button. Doing so will display the message You pressed the button, similar to the one we saw in the JS file. I hope it all makes sense now, but to drive the point home, we will make a few adjustments of our own. In the terminal window, press Ctrl + C to close the Meteor server, then open up the HTML file. A quick revision After the call to the hello template inside the body tags, add a call to another template named quickStart. Here is the new body section along with the completed quickStart template: <body> {{> hello}} {{> quickStart}}</body><template name=”quickStart”> <h3>Click Counter</h3> The Button has been pressed {{numClick}} time(s) <input type=”button” id=”counter” value=”CLICK ME!!!” /></template> Summary I wanted to keep it as similar to the other template as possible, not to throw too much at you all at once. It simply contains a title enclosed in the header tags followed by a string of text with a placeholder named numClick and a button with an id value of counter. There’s nothing radically different over the other template, so you should be fairly comfortable with it. Now save this and open the JS file. What we are adding to the page is a counter that will display the number of times the button was pressed. We do this by telling Meteor that the placeholder relies on a specific piece of data; Meteor will then track this data and every time it gets changed, the page will be automatically updated. The easiest way to set this up is by using Meteor’s Session object. Session is a key-value store object, which allows you to store and retrieve data inside Meteor. You set data using the set method, passing in a name (key) and value; you can then retrieve that stored info by calling the get method, passing in the same key. Besides the Session object bit, everything else is the same. So just add the following part right after the hello template’s events call, and make sure it’s inside the isClient variable’s if statement: Template.quickStart.numClick = function(){ var pcount = Session.get(“pressed_count”); return (pcount) ? pcount : 0; } This function gets the current number of clicks—stored with a key of pressed_count —and returns it, defaulting to zero if the value was never set. Since we are using the pressed_count property inside the placeholder’s function, Meteor will automatically update this part of the HTML whenever pressed_count changes. Last but not least we have to add the event-map; put the following code snippet right after the previous code: Template.quickStart.events({ ‘click #counter’ : function(){ var pcount = Session.get(“pressed_count”); pcount = (pcount) ? pcount + 1 : 1; Session.set(“pressed_count”, pcount); } }); Here we have a click event for our button with the counter ID, and the attached function just get’s the current count and increments it by one. To try it out, just save this file, and in the terminal window while still in the project’s directory, type meteor to restart the web server. Try clicking the button a few times, and if all went well the text should be updated with an incrementing value. Resources for Article: Further resources on this subject: Meteor.js JavaScript Framework: Why Meteor Rocks! [Article] Applying Special Effects in 3D Game Development with Microsoft Silverlight 3: Part 2 [Article] YUI Test [Article]
Read more
  • 0
  • 0
  • 1459

article-image-using-unrestricted-languages
Packt
13 Aug 2013
15 min read
Save for later

Using Unrestricted Languages

Packt
13 Aug 2013
15 min read
(For more resources related to this topic, see here.) Are untrusted languages inferior to trusted ones? No, on the contrary, these languages are untrusted in the same way that a sharp knife is untrusted and should not be trusted to very small children, at least not without adult supervision. They have extra powers that ordinary SQL or even the trusted languages (such as PL/pgSQL) and trusted variants of the same language (PL/Perl versus PL/Perlu) don't have. You can use the untrusted languages to directly read and write on the server's disks, and you can use it to open sockets and make Internet queries to the outside world. You can even send arbitrary signals to any process running on the database host. Generally, you can do anything the native language of the PL can do. However, you probably should not trust arbitrary database users to have the right to define functions in these languages. Always think twice before giving all privileges on some untrusted language to a user or group by using the *u languages for important functions. Can you use the untrusted languages for important functions? Absolutely. Sometimes, it may be the only way to accomplish some tasks from inside the server. Performing simple queries and computations should do nothing harmful to your database, and neither should connecting to the external world for sending e-mails, fetching web pages, or doing SOAP requests. They may cause delays and even queries that get stuck, but these can usually be dealt with by setting an upper limit as to how long a query can run by using an appropriate statement time-out value. Setting a reasonable statement time-out value by default is a good practice anyway. So, if you don't deliberately do risky things, the probability of harming the database is no bigger than using a "trusted" (also known as "restricted") variant of the language. However, if you give the language to someone who starts changing bytes on the production database "to see what happens", you probably get what you asked for. Will untrusted languages corrupt the database? The power to corrupt the database is definitely there, since the functions run as the system user of the database server with full access to the filesystem. So, if you blindly start writing into the data files and deleting important logs, it is very likely that your database will be corrupted. Additional types of denial-of-service attacks are also possible such as using up all memory or opening all IP ports; but there are ways to overload the database using plain SQL as well, so that part is not much different from the trusted database access with the ability to just run arbitrary queries. So yes, you can corrupt the database, but please don't do it on a production server. If you do, you will be sorry. Why untrusted? PostgreSQL's ability to use an untrusted language is a powerful way to perform some nontraditional things from database functions. Creating these functions in a PL is an order of magnitude smaller task than writing an extension function in C. For example, a function to look up a hostname for an IP address is only a few lines in PL/Pythonu: CREATE FUNCTION gethostbyname(hostname text) RETURNS inet AS $$ import socket return socket.gethostbyname(hostname) $$ LANGUAGE plpythonu SECURITY DEFINER; You can test it immediately after creating the function by using psql: hannu=# select gethostbyname('www.postgresql.org'); gethostbyname ---------------- 98.129.198.126 (1 row) Creating the same function in the most untrusted language, C, involves writing tens of lines of boilerplate code, worrying about memory leaks, and all the other problems coming from writing code in a low-level language. I recommend prototyping in some PL language if possible, and in an untrusted language if the function needs something that the restricted languages do not offer. Why PL/Python? All of these tasks could be done equally well using PL/Perlu or PL/Tclu; I chose PL/Pythonu mainly because Python is the language I am most comfortable with. This also translates to having written some PL/Python code, which I plan to discuss and share with you in this article. Quick introduction to PL/Python PL/pgSQL is a language unique to PostgreSQL and was designed to add blocks of computation and SQL inside the database. While it has grown in its breath of functionality, it still lacks the completeness of syntax of a full programming language. PL/Python allows your database functions to be written in Python with all the depth and maturity of writing a Python code outside the database. A minimal PL/Python function Let's start from the very beginning (again): CREATE FUNCTION hello(name text) RETURNS text AS $$ return 'hello %s !' % name $$ LANGUAGE plpythonu; Here, we see that creating the function starts by defining it as any other PostgreSQL function with a RETURNS definition of a text field: CREATE FUNCTION hello(name text) RETURNS text The difference from what we have seen before is that the language part is specifying plpythonu (the language ID for PL/Pythonu language): $$ LANGUAGE plpythonu; Inside the function body it is very much a normal python function, returning a value obtained by the name passed as an argument formatted into a string 'hello %s !' using the standard Python formatting operator %: return 'hello %s !' % name Finally, let's test how this works: hannu=# select hello('world'); hello --------------- hello world ! (1 row) And yes, it returns exactly what is expected! Data type conversions The first and last things happening when a PL function is called by PostgreSQL are converting argument values between the PostgreSQL and PL types. The PostgreSQL types need to be converted to the PL types on entering the function, and then the return value needs to be converted back into the PostgreSQL type. Except for PL/pgSQL, which uses PostgreSQL's own native types in computations, the PLs are based on existing languages with their own understanding of what types (integer, string, date, …) are, how they should behave, and how they are represented internally. They are mostly similar to PostgreSQL's understanding but quite often are not exactly the same. PL/Python converts data from PostgreSQL type to Python types as shown in the following table: PostgreSQL Python 2 Python 3 Comments int2, int4 int int   int8 long int   real, double, numeric float float This may lose precision for numeric values. bytea str bytes No encoding conversion is done, nor should any encoding be assumed. text, char(), varchar(), and other text types str str On Python 2, the string will be in server encoding. On Python 3, it is an unicode string. All other types str str PostgreSQL's type output function is used to convert to this string. Inside the function, all computation is done using Python types and the return value is converted back to PostgreSQL using the following rules (this is a direct quote from official PL/Python documentation at http://www.postgresql.org/docs/current/static/plpython-data.html): When the PostgreSQL return type is Boolean, the return value will be evaluated for truth according to the Python rules. That is, 0 and empty string are false, but notably f is true. When the PostgreSQL return type is bytea, the return value will be converted to a string (Python 2) or bytes (Python 3) using the respective Python built-ins, with the result being converted bytea. For all other PostgreSQL return types, the returned Python value is converted to a string using Python's built-in str, and the result is passed to the input function of the PostgreSQL data type. Strings in Python 2 are required to be in the PostgreSQL server encoding when they are passed to PostgreSQL. Strings that are not valid in the current server encoding will raise an error; but not all encoding mismatches can be detected, so garbage data can still result when this is not done correctly. Unicode strings are converted to the correct encoding automatically, so it can be safer and more convenient to use those. In Python 3, all strings are Unicode strings. In other words, anything but 0, False, and an empty sequence, including empty string ' ' or dictionary becomes PostgreSQL false. One notable exception to this is that the check for None is done before any other conversions and even for Booleans, None is always converted to NULL and not to the Boolean value false. For the bytea type, the PostgreSQL byte array, the conversion from Python's string representation, is an exact copy with no encoding or other conversions applied. Writing simple functions in PL/Python Writing functions in PL/Python is not much different in principle from writing functions in PL/pgSQL. You still have the exact same syntax around the function body in $$, and the argument name, types, and returns all mean the same thing regardless of the exact PL/language used. A simple function So a simple add_one() function in PL/Python looks like this: CREATE FUNCTION add_one(i int) RETURNS int AS $$ return i + 1; $$ LANGUAGE plpythonu; It can't get much simpler than that, can it? What you see here is that the PL/Python arguments are passed to the Python code after converting them to appropriate types, and the result is passed back and converted to the appropriate PostgreSQL type for the return value. Functions returning a record To return a record from a Python function, you can use: A sequence or list of values in the same order as the fields in the return record A dictionary with keys matching the fields in the return record A class or type instance with attributes matching the fields in the return record Here are samples of the three ways to return a record. First, using an instance: CREATE OR REPLACE FUNCTION userinfo( INOUT username name, OUT user_id oid, OUT is_superuser boolean) AS $$ class PGUser: def __init__(self,username,user_id,is_superuser): self.username = username self.user_id = user_id self.is_superuser = is_superuser u = plpy.execute(""" select usename,usesysid,usesuper from pg_user where usename = '%s'""" % username)[0] user = PGUser(u['usename'], u['usesysid'], u['usesuper']) return user $$ LANGUAGE plpythonu; Then, a little simpler one using a dictionary: CREATE OR REPLACE FUNCTION userinfo( INOUT username name, OUT user_id oid, OUT is_superuser boolean) AS $$ u = plpy.execute(""" select usename,usesysid,usesuper from pg_user where usename = '%s'""" % username)[0] return {'username':u['usename'], 'user_id':u['usesysid'], 'is_ superuser':u['usesuper']} $$ LANGUAGE plpythonu; Finally, using a tuple: CREATE OR REPLACE FUNCTION userinfo( INOUT username name, OUT user_id oid, OUT is_superuser boolean) AS $$ u = plpy.execute(""" select usename,usesysid,usesuper from pg_user where usename = '%s'""" % username)[0] return (u['usename'], u['usesysid'], u['usesuper']) $$ LANGUAGE plpythonu; Notice [0] at the end of u = plpy.execute(...)[0] in all the examples. It is there to extract the first row of the result, as even for one-row results plpy.execute still returns a list of results. Danger of SQL injection! As we have neither executed a prepare() method and executed a execute() method with arguments after it, nor have we used the plpy.quote_literal() method (both techniques are discussed later) to safely quote the username before merging it into the query, we are open to a security flaw known as SQL injection. So, make sure that you only let trusted users call this function or supply the username argument. Calling the function defined via any of these three CREATE commands will look exactly the same: hannu=# select * from userinfo('postgres'); username | user_id | is_superuser ----------+---------+-------------- postgres | 10 | t (1 row) It usually does not make sense to declare a class inside a function just to return a record value. This possibility is included mostly for cases where you already have a suitable class with a set of attributes matching the ones the function returns. Table functions When returning a set from a PL/Python functions, you have three options: Return a list or any other sequence of return type Return an iterator or generator yield the return values from a loop Here, we have three ways to generate all even numbers up to the argument value using these different styles. First, returning a list of integers: CREATE FUNCTION even_numbers_from_list(up_to int) RETURNS SETOF int AS $$ return range(0,up_to,2) $$ LANGUAGE plpythonu; The list here is returned by a built-in Python function called range, which returns a result of all even numbers below the argument. This gets returned as a table of integers, one integer per row from the PostgreSQL function. If the RETURNS clause of the function definition would say int[] instead of SETOF int, the same function would return a single number of even integers as a PostgreSQL array. The next function returns a similar result using a generator and returning both the even number and the odd one following it. Also, notice the different PostgreSQL syntax RETURNS TABLE(...) used this time for defining the return set: CREATE FUNCTION even_numbers_from_generator(up_to int) RETURNS TABLE (even int, odd int) AS $$ return ((i,i+1) for i in xrange(0,up_to,2)) $$ LANGUAGE plpythonu; The generator is constructed using a generator expression (x for x in <seq>). Finally, the function is defined using a generator using and explicit yield syntax, and yet another PostgreSQL syntax is used for returning SETOF RECORD with the record structure defined this time by OUT parameters: CREATE FUNCTION even_numbers_with_yield(up_to int, OUT even int, OUT odd int) RETURNS SETOF RECORD AS $$ for i in xrange(0,up_to,2): yield i, i+1 $$ LANGUAGE plpythonu; The important part here is that you can use any of the preceding ways to define a PL/Python set returning function, and they all work the same. Also, you are free to return a mixture of different types for each row of the set: CREATE FUNCTION birthdates(OUT name text, OUT birthdate date) RETURNS SETOF RECORD AS $$ return ( {'name': 'bob', 'birthdate': '1980-10-10'}, {'name': 'mary', 'birthdate': '1983-02-17'}, ['jill', '2010-01-15'], ) $$ LANGUAGE plpythonu; This yields result as follows: hannu=# select * from birthdates(); name | birthdate ------+------------ bob | 1980-10-10 mary | 1983-02-17 jill | 2010-01-15 (3 rows) As you see, the data returning a part of PL/Pythonu is much more flexible than returning data from a function written in PL/pgSQL. Running queries in the database If you have ever accessed a database in Python, you know that most database adapters conform to a somewhat loose standard called Python Database API Specification v2.0 or DBAPI 2 for short. The first thing you need to know about database access in PL/Python is that in-database queries do not follow this API. Running simple queries Instead of using the standard API, there are just three functions for doing all database access. There are two variants: plpy.execute() for running a query, and plpy.prepare() for turning query text into a query plan or a prepared query. The simplest way to do a query is with: res = plpy.execute(<query text>, [<row count>]) This takes a textual query and an optional row count, and returns a result object, which emulates a list of dictionaries, one dictionary per row. As an example, if you want to access a field 'name' of the third row of the result, you use: res[2]['name'] The index is 2 and not 3 because Python lists are indexed starting from 0, so the first row is res[0], the second row res[1], and so on. Using prepared queries In an ideal world this would be all that is needed, but plpy.execute(query, cnt) has two shortcomings: It does not support parameters The plan for the query is not saved, requiring the query text to be parsed and run through the optimizer at each invocation We will show a way to properly construct a query string later, but for most uses simple case parameter passing is enough. So, the execute(query, [maxrows]) call becomes a set of two statements: plan = plpy.prepare(<query text>, <list of argument types>) res = plpy.execute(plan, <list of values>, [<row count>])For example, to query if a user 'postgres' is a superuser, use the following: plan = plpy.prepare("select usesuper from pg_user where usename = $1", ["text"]) res = plpy.execute(plan, ["postgres"]) print res[0]["usesuper"] The first statement prepares the query, which parses the query string into a query tree, optimizes this tree to produce the best query plan available, and returns the prepared_query object. The second row uses the prepared plan to query for a specific user's superuser status. The prepared plan can be used multiple times, so that you could continue to see if user bob is superuser. res = plpy.execute(plan, ["bob"]) print res[0]["usesuper"] Caching prepared queries Preparing the query can be quite an expensive step, especially for more complex queries where the optimizer has to choose from a rather large set of possible plans; so, it makes sense to re-use results of this step if possible. The current implementation of PL/Python does not automatically cache query plans (prepared queries), but you can do it easily yourself. try: plan = SD['is_super_qplan'] except: SD['is_super_qplan'] = plpy.prepare(".... plan = SD['is_super_qplan'] <the rest of the function> The values in SD[] and GD[] only live inside a single database session, so it only makes sense to do the caching in case you have long-lived connections.
Read more
  • 0
  • 0
  • 5362

article-image-advanced-jira-52-features
Packt
13 Aug 2013
6 min read
Save for later

Advanced JIRA 5.2 Features

Packt
13 Aug 2013
6 min read
(For more resources related to this topic, see here.) GreenHopper So far, you have seen and used JIRA as a traditional issue-tracking system, where users can log issues and transition them through workflows. With the recent increased adoption of agile development methodologies, it is clear that JIRA by itself is not enough, and this is where GreenHopper comes in. GreenHopper adds the power of agile methodologies to JIRA, by providing a new user interface to help you and your team plan and visualize the tasks you have at hand. GreenHopper is a separate product and does not come with JIRA. So the first step for us is to install it via the Marketplace. Getting GreenHopper GreenHopper is a commercial add-on provided by Atlassian. We can discover and install add-ons directly from JIRA through the Universal Plugin Manager. Perform the following steps to install GreenHopper via the UPM: Browse to Universal Plugin Manager. Select the Find New Add-ons tab. Search for GreenHopper in the search box. This will locate the add-on GreenHopper - Agile project management for JIRA. Click on on the Free Trial button if you want to evaluate GreenHopper before purchasing, or click on the Buy Now button to purchase directly. This will prompt the UPM to start downloading and installing the add-on. Click on the Get License button when prompted, and follow the steps to either generate a trial license or purchase a full license: After you have successfully installed GreenHopper, there will be a new item added to JIRA's top menu bar called Agile, as shown in the following screenshot: Starting with GreenHopper Before we start using GreenHopper, the first thing you need to understand is that GreenHopper adds a new user interface to JIRA, allowing you to better visualize the data you already have in JIRA. For example, an issue in GreenHopper is the same as an issue in JIRA, and you can go back and forth between the two user interfaces. Now that the relationship between GreenHopper and JIRA is clear, we need to familiarize ourselves with a number of new terminologies that we will be using. Scrum Scrum is an agile software development methodology, where the development team plans and works on the project iteratively and incrementally to complete the project. You can read more on Scrum at http://en.wikipedia.org/wiki/Scrum(development). Kanban Kanban is a methodology where the focus is to visualize and limit the amount of work that is in progress. Kanban allows the project team to focus on delivering custom value. You can read more on Kanban at http://en.wikipedia.org/wiki/Scrum(development). Board A board is what GreenHopper uses to display and visualize issues in JIRA. You can think of it as a traditional white board, where you will have sticky notes representing the tasks to be completed. Card Following the preceding white board analogy, a card is the sticky note that represents the task to be done. With GreenHopper, a card is an issue, visualized differently: Story Stories or user stories represent requirements or features that are to be implemented. They are usually written in a non-technical language and describe what needs to be done and whom the requirement is designed for (e.g. the end user, the administrator), in a few short sentences. In GreenHopper, a story is represented as an issue of type User Story. Sprint Sprints also known as iterations, are used in iterative agile development methodologies, such as Scrum. A sprint has a specific duration (that is, a start and end date) and is usually between one to four weeks, in which the team works to deliver a portion or an improvement of the whole product or project. Epic An epic is a large user story that has not yet been broken down into smaller, more manageable stories, usually a group of related stories. Epics should be broken down into their component stories during the planning session, before becoming part of a sprint. In GreenHopper, an epic is represented as an issue of type Epic. Backlog The backlog contains all the issues that have not yet been included in a sprint. Working with boards To start working with GreenHopper, you need to get familiar with boards. You can view and access boards from the Manage Boards page, by pulling down the Agile menu and selecting Manage Boards. From the Manage Boards page, you will see all the boards that are shared with you. The following screenshot shows three boards, two are shared with Sample Project, and one is not shared at all, making it a private board: GreenHopper has two types of boards, Scrum and Kanban. The Scrum board is designed to support the Scrum methodology, where teams plan and work in sprints. Scrum boards have access to all three modes mentioned above. The Kanban board is designed to support the Kanban methodology, where teams focus on managing and constraining their work in progress. Since Kanban does not have a planning session like Scrum, its boards do not have the Plan mode. There are three modes for GreenHopper boards, namely Plan, Work, and Report: Plan: This is where you plan your sprints. This mode is only available to Scrum boards. Work: This is where cards (issues) are progressed (workflow transition) from one column (issue status) to another. Report: This contains a number of built-in reports and charts such as the Burndown chart (Scrum) and Control chart. The following screenshot shows an example of a Scrum board in the Plan mode: Creating a new board There are two ways to create a new board. You can create either a new Scrum or Kanban board. Perform the following steps to create a new board from presets: Bring down the Agile menu and select Manage Boards. Click on the Tools option at the top-right and select Create Board. Choose to create either a Scrum or Kanban board. Provide a name for the new board. Select the project the new board is for. Click on the Create button. When creating a new board based on the presets, GreenHopper will automatically generate the necessary JQL queries based on the selected project. For a Scrum board, it will include all the issues in the project, while for a Kanban board, it will include all the issues that do not belong to a released version. Creating a new board based on the presets is simple and fast but each board is linked to a project only. You can also create a new board with a filter, and this way, you can control what issues will be added to the board. One thing to keep in mind is that you can only create Kanban boards this way. You cannot create a Scrum board with a filter. Perform the following steps to create a new Kanban board with a filter: Bring down the Agile menu and select Manage Boards. Click on the Tools option at the top-right and select Create Board. Select the Advanced option. Provide a name for the new board. Select a filter you want to use. Click on the Create button.
Read more
  • 0
  • 0
  • 2717
Unlock access to the largest independent learning library in Tech for FREE!
Get unlimited access to 7500+ expert-authored eBooks and video courses covering every tech area you can think of.
Renews at $19.99/month. Cancel anytime
article-image-working-bazaar-centralized-mode
Packt
12 Aug 2013
37 min read
Save for later

Working with Bazaar in Centralized Mode

Packt
12 Aug 2013
37 min read
(For more resources related to this topic, see here.) The centralized mode In the centralized mode, multiple users have write access to one or more branches on a central server. In addition, this mode requires that all commit operations be applied to the central branches directly. This is in contrast with the default behavior of Bazaar, where all commits are local only, and thus private by default. In order to prevent multiple users from overwriting each other's changes, commits must be synchronized and performed in lock-step—if two collaborators try to commit at the same time, only the first commit will succeed. The second collaborator has to synchronize first with the central server, merging in the changes done by others, and try to commit again. In short, a commit operation can only succeed if the server and the user are on the same revision right before the commit. First, we will learn about the core operations, advantages, and disadvantages of the centralized mode in a general context. In the next section, we will learn in detail how the centralized mode works in Bazaar. Core operations The core operations in centralized mode are checkout, update, and commit: Checkout : This operation creates a working tree by downloading the project's files from a central server. This is similar to the branch operation in Bazaar. Update : This operation updates the working tree to synchronize with the central server, downloading any changes committed to the server by others since the last update. This is similar to the pull operation in Bazaar. Commit : This operation records the pending changes in the working tree as a new revision on the central server. This is different from the commit , because in the centralized mode, the commit must be performed on the central server. Bazaar supports all these core operations, and it provides additional operations to switch between centralized and decentralized modes, such as bind, unbind, and the notion of local commits, which we will explain later. The centralized workflow Since the centralized mode requires that all the commits be performed on the central server, it naturally enforces a centralized workflow. After getting the project's files using the checkout operation, the workflow is essentially a cycle of update and commit operations: Do a "checkout" to get the project's files. Work on the files and make some changes. Before committing, update the project to get the changes committed by others in the meantime. Commit the changes and return to step 2. Checkout from the central branch Given the central repository with its branches, the first step for a collaborator is to get the latest version of the project. Typically, you only need to do this once in the lifetime of the project. Later on, you can use the update operation to get the changes that were committed by the other collaborators on the server: As a result of the checkout, collaborators have their own private copy of the project to work on. Making changes Collaborators make changes independently in their own working trees, possibly working on copies of the same files simultaneously. Their environments are independent of each other and of the server too. Their changes are local and typically private until they commit them to the repository: Committing changes Commit operations are atomic—they cannot be interrupted or performed simultaneously in parallel. Therefore, collaborators can only commit new revisions one by one, not at the same time: If two collaborators try to commit at the same time as in this example, only the first one will succeed. The second one will fail because his copy of the project will be out of date as compared to the server, where another revision has been added by the other collaborator. At this point, the second collaborator will have to update his working tree to bring it to the latest revision, downloading the revision added by the other user who succeeded to commit first. Updating from the server The update operation brings the working tree up-to-date by copying any revisions that have been added on the server since the last update or checkout. If there are uncommitted changes in the working tree, they will be merged on top of the incoming changes: After the update, the local branch will be on the same revision as the server, and now the user may commit the pending changes: Handling conflicts during update When there are pending changes in the working tree, the update operation will try to rebase those changes on top of the incoming revisions. That is, the working tree is first synchronized with the server to be on the same revision, and after that the pending changes are applied on top of the updated working tree. Similar to a merge operation, if the pending changes conflict with the incoming changes, the conflicts must be resolved manually. Since there is no systematic way to return to the same original pending state, the update operation can be dangerous in this situation. The more pending changes and the more time has elapsed since the last update or checkout, the greater the risk of conflicts. Advantages The centralized mode has several useful properties that are worth considering. Easy to understand The concept of a central server, where all the changes are integrated and the work of all collaborators is kept synchronized, is simple and easy to understand. In projects using the centralized mode, the central server is an explicit and unambiguous reference point. Easy to synchronize efforts Since all the commits of the collaborators are performed on the central server in lock-step, the independent local working trees cannot diverge too far from each other; it's as if they are always at most one revision away from the central branch. In this way, the centralized mode helps the collaborators to stay synchronized. Widely used The centralized mode has a long-standing history. It is widely used today in many projects, and it is often preferred in corporate environments. Disadvantages The centralized mode has several drawbacks that are important to keep in mind. Single point of failure Any central server is, by definition, a potential single point of failure. Since in the centralized mode all commits must go through the central server, if it crashes or becomes unavailable, it can slow down, hinder, or in the worst case completely block further collaboration. Administrative overhead of access control When multiple users have write access to a branch, it raises questions and issues about access control, server configuration, and maintenance: Who should have write access? An access control policy must be defined and maintained. How to implement write access of multiple users on the central branches? The central server must be configured appropriately to enforce the access control policy. Whenever a collaborator joins or leaves the project, the server configuration must be updated to accommodate changes in the team. Whenever the access policy changes, the server configuration must be updated accordingly. The update operation is not safe The centralized mode heavily relies on an inherently unsafe operation—updating the working tree from the server while it has pending changes. Since the pending changes are, by definition, not recorded anywhere, there is no systematic way to return to the original state after performing the update operation. Unrelated changes interleaved in the revision history When collaborators work on different topics in parallel, if they continuously commit their changes, then unrelated changes will be interleaved in the revision history. As a result, the revision history can become difficult to read, and if a feature needs to be rolled back later, the revisions that were a part of the feature can be difficult to find. Using Bazaar in centralized mode Bazaar fully supports the core operations of the centralized mode by using so-called bound branches. The checkout and update operations are implemented using dedicated commands in the context of bound branches. The commit operation works differently when used with bound branches, in order to enforce the requirements of the centralized mode. In addition to the classic core operations of the centralized mode, Bazaar provides additional operations to easily turn the centralized mode on or off, which opens interesting new ways of combining centralized and decentralized elements in a workflow. Bound branches Bound branches are internally the same as regular branches; they differ only in a few configuration values—the bound flag is set to true, and bound_location is set to the URL of another branch. We will refer to the bound location as the master branch . In most respects, a bound branch behaves just like any regular branch. However, operations that add revisions to a bound branch behave differently—all the revisions are first added in the master branch, and only if that succeeds, the operation is applied to the bound branch. For example, the commit operation succeeds only if it can be applied to the master branch. Similarly, the push and pull operations on a bound branch will attempt to push and pull the missing revisions in the master branch first. Since being bound to another branch is simply a matter of configuration, branches can be reconfigured at any time to be bound or unbound. Creating a checkout The checkout operation creates a bound branch with a working tree. This configuration is called a checkout in Bazaar. This is essentially the same as creating a regular branch and then binding it to the source branch it was created from. The term checkout is also used as a verb to indicate the act of creating a checkout from another branch. Using the command line Let's first create a shared repository to store our sample branches: $ mkdir -p /sandbox $ bzr init-repository /sandbox/central Shared repository with trees (format: 2a) Location: shared repository: /sandbox/central $ cd /sandbox/central You can check out from another branch by using the bzr checkout command and by specifying the URL of the source branch. Optionally, you can specify the target directory where you want to create the new checkout. For example: $ bzr checkout http://bazaar.launchpad.net/~bzrbook/bzrbook-examples/hello-start trunk You can confirm that the branch configuration is a checkout by using the bzr info command: $ bzr info trunk Repository checkout (format: 2a) Location: repository checkout root: trunk checkout of branch: http://bazaar.launchpad.net/~bzrbook/bzrbook-examples/hello-start/ shared repository: . The first line of the output is the branch configuration, in this case a "Repository checkout", because we created the checkout inside a shared repository. Outside a shared repository, the configuration is called simply "Checkout". For example: $ bzr checkout trunk /tmp/checkout-tmp $ cd /tmp/checkout-tmp/ $ bzr info Checkout (format: 2a) Location: checkout root: . checkout of branch: /sandbox/central/trunk In both the cases the checkout of branch line indicates the master branch that this one is bound to. Using Bazaar Explorer Performing a checkout using Bazaar Explorer can be a bit confusing, because the buttons and menu options labeled Checkout... use a special mode of the checkout operation called "lightweight checkouts". Lightweight checkouts are very different from branches.. Use the Branch view to checkout from a branch: From the toolbar, click on the large Start button and select Branch... From the menu, select Bazaar | Start | Initialize In the From: textbox, enter the URL of the source branch. In the To: textbox, you can either type the path to the directory where you want to create the checkout, or click on the Browse button and navigate to it. Make sure to select the Bind new branch to parent location box, in order to make the new branch bound to the source branch: After you click on OK , the Status box will show the bzr command that was executed and its output. For example: Run command: bzr branch https://code.launchpad.net/~bzrbook/bzrbook-examples/hello-start /sandbox/central/trunk2 --bind --use-existing-dir Branched 6 revisions. New branch bound to https://code.launchpad.net/~bzrbook/bzrbook-examples/hello-start Click on Close to return to the status view, which shows the content of the working tree exactly in the same way as in the case of regular branches. The Status view does not indicate whether the branch of the current working tree is bound or not. On the other hand, the repository view uses different icons to distinguish these configurations: Bound branches are shown with a computer icon, and unbound branches are shown with a folder icon. Updating a checkout The purpose of the update operation is to bring a bound branch up-to-date with its master branch. If there are pending changes in the working tree, they will be reapplied after the branch is updated. If the incoming changes conflict with the pending changes in the working tree, the operation may result in conflicts. As collaborators work independently in parallel, it is very common and normal that a bound branch is out of date due to the commits done by other collaborators. In such a state, the commit operation would fail, and the bound branch must be updated first before retrying to commit. Similar to a pull operation, the update operation copies the missing revision data to the repository and updates the branch data to be the same as the master branch. If there are pending changes in the working tree at the time of performing the update, they are first set aside and reapplied at the end. During this step conflicts may happen, the same way as during a merge operation. Using the command line You can bring a bound branch up-to-date with its master branch by using the bzr update command. To demonstrate this, let's first create another checkout based upon an older revision: $ cd /sandbox/central $ bzr checkout trunk -rlast:3 last-3 $ cd last-3 $ bzr missing --line ../trunk You are missing 2 revisions: 6: Janos Gyerik 2013-03-03 updated readme 5: Janos Gyerik 2013-03-03 added python and bash impl That is, our new checkout is two revisions behind the trunk. Let's bring it up to date: $ bzr update +N hello.py +N hello.sh M README.md All changes applied successfully. Updated to revision 6 of branch /sandbox/central/trunk The missing revisions are added to the branch, and the necessary changes are applied to the working tree, resulting in identical branches: $ bzr missing ../trunk Branches are up to date. Using Bazaar Explorer To bring a checkout up-to-date with its master, you can either click on the large Update button in the toolbar, or navigate to Bazaar | Collaborate | Update Working Tree.... in the menu. The user interface does not take any parameters; the operation is applied immediately and its result is shown similar to the command-line interface. Visiting an older revision An interesting alternative use of the update operation is to reset the working tree to a past state, by specifying a revision by using the -r or --revision options. For example: $ cd /sandbox/central/trunk $ bzr update -r3 -D .bzrignore M README.md -D hello.py -D hello.sh All changes applied successfully. Updated to revision 3 of branch http://bazaar.launchpad.net/~bzrbook/bzrbook-examples/hello-start This may seem similar to using bzr revert, but in fact it is very different. The changes applied to the working tree will not be considered pending changes. Instead, the working tree is marked as out of date with its master, effectively preventing commit operations in this state: $ bzr status working tree is out of date, run 'bzr update' Another difference from the revert command is that we cannot specify a subset of files; the update command is applied to the entire working tree. This operation works on unbound branches too. Since an unbound branch can be thought of as being its own master, the update command without a revision parameter simply restores it to its latest revision. Committing a new revision The commit operation works in the same way as it does with unbound branches, however, in keeping with the main principles of the centralized mode, Bazaar must ensure that the commit is performed in two branches—first in the master branch, followed by the bound branch. The commit operation in the master branch succeeds only if it is at the same revision as the bound branch. Otherwise, the operation fails, and the bound branch must first be synchronized with its master branch using the update operation. In Bazaar Explorer, the Commit view shows an additional explanation when committing in a bound branch, as a kind reminder that the operation will be performed on the master branch first, keeping the local and master branches in sync: Practical tips when working in centralized mode The centralized mode is simple and easy to work with in general, except for the update operation. The update operation can be problematic when there are too many pending changes in the working tree, and the central branch has evolved too far since the last time the bound branch was synchronized. Fortunately, a few simple practices can greatly reduce or mitigate the potential conflicts that may arise during update operations: Always perform an update before starting to work on something new. That is, make sure to start a new development based on the latest version of the central branch. Break down bigger changes into smaller steps and commit them little by little. Don't let too many pending changes to accumulate locally; try to commit your work as soon as possible. In case of large scale changes and whenever it makes sense, use dedicated feature branches. You can work on feature branches locally or share them with others by pushing to the central server. Working with bound branches Bazaar provides additional operations using bound branches that go beyond the core principles of the centralized mode, such as: Unbinding from the master branch Binding to a branch Local commits Essentially, these operations provide different ways to switch in and out of the centralized mode, which is extremely useful when a central branch becomes temporarily unavailable, or if you want to rearrange the branches in your workflow. Unbinding from the master branch Sometimes, you may want to commit changes even if the master branch is not accessible. For example, when the server hosting the master branch is experiencing network problems, or if you are in an environment with no network access such as in a coffee shop or in a train. You can unbind from the master branch by using the bzr unbind command. To unbind a branch using Bazaar Explorer, you can either click on the large Work icon in the toolbar and select Unbind Branch , or using the menu Bazaar | Work | Unbind Branch . Internally, this operation simply sets the bound configuration value to false. Since the branch is no longer considered bound, subsequent commit operations will be performed only locally, and the branch will behave as any other regular branch. You can confirm that a branch was unbound from its master by using the bzr info command. For example: $ cd /sandbox/central/ $ bzr checkout trunk mycheckout $ cd mycheckout/ $ bzr info Repository checkout (format: 2a) Location: repository checkout root: . checkout of branch: /sandbox/central/trunk shared repository: /sandbox/central $ bzr unbind $ bzr info Repository tree (format: 2a) Location: shared repository: /sandbox/central repository branch: . That is, the configuration has changed from Repository checkout to Repository tree and the checkout of branch line disappeared from the output. Binding to a branch Sometimes, you may want to bind a regular independent branch to another branch, for example to switch to using the centralized mode, or if you previously unbound from a branch and want to bind to it again. You can bind to a branch by using the bzr bind command and specifying the URL of the branch. To bind a branch using Bazaar Explorer, you can either click on the large Work icon in the toolbar and select Bind Branch... , or use the menu Bazaar | Work | Bind Branch... . If you have previously used unbind in this branch, then you can omit the URL parameter on the command line, and in Bazaar Explorer the previous location is selected by default. Internally, this operation simply updates the branch configuration—sets or updates the value of bound_location and sets the value of bound to True. Since the branch is now considered bound, all commit operations will be first applied to the master branch, but the working tree is left unchanged at this point. Although you can bind any branch to any other branch, it only makes sense to bind to a related branch, typically a branch that is some revisions ahead of the current branch, so that a normal pull operation would bring the local branch up-to-date with its master branch. After binding to a branch, you should bring the local branch up-to-date with its master branch by using bzr update. Ideally, if the local branch is related to its new master and is just some revisions behind, then the update operation will simply bring it up-to-date by copying the revision data and the branch data of the master, leaving the working tree in a clean state, ready to work in the branch. However, if the two branches have diverged from each other, then the update operation will perform a merge—first the working tree is updated to match the latest revision in the master branch, after that the revisions that do not exist in the master branch are merged in the same way as in a regular merge operation. This is an unusual use case, but nonetheless a valid operation. After all the changes are applied, you must sort out all conflicts, if any, and you may commit the merge. Since the branch is now a bound branch, the merge commit will be first applied in the master branch, and after that in the bound branch. Using local commits If you want to break out of the centralized mode only temporarily, an alternative to unbinding and rebinding later is using so-called local commits. When using local commits, you basically stay in centralized mode, but instead of trying to commit in the master branch, the commit operation is applied only in the local branch. This can be very useful when the master branch is temporarily unavailable but expected to be restored soon. You can perform a local commit by using the bzr commit command with the --local flag, or in Bazaar Explorer by selecting the Local commit box in the Commit view: You can continue to perform as many local commits as needed until the master branch becomes available again. As a result of local commits, the bound branch and the master branch go out of sync. If you try to perform a regular commit in such a state, Bazaar will raise an error and tell you to either continue committing locally, or perform an update and then commit. $ bzr commit -m 'removed readme' bzr: ERROR: Bound branch BzrBranch7(file:///sandbox/central/on-the-train/) is out of date with master branch BzrBranch7(file:///sandbox/central/trunk/). To commit to master branch, run update and then commit. You can also pass --local to commit to continue working disconnected. It may seem strange at first that we have to do an update even though in this case our local branch is clearly ahead of its master. However, the behavior is consistent with the rule – if a bound branch is not in sync with its master branch, you must always use the update operation to synchronize it. As usual, the update operation will first restore the working tree to the same state as the latest revision in the master branch. After that, it will perform a merge from the tip of the local branch, applying the changes in the revisions that were committed locally. Finally, it will apply the pending changes that existed at the moment the update operation started. As a result, the working tree will be in a pending merge state, as you can confirm by using the log and status commands. For example: After sorting out all conflicts, if any, you may commit the merge. The local commits will appear as if they had been on a branch and the branch has been merged. This makes perfect sense, as indeed this is exactly what happened: If no new revisions were added in the master branch during your local commits, then a simple way to bring the master up-to-date is to do a bzr push operation instead of bzr update. It works because in this case the two branches have not diverged; the local branch is simply a few revisions ahead of its master. The push operation appends the missing revisions to the master branch, and the two branches become synchronized again, and you can continue to work and commit normally. Working with multiple branches Branch operations work consistently, regardless of whether you use the centralized mode or not. Although the centralized mode permits multiple collaborators committing unrelated changes continuously in the central branch, it is better to work on new improvements in dedicated feature branches and merge them into the central branch only when they are ready. In this way, the revision history remains easy to read, and if a feature causes problems, then all the revisions involved in it can be reverted easily with one swift move. Even in a centralized workflow, you are free to use as many local private branches as needed. You can slice and dice your local branches and when a feature is ready, you can merge them into the central branch, and all the intermediate revisions will be preserved in the history. Team members can work on a feature branch together by sharing the branch on the central server. One of the team members can start working on the feature, and at some point push the branch on the server so that others can checkout from it and start contributing their work. After pushing the branch to the server, the original contributor can switch to the centralized mode using the bind command. When working on a bound branch, keep in mind that in addition to the commit operation, the push and pull operations too will (at least least try to) impact its master branch. Setting up a central server In order to use Bazaar in the centralized mode, collaborators need to have write access to the branches on a central server. Here, we explain a few ways of configuring such servers. Using an SSH server An easy and secure way to provide write access to branches at a central location is by using an SSH server. In this setup, users authenticate via the SSH service running on the server, and their read and write access permissions to the branches are subject to regular filesystem permissions. There are several ways of accessing Bazaar branches over SSH: Users access the server with their own SSH account Users access the branches with a shared restricted SSH account Users access the server with their own SSH account over SFTP Using the smart server over SSH If Bazaar is installed on the server, remote clients can benefit from the built-in smart server when accessing branches by using the bzr+ssh:// protocol. In this mode, the bzr serve command is invoked on the server side to handle incoming Bazaar commands. This mode is called smart server , because remote clients receive assistance from the server, significantly speeding up Bazaar operations. In addition to Bazaar being installed on the server, the bzr command must be in a directory included on the user's PATH variable. Otherwise, the absolute path of bzr must be specified at the client side, either in the BZR_REMOTE_PATH environment variable or in Bazaar's user configuration. For example, if bzr is installed in /usr/local/bin/bzr, then you can execute Bazaar commands on the remote location as follows: $ export BZR_REMOTE_PATH=/usr/local/bin/bzr $ bzr info bzr+ssh://user@example.com/repos/projectx Alternatively, the remote path can be specified in the locations.conf file in your Bazaar configuration directory as follows: [bzr+ssh://example.com/repos/projectx] bzr_remote_path = /usr/local/bin/bzr See bzr help configuration for more details. Use the bzr version command to the find the location of the Bazaar configuration directory. Using individual SSH accounts This is the easiest way to access Bazaar repositories on a remote computer. Users with shell access to a computer can access Bazaar branches by using the bzr+ssh:// protocol. For example: $ bzr info bzr+ssh://user@example.com/repos/projectx The path component in the URL must be the absolute path of the branch on the server; in this example, the branch is in /repos/projectx. If the branch is in the user's home directory, then the home directory part can be replaced with ~; for example, instead of /home/jack/repos/projectx, you can use the more simple form ~/repos/projectx: $ bzr info bzr+ssh://user@example.com/~/repos/projectx To refer to a Bazaar branch in another user's home directory, you can use the ~username shortcut. For example: $ bzr log bzr+ssh://user@example.com/~mike/repos/projectx In order to let multiple users commit to the same branches, their user accounts must have write permission to the branch and repository files used by Bazaar. One way to do that is by adding the users to a dedicated group, and setting the ownership and access permissions appropriately. Let's call this group bzrgroup, and let's set up a shared repository at /srv/repos/projectx for members of the group, as follows: $ bzr init-repository /srv/repos/projectx --no-trees Shared repository (format: 2a) Location: shared repository: /srv/repos/projectx $ chgrp -R bzrgroup /src/repos/projectx $ chmod g+s /src/repos/projectx With this setup, the members of bzrgroup can create branches and commit to them. With appropriate permissions, other users can be permitted strictly the read-only access. Using a shared restricted SSH account Instead of creating individual SSH accounts for each collaborator, an interesting alternative is to use a shared SSH account with command restrictions. This setup requires that collaborators use the SSH public key authentication when connecting to the server, and that appropriate access permissions to the branches be configured in the ~/.ssh/authorized_keys file of the shared SSH account. Let's suppose that: There is a shared repository on the server in /srv/bzr/projectx You want to give Jack and Mike write access to the shared repository The shared repository is owned by the user bzruser To make this work, add the following two lines to the ~/.ssh/authorized_keys file of bzruser: command="bzr serve --inet --allow-writes --directory=/srv/bzr/projectx",no-agent-forwarding,no-port-forwarding,no-pty,no-user-rc,no-X11-forwarding PUBKEY_OF_JACK command="bzr serve --inet --allow-writes --directory=/srv/bzr/projectx",no-agent-forwarding,no-port-forwarding, no-pty,no-user-rc,no-X11-forwarding PUBKEY_OF_MIKE Replace PUBKEY_OF_JACK and PUBKEY_OF_MIKE with the SSH public key of Jack and Mike, respectively. For example, an SSH public key looks similar to the following: ssh-rsa AAAAB3NzaC1yc2EAAAABIwAAAQEAo6a+TOzByRt9EVUjpMBs5kRft9SSPamI3cRlvaX4DuMbRqjtfkRTO4tik+MAWaFeIHyO5EsdFBGp+XVH9BMqehXdjAQga4Wa2oGX/w7bn+O+gdIoJE2wzMlGV2eXcaW2PKdDIqQpUn0n+xX68vjRaCiZmqGXWhVej3cVi9dtIwIQMrcIF4T+4wONic09UjPXZKbjL2GmkzsR6SMQJBomr4TUcRgyaR5ija9R8AzvsSdNeDKkVwf83lva3jruwEMute3aZFulM5JqvjFIFqooAlSjWjdniF8ZdweeN1c2Q2QH+eCl48hY2drUsdZ+oQH+xp8x6llkZiDWFE/RZLa3Glw== Joe The command parameter restricts the login shell to the bzr serve command. In this way, the users will not be able to do anything else on the server except run Bazaar commands. The --directory parameter further restricts Bazaar operations to the specified directory. To give only read-only access, simply drop the --allow-writes flag. The other options on the line after command are to make the SSH sessions as restricted as possible, as a good measure of security. When accessing branches in this setup, the path component in the branch URL must be relative to the directory specified in the authorization line. For example, the trunk in /srv/bzr/projectx/trunk can be accessed as follows: $ bzr info bzr+ssh://bzruser@example.com/trunk The drawback of this setup is that you can only have one configuration line per SSH key. Using SFTP If SFTP is enabled on the SSH server, you can access branches without installing Bazaar on the server by using the sftp:// URL prefix instead of bzr+ssh://. For example: $ bzr info sftp://user@example.com/home/mike/repos/projectx This type of access is called "dumb server" mode, because in this case Bazaar is not used on the server side, and thus it cannot provide assistance to the client. In this setup, operations will be much less efficient compared to using the smart server. Using bzr serve directly You can use the Bazaar smart server directly to listen to incoming connections and serve the branch data. Use the bzr serve command to start the smart server. By default, it listens on port 4155, and serves branch data from the current working directory in read-only mode. It has several command-line parameters and flags to change the default behavior. For example: --directory DIR: This specifies the base directory to serve the branch data from, instead of the current working directory --port PORT: This specifies the port number to listen on, instead of the default 4155 port --allow-writes: This allows write operations instead of strictly read-only Use the -h or --help flags to see the list of supported command-line parameters. Branches served in this way can be accessed by URLs in the following format: bzr://host/[path] Here, host is the hostname of the server, and path is the relative path from the base directory of the server process. For example, if the server is example.com, the smart server is running in the directory /srv/bzr/repo, and there is a Bazaar branch at the path /srv/bzr/repo/projectx/feature-123, then the branch can be accessed as follows: $ bzr info bzr://example.com/projectx/feature-123 The advantage of this setup is that the smart server provides good performance. On the other hand, it completely lacks authentication. Using bzr serve over inetd On GNU/Linux and UNIX systems, you can configure inetd to start the bzr serve command automatically as needed, by adding a line in the inetd.conf file as follows: 4155 stream TCP nowait bzruser /usr/bin/bzr /usr/bin/bzr serve --inet --directory=/srv/bzr/repo Here: 4155 is the port number where the Bazaar server should listen for incoming connection. bzruser is the user account the bzr serve process will run as. /usr/bin/bzr is the absolute path of the bzr command. /usr/bin/bzr serve --inet --directory=/srv/bzr/repo is the complete command to execute when starting the server. The --directory parameter is used to specify the base directory of Bazaar branches. Once configured, this setup works exactly in the same way as using bzr serve directly, with the same advantages and disadvantages. Creating branches on the central server Creating branches on a server works much in the same way as when creating branches locally. Here, we emphasize on some good practices for optimal performance. The same way as when working with local branches, it is a good idea to create a shared repository per project to host multiple Bazaar branches. Even if you don't intend to use multiple branches at first, you might want to do that later, and it is easier to have a shared repository right from the start, than migrating an existing branch later. Another important point is to configure the shared repository to not create working trees by default. Working trees are unnecessary on the server, because collaborators work in their local checkouts, and Bazaar may give warnings during branch operations if the central branch contains a working tree. In order to avoid confusion, it is better to completely omit working trees on the server. Creating a shared repository without working trees Similar to when working with local branches, using a shared repository on the server is a good way to save disk space. In addition, when pushing a new branch to the server that shares revisions with an existing branch, the shared revisions don't need to be copied, thus the push operation will be faster. When creating the shared repository, make sure to use the --no-trees flag, so that new branches will be created without trees by default. Although, most probably, you will create new branches using push operations, and most protocols don't support creating a working tree when used with push, nonetheless it is a good precaution to set up a shared repository in this way right from the start. Reconfiguring a shared repository to not use working trees You can use the bzr info command to check whether a shared repository is configured with or without working trees. For example: $ bzr info bzr+ssh://user@example.com/tmp/repo/ Shared repository with trees (format: unnamed) Location: shared repository: bzr+ssh://user@example.com/tmp/repo/ If the first line of the output says Shared repository with trees instead of simply Shared repository, then you should log in to the server and reconfigure it by using the bzr reconfigure command with the --with-no-trees flag. For example: $ cd /tmp/repo $ bzr reconfigure --with-no-trees $ bzr info Shared repository (format: 2a) Location: shared repository: . Removing an existing working tree If you already have branches on the central server with a working tree, then it is a good idea to remove them. First, check the status of the working tree by using the bzr status command. If there are any pending changes, then commit or revert them. To remove the working tree, use the bzr reconfigure command with the --branch flag. Creating branches on the server without a working tree Although you can use the bzr init and bzr branch commands directly on the server in the same way as you would do it locally, it would defeat the purpose of the centralized setup, and invite mistakes such as creating working trees by accident. A common way to create new branches on the server is by using a push operation from your local branch. For example: $ bzr push bzr+ssh://user@example.com/tmp/repo/branch1 Created new branch. After pushing a branch, if you would like to work on it in the centralized mode, then you can bind to the remote branch by using the :push location alias: $ bzr bind :push Practical use cases The key feature of the centralized mode is that it automatically keeps bound branches synchronized with their master branch. This opens interesting possibilities that can be useful in many situations, regardless of the workflow or the size of a team. To give you some idea here, we briefly introduce a few example use cases. Working on branches using multiple computers If you use multiple computers to work on a project, for example, a desktop and a laptop, or computers at different locations, then you probably need a way to synchronize your work done at physically different locations. Although you can synchronize branches between the two locations by using mirror operations such as bzr push and bzr pull, they are not automatic, and thus you may easily find yourself in a situation that you cannot access some changes you did on another computer, because you forgot to run bzr push before you switched off the machine, for example. Using the centralized mode can help here, because the synchronization between two branches is automatic, as it takes place at the time of each commit. You can start using the centralized mode by converting the branch you used to push to into a master branch, and binding to it with your other branches. Let's say you have two computers, computerA and computerB, they both can access a branch at some location branchX, and you work on the branch sometimes by using computerA, and at other times by using computerB. (Whether branchX is hosted on computerA or computerB or a third computer doesn't matter, the example will still hold true.) You can keep your work environments synchronized by using the bzr push and bzr pull operations, by adopting the following workflow on both the computers when working on branches you want to share: Pull from branchX. Work, make changes, and commit. Push to branchX. This can be tedious and error-prone; for example, if you forget to push your changes on one computer, then you might not be able to access those changes after switching to the other computer, as it may have been powered down, or be inaccessible directly over the network. Using the centralized mode would simplify the workflow to only two steps: Update from branchX. Work, make changes, and commit. Not only there is one less step to do, but since in this case branchX is automatically updated at every commit, the possibility of forgetting to run bzr push is completely eliminated. You can convert your existing setup to using centralized mode simply by binding to branchX on both the computers, and then using the update command to synchronize. Assuming that both branches have no pending changes and both have been pushed to branchX as their last operation, you can convert them by using the following commands: On computerA: $ bzr pull $ bzr bind :push On computerB: $ bzr bind :push $ bzr update After this, you can start using branchX in the centralized mode, as a cycle of the bzr update and bzr commit operations. Synchronizing backup branches An easy way to back up a branch is by pushing it to another location. For example: $ bzr push BACKUP_URL BACKUP_URL can be a path on an external disk, a path on a network share or network filesystem, or any remote URL. However, the push operation is not automatic; it must be executed manually every time you want to update the backup. Another way is to bind the branch to the backup location, effectively using it in the centralized mode. In this case, all commits in the bound branch will be automatically applied to its master branch too, keeping the backup up-to-date at all times. You can convert the branch to this setup, simply by binding to the push location: $ bzr bind :push Since this practically means switching to the centralized mode, it is important to have fast access to BACKUP_URL, otherwise the delay at every commit might be annoying. If you need to break out of the centralized mode, for example when the BACKUP_URL is temporarily unavailable for some reason, then simply run bzr unbind. And after BACKUP_URL becomes available again, you can bring the remote branch up-to-date with bzr push, and re-bind to it by using bzr bind without additional parameters to return to the centralized mode. Summary In this article, we explained the core principles of the centralized mode with its advantages and disadvantages. Bazaar fully supports the centralized mode by using bound branches, and we have demonstrated, with examples, how you can switch in and out of this mode at any time. We have covered a few simple ways of setting up a central server, where team members can have shared write access to branches, and a few practical use cases. The centralized mode in Bazaar is very flexible. It can be used for more than just to imitate the workflow of centralized version control systems. Essentially, it provides automatic synchronization of two branches, which can be practical in many situations, even as a part of more sophisticated distributed workflows. Resources for Article :   Further resources on this subject: Configuration and Handy Tweaks for UDK [Article] Parallel Dimensions – Branching with Git [Article] Installing and customizing Redmine [Article]
Read more
  • 0
  • 0
  • 2358

article-image-motion-detection
Packt
12 Aug 2013
6 min read
Save for later

Motion Detection

Packt
12 Aug 2013
6 min read
(For more resources related to this topic, see here.) Obtaining the frame difference To begin with, we create a patch with name Frame001.pd. Put in all those elements for displaying the live webcam image in a rectangle. We use a dimen 800 600 message for the gemwin object to show the GEM window in 800 x 600 pixels. We plan to display the video image in the full size of the window. The aspect ratio of the current GEM window is now 4:3. We use a rectangle of size 5.33 x 4 (4:3 aspect ratio) to cover the whole GEM window: Now we have one single frame of the video image. To make a comparison with another frame, we have to store that frame in memory. In the following patch, you can click on the bang box to store a copy of the current video frame in the buffer. The latest video frame will compare against the stored copy, as shown in the following screenshot: The object to compare two frames is pix_diff. It is similar to the Difference layer option in Photoshop. Those pixels that are the same in both frames are black. The color areas are those with changes across the two frames. Here is what you would expect in the GEM window: To further simplify the image, we can get rid of the color and use only black and white to indicate the changes: The pix_grey object converts a color image into grey scale. The pix_threshold object will zero out the pixels (black) with color information lower than a threshold value supplied by the horizontal slider that has value between 0 and 1. Refer to the following screenshot: Note that a default slider has a value between 0 and 127. You have to change the range to 0 and 1 using the Properties window of the slider. In this case, we can obtain the information about those pixels that are different from the stored image. Detecting presence Based on the knowledge about those pixels that have changed between the stored image and the current video image, we can detect the presence of a foreground subject in front of a static background. Point your webcam in front of a relatively static background; click on the bang box, which is next to the Store comment, to store the background image in the pix_buffer object. Anything that appears in front of the background will be shown in the GEM window. Now we can ask the question: how can we know if there is anything present in front of the background? The answer will be in the pix_blob object: The pix_blob object calculates the centroid of an image. The centroid (http://en.wikipedia.org/wiki/Centroid) of an image is its center of mass. Imagine that you cut out the shape of the image in a cardboard. The centroid is the center of mass of that piece of cardboard. You can balance the cardboard by using one finger to hold it as the center of mass. In our example, the image is mostly a black-grey scale image. The pix_blob object finds out the center of the nonblack pixels and returns its position in the first and second outlets. The third outlet indicates the size of the nonblack pixel group. To detect the presence of a foreground subject in front of the background, the first and second number boxes connected to the corresponding pix_blob outlets will return roughly the center of the foreground subject. The third number box will tell how big that foreground subject is. If you pay attention to the changes in the three number boxes, you can guess how we will implement the way to detect presence. When you click on the store image bang button, the third number box (size) will turn zero immediately. Once you enter into the frame, in front of the background, the number increases. The bigger the portion you occupy of the frame, the larger the number is. To complete the logic, we can check whether the third number box value is greater than a predefined number. If it is, we conclude that something is present in front of the background. If it is not, there is nothing in front of the background. The following patch Frame002.pd will try to display a warning message when something is present: A comparison object > 0.002 detects the size of the grey area (blob). If it is true, it sends a value 1 to the gemhead object for the warning text to display. If it is false, it sends a value 0. We'll use a new technique to turn on/off the text. Each gemhead object can accept a toggle input to turn it on or off. A value 1 enables the rendering of that gemhead path. A value 0 disables the rendering. When you first click on the store image bang button, the third number box value drops to 0. Minor changes in the background will not trigger the text message: If there is significant change in front of the background, the size number box will have a value larger than 0.002. It thus enables the rendering of the text2d message to display the WARNING message. After you click on the Store bang box, you can drag the horizontal slider attached to the pix_threshold object. Drag it towards the right-hand side until the image in the GEM window turns completely black. It will roughly be the threshold value. Note also that we use a number in each gemhead object. It is the rendering order. The default one is 50. The larger number will be rendered after the lower number. In this case, the gemhead object for the pix_video object will render first. The gemhead object for the text2d object will render afterwards. In this case, we can guarantee that the text will always be on top of the video: Actually, you can replace the previous version with a single pix_background object. A reset message will replace the bang button to store the background image. In the following patch, it will show either the clear or warning message on the screen, depending on the presence of a subject in front of the background image: The GEM window at this moment shows only a black screen when there isn't anything in front of the background. For most applications, it would be better to have the live video image on screen. In the following patch, we split the video signal into two – one to the pix_background object for detection and one to the pix_texture object for display: The patch requires two pix_separator objects to separate the two video streams from pix_video, in order not to let one affect the other. Here is the background image after clicking on the reset message: The warning message shows up after the subject entered the frame, and is triggered by the comparison object > 0.005 in the patch: We have been using the pix_blob object to detect presence in front of a static background image. The pix_blob object will also return the position of the subject (blob) in front of the webcam. We are going to look into this in the next section.
Read more
  • 0
  • 0
  • 3183

Packt
12 Aug 2013
14 min read
Save for later

Quick start – Creating your first Java application

Packt
12 Aug 2013
14 min read
(For more resources related to this topic, see here.) Cassandra's storage architecture is designed to manage large data volumes and revolve around some important factors: Decentralized systems Data replication and transparency Data partitioning Decentralized systems are systems that provide maximum throughput from each node.Cassandra offers decentralization by keeping each node with an identical configuration. There are no such master-slave configurations between nodes. Data is spread across nodes and each node is capable of serving read/write requests with the same efficiency. A data center is a physical space where critical application data resides. Logically, a data center is made up of multiple racks, and each rack may contain multiple nodes. Cassandra replication strategies Cassandra replicates data across the nodes based on configured replication. If the replication factor is 1, it means that one copy of the dataset will be available on one node only. If the replication factor is 2, it means two copies of each dataset will be made available on different nodes in the cluster. Still, Cassandra ensures data transparency, as for an end user data is served from one logical cluster. Cassandra offers two types of replication strategies. Simple strategy Simple strategy is best suited for clusters involving a single data center, where data is replicated across different nodes based on the replication factor in a clockwise direction. With a replication factor of 3, two more copies of each row will be copied on nearby nodes in a clockwise direction: Network topology strategy Network topology strategy ( NTS ) is preferred when a cluster is made up of nodes spread across multiple data centers. With NTS, we can configure the number of replicas needed to be placed within each data center. Data colocation and no single point of failure are two important factors that we need to consider priorities while configuring the replication factor and consistency level. NTS identifies the first node based on the selected schema partitioning and then looks up for nodes in a different rack (in the same data center). In case there is no such node, data replicas will be passed on to different nodes within the same rack. In this way, data colocation can be guaranteed by keeping the replica of a dataset in the same data center (to serve read requests locally). This also minimizes the risk of network latency at the same time. NTS depends on snitch configuration for proper data replica placement across different data centers. A snitch relies upon the node IP address for grouping nodes within the network topology. Cassandra depends upon this information for routing data requests internally between nodes. The preferred snitch configurations for NTS are RackInferringSnitch and PropertyFileSnitch . We can configure snitch in cassandra.yaml (the configuration file). Data partitioning Data partitioning strategy is required for node selection of a given data read/request. Cassandra offers two types of partitioning strategies. Random partitioning Random partitioning is the recommended partitioning scheme for Cassandra. Each node is assigned a 128-bit token value ( initial_token for a node is defined in cassandra.yaml) generated by a one way hashing (MD5) algorithm. Each node is assigned an initial token value (to determine the position in a ring) and a data range is assigned to the node. If a read/write request with the token value (generated for a row key value) lies within the assigned range of nodes, then that particular node is responsible for serving that request. The following diagram is a common graphical representation of the numbers of nodes placed in a circular representation or a ring, and the data range is evenly distributed between these nodes: Ordered partitioning Ordered partitioning is useful when an application requires key distribution in a sorted manner. Here, the token value is the actual row key value. Ordered partitioning also allows you to perform range scans over row keys. However, with ordered partitioning, key distribution might be uneven and may require load balancing administration. It is certainly possible that the data for multiple column families may get unevenly distributed and the token range may vary from one node to another. Hence, it is strongly recommended not to opt for ordered partitioning unless it is really required. Cassandra write path Here, we will discuss how the Cassandra process writes a request and stores it on a disk: As we have mentioned earlier, all nodes in Cassandra are peers and there is no master-slave configuration. Hence, on receiving a write request, a client can select any node to serve as a coordinator. The coordinator node is responsible for delegating write requests to an eligible node based on the cluster's partitioning strategy and replication factor. First, it is written to a commit log and then it is delegated to corresponding memtables (see the preceding diagram). A memtable is an in-memory table, which serves subsequent read requests without any look up in the disk. For each column family, there is one memtable. Once a memtable is full, data is flushed down in the form of SS tables (on disk), asynchronously. Once all the segments are flushed onto the disk, they are recycled. Periodically, Cassandra performs compaction over SS tables (sorted by row keys) and claims unused segments. In case of data node restart (unwanted scenarios such as failover), the commit log replay will happen, to recover any previous incomplete write requests. Hands on with the Cassandra command-line interface Cassandra provides a default command-line interface that is located at: CASSANDRA_HOME/bin/cassandra-cli.sh using Linux CASSANDRA_HOME/bin/cassandra-cli.bat using Windows Before we proceed with the sample exercise, let's have a look at the Cassandra schema: Keyspace: A keyspace may contain multiple column families; similarly, a cluster (made up of multiple nodes) can contain multiple keyspaces. Column family: A column family is a collection of rows with defined column metadata. Cassandra offers different ways to define two types of column families, namely, static and dynamic column families. Static column family: A static column family contains a predefined set of columns with metadata. Please note that a predefined set of columns may exist, but the number of columns can vary across multiple rows within the column family. Dynamic column family: A dynamic column family generally defines a comparator type and validation class for all columns instead of individual column metadata. The client application is responsible for providing columns for a particular row key, which means the column names and values may differ across multiple row keys: Column: A column can be attributed as a cell, which contains a name, value, and timestamp. Super column: A super column is similar to a column and contains a name, value, and timestamp, except that a super column value may contain a collection of columns. Super columns cannot be sorted; however, subcolumns within super columns can be sorted by defining a sub comparator. Super columns do have some limitations, such as that secondary indexes over super columns are not possible. Also, it is not possible to read a particular super column without deserialization of the wrapped subcolumns. Because of such limitations, usage of super columns is highly discouraged within the Cassandra community. Using composite columns we can achieve such functionalities. In the next articles, we will cover composite columns in detail: Counter column family: Since 0.8 onwards, Cassandra has enabled support for counter columns. Counter columns are useful for applications that perform the following: Maintain the page count for the website Do aggregation based on a column value from another column family A counter column is a sort of 64 bit signed integer. To create a counter column family, we simply need to define default_validation_class as CounterColumnType. Counter columns do have some application and technical limitations: In case of events, such as disk failure, it is not possible to replay a column family containing counters without reinitializing and removing all the data Secondary indexes over counter columns are not supported in Cassandra Frequent insert/delete operations over the counter column in a short period of time may result in inconsistent counter values There are still some unresolved issues (https://issues.apache.org/jira/browse/CASSANDRA-4775) and to considering the preceding limitations before opting for counter columns is recommended. You can start a Cassandra server simply by running $CASSANDRA_HOME/bin/ cassandra. If started in the local mode, it means there is only one node. Once successfully started, you should see logs on your console, as follows: Cassandra-cli: Cassandra distribution, by default, provides a command-line utility (cassandra-cli ), which can be used for basic ddl /dml operations; you can connect to a local/remote Cassandra server instance by specifying the host and port options, as follows: $CASSANDRA_HOME/bin/cassandra-cli -host locahost -port 9160 Performing DDL/DML operations on the column family First, we need to create a keyspace using the create keyspace command, as follows: The create keyspace command: This operation will create a keyspace cassandraSample with node placement strategy as SimpleStrategy and replication factor one. By default, if you don't specify placement_strategy and strategy_options, it will opt for NTS, where replication will be on one data center: create keyspace cassandraSample with placement_strategy='org.apache.cassandra.locator.SimpleStrategy' and strategy_options = {replication_factor:1}; We can look for available keyspaces by running the following command: show keyspaces; This will result in the following output: We can always update the keyspace for configurations, such as replication factor. To update the keyspace, do the following: Modify the replication factor: You can update a keyspace for changing the replication factor as well as the placement strategy. For example, to change a replication factor to 2 for cassandraSample, you simply need to execute the following command: update keyspace cassandraSample with placement_strategy='org.apache.cassandra.locator.SimpleStrategy' and strategy_options = {replication_factor:2}; Modify the placement strategy: You can change the placement strategy for NTS by executing the following command: update keyspace cassandraSample with placement_strategy='org.apache.cassandra.locator.NetworkTopologyStrategy' and strategy_options = {datacenter1:1}; Strategy options are in the format {datacentername:number of replicas}, and there can be multiple datacenters. After successfully creating a keyspace before proceeding with other ddl operations (for example, column family creation), we need to authorize a keyspace. We will authorize to a keyspace using the following command: use cassandraSample; Create a column family/super column family as follows: Use the following command to create column family users within the cassandraSample keyspace: create column family users with key_validation_class = 'UTF8Type' and comparator = 'UTF8Type' and default_validation_class = 'UTF8Type'; To create a super column family suysers, you need to run the following command: create column family suysers with key_validation_class = 'UTF8Type' and comparator = 'UTF8Type' and subcomparator='UTF8Type' and default_validation_class = 'UTF8Type' and column_type='Super' and column_metadata=[{column_name: name, validation_class: UTF8Type}]; key_validation_class: It defines the datatype for the row key comparator: It defines the datatype for the column name default_validation_class: It defines the datatype for the column value subcomparator: It defines the datatype for subcolumns. You can create/update a column by using the set method as follows: // create a column named "username", with a value of "user1" for row key 1set users[1][username] = user1; // create a column named "password", with a value of "password1" for row key 1 set users[1][password] = password1;// create a column named "username", with a value of "user2" for row key 2set users[2][username] = user2; // create a column named "password", with a value of "password2" for row key 2 set users[2][password] = password2; To fetch all the rows and columns from a column family, execute the following command: // to list down all persisted rows within a column family.list users ; // to fetch a row from users column family having row key value "1".get users[1]; You can delete a column as follows: // to delete a column "username" for row key 1;del users[1][username]; To update the column family, do the following: If you want to change key_validation_class from UTF8Type to BytesType and validation_class for the password column from UTF8Type to BytesType, then type the following command: update column family users with key_validation_class=BytesType and comparator=UTF8Type and column_metadata = [{column_name:password, validation_class:BytesType}] To drop/truncate the column family, follow the ensuing steps: Delete all the data rows from a column family users, as follows: truncate users; Drop a column family by issuing the following command: drop column family users; These are some basic operations that should give you a brief idea about how to create/manage the Cassandra schema. Cassandra Query Language Cassandra is schemaless, but CQL is useful when we need data modeling with the traditional RDBMS flavor. Two variants of CQL (2.0 and 3.0) are provided by Cassandra. We will use CQL3.0 for a quick exercise. We will refer to similar exercises, as we follow with the Cassandra-cli interface. The command to connect with cql is as follows: $CASSANDRA_HOME/bin/cqlsh host port cqlversion You can connect to the localhost and 9160 ports by executing the following command: $CASSANDRA_HOME/bin/cqlsh localhost 9160 -3 After successfully connecting to the command-line CQL client, you can create the keyspace as follows: create keyspace cassandrasample with strategy_class='SimpleStrategy' and strategy_options:replication_factor=1;Update keyspacealter keyspace cassandrasample with strategy_class='NetworkTopologyStrategy' and strategy_options:datacenter=1; Before creating any column family and storing data, we need to authorize such ddl/dml operations to a keyspace (for example, cassandraSample). We can authorize to a keyspace as follows: use cassandrasample; We can always run the describe keyspace command to look into containing column families and configuration settings. We can describe a keyspace as follows: describe keyspace cassandrasample; We will create a users column family with user_id as row key and username and password as columns. To create a column family, such as users, use the following command: create columnfamily users(user_id varchar PRIMARY KEY,username varchar, password varchar); To store a row in the users column family for row key value 1, we will run the following CQL query: insert into users(user_id,username,password) values(1,'user1','password1'); To select all the data from the users column family, we need to execute the following CQL query: select * from users; We can delete a row as well as specific columns using the delete operation. The following command-line scripts are to perform the deletion of a complete row and column age from the users column family, respectively: // delete complete row for user_id=1delete from users where user_id=1; // delete age column from users for row key 1.delete age from users where user_id=1; You can update a column family to add columns and to update or drop column metadata. Here are a few examples: // add a new columnalter columnfamily users add age int; // update column metadataalter columnfamily users alter password type blob; Truncating a column family will delete all the data belonging to the corresponding column family, whereas dropping a column family will also remove the column family definition along with the containing data. We can drop/truncate the column family as follows: truncate users;drop columnfamily users; Dropping a keyspace means instantly removing all the column families and data available within that keyspace.We can drop a keyspace using the following command: drop keyspace cassandrasample; By default, the CQL shell converts the column family and keyspace name to lowercase. You can ensure case sensitivity by wrapping these identifiers within " " . Summary This article showed how to create a Java application using Cassandra. Resources for Article: Further resources on this subject: Getting Started with Apache Cassandra [Article] Apache Cassandra: Working in Multiple Datacenter Environments [Article] Apache Cassandra: Libraries and Applications [Article]
Read more
  • 0
  • 0
  • 2040

article-image-overview-sql-server-reporting-services-2012-architecture-features-and-tools
Packt
08 Aug 2013
15 min read
Save for later

Overview of SQL Server Reporting Services 2012 Architecture, Features, and Tools

Packt
08 Aug 2013
15 min read
(For more resources related to this topic, see here.) Structural design of SQL servers and SharePoint environment Depending on the business and the resources available, the various servers may be located in distributed locations and the Web applications may also be run from Web servers in a farm and the same can be true for SharePoint servers. In this article, by the word architecture we mean the way by which the preceding elements are put together to work on a single computer. However, it is important to know that this is just one topology (an arrangement of constituent elements) and in general it can be lot more complicated spanning networks and reaching across boundaries. The Report Server is the centerpiece of the Reporting Services installation. This installation can be deployed in two modes, namely, Native mode or SharePoint Integrated mode. Each mode has a separate engine and an extensible architecture. It consists of a collection of special-purpose extensions that handle authentication, data processing, rendering, and delivery operations. Once deployed in one mode it cannot be changed to the other. It is possible to have two servers each installed in a different mode. We have installed all the necessary elements to explore the RS 2012 features, including Power View and Data Alerts. The next diagram briefly shows the structural design of the environment used in working with the article: Primarily, SQL Server 2012 Enterprise Edition is used, for both Native mode as well as SharePoint Integrated mode. As we see in the previous diagram, Report Server Native mode is on a named instance HI (in some places another named instance Kailua is also used). This server has the Reporting Services databases ReportServer$HI and ReportServer$HITempDB. The associated Report Server handles Jobs, Security, and Shared Schedules. The Native mode architecture described in the next section is taken from the Microsoft documentation. The tools (SSDT, Report Builder, Report Server Configuration, and so on) connect to the Report Server. The associated SQL Server Agent takes care of the jobs such as subscriptions related to Native mode. The SharePoint Server 2010 is a required element with which the Reporting Services add-in helps to create a Reporting Services Service. With the creation of the RS Service in SharePoint, three SQL Server 2012 databases (shown alongside in the diagram) are created in an instance with its Reporting Services installed in SharePoint Integrated mode. The SQL Server 2012 instance NJ is installed in this fashion. These databases are repositories for report content including those related to Power Views and Data Alerts. The data sources(extension .rsds) used in creating Power View reports (extension.rdlx) are stored in the ReportingService_b67933dba1f14282bdf434479cbc8f8f database and the alerting related information is stored in the ReportingService_b67933dba1f14282bdf434479cbc8f8f_Alerting database. Not shown is an Express database that is used by the SharePoint Server for its content, administration, and so on. RS_ADD-IN allows you to create the service. You will use the Power Shell tool to create and manage the service. In order to create Power View reports, the new feature in SSRS 2012, you start off creating a data source in SharePoint library. Because of the RS Service, you can enable Reporting Services features such as Report Builder; and associate BISM file extensions to support connecting to tabular models created in SSDT deployed to Analysis Services Server. When Reporting Services is installed in SharePoint Integrated mode, SharePoint Web parts will be available to users that allow them to connect to RS Native mode servers to work with reports on the servers from within SharePoint Site. Native mode The following schematic taken from Microsoft documentation (http://msdn.microsoft.com/en-us/library/ms157231.aspx) shows the major components of a Native mode installation: The image shows clearly the several processors that are called into play before a report is displayed. The following are the elements of this processing: Processing extensions(data, rendering, report processing, and authentication) Designing tools(Report Builder, Report Designer) Display devices(browsers) Windows components that do the scheduling and delivery through extensions(Report Server databases, a SQL Server 2012 database, which store everything connected with reports) For the Reporting Services 2012 enabled in Native mode for this article, the following image shows the ReportServer databases and the Reporting Services Server. A similar server HI was also installed after a malware attack. The Report Server is implemented as a Microsoft Windows service called Report Server Service. SharePoint Integrated mode In SharePoint mode, a Report Server must run within a SharePoint Server (even in a standalone implementation). The Report Server processing, rendering, and management are all from SharePoint application server running the Reporting Services SharePoint shared service. For this to happen, at SQL Server installation time, the SharePoint Integrated mode has to be chosen. The access to reports and related operations in this case are from a SharePoint frontend. The following elements are required for SharePoint mode: SharePoint Foundation 2010 or SharePoint Server 2010 An appropriate version of the Reporting Services add-in for SharePoint products A SharePoint application server with a Reporting Services shared service instance and at least one Reporting Services service application The following diagram taken from Microsoft documentation illustrates the various parts of a SharePoint Integrated environment of Reporting Services. Note that the alerting Web service and Power View need SharePoint Integration. The numbered items and their description shown next are also from the same Microsoft document. Follow the link at the beginning of this section. The architectural details presented previously were taken from Microsoft documentation. Item number in the diagram   Description   1   Web servers or Web Frontends (WFE). The Reporting Services add-in must be installed on each Web server from which you want to utilize the Web application feature such as viewing reports or a Reporting Services management page for tasks such as managing data sources and subscriptions.   2   The add-in installs URL and SOAP endpoints for clients to communicate with application servers through the Reporting Services Proxy.   3   Application servers running a shared service. Scale-out of report processing is managed as part of the SharePoint farm and by adding the service to additional application servers.   4 You can create more than one Reporting Services service application with different configurations, including permissions, e-mail, proxy, and subscriptions.   5   Reports, data sources, and other items are stored in SharePoint content databases.   6   Reporting Services service applications create three databases for the Report Server, temp, and data alerting features. Configuration settings that apply to all SSRS service applications are stored in RSReportserver.config file.   When you install Reporting Services in SharePoint Integrated mode, several features that you are used to in Native mode will not be available. Some of them are summarized here from the MSDN site: URL access will work but you will have to access SharePoint URL and not Native mode URL. The Native mode folder hierarchy will not work. Custom Security extensions can be used but you need to use the special purpose security extension meant to be used for SharePoint Integration. You cannot use the Reporting Services Configuration Manager (of the Native mode installation).You should use the SharePoint Central Administration shown in this section (for Reporting Services 2008 and 2008 R2). Report Manager is not the frontend; in this case, you should use SharePoint Application pages. You cannot use Linked Reports, My Reports, and My Subscriptions in SharePoint mode. In SharePoint Integrated mode, you can work with Data Alerts and this is not possible in a Native mode installation. Power View is another thing you can do with SharePoint that is not available for Native mode. To access Power View the browser needs Silverlight installed. While reports with RDL extension are supported in both modes, reports with RDLX are only supported in SharePoint mode. SharePoint user token credentials, AAM Zones for internet facing deployments, SharePoint back and recovery, and ULS log support are only available for SharePoint mode. For the purposes of discussion and exercises in this article, a standalone server deployment is used as shown in the next diagram. It must be remembered that there are various other topologies of deployment possible using more than one computer. For a detailed description please follow the link http://msdn.microsoft.com/en-us/library/bb510781(v=sql.105).aspx. The standalone deployment is the simplest, in that all the components are installed on a single computer representative of the installation used for this article. The following diagram taken from the preceding link illustrates the elements of the standalone deployment: Reporting Services configuration For both modes of installation, information for Reporting Services components is stored in configuration files and the registry. During setup the configuration files are copied to the following locations: Native modeC:Program FilesMicrosoft SQL ServerMSRS11.MSSQLSERVER SharePoint Integrated modeC:Program FilesCommon FilesMicrosoft SharedWeb Server Extensions15WebServicesReporting Follow the link http://msdn.microsoft.com/en-us/library/ms155866.aspx for details. Native mode The Report Server Windows Service is an orchestrated set of applications that run in a single process using a single account with access to a single Report Server database with a set of configuration files listed here: Stored in   Description   Location   RSReportServer.config   Stores configuration settings for feature areas of the Report Server Service: Report Manager, the Report Server Web Service, and background processing.   <Installation directory> Reporting Services ReportServer   RSSrvPolicy.config   Stores the code access security policies for the server extensions.   <Installation directory> Reporting Services ReportServer   RSMgrPolicy.config   Stores the code access security policies for Report Manager.   <Installation directory> Reporting Services ReportManager   Web.config for the Report Server Web Service   Includes only those settings that are required for ASP.NET.   <Installation directory> Reporting Services ReportServer   Web.config for Report Manager   Includes only those settings that are required for ASP.NET.   <Installation directory> Reporting Services ReportManager   ReportingServicesService. exe.config   Stores configuration settings that specify the trace levels and logging options for the Report Server Service.   <Installation directory> Reporting Services ReportServer Bin Registry settings   Stores configuration state and other settings used to uninstall Reporting Services. If you are troubleshooting an installation or configuration problem, you can view these settings to get information about how the Report Server is configured.   Do not modify these settings directly as this can invalidate your installation.   HKEY_LOCAL_MACHINE SOFTWARE Microsoft Microsoft SQL Server <InstanceID> Setup and HKEY_ LOCAL_MACHINE SOFTWARE Microsoft Microsoft SQL ServerServices ReportServer   RSReportDesigner.config   Stores configuration settings for Report Designer. For more information follow the link http://msdn.microsoft.com/en-us/library/ms160346.aspx   <drive>:Program Files Microsoft Visual Studio 10 Common7 IDE PrivateAssemblies   RSPreviewPolicy.config   Stores the code access security policies for the server extensions used during report preview.   C:Program Files Microsoft Visual Studio 10.0 Common7IDE PrivateAssembliesr   First is the RSReportServer configuration file which can be found in the installation directory under Reporting Services. The entries in this file control the feature areas of the three components in the previous image, namely, Report Server Web Service, Report Server Service, Report Manager, and background processing. The ReportServer Configuration file has several sections with which you can modify the following features: General configuration settings URL reservations Authentication Service UI Extensions MapTileServerConfiguration (Microsoft Bing Maps SOAP Services that provides a tile background for map report items in the report) Default configuration file for a Native mode Report Server Default configuration file for a SharePoint mode Report Server The three areas previously mentioned (Report Server Web Service, Report Server Service, and Report Manager) all run in separate application domains and you can turn on/off elements that you may or may not need so as to improve security by reducing the surface area for attacks. Some functionality works for all the three components such as memory management and process health. For example, in the reporting server Kailua in this article, the service name is ReportServer$KAILUA. This service has no other dependencies. In fact, you can access the help file for this service when you look at Windows Services in the Control Panels shown. In three of the tabbed pages of this window you can access contextual help. SharePoint Integrated mode The following table taken from Microsoft documentation describes the configuration files used in the SharePoint mode Report Server. Configuration settings are stored in SharePoint Service application databases. Stored in   Description   Location   RSReportServer. config   Stores configuration settings for feature areas of the Report Server Service: Report Manager, the Report Server Web Service, and background processing.   <Installation directory> Reporting Services ReportServer   RSSrvPolicy.config   Stores the code access security policies for the server extensions.   <Installation directory> Reporting Services ReportServer   Web.config for the Report Server Web Service Registry settings   Stores configuration state and other settings used to uninstall Reporting Services. Also stores information about each Reporting Services service application.   Do not modify these settings directly as this can invalidate your installation.   HKEY_LOCAL_MACHINE SOFTWARE Microsoft Microsoft SQL Server <InstanceID> Setup   For example instance ID: MSSQL11.MSSQLSERVER and HKEY_LOCAL_MACHINE SOFTWAREMicrosoft Microsoft SQL Server Reporting Services Service Applications   RSReportDesigner. config   Stores configuration settings for Report Designer.   <drive>:Program Files Microsoft Visual Studio 10 Common7 IDE PrivateAssemblies   Hands-on exercise 3.1 – modifying the configuration file in Native mode We can make changes to the rsreportserver.config file if changes are required or some tuning has to be done. For example, you may need to change, to accommodate a different e-mail, change authentication, and so on. This is an XML file that can be edited in Notepad.exe (you can also use an XML Editor or Visual Studio). You need to start Notepad with administrator privileges. Turn on/off the Report Server Web Service In this exercise, we will modify the configuration file to turn on/off the Report Server Web Service. Perform the following steps: Start Notepad using Run as Administrator. Open the file at this location (you may use Start Search| for rsreportserver.config) which is located at C:Program FilesMicrosoft SQL ServerMSRS11.KAILUAReporting ServicesReportServerrsreportserver.config. In Edit Find| type in IsWebServiceEnabled. There are two values True/False. If you want to turn off, change TRUE to FALSE. The default is TRUE.Here is a section of the file reproduced: <Service> <IsSchedulingService>True</IsSchedulingService> <IsNotificationService>True</IsNotificationService> <IsEventService>True</IsEventService> <PollingInterval>10</PollingInterval> <WindowsServiceUseFileShareStorage>False </WindowsServiceUseFileShareStorage> <MemorySafetyMargin>80</MemorySafetyMargin> <MemoryThreshold>90</MemoryThreshold> <RecycleTime>720</RecycleTime> <MaxAppDomainUnloadTime>30</MaxAppDomainUnloadTime> <MaxQueueThreads>0</MaxQueueThreads> <UrlRoot> </UrlRoot> <UnattendedExecutionAccount> <UserName></UserName> <Password></Password> <Domain></Domain> </UnattendedExecutionAccount> <PolicyLevel>rssrvpolicy.config</PolicyLevel> <IsWebServiceEnabled>True</IsWebServiceEnabled> <IsReportManagerEnabled>True</IsReportManagerEnabled> <FileShareStorageLocation> <Path> </Path> </FileShareStorageLocation> </Service> Save the file to apply changes. Turn on/off the scheduled events and delivery This changes the report processing and delivery. Make changes in the rsreportserver.config file in the following section of <Service/>: <IsSchedulingService>True</IsSchedulingService> <IsNotificationService>True</IsNotificationService> <IsEventService>True</IsEventService> The default value for all of the three is TRUE. You can make it FALSE and save the file to apply changes. This can be carried out modifying FACET in SQL Server Management Studio (SSMS), but presently this is not available. Turn on/off the Report Manager Report Manager can be turned off or on by making changes to the configuration file. Make a change to the following section in the <Service/>: <IsReportManagerEnabled>True</IsReportManagerEnabled> Again, this change can be made using the Reporting Services Server in its FACET. To change this make sure you launch SQL Server Management Studio as Administrator. In the following sections use of SSMS via Facets is described. Hands-on exercise 3.2 – turn the Reporting Service on/off in SSMS The following are the steps to turn the Reporting Service on/off in SSMS: Connect to Reporting Services_KAILUA in SQL Server Management Studio as the Administrator. Choose HODENTEKWIN7KAILUA under Reporting Services. Click on OK. Right-click on HODENTEKWIN7KAILUA (Report Server 11.0.22180 –HodentekWin7mysorian). Click on Facets to open the following properties page Click on the handle and set it to True or False and click on OK. The default is True. It should be possible to turn Windows Integrated security on or off by using SQL Server Management Studio. However, the Reporting Services Server properties are disabled.
Read more
  • 0
  • 0
  • 8004
article-image-interacting-user
Packt
08 Aug 2013
23 min read
Save for later

Interacting with the User

Packt
08 Aug 2013
23 min read
(For more resources related to this topic, see here.) Creating actions, commands, and handlers The first few releases of the Eclipse framework provided Action as a means of contributing to menu items. These were defined declaratively via actionSets in the plugin.xml file, and many tutorials still reference those today. At the programming level, when creating views, Actions are still used to provide context menus programmatically. They were replaced with commands in Eclipse 3, as a more abstract way of decoupling the operation of a command with its representation of the menu. To connect these two together, a handler is used. E4: Eclipse 4.x uses the command's model, and decouples it further using the @Execute annotation on the handler class. Commands and views are hooked up with entries on the application's model. Time for action – adding context menus A context menu can be added to the TimeZoneTableView class and respond to it dynamically in the view's creation. The typical pattern for Eclipse 3 applications is to create a hookContextMenu() method, which is used to wire up the context menu operation with displaying the menu. A default implementation can be seen by creating an example view, or one can be created from first principles. Eclipse menus are managed by a MenuManager. This is a specialized subclass of a more general ContributionManager, which looks after a dynamic set of contributions that can be made from other sources. When the menu manager is connected to a control, it responds in the standard ways for the platform for showing the menu (typically a context-sensitive click or short key). Menus can also be displayed in other locations, such as a view's or the workspace's coolbar (toolbar). The same MenuManager approach works in these different locations. Open the TimeZoneTableView class and go to the createPartControl() method. At the botom of the method, add a new MenuManager with the ID #PopupMenu and associate it to the viewer's control. MenuManager manager = new MenuManager("#PopupMenu");Menu menu = manager.createContextMenu(tableViewer.getControl());tableViewer.getControl().setMenu(menu); If the Menu is empty, the MenuManager won't show any content, so this currently has no effect. To demonstrate this, an Action will be added to the Menu. An Action has text (for rendering in the pop-up menu, or the menu at the top of the screen), as well as a state (enabled/disabled, selected) and a behavior. These are typically created as subclasses and (although the Action doesn't strictly require it) an implementaton of the run() method. Add this to the botom of the createPartControl() method. Action deprecated = new Action() { public void run() { MessageDialog.openInformation(null, "Hello", "World"); }};deprecated.setText("Hello");manager.add(deprecated); Run the Eclipse instance, open the Time Zone Table View, and right-click on the table. The Hello menu can be seen, and when selected, an informational dialog is shown. What just happened? The MenuManager(with the id #PopupMenu) was bound to the control, which means when that particular control's context sensitive menu is invoked, the manager will be able to ask to display a menu. The manager is associated with a single Menu object (which is also stamped on the underlying control itself) and is responsible for updating the status of the menu. Actions are deprecated. They are included here since examples on the Internet may have preferred references to them, but it's important to note that while they still work, the way of building user interfaces are with the commands and handlers, shown in the next section. When the menu is shown, the actions that the menu contains are rendered in the order in which they are added. Action are usually subclasses that implement a run() method, which performs a certain operation, and have text which is displayed. Action instances also have other metadata, such as whether they are enabled or disabled. Although it is tempting to override the access or methods, this behavior doesn't work—the setters cause an event to be sent out to registered listeners, which causes side effects, such as updating any displayed controls. Time for action – creating commands and handlers Since the Action class is deprecated, the supported mechanism is to create a command, a handler, and a menu to display the command in the menu bar. Open the plug-in manifest for the project, or double-click on the plugin.xml file. Edit the source on the plugin.xml tab, and add a definition of a Hello command as follows: <extension point="org.eclipse.ui.commands"> <command name="Hello" description="Says Hello World" id="com.packtpub.e4.clock.ui.command.hello"/></extension> This creates a command, which is just an identifier and a name. To specify what it does, it must be connected to a handler, which is done by adding the following extension: <extension point="org.eclipse.ui.handlers"> <handler class= "com.packtpub.e4.clock.ui.handlers.HelloHandler" commandId="com.packtpub.e4.clock.ui.command.hello"/></extension> The handler joins the processing of the command to a class that implements IHandler, typically AbstractHandler. Create a class HelloHandler in a new com.packtpub.e4.clock.ui.handlers package, which implements AbstractHandler(from the org.eclipse.core.commands package). public class HelloHandler extends AbstractHandler { public Object execute(ExecutionEvent event) { MessageDialog.openInformation(null, "Hello", "World"); return null; }} The command's ID com.packtpub.e4.clock.ui.command.hello is used to refer to it from menus or other locations. To place the contribution in an existing menu structure, it needs to be specified by its locationURI, which is a URL that begins with menu:such as menu:window?after=additionsor menu:file?after=additions. To place it in the Help menu, add this to the plugin.xml file. <extension point="org.eclipse.ui.menus"> <menuContribution allPopups="false" locationURI="menu:help?after=additions"> <command commandId="com.packtpub.e4.clock.ui.command.hello" label="Hello" style="push"> </command> </menuContribution></extension> Run the Eclipse instance, and there will be a Hello menu item under the Help menu. When selected, it will pop up the Hello World message. If the Hello menu is disabled, verify that the handler extension point is defined, which connects the command to the handler class. What just happened? The main issue with the actions framework was that it tightly coupled the state of the command with the user interface. Although an action could be used uniformly between different menu locations, the Action superclass still lives in the JFace package, which has dependencies on both SWT and other UI components. As a result, Action cannot be used in a headless environment. Eclipse 3.x introduced the concept of commands and handlers, as a means of separating their interface from their implementation. This allows a generic command (such as Copy) to be overridden by specific views. Unlike the traditional command design pattern, which provides implementation as subclasses, the command in Eclipse 3.x uses a final class and then a retargetable IHandler to perform the actual execution. E4: In Eclipse 4.x, the concepts of commands and handlers are used extensively to provide the components of the user interface. The key difference is in their definition; for Eclipse 3.x, this typically occurs in the plugin.xml file, whereas in E4 it is part of the application model. In the example, a specific handler was defined for the command, which is valid in all contexts. The handler's class is the implementation; the command ID is the reference. The org.eclipse.ui.menus extension point allows menuContributions to be added anywhere in the user interface. To address where the menu can be contributed to, the location URIobject defines where the menu item can be created. The syntax for the URI is as follows: menu: Menus begin with the menu: protocol (can also be toolbar:or popup:) identifier: This can be a known short name (such as file, window, and help), the global menu (org.eclipse.ui.main.menu), the global toolbar (org.eclipse.ui.main.toolbar), a view identifier (org.eclipse. ui.views.ContentOutline), or an ID explicitly defined in a pop-up menu's registerContextMenu()call. ?after(or before)=key: This is the placement instruction to put this after or before other items; typically additions is used as an extensible location for others to contribute to. The locationURIallows plug-ins to contribute to other menus, regardless of where they are ultimately located. Note, that if the handler implements the IHandler interface directly instead of subclassing AbstractHandler, the isEnabled() method will need to be overridden as otherwise the command won't be enabled, and the menu won't have any effect. Time for action – binding commands to keys To hook up the command to a keystroke a binding is used. This allows a key (or series of keys) to be used to invoke the command, instead of only via the menu. Bindings are set up via an extension point org.eclipse.ui.bindings, and connect a sequence of keystrokes to a command ID. Open the plugin.xml in the clock.uiproject. In the plugin.xml tab, add the following: <extension point="org.eclipse.ui.bindings"> <key commandId="com.packtpub.e4.clock.ui.command.hello" sequence="M1+9" contextId="org.eclipse.ui.contexts.window" schemeId= "org.eclipse.ui.defaultAcceleratorConfiguration"/></extension> Run the Eclipse instance, and press Cmd+ 9(for OS X) or Ctrl+ 9 (for Windows/Linux). The same Hello dialog should be displayed, as if it was shown from the menu. The same keystroke should be displayed in the Help menu. What just happened? The M1 key is the primary meta key, which is Cmd on OS X and Ctrl on Windows/Linux. This is typically used for the main operations; for example M1+ C is copy and M1+ V is paste on all systems. The sequence notation M1+ 9 is used to indicate pressing both keys at the same time. The command that gets invoked is referenced by its commandId. This may be defined in the same plug-in, but does not have to be; it is possible for one application to provide a set of commands and another plug-in to provide keystrokes that bind them. It is also possible to set up a sequence of key presses; for example, M1+ 9 8 7would require pressing Cmd+ 9 or Ctrl+ 9 followed by 8 and then 7 before the command is executed. This allows a set of keystrokes to be used to invoke a command; for example, it's possible to emulate an Emacs quit operation with the keybinding Ctrl + X Ctrl + Cto the quit command. Other modifier keys include M2(Shift), M3(Alt/Option), and M4(Ctrl on OS X). It is possible to use CTRL, SHIFT, or ALT as long names, but the meta names are preferred, since M1tends to be bound to different keys on different operating systems. The non-modifier keys themselves can either be single characters (A to Z), numbers (0to 9), or one of a set of longer name key-codes, such as F12, ARROW_UP, TAB, and PAGE_UP. Certain common variations are allowed; for example, ESC/ESCAPE, ENTER/RETURN, and so on. Finally, bindings are associated with a scheme, which in the default case should be org.eclipse.ui.defaultAcceleratorConfiguration. Schemes exist to allow the user to switch in and out of keybindings and replace them with others, which is how tools like "vrapper" (a vi emulator) and the Emacs bindings that come with Eclipse by default can be used. (This can be changed via Window | Preferences | Keys menu in Eclipse.) Time for action – changing contexts The context is the location in which this binding is valid. For commands that are visible everywhere—typically the kind of options in the default menu—they can be associated with the org.eclipse.ui.contexts.windowcontext. If the command should also be invoked from dialogs as well, then the org.eclipse.ui.context.dialogAndWindowcontext would be used instead. Open the plugin.xml file of the clock.ui project. To enable the command only for Java editors, go to the plugin.xml tab, and modify the contextId as follows: <extension point="org.eclipse.ui.bindings"> <key commandId="com.packtpub.e4.clock.ui.command.hello" sequence="M1+9" contextId="org.eclipse.ui.contexts.window" contextId="org.eclipse.jdt.ui.javaEditorScope" schemeId="org.eclipse.ui.defaultAcceleratorConfiguration"/></extension> Run the Eclipse instance, and create a Java project, a test Java class, and an empty text file. Open both of these in editors. When the focus is on the Java editor, the Cmd + 9 or Ctrl+ 9 operation will run the command, but when the focus is on the text editor, the keybinding will have no effect. Unfortunately, it also highlights the fact that just because the keybinding is disabled when in the Java scope, it doesn't disable the underlying command. If there is no change in behavior, try cleaning the workspace of the test instance at launch, by going to the Run | Run... menu, and choosing Clear on the workspace. This is sometimes necessary when making changes to the plugin.xml file, as some extensions are cached and may lead to strange behavior. What just happened? Context scopes allow bindings to be valid for certain situations, such as when a Java editor is open. This allows the same keybinding to be used for different situations, such as a Format operation—which may have a different effect in a Java editor than an XML editor, for instance. Since scopes are hierarchical, they can be specifically targeted for the contexts in which they may be used. The Java editor context is a subcontext of the general text editor, which in turn is a subcontext of the window context, which in turn is a subcontext of the windowAndDialogcontext. The available contexts can be seen by editing the plugin.xml file in the plug-in editor; in the extensions tab the binding shows an editor window with a form: Clicking on the Browse… button next to the contextId brings up a dialog, which presents the available contexts: It's also possible to find out all the contexts programmatically or via the running OSGi instance, by navigating to Window | Show View | Console, and then using New Host OSGi Console in the drop-down menu, and then running the following code snippet: osgi> pt -v org.eclipse.ui.contextsExtension point: org.eclipse.ui.contexts [from org.eclipse.ui]Extension(s):-------------------null [from org.eclipse.ant.ui] <context> name = Editing Ant Buildfiles description = Editing Ant Buildfiles Context parentId = org.eclipse.ui.textEditorScope id = org.eclipse.ant.ui.AntEditorScope </context>null [from org.eclipse.compare] <context> name = Comparing in an Editor description = Comparing in an Editor parentId = org.eclipse.ui.contexts.window id = org.eclipse.compare.compareEditorScope </context> Time for action – enabling and disabling the menu's items The previous section showed how to hide or show a specific keybinding depending on the open editor type. However, it doesn't stop the command being called via the menu, or from it showing up in the menu itself. Instead of just hiding the keybinding, the menu can be hidden as well by adding a visibleWhenblock to the command. The expressions framework provides a number of variables, including activeContexts, which contains a list of the active contexts at the time. Since many contexts can be active simultaneously, the active contexts is a list (for example, [dialogAndWindows,windows, textEditor, javaEditor]). So, to find an entry (in effect, a contains operation) an iterate operator with the equals expression is used. Open up the plugin.xml file, and update the the Hello command by adding a visibleWhen expression. <extension point="org.eclipse.ui.menus"> <menuContribution allPopups="false" locationURI="menu:help?after=additions"> <command commandId="com.packtpub.e4.clock.ui.command.hello" label="Hello" style="push"> <visibleWhen> <with variable="activeContexts"> <iterate operator="or"> <equals value="org.eclipse.jdt.ui.javaEditorScope"/> </iterate> </with> </visibleWhen> </command> </menuContribution></extension> Run the Eclipse instance, and verify that the menu is hidden until a Java editor is opened. If this behavior is not seen, run the Eclipse application with the clean argument to clear the workspace. After clearing, it will be necessary to create a new Java project with a Java class, as well as an empty text file, to verify that the menu's visibility is correct. What just happened? Menus have a visibleWhen guard that is evaluated when the menu is shown. If it is false, he menu is hidden. The expressions syntax is based on nested XML elements with certain conditions. For example, an <and> block is true if all of its children are true, whereas an <or> block is true if one of its children is true. Variables can also be used with a property test using a combination of a <with> block (which binds the specified variable to the stack) and an <equals> block or other comparison. In the case of variables that have lists, an <iterate> can be used to step through elements using either operator="or" or operator="and" to dynamically calculate enablement. To find out if a list contains an element, a combination of <iterate> and <equals> operators is the standard pattern. There are a number of variables that can be used in tests; they include the following variables: activeContexts: List of context IDs that are active at the time activeShell: The active shell (dialog or window) activeWorkbenchWindow: The active window activeEditor: The current or last active editor activePart: The active part (editor or view) selection: The current selection org.eclipse.core.runtime.Platform: The Platform object The Platform object is useful for performing dynamic tests using test, such as the following: <test value="ACTIVE" property="org.eclipse.core.runtime.bundleState" args="org.eclipse.core.expressions"/><test property="org.eclipse.core.runtime.isBundleInstalled" args="org.eclipse.core.expressions"/> Knowing if a bundle is installed is often useful; it's better to only enable functionality if a bundle is started (or in OSGi terminology, ACTIVE). As a result, use of isBundleInstalled has been replaced by the bundleState=ACTIVE tests. Time for action – reusing expressions Although it's possible to copy and paste expressions between places where they are used, it is preferable to re-use an identical expression. Declare an expression using the expression's extension point, by opening the plugin.xml file of the clock.uiproject. <extension point="org.eclipse.core.expressions.definitions"> <definition id="when.hello.is.active"> <with variable="activeContexts"> <iterate operator="or"> <equals value="org.eclipse.jdt.ui.javaEditorScope"/> </iterate> </with> </definition></extension> If defined via the extension wizard, it will prompt to add dependency on the org.eclipse.core.expressions bundle. This isn't strictly necessary for this example to work. To use the definition, the enablement expressions needs to use the reference. <extension point="org.eclipse.ui.menus"> <menuContribution allPopups="false" locationURI="menu:help?after=additions"> <command commandId="com.packtpub.e4.clock.ui.command.hello" label="Hello" style="push"> <visibleWhen> <with variable="activeContexts"> <iterate operator="or"> <equals value="org.eclipse.jdt.ui.javaEditorScope"/> </iterate> </with> <reference definitionId="when.hello.is.active"/> </visibleWhen> </command> </menuContribution></extension> Now that the reference has been defined, it can be used to modify the handler as well, so that the handler and menu become active and visible together. Add the following to the Hellohandler in the plugin.xml file: <extension point="org.eclipse.ui.handlers"> <handler class="com.packtpub.e4.clock.ui.handlers.Hello" commandId="com.packtpub.e4.clock.ui.command.hello"> <enabledWhen> <reference definitionId="when.hello.is.active"/> </enabledWhen> </handler></extension> Run the Eclipse application and exactly the same behavior will occur; but should the enablement change, it can be done in one place. What just happened? The org.eclipse.core.expressions extension point defined a virtual condition that could be evaluated when the user's context changes, so both the menu and the handler can be made visible and enabled at the same time. The reference was bound in the enabledWhen condition for the handler, and the visibleWhencondition for the menu. Since references can be used anywhere, expressions can also be defined in terms of other expressions. As long as the expressions aren't recursive, they can be built up in any manner. Time for action – contributing commands to pop-up menus It's useful to be able to add contributions to pop-up menus so that they can be used by different places. Fortunately, this can be done fairly easily with the menuContribution element and a combination of enablement tests. This allows the removal of the Action introduced in the first part of this article with a more generic command and handler pairing. There is a deprecated extension point—which still works in Eclipse 4.2 today—called objectContribution, which is a single specialized hook for contributing a pop-up menu to an object. This has been deprecated for some time, but often older tutorials or examples may refer to it. Open the TimeZoneTableView class and add the hookContextMenu()method as follows: private void hookContextMenu(Viewer viewer) { MenuManager manager = new MenuManager("#PopupMenu"); Menu menu = manager.createContextMenu(viewer.getControl()); viewer.getControl().setMenu(menu); getSite().registerContextMenu(manager, viewer);} Add the same hookContextMenu() method to the TimeZoneTreeView class. In the TimeZoneTreeView class, at the end of the createPartControl() method, call hookContextMenu(tableViewer). In the TimeZoneTableViewclass, at the end of the createPartControl() method, replace the call to the action with a call to hookContextMenu()instead: hookContextMenu(tableViewer);MenuManager manager = new MenuManager("#PopupMenu");Menu menu = manager.createContextMenu(tableViewer.getControl());tableViewer.getControl().setMenu(menu);Action deprecated = new Action() { public void run() { MessageDialog.openInformation(null, "Hello", "World"); }};deprecated.setText("Hello");manager.add(deprecated); Running the Eclipse instance now and showing the menu results in nothing being displayed, because no menu items have been added to it yet. Create a command and a handler Show the Time. <extension point="org.eclipse.ui.commands"> <command name="Show the Time" description="Shows the Time" id="com.packtpub.e4.clock.ui.command.showTheTime"/></extension><extension point="org.eclipse.ui.handlers"> <handler class= "com.packtpub.e4.clock.ui.handlers.ShowTheTime" commandId="com.packtpub.e4.clock.ui.command.showTheTime"/></extension> Create a class ShowTheTime, in the com.packtpub.e4.clock.ui.handlers package, which extends org.eclipse.core.commands.AbstractHandler, to show the time in a specific time zone. public class ShowTheTime extends AbstractHandler { public Object execute(ExecutionEvent event) { ISelection sel = HandlerUtil.getActiveWorkbenchWindow(event) .getSelectionService().getSelection(); if (sel instanceof IStructuredSelection && !sel.isEmpty()) { Object value = ((IStructuredSelection)sel).getFirstElement(); if (value instanceof TimeZone) { SimpleDateFormat sdf = new SimpleDateFormat(); sdf.setTimeZone((TimeZone) value); MessageDialog.openInformation(null, "The time is", sdf.format(new Date())); } } return null; }} Finally, to hook it up, a menu needs to be added to the special locationURI popup:org.eclipse.ui.popup.any. <extension point="org.eclipse.ui.menus"> <menuContribution allPopups="false" locationURI="popup:org.eclipse.ui.popup.any"> <command label="Show the Time" style="push" commandId="com.packtpub.e4.clock.ui.command.showTheTime"> <visibleWhen checkEnabled="false"> <with variable="selection"> <iterate ifEmpty="false"> <adapt type="java.util.TimeZone"/> </iterate> </with> </visibleWhen </command> </menuContribution></extension> Run the Eclipse instance, and open the Time Zone Table view or Time Zone Table view. Right-click on a TimeZone, and the command Show the Time will be displayed (that is, one of the leaves of the tree or one of the rows of the table). Select the command and a dialog should show the time. What just happened? The views and the knowledge of how to wire up commands in this article provided a unified means of adding commands, based on the selected object type. This approach of registering commands is powerful, because any time a time zone is exposed as a selection in the future it will now have a Show the Time menu added to it automatically. The commands define a generic operation, and handlers bind those commands to implementations. The context-sensitive menu is provided by the pop-up menu extension point using the locationURI popup:org.eclipse.ui.popup.any. This allows the menu to be added to any pop-up menu that uses a MenuManager and when the selection contains a TimeZone. The MenuManager is responsible for listening to the mouse gestures to show a menu, and filling it with details when it is shown. In the example, the command was enabled when the object was an instance of a TimeZone, and also if it could be adapted to a TimeZone. This would allow another object type (say, a contact card) to have an adapter to convert it to a TimeZone, and thus show the time in that contact's location. Have a go hero – using view menus and toolbars The way to add a view menu is similar to adding a pop-up menu; the locationURI used is the view's ID rather than the menu item itself. Add a Show the Time menu to the TimeZone view as a view menu. Another way of adding the menu is to add it as a toolbar, which is an icon in the main Eclipse window. Add the Show the Time icon by adding it to the global toolbar instead. To facilitate testing of views, add a menu item that allows you to show the TimeZone views with PlatformUI.getActiveWorkbenchWindow().getActivePage().showView(id). Jobs and progress Since the user interface is single threaded, if a command takes a long amount of time it will block the user interface from being redrawn or processed. As a result, it is necessary to run long-running operations in a background thread to prevent the UI from hanging. Although the core Java library contains java.util.Timer, the Eclipse Jobs API provides a mechanism to both run jobs and report progress. It also allows jobs to be grouped together and paused or joined as a whole.
Read more
  • 0
  • 0
  • 3984

article-image-form-customizations
Packt
08 Aug 2013
26 min read
Save for later

Form customizations

Packt
08 Aug 2013
26 min read
(For more resources related to this topic, see here.) Forms are probably the most important visual element of the Dynamics CRM 2011 interface. To find the underlying data in every entity record, the user has to open the form. Dynamics CRM 2011 supports two types of forms: The main form : Dynamics CRM 2011 uses this form to allow the user to enter and view data within the Dynamics CRM 2011 web user interface as well as the Dynamics CRM 2011 within Microsoft Outlook interface. One main form per entity exists by default. However, multiple main forms can be created for an entity. Dynamics CRM 2011 supports role-based forms, which means separate forms can be visible depending on the security roles of the current user. Usually, multiple main forms are created when role-based forms have to be supported. The mobile form : Dynamics CRM 2011 uses this form when a user is accessing CRM from a mobile device that is compatible with HTML 4.0 using a URL such as <CRM_server> /m, where <CRM_server> is the path of Microsoft Dynamics CRM 2011 Server. A separate form for mobile devices is useful considering the limited space usually available on a mobile screen. A mobile form does not store data on a mobile device. If users try to access Dynamics CRM 2011 from an unsupported browser, they will be redirected to the mobile form. The following table outlines the browsers supported by Microsoft Dynamics CRM 2011: Browser Version / other requirements Internet Explorer IE7 (only for the on-premises version) IE 8, IE9 IE10 (desktop mode only) Mozilla Firefox Latest publicly released version running on Windows 8, Windows 7, Windows Vista, or Windows XP Google Chrome Latest publicly released version running on Windows 8, Windows 7, Windows Vista, or Windows XP Apple Safari Latest publicly released version running on Mac OS X 10.7 (Lion) or 10.8 (Mountain Lion) Detailed information about supported browsers can be found at http://technet.microsoft.com/en-us/library/hh699710.aspx. Dynamics CRM 2011 also supports special variants of the main form, as follows: The read-optimized form : Dynamics CRM 2011 has another type of form called the read-optimized form. Introduced in Update Rollup 7, this form is designed for the fast display of a record by disabling the ribbon and form scripts. This form displays the record in the read-only mode. Read-optimized forms are disabled by default and can be enabled by going to System | Administration | System Settings | Customization | Form Mode . Update Rollup 12 has introduced the following changes in read-optimized forms: The navigation pane for read-optimized forms is now enabled and the navigation pane can be expanded or collapsed. Support for web resources has been added. A new setting in the web resource properties, called Show this Web Resources in Read Optimized form , has been added. This setting must be enabled for the web resources to display in the read-optimized form. If the web resource depends on form resources, which are not available in a read-optimized form, we should not display it. Read-optimized forms honor all field-level security and role-based form definitions. If an entity has more than one form enabled, the read-optimized form uses the form that the user last used. The process-driven form : The December 2012 Service Update (Polaris update) of Dynamics CRM 2011 has introduced an enhanced read-optimized form, commonly known as the process-driven form for the Account, Contact, Lead, Opportunity, and Case entities. This new type of form is very useful, especially for touch devices, as the new form is designed to contain everything in one form; there is no need to open multiple pop ups. However, this new form type cannot be used for any entity other than the entities listed above. For the Account, Contact, Lead, Opportunity, and Case entities, in addition to the information form, there will be a new form with the same name as that of the entity. The <entity name> form will always display using the updated presentation, regardless of the settings for read-optimized forms. However, if read-optimized forms are enabled for the organization, the information form will also display using the updated presentation. These new forms are not available in an on-premises deployment of Microsoft Dynamics CRM 2011. Form editor We need to use a form editor to customize a form within Dynamics CRM 2011. The form layout definition is actually stored as an XML file called Form Xml in the SystemForm entity. The customization.xml file exported with an unmanaged solution contains the definition of the entity forms. Creating and customizing an entity main form Almost all the business entities have a customizable main form. The Activity entity does not have any form and some entity forms such as the Case Resolution entity form are not customizable. When a custom entity is created, one main and one mobile form are added automatically. In this recipe, we will focus our discussion on how to customize a main form. Getting ready Dynamics CRM 2011 introduced a flexible layout for form design. The following diagram outlines the typical main form layout within the Dynamics CRM 2011 system: The major visible components of a standard main form are as follows: Ribbon : This is the top area of the form. We cannot customize this using the form editor. Entity icon : This displays the Icon for Entity Form icon of the entity. It is a 32 x 32 pixel image and can be updated for an entity.  Header and footer : The header and footer are two read-only areas of the form layout. These two sections remain static when a user scrolls through the form data displayed by the various tabs and sections. So any data that is required to be available to the user irrespective of any scrolling, can be included in these sections. Form selector : When an entity has multiple forms and the current user's security role has access to more than one form, the form selector is displayed. The user can use the form selector to choose a form from multiple forms available to them. Navigation : This section allows users to navigate to related records of the current record. We can add, modify, delete, or reorganize the link to the related entity records using the form editor. We can also include links to URLs or web resources by adding navigation links using the form editor. Form assistant : It helps when we set values for lookup fields. Dynamics CRM 2011 has introduced improved capabilities to filter data returned in the lookup dialog. Hence, the form assistant is no longer useful; the form assistant has been turned off for all except the following three entity forms: Case Product Service activity Tabs and sections : Tabs and sections allow grouping and laying out of controls in a form. A tab can contain multiple sections. Each form can have a maximum of 100 tabs. Tabs have a vertical collapse/expand feature. We will now take a look at the various form-body elements that can be added or associated with an entity form: Field : Each field represents an attribute of the entity. A field can be added to a form using the form editor and the form editor allows us to add the same field multiple times in a form. Each instance of a field in a form is known as a control . The appearance and behavior of a control is driven by the type and formatting options of the attribute as well as display and formatting properties set on the control, using the form editor. Tab and section : As previously discussed, tabs and sections are used for grouping the controls in the form. A tab can contain multiple sections within it. Each tab or section can be assigned a name. We can choose to display the name of the tab or section on the form or include a separator line at the top of the tab or section, underneath the name. A tab can have one column or two columns; when two columns are specified, the width of each column is a percentage of the width of the tab. A section, on the other hand, may have up to four columns and we can control the width available for control labels to be displayed in the section as well as how labels for controls in the section should be aligned. Spacer : The Spacer element provides extra space between fields and controls in the form. This is used to improve the control layout in a section. Sub-Grid : Sub-Grid allows us to display a list of records, charts, or both. The first four subgrids can be populated with data in a form when it loads. If more than four subgrids exist on a form, the remaining subgrids require some user or form script action to retrieve data. This is for performance optimization. IFRAME : This control provides the HTML iFrame element in the form. Using the control, we can host another web page within the Dynamics CRM 2011 entity form. The form editor provides the ability to set regular iFrame properties along with properties specific to Dynamics CRM 2011. Web Resource : This control displays a form-enabled web resource to be displayed on the page. A form-enabled web resource includes a web page (HTML), image (JPG, PNG, GIF, ICO), or Silverlight (XAP) resource. The web resource contents are hosted within Dynamics CRM 2011. Notes : If the entity uses notes and attachments, we can add the Notes control into the form. This control can only be added if the entity has Notes enabled in the entity definition. Navigation Link : This control is available only within the Navigation section of the form. This control allows us to add a link to an external URL or web resource. How to do it… In this recipe, we will first discuss how to create a new main form and then discuss the form-customization options. The customization steps can be carried out on any main form. The entity main form can be customized by carrying out the following tasks: Editing tabs Editing sections Editing fields Editing header and footer Adding subgrids Adding iFrames Adding web resources Editing the Navigation area Editing form properties Making the form non-customizable In this recipe, we will discuss all the previously stated tasks one after the other. Please follow these steps to customize the main form for an entity: Log in to the Dynamics CRM 2011 system as a system administrator or with a relevant security role. Navigate to Settings | Customizations | Solutions and change the view to Unmanaged Solutions , if not already selected. Then double-click on the unmanaged solution to open it. On the expanded Solution page, navigate to Components | Entities | <Entity> | Forms . The next step is to create a new main form; this can be done in two ways. We will discuss both of these here: Creating an entirely new main form : Go to New | Main Form in the actions toolbar. This will create a new form by copying the existing main form. When the new form pops up, click on the save button to save the form. Creating a new form from an existing form : Open the existing form by double-clicking on it. When the form launches, click on Save As in the top ribbon. When the Save As -- Webpage Dialog window pops up, provide data for the Name and Description fields of the new form. Finally, click on the OK button to save the new form as shown in the following screenshot: Any newly created main form will be assigned only to the system administrator and system customizer security roles by default. To customize a main form, open the form by double-clicking on it in the forms list. The next step is to discuss the editing of tabs in the form. Tabs are collapsible controls that can contain section controls. The following two points will demonstrate adding a new tab and editing tab properties: Adding a new tab in the form : Click on Body in the form ribbon and then click on the Insert tab in the form. In the Insert tab, under the Tab group, select One Column to create a one-column tab, or Two Columns to create a two-column tab: If we add a tab, Dynamics CRM 2011 will automatically add a section for each column. To remove any control in an entity form, use the Delete key on the keyboard. Alternatively, the Remove button in the ribbon can also be used. Editing tab properties : Select the tab control and then click on the Change Properties button in the form ribbon. The Tab Properties page will open with the following properties being modifiable: Tab property Description Under the Display tab Name The unique name of the tab. Label The display label for this tab. This text will appear on the form. Show the label of this tab on the Form This determines whether the label defined for this tab will be displayed on the form. Select this option to enable the display of the tab's label on the form. Expand this tab by default If selected, the tab control will be displayed in expanded mode by default. Visible by default If selected, the tab control will be visible by default in the form. Under the Formatting tab Select tab layout Choose between One Column and Two Columns  to define the layout of the tab. Column 1 width If the Two Columns option is selected in the tab layout, we can specify the width of column 1 as a percentage. Column 2 width If the Two Columns option is selected in the tab layout, we can specify the width of column 2 as a percentage. The Events properties   Scripts libraries can be linked to the tab. The scripts functions will be called on the TabStateChange event. Next we will see the editing of a section in a tab. A section contains fields in the form. The following two sections will demonstrate adding a section in a form and editing the section's properties: Adding a section in the form : Select the tab control where the new section is to be added and then click on the Insert tab in the form ribbon. Thereafter, click on One Column , Two Columns , Three Columns , or Four Columns under the Section group depending on whether a section with one, two, three, or four columns is to be added. Editing section properties : Select the section control and then click on the Change Properties button in the form ribbon. The Section Properties page will open and the following properties will be modifiable: Section property Description Under the Display tab Name The unique name of the tab. Label The display label for this tab. This text will appear on the form. Show the label of this section on the Form This determines whether the label defined for this section will be displayed on the form. Select this option to enable the display of the section's label on the form. Show a line at top of the section If selected, a divider line will be displayed underneath the name of the section. Width Specify the width of the label area of the fields in this field. The width must be set between 50 and 250 pixels. Visible by default If selected, the section control will be visible by default on the form. Lock the section of the Form If selected, the section would be locked in the form. Under the Formatting tab Layout Choose from among One Column, Two Columns, Three Columns, and Four Columns to define the layout of the section control. Field label alignment Select between the Left and Right alignments for the field labels in the section control. Next we will take a look at editing a field in the section: Adding a field in a section : Select the section where the field has to be added. Thereafter, find the field in the right-hand side Field Explorer pane. By default, the Field Explorer pane displays all unused fields in the form. If we want to add a field that is already used in the form, uncheck the Only show unused fields checkbox as shown in the following screenshot: After selecting the field in Field Explorer , move the field by pressing the left mouse button and drop the field in the intended column of the section. The red line on top of the column indicates that the column has been selected. Now drop the field on the selected column. Editing field properties : To edit the form-level properties of the field, select the field and then click on the Change Properties button in the form ribbon. Then the Field Properties pop up will open and the following properties can be modified: Field property Description Under the Display tab Label Here you can edit the display name of the field on the form. By default, the display name of the field will be displayed there, which can be edited to provide a new display name for the field on the form. Display Label on the form This determines whether the display name of the field is to be displayed in the form. Field is read-only This determines whether a field is to be read-only for the users in the form. Lock the field on the form This determines whether the field is to be locked on the form. Visible by default This determines the default visibility of the control in the form. Under the Formatting tab Layout This determines the width of this field on the form. The width of a field depends on the layout settings of the section it is in. The Details properties   This tab displays the details of the field definition. Click on the Edit button to modify those properties of the field definition that can be modified. The Event properties   Script libraries can be linked to the tab. The scripts' functions will be called on the OnChange event. If the field is of type Lookup (N:1 relationship with another entity), then there exists an additional set of properties in the Field Properties list. These properties can be set to save the user's time, find the appropriate parent record, or to restrict the user to select among a subset of records in the parent entity. The following form-level properties of the lookup field can be edited: Property name Description Turn off automatic resolutions in the field If this setting is disabled (not selected) and if a user enters a partial value for the lookup field and tabs away, Dynamics CRM 2011 will try to autopopulate the lookup field. Disable most recently used items for this field If this setting is disabled (not selected), Dynamics CRM 2011 will automatically provide a list of recently selected values for the user to choose from. This property is not supported for process-driven forms of Microsoft Dynamics CRM 2011 Online. Related Record Filtering This setting provides a way to limit the list of records that the user can choose from. The list under the Only show records where heading displays all the potential relationships that can be used to filter this lookup. Once a record is selected, the list under the Contains  heading will display all relationships that connect the related entity (selected in the first list) to the target entity. Select the Allow users to turn off filter checkbox to provide users with the option to turn off the filter defined here. This makes it possible for them to view a wider range of records. Additional properties This setting controls how much search flexibility the user will have in terms of changing among various views and searching the record with a search box. Select the Display Search Box in lookup dialog checkbox if you want a search box to be available in the lookup. In the Default View list, select the default view for which results will be displayed in the lookup. Finally, choose the views we want users to have access to in the lookup, using the View Selector list. Adding a new entity field and then adding it to the form : A new field can also be created and then added to the entity from the form. To create a new field, click on the New Field button at the bottom of the Field Explorer pane. This will launch the new field pop up.  Next we will delve into editing headers and footers. To edit the header or footer of the form, click on the Header or Footer button in the form ribbon and the section will be focused automatically. Then click on Change Properties in the ribbon. The Header Properties or Footer Properties page will pop up and we can edit the following settings: Header/footer property Description Under the Display tab Width Specify the width field label area here. The width must be set between 50 and 250 pixels. Lock the section of the Form This setting is selected by default and cannot be modified. This setting determines whether the section would be locked in the form or not. Under the Formatting tab Layout Here you can choose from among One Column,  Two Columns, Three Columns, and Four Columns to define the layout of the header/footer control. Field Label Alignment Select from the Left (default), Right, or Center alignment for the field labels in the header/footer control. Field Label Position Select between Side (default) and Top to specify whether the field label in this section will be on the left-hand side or above the field. Fields can be added to the header or footer controls in the same way they are added in any section control in the form. Next we will look at how to add subgrids. The Sub-Grid control displays related entity records in the form body, using the following steps: Select the section control where the subgrid is to be added in the form. Then click on the Sub-Grid button under the Insert tab in the form ribbon. This will bring up the List or Chart Properties page, where we can specify the following properties of a subgrid: Subgrid property Description Under the Display tab Name The unique name of the subgrid control. Label The display text of the subgrid. This text will be displayed on the form. Display label on the Form Select to confirm that the Label text will be displayed on the form. Data Source This specifies the primary data source of the subgrid. The Records list allows us to select between Only Related Records (to set only entities having a relationship to the current entity) and All Record Types (to set all available entities). We can choose the related entity from the Entity list. This list content will vary based on the earlier list's selection. The Default View list allows us to choose which view is to be displayed in the subgrid. Display Search Box Select this setting to display the search box in the subgrid. Display Index Select this setting to display the alphabetic index record selector in the subgrid. This property is not supported for process-driven forms of Microsoft Dynamics CRM 2011 Online. View Selector Select this setting to display the view selector in the subgrid. This property is not supported for process-driven forms of Microsoft Dynamics CRM 2011 Online. Chart Options Select whether to display a chart selector along with a default chart or show only a specified chart in place of the subgrid. This property is not supported for process-driven forms of Microsoft Dynamics CRM 2011 Online. Under the Formatting tab Layout Choose from among One Column, Two Columns, Three Columns, and Four Columns to define the layout of the subgrid control. Number of Rows Select the maximum number of rows to be displayed in the subgrid control. The number of rows has to be between 2 and 250. Automatically expand to use available space Select this setting to enable automatic expansion of the subgrid to use available space in the form. iFrames or Inline Frames are HTML documents embedded inside the Dynamics CRM entity form. The following steps will guide you through adding an iFrame in the form: Select the section control where the iFrame is to be added in the form. Then click on the IFRAME button under the Insert tab in the form ribbon. This will bring up the Add an IFRAME page, where we can specify the following properties of an iFrame: iFrame property Description Under the General tab Name The unique name of the iFrame control. URL The URL of the HTML document to be displayed in the iFrame control. Pass record object-type code and unique identifier as parameters Select this option to pass contextual information entity object-type code and the record's unique identifier to the iFrame. Read more about this in the How it works... section of this recipe. Label Here, specify the display text for the iFrame. Display label on the Form Select this setting to display the label on the form. Restrict cross-frame scripting, where supported This checkbox is selected by default. We can remove this restriction only if we are certain that the HTML document/site we are using as the target of the iFrame can be trusted. Visible by default Select this setting to make the iFrame visible by default on the form. Under the Formatting tab Layout Choose from among One Column, Two Columns, Three Columns, and Four Columns to define the layout of the iFrame control. Number of Rows Select the maximum number of rows the iFrame control occupies on the form. The number of rows has to be between 1 and 40. Automatically expand to use available space Select this setting to enable automatic expansion of the iFrame control to use the available space in the form. Scrolling Select the scrolling option for the iFrame content display. Display Border Specify whether a border for the iFrame control is to be displayed. Web resources represent files that can be used to extend the Microsoft Dynamics CRM 2011 web application, such as HTML files, Image files, JScript library, and Silverlight applications. The following steps can be used to add a web resource in the form: Select the section control where the web resource is to be added in the form. Then click on the Web Resource button under the Insert tab in the form ribbon. This will bring up the Add Web Resource page, where we can specify the following properties of a web resource: Web resource property Description Under the General tab Web Resource Lookup to find a form-enabled web resource. Name The unique name for the web resource. Label Specify the display text for the web resource here. Display label on the Form Select this setting to display the label on the form. Visibility by default Select this setting to make the web resource visible by default on the form. Show this web resource in Read-Optimized Form Select this setting if the web resource is to be displayed in the read-optimized form. Under the Formatting tab Layout Choose from among One Column, Two Columns, Three Columns, and Four Columns to define the layout of the web resource control. Number of Rows Select the maximum number of rows the web resource control occupies on the form. The number of rows has to be between 1 and 40. Automatically expand to use available space Select this setting to enable automatic expansion of the web resource control to use the available space in the form. Scrolling Select the scrolling option for the web resource content display. Display Border Specify here whether a border for the web resource control is to be displayed. The Dependencies properties   Select the fields from the Available fields list that are required by the web resource, and then click on the (add selected records) button to move the selected fields to the Dependent fields list. The navigation area displays entities that are related to the current entity. Each relationship has a Label property and in this navigation section this Label property is displayed by default. However, the display name for the related entity can be changed. This display name does not update the Label property of the relationship. In order to edit the navigation area, perform the following steps: Select the Navigation button in the form ribbon. The navigation section will be enabled. Then click on any relationship label and select Change Properties to edit the display text. This will bring up the Relationship Properties page. Modify the Label field here. Next we will edit the form properties; in order to do this, click on the Form Properties button in the form ribbon and the Form Properties page will pop up. The following properties can be edited there: Form property Description The Event properties   Add or remove the JScript libraries that will be available for the form or field events. Under the Display tab Form Name The display name for the form. Modify this to rename the form. Description Specify a description for this form here. Show navigation items Select this setting to display the page navigation in the form. The Parameters properties   Add query string parameters to be passed to the form. Click on the green plus sign to add a query string. We have to provide a Name value and select a Type value of the query string parameter. The Non- Event Dependencies properties   Select the fields from the Available fields list that are required by any external, non-event scripts, and then click on the (add selected records) button to move the selected fields to the Dependent fields list. These fields will not be removable from the form. Lastly, making a form non-customizable restricts any future customization of the form. Therefore, to make a form non-customizable, perform the following steps: Select the Managed Properties button in the form ribbon. The Managed Properties of System Form: Form web page dialog will pop up. In this page, mark Customizable as False . After making any changes to an entity form, the form has to be saved and published. Use the Publish button in the form ribbon to publish the changes. How it works… Web resources and iFrames are not displayed using the Microsoft Dynamics CRM 2011 for Outlook reading pane, but iFrames are displayed in read-optimized forms. When the Pass record object-type code and unique identifier as parameters setting is enabled, iFrames allow the form to pass the following contextual parameters to itself: Parameter name Description typename The name of the entity. type This takes in the entity type code, which is an integer value to uniquely identify an entity in a specific organization. Id A GUID that represents a record. orgname The organization's name. userlcid The user's language code. orglcid The organization's language code. The list of entity type codes can be found at http://msdn.microsoft.com/en-us/library/gg328086.aspx. The key points about entity type codes are as follows: Type codes below 10,000 are reserved for out-of-the-box entities. Custom entities will have a type code greater than or equal to 10,000. Custom entities' type codes might change during solution import. Hence the type codes of a custom entity might be different in the development and test environments. The entity codes are stored in the Dynamics CRM database and can be retrieved from the EntityView table of the <OrganizationName>_MSCRM database.
Read more
  • 0
  • 0
  • 7989

article-image-map-reduce
Packt
08 Aug 2013
10 min read
Save for later

Map Reduce

Packt
08 Aug 2013
10 min read
(For more resources related to this topic, see here.) Map-reduce is a technique that is used to take large quantities of data and farm it out for processing. A somewhat trivial example might be: given 1TB of HTTP log data, count the number of hits that come from a given country, and report those numbers. For example, if you have the log entries: 204.12.226.2 - - [09/Jun/2013:09:12:24 -0700] "GET /who-we-are HTTP/1.0"404 471 "-" "Mozilla/5.0 (compatible; MJ12bot/v1.4.3; http://www.majestic12.co.uk/bot.php?+)"174.129.187.73 - - [09/Jun/2013:10:58:22 -0700] "GET /robots.txtHTTP/1.1" 404 452 "-" "CybEye.com/2.0 (compatible; MSIE 9.0; Windows NT5.1; Trident/4.0; GTB6.4)"157.55.35.37 - - [02/Jun/2013:23:31:01 -0700] "GET / HTTP/1.1" 200 483"-" "Mozilla/5.0 (compatible; bingbot/2.0; +http://www.bing.com/bingbot.htm)"206.183.1.74 - - [02/Jun/2013:18:24:35 -0700] "GET / HTTP/1.1" 200 482"-" "Mozilla/4.0 (compatible; http://search.thunderstone.com/texis/websearch/about.html)"1.202.218.21 - - [02/Jun/2013:17:38:20 -0700] "GET /robots.txt HTTP/1.1"404 471 "-" "Mozilla/5.0 (compatible; JikeSpider; +http://shoulu.jike.com/spider.html)" Then the answer to the question would be as follows: US: 4China: 1 Clearly this example dataset does not warrant distributing the data processing among multiple machines, but imagine if instead of five rows of log data we had twenty-five billion rows. If your program took a single computer a half a second to process five records, it would take a little short of eighty years to process twenty-five billion records. To solve for this, we could break up the data into smaller chunks and then process those smaller chunks, rejoining them when we were finished. To apply this to a slightly larger dataset, imagine you extrapolated these five records to one hundred records and then split those one hundred records into five groups, each containing twenty records. From those five groups we might compute the following results: Group 1   Group 2   Group 3   Group 4   Group 5   US 5 Mexico 2 US 15 Italy 1 Finland 5 Greece 4 Scotland 6 China 2 Greece 4 China 5 Ireland 8 Canada 9 Finland 3 Scotland 10 US 10 Canada 3 Ireland 3     US 5     If we were to combine these data points by using the country name as a key and store them in a map, adding the value to any existing value, we would get the count per country across all one hundred records. Using Ruby, we can write a simple program to do this, first without using Gearman, and then with it. To demonstrate this, we will write the following: A simple library that we can use in our non-distributed program and in our Gearman-enabled programs An example program that demonstrates using the library A client that uses the library to split up our data and submit jobs to our manager A worker that uses the library to process the job requests and return the results The shared library First we will develop a library that we can reuse. This will demonstrate that you can reuse existing logic to quickly take advantage of Gearman because it ensures the following things: The program, client, and worker are much simpler so we can see what's going on in them The behavior between our program, client, and worker is guaranteed to be consistent The shared library will have two methods, map_data and reduce_data. The map_data method will be responsible for splitting up the data into chunks to be processed, and the reduce_data method will process those chunks of data and return something that can be merged together into an accurate answer. Take the following example, and save it to a file named functions.rb for later use: #!/bin/env ruby# Generate sub-lists of the data# each sub-list has size = blocksizedef map_data(lines, blocksize)blocks = []counter = 0block = []lines.each do |line|if (counter >= blocksize)blocks << blockblock = []counter = 0endblock << linecounter += 1endblocks << block if block.size> 0blocksend# Extract the number of times we see a unique line# Result is a hash with key = line, value = countdef reduce_data(lines)results = {}lines.each do |line|results[line] ||= 0results[line] += 1endresultsend A simple program To use this library, we can write a very simple program that demonstrates the functionality: require './functions.rb'countries = ["china", "us", "greece", "italy"]lines = []results = {}(1..100).each { |i| lines << countries[i % 4] }blocks = map_data(lines, 20)blocks.each do |block|reduce_data(block).each do |k,v|results[k] ||= 0results[k] += vendendputs results.inspect Put the contents of this example into a Ruby source file, named mapreduce.rb in the same directory as you placed your functions.rb file, and execute it with the following: [user@host:$] ruby ./mapreduce.rb This script will generate a list with one hundred elements in it. Since there are four distinct elements, each will appear 25 times as the following output shows: {"us"=>25, "greece"=>25, "italy"=>25, "china"=>25} Following in this vein, we can add in Gearman to extend our example to operate using a client that submits jobs and a single worker that will process the results serially to generate the same results. The reason we wrote these methods in a separate module from the driver application was to make them reusable in this fashion. The client The following code for the client in this example will be responsible for the mapping phase, it will split apart the results and submit jobs for the blocks of data it needs processed. In this example worker/client setup, we are using JSON as a simple way to serialize/deserialize data being sent back and forth: require 'rubygems'require 'gearman'require 'json'require './functions.rb'client = Gearman::Client.new('localhost:4730')taskset = Gearman::TaskSet.new(client)countries = ["china", "us", "greece", "italy"]jobcount = 1lines = []results = {}(1..100).each { |i| lines << countries[i % 4] }blocks = map_data(lines, 20)blocks.each do |block|# Generate a task with a unique iduniq = rand(36**8).to_s(36)task = Gearman::Task.new('count_countries',JSON.dump(block),:uniq =>uniq)# When the task is complete, add its results into ourstask.on_complete do |d|# We are passing data back and forth as JSON, so# decode it to a hash and then iterate over the# k=>v pairsJSON.parse(d).each do |k,v|results[k] ||= 0results[k] += vendendtaskset.add_task(task)puts "Submitted job #{jobcount}"jobcount += 1endputs "Submitted all jobs, waiting for results."start_time = Time.nowtaskset.wait(100)time_diff = (Time.now - start_time).to_iputs "Took #{time_diff} seconds: #{results.inspect}" This client uses a few new concepts that were not used in the introductory examples, that is, task sets and unique identifiers. In the Ruby client, a task set is a group of tasks that are submitted together and can be waited upon collectively. To generate a task set, you construct it by giving it the client that you want to submit the task set with: taskset = Gearman::TaskSet.new(client) Then you can create and add tasks to the task set: task = Gearman::Task.new('count_countries',JSON.dump(block), :uniq =>uniq)taskset.add_task(task) Finally, you tell the task set how long you want to wait for the results: taskset.wait(100) This will block the program until the timeout passes, or all the tasks in the task set complete hold true (again, complete does necessarily mean that the worker succeeded at the task, but that it saw it to completion). In this example, it will wait 100 seconds for all the tasks to complete before giving up on them. This doesn't mean that the jobs won't complete if the client disconnects, just that the client won't see the end results (which may or may not be acceptable). The worker To complete the distributed MapReduce example, we need to implement the worker that is responsible for performing the actual data processing. The worker will perform the following tasks: Receive a list of countries serialized as JSON from the manager Decode that JSON data into a Ruby structure Perform the reduce operation on the data converting the list of countries into a corresponding hash of counts Serialize the hash of counts as a JSON string Return the JSON string to the manager (to be passed on to the client) require 'rubygems'require 'gearman'require 'json'require './functions.rb'Gearman::Util.logger.level = Logger::DEBUG@servers = ['localhost:4730']w = Gearman::Worker.new(@servers)w.add_ability('count_countries') do |json_data,job|puts "Received: #{json_data}"data = JSON.parse(json_data)result = reduce_data(data)puts "Result: #{result.inspect}"returndata = JSON.dump(result)puts "Returning #{returndata}"sleep 4returndataendloop { w.work } Notice that we have introduced a slight delay in returning the results by instructing our worker to sleep for four seconds before returning the data. This is here in order to simulate a job that takes a while to process. To run this example, we will repeat the exercise from the first section. Save the contents of the client to a file called mapreduce_client.rb, and then contents of the worker to a file named mapreduce_worker.rb in the same directory as the functions.rb file. Then, start the worker first by running the following: ruby mapreduce_worker.rb And then start the client by running the following: ruby mapreduce_client.rb When you run these scripts, the worker will be waiting to pick up jobs, and then the client will generate five jobs, each with a block containing a list of countries to be counted, and submit them to the manager. These jobs will be picked up by the worker and then processed, one at a time, until they are all complete. As a result there will be a twenty second difference between when the jobs are submitted and when they are completed. Parallelizing the pipeline Implementing the solution this way clearly doesn't gain us much performance from the original example. In fact, it is going to be slower (even ignoring the four second sleep inside each job execution) than the original because there is time involved in serialization and deserialization of the data, transmitting the data between the actors, and transmitting the results between the actors. The goal of this exercise is to demonstrate building a system that can increase the number of workers and parallelize the processing of data, which we will see in the following exercise. To demonstrate the power of parallel processing, we can now run two copies of the worker. Simply open a new shell and execute the worker via ruby mapreduce_worker.rb and this will spin up a second copy of the worker that is ready to process jobs. Now, run the client a second time and observe the behavior. You will see that the client has completed in twelve seconds instead of twenty. Why not ten? Remember that we submitted five jobs, and each will take four seconds. Five jobs do not get divided evenly between two workers and so one worker will acquire three jobs instead of two, which will take it an additional four seconds to complete: [user@host]% ruby mapreduce_client.rbSubmitted job 1Submitted job 2Submitted job 3Submitted job 4Submitted job 5Submitted all jobs, waiting for results.Took 12 seconds: {"us"=>25, "greece"=>25, "italy"=>25, "china"=>25} Feel free to experiment with the various parameters of the system such as running more workers, increasing the number of records that are being processed, or adjusting the amount of time that the worker sleeps during a job. While this example does not involve processing enormous quantities of data, hopefully you can see how this can be expanded for future growth. Summary In this article, we have discussed MapReduce technique. Hope this article gives you a glimpse of how the book flows. Resources for Article : Further resources on this subject: BPMN 2.0 Concepts and The Sales Quote Process [Article] Simplifying Parallelism Complexity in C# [Article] Oracle BPM Suite 11gR1: Creating a BPM Application [Article]
Read more
  • 0
  • 0
  • 4610
Packt
08 Aug 2013
4 min read
Save for later

Ext.NET – Understanding Direct Methods and Direct Events

Packt
08 Aug 2013
4 min read
(For more resources related to this topic, see here.) How to do it... The steps to handle events raised by different controls are as follows: Open the Pack.Ext2.Examples solution Press F5 or click on the Start button to run the solution. Click on the Direct Methods & Events hyperlink. This will run the example code for this recipe. Familiarize yourself with the code behind and the client-side markup. How it works... Applying the [DirectMethod(namespace="ExtNetExample")] attribute to the server-side method GetDateTime(int timeDiff) has exposed this method to our client-side code with the namespace of ExtNetExample, which we append to the method name call on the client side. As we can see in the example code, we call this server method in the markup using the Ext.NET button btnDateTime and the code ExtNetExamples.GetDateTime(3). When the call hits the server, we update the Ext.NET control lblDateTime text property, which updates the control related to the property. Adding namespace="ExtNetExample" allows us to neatly group server-side methods and the JavaScript calls in our code. A good notation is CompanyName.ProjectName. BusinessDomain.MethodName. Without applying the namespace attribute, we would access our server-side method using the default namespace of App.direct. So, to call the GetDateTime method without the namespace attribute, we would use App.direct. GetDateTime(3). We can also see how to return a response from Direct Method to the client-side JavaScript. If a Direct Method returns a value, it is sent back to the success function defined in a configuration object. This configuration object contains a number of functions, properties, and objects. We have dealt with the two most common functions in our example, the success and failure responses. The server-side method GetCar()returns a custom object called Car. If the btnReturnResponse button is clicked on and GetCar() successfully returns a response, we can access the value when Ext.NET calls the JavaScript function named in the success configuration object CarResponseSuccess. This JavaScript function accepts the response parameter from the method and we can process it accordingly. The response parameter is serialized into JSON, and so object values can be accessed using the JavaScript object notation of object.propertyValue. Note that we alert the FirstRegistered property of the Car object returned. Likewise, if a failure response is received, we call the client-side method CarResponseFailure alerting the response, which is a string value. There are a number of other properties that form a part of the configuration object, which can be accessed as part of the callback, for example, failure to return a response. Please refer to the Direct Methods Overview Ext.NET examples website (http://examples.ext.net/#/ Events/DirectMethods/Overview/ ). To demonstrate DirectEvent in action, we've declared a button called btnFireEvent and secondly, a checkbox called chkFireEvent. Note that each control points to the same DirectEvent method called WhoFiredMe. You'll notice that in the markup we declare the WhoFiredMe method using the OnEvent property of the controls. This means that when the Click event is fired on the btnFireEvent button and the Change event is fired on the chkFireEvent checkbox, a request to the server is made where we call the WhoFiredMe method. From this, we can get the control that invoked the request via the object sender parameter and the arguments of the event using the DirectEventArgs e method. Note that we don't have to decorate the DirectEvent method, WhoFiredMe, with any attributes. Ext.NET takes care of all the plumbing. We just need to specify the method, which needs to be called on the server. There's more... Raising DirectMethods is far more flexible in terms of being able to specify the parameters you want to send to the server. You also have the ability to send the control objects to the server or to client-side functions using the #{controlId} notation. It is generally not a good idea though to send the whole control to the server from a Direct Method, as Ext.NET controls can contain references to themselves. Therefore, when Ext.NET encodes the control, it can end up in an infinite loop, and you will end up breaking your code. With a DirectEvent method, you can send extra parameters to the server using the ExtraParams property inside the controls event element. This can then be accessed using the e parameter on the server. Summary In this article we discussed about how to connect client-side and server-side code. Resources for Article : Further resources on this subject: Working with Microsoft Dynamics AX and .NET: Part 1 [Article] Working with Microsoft Dynamics AX and .NET: Part 2 [Article] Dynamically enable a control (Become an expert) [Article]
Read more
  • 0
  • 0
  • 5391

article-image-setting-node
Packt
07 Aug 2013
10 min read
Save for later

Setting up Node

Packt
07 Aug 2013
10 min read
(For more resources related to this topic, see here.) System requirements Node runs on POSIX-like operating systems, the various UNIX derivatives (Solaris, and so on), or workalikes (Linux, Mac OS X, and so on), as well as on Microsoft Windows, thanks to the extensive assistance from Microsoft. Indeed, many of the Node built-in functions are direct corollaries to POSIX system calls. It can run on machines both large and small, including the tiny ARM devices such as the Raspberry Pi microscale embeddable computer for DIY software/hardware projects. Node is now available via package management systems, limiting the need to compile and install from source. Installing from source requires having a C compiler (such as GCC), and Python 2.7 (or later). If you plan to use encryption in your networking code you will also need the OpenSSL cryptographic library. The modern UNIX derivatives almost certainly come with these, and Node's configure script (see later when we download and configure the source) will detect their presence. If you should have to install them, Python is available at http://python.org and OpenSSL is available at http://openssl.org. Installing Node using package managers The preferred method for installing Node, now, is to use the versions available in package managers such as apt-get, or MacPorts. Package managers simplify your life by helping to maintain the current version of the software on your computer and ensuring to update dependent packages as necessary, all by typing a simple command such as apt-get update. Let's go over this first. Installing on Mac OS X with MacPorts The MacPorts project (http://www.macports.org/) has for years been packaging a long list of open source software packages for Mac OS X, and they have packaged Node. After you have installed MacPorts using the installer on their website, installing Node is pretty much this simple: $ sudo port search nodejs nodejs @0.10.6 (devel, net) Evented I/O for V8 JavaScript nodejs-devel @0.11.2 (devel, net) Evented I/O for V8 JavaScript Found 2 ports. -- npm @1.2.21 (devel) node package manager $ sudo port install nodejs npm .. long log of downloading and installing prerequisites and Node Installing on Mac OS X with Homebrew Homebrew is another open source software package manager for Mac OS X, which some say is the perfect replacement for MacPorts. It is available through their home page at http://mxcl.github.com/homebrew/. After installing Homebrew using the instructions on their website, using it to install Node is as simple as this: $ brew search node leafnode node $ brew install node ==> Downloading http://nodejs.org/dist/v0.10.7/node-v0.10.7.tar.gz ######################################################################## 100.0% ==> ./configure –prefix=/usr/local/Cellar/node/0.10.7 ==> make install ==> Caveats Homebrew installed npm. We recommend prepending the following path to your PATH environment variable to have npm-installed binaries picked up: /usr/local/share/npm/bin ==> Summary /usr/local/Cellar/node/0.10.7: 870 files, 16M, built in 21.9 minutes Installing on Linux from package management systems While it's still premature for Linux distributions or other operating systems to prepackage Node with their OS, that doesn't mean you cannot install it using the package managers. Instructions on the Node wiki currently list packaged versions of Node for Debian, Ubuntu, OpenSUSE, and Arch Linux. See: https://github.com/joyent/node/wiki/Installing-Node.js-via-package-manager For example, on Debian sid (unstable): # apt-get update # apt-get install nodejs # Documentation is great. And on Ubuntu: # sudo apt-get install python-software-properties # sudo add-apt-repository ppa:chris-lea/node.js # sudo apt-get update # sudo apt-get install nodejs npm We can expect in due course that the Linux distros and other operating systems will routinely bundle Node into the OS like they do with other languages today. Installing the Node distribution from nodejs.org The nodejs.org website offers prebuilt binaries for Windows, Mac OS X, Linux, and Solaris. You simply go to the website, click on the Install button, and run the installer. For systems with package managers, such as the ones we've just discussed, it's preferable to use that installation method. That's because you'll find it easier to stay up-to-date with the latest version. However, on Windows this method may be preferred. For Mac OS X, the installer is a PKG file giving the typical installation process. For Windows, the installer simply takes you through the typical install wizard process. Once finished with the installer, you have a command line tool with which to run Node programs. The pre-packaged installers are the simplest ways to install Node, for those systems for which they're available. Installing Node on Windows using Chocolatey Gallery Chocolatey Gallery is a package management system, built on top of NuGet. Using it requires a Windows machine modern enough to support the Powershell and the .NET Framework 4.0. Once you have Chocolatey Gallery (http://chocolatey.org/), installing Node is as simple as this: C:> cinst install nodejs Installing the StrongLoop Node distribution StrongLoop (http://strongloop.com) has put together a supported version of Node that is prepackaged with several useful tools. This is a Node distribution in the same sense in which Fedora or Ubuntu are Linux distributions. StrongLoop brings together several useful packages, some of which were written by StrongLoop. StrongLoop tests the packages together, and distributes installable bundles through their website. The packages in the distribution include Express, Passport, Mongoose, Socket.IO, Engine.IO, Async, and Request. We will use all of those modules in this book. To install, navigate to the company home page and click on the Products link. They offer downloads of precompiled packages for both RPM and Debian Linux systems, as well as Mac OS X and Windows. Simply download the appropriate bundle for your system. For the RPM bundle, type the following: $ sudo rpm -i bundle-file-name For the Debian bundle, type the following: $ sudo dpkg -i bundle-file-name The Windows or Mac bundles are the usual sort of installable packages for each system. Simply double-click on the installer bundle, and follow the instructions in the install wizard. Once StrongLoop Node is installed, it provides not only the nodeand npmcommands (we'll go over these in a few pages), but also the slnodecommand. That command offers a superset of the npmcommands, such as boilerplate code for modules, web applications, or command-line applications. Installing from source on POSIX-like systems Installing the pre-packaged Node distributions is currently the preferred installation method. However, installing Node from source is desirable in a few situations: It could let you optimize the compiler settings as desired It could let you cross-compile, say for an embedded ARM system You might need to keep multiple Node builds for testing You might be working on Node itself Now that you have the high-level view, let's get our hands dirty mucking around in some build scripts. The general process follows the usual configure, make, and makeinstallroutine that you may already have performed with other open source software packages. If not, don't worry, we'll guide you through the process. The official installation instructions are in the Node wiki at https://github.com/joyent/node/wiki/Installation. Installing prerequisites As noted a minute ago, there are three prerequisites, a C compiler, Python, and the OpenSSL libraries. The Node installation process checks for their presence and will fail if the C compiler or Python is not present. The specific method of installing these is dependent on your operating system. These commands will check for their presence: $ cc --version i686-apple-darwin10-gcc-4.2.1 (GCC) 4.2.1 (Apple Inc. build 5666) (dot 3) Copyright (C) 2007 Free Software Foundation, Inc. This is free software; see the source for copying conditions. There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. $ python Python 2.6.6 (r266:84292, Feb 15 2011, 01:35:25) [GCC 4.2.1 (Apple Inc. build 5664)] on darwin Type "help", "copyright", "credits" or "license" for more information. >>> Installing developer tools on Mac OS X The developer tools (such as GCC) are an optional installation on Mac OS X. There are two ways to get those tools, both of which are free. On the OS X installation DVD is a directory labeled Optional Installs, in which there is a package installer for—among other things—the developer tools, including Xcode. The other method is to download the latest copy of Xcode (for free) from http://developer.apple.com/xcode/. Most other POSIX-like systems, such as Linux, include a C compiler with the base system. Installing from source for all POSIX-like systems First, download the source from http://nodejs.org/download. One way to do this is with your browser, and another way is as follows: $ mkdir src $ cd src $ wget http://nodejs.org/dist/v0.10.7/node-v0.10.7.tar.gz $ tar xvfz node-v0.10.7.tar.gz $ cd node-v0.10.7 The next step is to configure the source so that it can be built. It is done with the typical sort of configure script and you can see its long list of options by running the following: $ ./configure –help. To cause the installation to land in your home directory, run it this way: $ ./configure –prefix=$HOME/node/0.10.7 ..output from configure If you want to install Node in a system-wide directory simply leave off the -prefixoption, and it will default to installing in /usr/local. After a moment it'll stop and more likely configure the source tree for installation in your chosen directory. If this doesn't succeed it will print a message about something that needs to be fixed. Once the configure script is satisfied, you can go on to the next step. With the configure script satisfied, compile the software: $ make .. a long log of compiler output is printed $ make install If you are installing into a system-wide directory do the last step this way instead: $ make $ sudo make install Once installed you should make sure to add the installation directory to your PATHvariable as follows: $ echo 'export PATH=$HOME/node/0.10.7/bin:${PATH}' >>~/.bashrc $ . ~/.bashrc For cshusers, use this syntax to make an exported environment variable: $ echo 'setenv PATH $HOME/node/0.10.7/bin:${PATH}' >>~/.cshrc $ source ~/.cshrc This should result in some directories like this: $ ls ~/node/0.10.7/ bin include lib share $ ls ~/node/0.10.7/bin node node-waf npm Maintaining multiple Node installs simultaneously Normally you won't have multiple versions of Node installed, and doing so adds complexity to your system. But if you are hacking on Node itself, or are testing against different Node releases, or any of several similar situations, you may want to have multiple Node installations. The method to do so is a simple variation on what we've already discussed. If you noticed during the instructions discussed earlier, the –prefixoption was used in a way that directly supports installing several Node versions side-by-side in the same directory: $ ./configure –prefix=$HOME/node/0.10.7 And: $ ./configure –prefix=/usr/local/node/0.10.7 This initial step determines the install directory. Clearly when Version 0.10.7, Version 0.12.15, or whichever version is released, you can change the install prefix to have the new version installed side-by-side with the previous versions. To switch between Node versions is simply a matter of changing the PATHvariable (on POSIX systems), as follows: $ export PATH=/usr/local/node/0.10.7/bin:${PATH} It starts to be a little tedious to maintain this after a while. For each release, you have to set up Node, npm, and any third-party modules you desire in your Node install; also the command shown to change your PATHis not quite optimal. Inventive programmers have created several version managers to make this easier by automatically setting up not only Node, but npmalso, and providing commands to change your PATHthe smart way: Node version manager: https://github.com/visionmedia/n Nodefront, aids in rapid frontend development: http://karthikv.github.io/nodefront/
Read more
  • 0
  • 0
  • 3130
Modal Close icon
Modal Close icon