Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Save more on your purchases! discount-offer-chevron-icon
Savings automatically calculated. No voucher code required.
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Newsletter Hub
Free Learning
Arrow right icon
timer SALE ENDS IN
0 Days
:
00 Hours
:
00 Minutes
:
00 Seconds

How-To Tutorials

7019 Articles
article-image-leveraging-python-world-big-data
Packt
07 Sep 2015
26 min read
Save for later

Leveraging Python in the World of Big Data

Packt
07 Sep 2015
26 min read
 We are generating more and more data day by day. We have generated more data this century than in the previous century and we are currently only 15 years into this century. big data is the new buzz word and everyone is talking about it. It brings new possibilities. Google Translate is able to translate any language, thanks to big data. We are able to decode our human genome due to it. We can predict the failure of a turbine and do the required maintenance on it because of big data. There are three Vs of big data and they are defined as follows: Volume: This defines the size of the data. Facebook has petabytes of data on its users. Velocity: This is the rate at which data is generated. Variety: Data is not only in a tabular form. We can get data from text, images, and sound. Data comes in the form of JSON, XML, and other types as well. Let's take a look at the following screenshot:   In this article by Samir Madhavan, author of Mastering Python for Data Science, we'll learn how to use Python in the world of big data by doing the following: Understanding Hadoop Writing a MapReduce program in Python Using a Hadoop library (For more resources related to this topic, see here.) What is Hadoop? According to the Apache Hadoop's website, Hadoop stores data in a distributed manner and helps in computing it. It has been designed to scale easily to any number of machines with the help of computing power and storage. Hadoop was created by Doug Cutting and Mike Cafarella in the year 2005. It was named after Doug Cutting's son's toy elephant.   The programming model Hadoop is a programming paradigm that takes a large distributed computation as a sequence of distributed operations on large datasets of key-value pairs. The MapReduce framework makes use of a cluster of machines and executes MapReduce jobs across these machines. There are two phases in MapReduce—a mapping phase and a reduce phase. The input data to MapReduce is key value pairs of data. During the mapping phase, Hadoop splits the data into smaller pieces, which is then fed to the mappers. These mappers are distributed across machines within the cluster. Each mapper takes the input key-value pairs and generates intermediate key-value pairs by invoking a user-defined function within them. After the mapper phase, Hadoop sorts the intermediate dataset by key and generates a set of key-value tuples so that all the values belonging to a particular key are together. During the reduce phase, the reducer takes in the intermediate key-value pair and invokes a user-defined function, which then generates a output key-value pair. Hadoop distributes the reducers across the machines and assigns a set of key-value pairs to each of the reducers.  Data processing through MapReduce The MapReduce architecture MapReduce has a master-slave architecture, where the master is the JobTracker and TaskTracker is the slave. When a MapReduce program is submitted to Hadoop, the JobTracker assigns the mapping/reducing task to the TaskTracker and it takes of the task over executing the program. The Hadoop DFS Hadoop's distributed filesystem has been designed to store very large datasets in a distributed manner. It has been inspired by the Google File system, which is a proprietary distributed filesystem designed by Google. The data in HDFS is stored in a sequence of blocks, and all blocks are of the same size except for the last block. The block sizes are configurable in Hadoop. Hadoop's DFS architecture It also has a master/slave architecture where NameNode is the master machine and DataNode is the slave machine. The actual data is stored in the data node. The NameNode keeps a tab on where certain kinds of data are stored and whether it has the required replication. It also helps in managing a filesystem by creating, deleting, and moving directories and files in the filesystem. Python MapReduce Hadoop can be downloaded and installed from https://hadoop.apache.org/. We'll be using the Hadoop streaming API to execute our Python MapReduce program in Hadoop. The Hadoop Streaming API helps in using any program that has a standard input and output as a MapReduce program. We'll be writing three MapReduce programs using Python, they are as follows: A basic word count Getting the sentiment Score of each review Getting the overall sentiment score from all the reviews The basic word count We'll start with the word count MapReduce. Save the following code in a word_mapper.py file: import sys for l in sys.stdin: # Trailing and Leading white space is removed l = l.strip() # words in the line is split word_tokens = l.split() # Key Value pair is outputted for w in word_tokens: print '%st%s' % (w, 1) In the preceding mapper code, each line of the file is stripped of the leading and trailing white spaces. The line is then divided into tokens of words and then these tokens of words are outputted as a key value pair of 1. Save the following code in a word_reducer.py file: from operator import itemgetter import sys current_word_token = None counter = 0 word = None # STDIN Input for l in sys.stdin: # Trailing and Leading white space is removed l = l.strip() # input from the mapper is parsed word_token, counter = l.split('t', 1) # count is converted to int try: counter = int(counter) except ValueError: # if count is not a number then ignore the line continue #Since Hadoop sorts the mapper output by key, the following # if else statement works if current_word_token == word_token: current_counter += counter else: if current_word_token: print '%st%s' % (current_word_token, current_counter) current_counter = counter current_word_token = word_token # The last word is outputed if current_word_token == word_token: print '%st%s' % (current_word_token, current_counter) In the preceding code, we use the current_word_token parameter to keep track of the current word that is being counted. In the for loop, we use the word_token parameter and a counter to get the value out of the key-value pair. We then convert the counter to an int type. In the if/else statement, if the word_token value is same as the previous instance, which is current_word_token, then we keep counting else statement's value. If it's a new word that has come as the output, then we output the word and its count. The last if statement is to output the last word. We can check out if the mapper is working fine by using the following command: $ echo 'dolly dolly max max jack tim max' | ./BigData/word_mapper.py The output of the preceding command is shown as follows: dolly1 dolly1 max1 max1 jack1 tim1 max1 Now, we can check if the reducer is also working fine by piping the reducer to the sorted list of the mapper output: $ echo "dolly dolly max max jack tim max" | ./BigData/word_mapper.py | sort -k1,1 | ./BigData/word_reducer.py The output of the preceding command is shown as follows: dolly2 jack1 max3 tim1 Now, let's try to apply the same code on a local file containing the summary of mobydick: $ cat ./Data/mobydick_summary.txt | ./BigData/word_mapper.py | sort -k1,1 | ./BigData/word_reducer.py The output of the preceding command is shown as follows: a28 A2 abilities1 aboard3 about2 A sentiment score for each review We'll extend this to write a MapReduce program to determine the sentiment score for each review. Write the following code in the senti_mapper.py file: import sys import re positive_words = open('positive-words.txt').read().split('n') negative_words = open('negative-words.txt').read().split('n') def sentiment_score(text, pos_list, neg_list): positive_score = 0 negative_score = 0 for w in text.split(''): if w in pos_list: positive_score+=1 if w in neg_list: negative_score+=1 return positive_score - negative_score for l in sys.stdin: # Trailing and Leading white space is removed l = l.strip() #Convert to lower case l = l.lower() #Getting the sentiment score score = sentiment_score(l, positive_words, negative_words) # Key Value pair is outputted print '%st%s' % (l, score) In the preceding code, we used the sentiment_score function, which was designed to give the sentiment score as output. For each line, we strip the leading and trailing white spaces and then get the sentiment score for a review. Finally, we output a sentence and the score. For this program, we don't require a reducer as we can calculate the sentiment in the mapper itself and we just have to output the sentiment score. Let's test whether the mapper is working fine locally with a file containing the reviews for Jurassic World: $ cat ./Data/jurassic_world_review.txt | ./BigData/senti_mapper.py there is plenty here to divert, but little to leave you enraptored. such is the fate of the sequel: bigger. louder. fewer teeth.0 if you limit your expectations for jurassic world to "more teeth," it will deliver on that promise. if you dare to hope for anything more-relatable characters, narrative coherence-you'll only set yourself up for disappointment.-1 there's a problem when the most complex character in a film is the dinosaur-2 not so much another bloated sequel as it is the fruition of dreams deferred in the previous films. too bad the genre dictates that those dreams are once again destined for disaster.-2 We can see that our program is able to calculate the sentiment score well. The overall sentiment score To calculate the overall sentiment score, we would require the reducer and we'll use the same mapper but with slight modifications. Here is the mapper code that we'll use stored in the overall_senti_mapper.py file: import sys import hashlib positive_words = open('./Data/positive-words.txt').read().split('n') negative_words = open('./Data/negative-words.txt').read().split('n') def sentiment_score(text, pos_list, neg_list): positive_score = 0 negative_score = 0 for w in text.split(''): if w in pos_list: positive_score+=1 if w in neg_list: negative_score+=1 return positive_score - negative_score for l in sys.stdin: # Trailing and Leading white space is removed l = l.strip() #Convert to lower case l = l.lower() #Getting the sentiment score score = sentiment_score(l, positive_words, negative_words) #Hashing the review to use it as a string hash_object = hashlib.md5(l) # Key Value pair is outputted print '%st%s' % (hash_object.hexdigest(), score) This mapper code is similar to the previous mapper code, but here we use the MD5 hash library to review and then to get the output as the key. Here is the reducer code that is utilized to determine the overall sentiments score of the movie. Store the following code in the overall_senti_reducer.py file: from operator import itemgetter import sys total_score = 0 # STDIN Input for l in sys.stdin: # input from the mapper is parsed key, score = l.split('t', 1) # count is converted to int try: score = int(score) except ValueError: # if score is not a number then ignore the line continue #Updating the total score total_score += score print '%s' % (total_score,) In the preceding code, we strip the value containing the score and we then keep adding to the total_score variable. Finally, we output the total_score variable, which shows the sentiment of the movie. Let's locally test the overall sentiment on Jurassic World, which is a good movie, and then test the sentiment for the movie, Unfinished Business, which was critically deemed poor: $ cat ./Data/jurassic_world_review.txt | ./BigData/overall_senti_mapper.py | sort -k1,1 | ./BigData/overall_senti_reducer.py 19 $ cat ./Data/unfinished_business_review.txt | ./BigData/overall_senti_mapper.py | sort -k1,1 | ./BigData/overall_senti_reducer.py -8 We can see that our code is working well and we also see that Jurassic World has a more positive score, which means that people have liked it a lot. On the contrary, Unfinished Business has a negative value, which shows that people haven't liked it much. Deploying the MapReduce code on Hadoop We'll create a directory for data on Moby Dick, Jurassic World, and Unfinished Business in the HDFS tmp folder: $ Hadoop fs -mkdir /tmp/moby_dick $ Hadoop fs -mkdir /tmp/jurassic_world $ Hadoop fs -mkdir /tmp/unfinished_business Let's check if the folders are created: $ Hadoop fs -ls /tmp/ Found 6 items drwxrwxrwx - mapred Hadoop 0 2014-11-14 15:42 /tmp/Hadoop-mapred drwxr-xr-x - samzer Hadoop 0 2015-06-18 18:31 /tmp/jurassic_world drwxrwxrwx - hdfs Hadoop 0 2014-11-14 15:41 /tmp/mapred drwxr-xr-x - samzer Hadoop 0 2015-06-18 18:31 /tmp/moby_dick drwxr-xr-x - samzer Hadoop 0 2015-06-16 18:17 /tmp/temp635459726 drwxr-xr-x - samzer Hadoop 0 2015-06-18 18:31 /tmp/unfinished_business Once the folders are created, let's copy the data files to the respective folders. $ Hadoop fs -copyFromLocal ./Data/mobydick_summary.txt /tmp/moby_dick $ Hadoop fs -copyFromLocal ./Data/jurassic_world_review.txt /tmp/jurassic_world $ Hadoop fs -copyFromLocal ./Data/unfinished_business_review.txt /tmp/unfinished_business Let's verify that the file is copied: $ Hadoop fs -ls /tmp/moby_dick $ Hadoop fs -ls /tmp/jurassic_world $ Hadoop fs -ls /tmp/unfinished_business Found 1 items -rw-r--r-- 3 samzer Hadoop 5973 2015-06-18 18:34 /tmp/moby_dick/mobydick_summary.txt Found 1 items -rw-r--r-- 3 samzer Hadoop 3185 2015-06-18 18:34 /tmp/jurassic_world/jurassic_world_review.txt Found 1 items -rw-r--r-- 3 samzer Hadoop 2294 2015-06-18 18:34 /tmp/unfinished_business/unfinished_business_review.txt We can see that files have been copied successfully. With the following command, we'll execute our mapper and reducer's script in Hadoop. In this command, we define the mapper, reducer, input, and output file locations, and then use Hadoop streaming to execute our scripts. Let's execute the word count program first: $ Hadoop jar /usr/lib/Hadoop-0.20-mapreduce/contrib/streaming/Hadoop-*streaming*.jar -file ./BigData/word_mapper.py -mapper word_mapper.py -file ./BigData/word_reducer.py -reducer word_reducer.py -input /tmp/moby_dick/* -output /tmp/moby_output Let's verify that the word count MapReduce program is working successfully: $ Hadoop fs -cat /tmp/moby_output/* The output of the preceding command is shown as follows: (Queequeg1 A2 Africa1 Africa,1 After1 Ahab13 Ahab,1 Ahab's6 All1 American1 As1 At1 Bedford,1 Bildad1 Bildad,1 Boomer,2 Captain1 Christmas1 Day1 Delight,1 Dick6 Dick,2 The program is working as intended. Now, we'll deploy the program that calculates the sentiment score for each of the reviews. Note that we can add the positive and negative dictionary files to the Hadoop streaming: $ Hadoop jar /usr/lib/hadoop-0.20-mapreduce/contrib/streaming/hadoop-*streaming*.jar -file ./BigData/word_mapper.py -mapper word_mapper.py -file ./BigData/word_reducer.py -reducer word_reducer.py -input /tmp/moby_dick/* -output /tmp/moby_output In the preceding code, we use the Hadoop command with the Hadoop streaming JAR file and then define the mapper and reducer files, and finally, the input and output directories in Hadoop. Let's check the sentiments score of the movies review: $ Hadoop fs -cat /tmp/jurassic_output/* The output of the preceding command is shown as follows: "jurassic world," like its predecessors, fills up the screen with roaring, slathering, earth-shaking dinosaurs, then fills in mere humans around the edges. it's a formula that works as well in 2015 as it did in 1993.3 a perfectly fine movie and entertaining enough to keep you watching until the closing credits.4 an angry movie with a tragic moral ... meta-adoration and criticism ends with a genetically modified dinosaur fighting off waves of dinosaurs.-3 if you limit your expectations for jurassic world to "more teeth," it will deliver on that promise. if you dare to hope for anything more-relatable characters, narrative coherence-you'll only set yourself up for disappointment.-1 This program is also working as intended. Now, we'll try out the overall sentiment of a movie: $ Hadoop jar /usr/lib/Hadoop-0.20-mapreduce/contrib/streaming/Hadoop-*streaming*.jar -file ./BigData/overall_senti_mapper.py -mapper Let's verify the result: $ Hadoop fs -cat /tmp/unfinished_business_output/* The output of the preceding command is shown as follows: -8 We can see that the overall sentiment score comes out correctly from MapReduce. Here is a screenshot of the JobTracker status page:   The preceding image shows a portal where the jobs submitted to the JobTracker can be viewed and the status can be seen. This can be seen on port 50070 of the master system. From the preceding image, we can see that a job is running, and the status above the image shows that the job has been completed successfully. File handling with Hadoopy Hadoopy is a library in Python, which provides an API to interact with Hadoop to manage files and perform MapReduce on it. Hadoopy can be downloaded from http://www.Hadoopy.com/en/latest/tutorial.html#installing-Hadoopy. Let's try to put a few files in Hadoop through Hadoopy in a directory created within HDFS, called data: $ Hadoop fs -mkdir data Here is the code that puts the data into HDFS: importHadoopy import os hdfs_path = '' def read_local_dir(local_path): for fn in os.listdir(local_path): path = os.path.join(local_path, fn) if os.path.isfile(path): yield path def main(): local_path = './BigData/dummy_data' for file in read_local_dir(local_path): Hadoopy.put(file, 'data') print"The file %s has been put into hdfs"% (file,) if __name__ =='__main__': main() The file ./BigData/dummy_data/test9 has been put into hdfs The file ./BigData/dummy_data/test7 has been put into hdfs The file ./BigData/dummy_data/test1 has been put into hdfs The file ./BigData/dummy_data/test8 has been put into hdfs The file ./BigData/dummy_data/test6 has been put into hdfs The file ./BigData/dummy_data/test5 has been put into hdfs The file ./BigData/dummy_data/test3 has been put into hdfs The file ./BigData/dummy_data/test4 has been put into hdfs The file ./BigData/dummy_data/test2 has been put into hdfs In the preceding code, we list all the files in a directory and then put each of the files into Hadoop using the put() method of Hadoopy. Let's check if all the files have been put into HDFS: $ Hadoop fs -ls data The output of the preceding command is shown as follows: Found 9 items -rw-r--r-- 3 samzer Hadoop 0 2015-06-23 00:19 data/test1 -rw-r--r-- 3 samzer Hadoop 0 2015-06-23 00:19 data/test2 -rw-r--r-- 3 samzer Hadoop 0 2015-06-23 00:19 data/test3 -rw-r--r-- 3 samzer Hadoop 0 2015-06-23 00:19 data/test4 -rw-r--r-- 3 samzer Hadoop 0 2015-06-23 00:19 data/test5 -rw-r--r-- 3 samzer Hadoop 0 2015-06-23 00:19 data/test6 -rw-r--r-- 3 samzer Hadoop 0 2015-06-23 00:19 data/test7 -rw-r--r-- 3 samzer Hadoop 0 2015-06-23 00:19 data/test8 -rw-r--r-- 3 samzer Hadoop 0 2015-06-23 00:19 data/test9 So, we have successfully been able to put files into HDFS. Pig Pig is a platform that has a very expressive language to perform data transformations and querying. The code that is written in Pig is done in a scripting manner and this gets compiled to MapReduce programs, which execute on Hadoop. The following image is the logo of Pig Latin:  The Pig logo Pig helps in reducing the complexity of raw-level MapReduce programs, and enables the user to perform fast transformations. Pig Latin is the textual language that can be learned from http://pig.apache.org/docs/r0.7.0/piglatin_ref2.html. We'll be covering how to perform the top 10 most occurring words with Pig, and then we'll see how you can create a function in Python that can be used in Pig. Let's start with the word count. Here is the Pig Latin code, which you can save in thepig_wordcount.py file: data = load '/tmp/moby_dick/'; word_token = foreach data generate flatten(TOKENIZE((chararray)$0)) as word; group_word_token = group word_token by word; count_word_token = foreach group_word_token generate COUNT(word_token) as cnt, group; sort_word_token = ORDER count_word_token by cnt DESC; top10_word_count = LIMIT sort_word_token 10; DUMP top10_word_count; In the preceding code, we can load the summary of Moby Dick, which is then tokenized line by line and is basically split into individual elements. The flatten function converts a collection of individual word tokens in a line to a row-by-row form. We then group by the words and then take a count of the words for each word. Finally, we sort the count of words in a descending order and then we limit the count of the words to the first 10 rows to get the top 10 most occurring words. Let's execute the preceding pig script: $ pig ./BigData/pig_wordcount.pig The output of the preceding command is shown as follows: (83,the) (36,and) (28,a) (25,of) (24,to) (15,his) (14,Ahab) (14,Moby) (14,is) (14,in) We are able to get our top 10 words. Let's now create a user-defined function with Python, which will be used in Pig. We'll define two user-defined functions to score positive and negative sentiments of a sentence. The following code is the UDF used to score the positive sentiment and it's available in the positive_sentiment.py file: positive_words = [ 'a+', 'abound', 'abounds', 'abundance', 'abundant', 'accessable', 'accessible', 'acclaim', 'acclaimed', 'acclamation', 'acco$ ] @outputSchema("pnum:int") def sentiment_score(text): positive_score = 0 for w in text.split(''): if w in positive_words: positive_score+=1 return positive_score In the preceding code, we define the positive word list, which is used by the sentiment_score() function. The function checks for the positive words in a sentence and finally outputs their total count. There is an outputSchema() decorator that is used to tell Pig what type of data is being outputted, which in our case is int. Here is the code to score the negative sentiment and it's available in the negative_sentiment.py file. The code is almost similar to the positive sentiment: negative_words = ['2-faced', '2-faces', 'abnormal', 'abolish', 'abominable', 'abominably', 'abominate', 'abomination', 'abort', 'aborted', 'ab$....] @outputSchema("nnum:int") def sentiment_score(text): negative_score = 0 for w in text.split(''): if w in negative_words: negative_score-=1 return negative_score The following code is used by Pig to score the sentiments of the Jurassic World reviews and its available in the pig_sentiment.pig file: register 'positive_sentiment.py' using org.apache.pig.scripting.jython.JythonScriptEngine as positive; register 'negative_sentiment.py' using org.apache.pig.scripting.jython.JythonScriptEngine as negative; data = load '/tmp/jurassic_world/*'; feedback_sentiments = foreach data generate LOWER((chararray)$0) as feedback, positive.sentiment_score(LOWER((chararray)$0)) as psenti, negative.sentiment_score(LOWER((chararray)$0)) as nsenti; average_sentiments = foreach feedback,feedback_sentiments generate psenti + nsenti; dump average_sentiments; In the preceding Pig script, we first register the Python UDF scripts using the register command and give them an appropriate name. We then load our Jurassic World review. We then convert our reviews to lowercase and score the positive and negative sentiments of a review. Finally, we add the score to get the overall sentiments of a review. Let's execute the Pig script and see the results: $ pig ./BigData/pig_sentiment.pig The output of the preceding command is shown as follows: (there is plenty here to divert, but little to leave you enraptored. such is the fate of the sequel: bigger. louder. fewer teeth.,0) (if you limit your expectations for jurassic world to "more teeth," it will deliver on that promise. if you dare to hope for anything more-relatable characters, narrative coherence-you'll only set yourself up for disappointment.,-1) (there's a problem when the most complex character in a film is the dinosaur,-2) (not so much another bloated sequel as it is the fruition of dreams deferred in the previous films. too bad the genre dictates that those dreams are once again destined for disaster.,-2) (a perfectly fine movie and entertaining enough to keep you watching until the closing credits.,4) (this fourth installment of the jurassic park film series shows some wear and tear, but there is still some gas left in the tank. time is spent to set up the next film in the series. they will keep making more of these until we stop watching.,0) We have successfully scored the sentiments of the Jurassic World review using the Python UDF in Pig. Python with Apache Spark Apache Spark is a computing framework that works on top of HDFS and provides an alternative way of computing that is similar to MapReduce. It was developed by AmpLab of UC Berkeley. Spark does its computation mostly in the memory because of which, it is much faster than MapReduce, and is well suited for machine learning as it's able to handle iterative workloads really well.   Spark uses the programming abstraction of RDDs (Resilient Distributed Datasets) in which data is logically distributed into partitions, and transformations can be performed on top of this data. Python is one of the languages that is used to interact with Apache Spark, and we'll create a program to perform the sentiment scoring for each review of Jurassic Park as well as the overall sentiment. You can install Apache Spark by following the instructions at https://spark.apache.org/docs/1.0.1/spark-standalone.html. Scoring the sentiment Here is the Python code to score the sentiment: from __future__ import print_function import sys from operator import add from pyspark import SparkContext positive_words = open('positive-words.txt').read().split('n') negative_words = open('negative-words.txt').read().split('n') def sentiment_score(text, pos_list, neg_list): positive_score = 0 negative_score = 0 for w in text.split(''): if w in pos_list: positive_score+=1 if w in neg_list: negative_score+=1 return positive_score - negative_score if __name__ == "__main__": if len(sys.argv) != 2: print("Usage: sentiment <file>", file=sys.stderr) exit(-1) sc = SparkContext(appName="PythonSentiment") lines = sc.textFile(sys.argv[1], 1) scores = lines.map(lambda x: (x, sentiment_score(x.lower(), positive_words, negative_words))) output = scores.collect() for (key, score) in output: print("%s: %i" % (key, score)) sc.stop() In the preceding code, we define our standard sentiment_score() function, which we'll be reusing. The if statement checks whether the Python script and the text file is given. The sc variable is a Spark Context object with the PythonSentiment app name. The filename in the argument is passed into Spark through the textFile() method of the sc variable. In the map() function of Spark, we define a lambda function, where each line of the text file is passed, and then we obtain the line and its respective sentiment score. The output variable gets the result, and finally, we print the result on the screen. Let's score the sentiment of each of the reviews of Jurassic World. Replace the <hostname> with your hostname, this should suffice: $ ~/spark-1.3.0-bin-cdh4/bin/spark-submit --master spark://<hostname>:7077 ./BigData/spark_sentiment.py hdfs://localhost:8020/tmp/jurassic_world/* We'll get the following output for the preceding command: There is plenty here to divert but little to leave you enraptured. Such is the fate of the sequel: Bigger, Louder, Fewer teeth: 0 If you limit your expectations for Jurassic World to more teeth, it will deliver on this promise. If you dare to hope for anything more—relatable characters or narrative coherence—you'll only set yourself up for disappointment:-1 We can see that our Spark program was able to score the sentiment for each of the reviews. The number in the end of the output of the sentiment score shows that if the review has been positive or negative, the higher the number of the sentiment score—the better the review and the more negative the number of the sentiment score—the more negative the review has been. We use the Spark Submit command with the following parameters: A master node of the Spark system A Python script containing the transformation commands An argument to the Python script The overall sentiment Here is a Spark program to score the overall sentiment of all the reviews: from __future__ import print_function import sys from operator import add from pyspark import SparkContext positive_words = open('positive-words.txt').read().split('n') negative_words = open('negative-words.txt').read().split('n') def sentiment_score(text, pos_list, neg_list): positive_score = 0 negative_score = 0 for w in text.split(''): if w in pos_list: positive_score+=1 if w in neg_list: negative_score+=1 return positive_score - negative_score if __name__ =="__main__": if len(sys.argv) != 2: print("Usage: Overall Sentiment <file>", file=sys.stderr) exit(-1) sc = SparkContext(appName="PythonOverallSentiment") lines = sc.textFile(sys.argv[1], 1) scores = lines.map(lambda x: ("Total", sentiment_score(x.lower(), positive_words, negative_words))) .reduceByKey(add) output = scores.collect() for (key, score) in output: print("%s: %i"% (key, score)) sc.stop() In the preceding code, we have added a reduceByKey() method, which reduces the value by adding the output values, and we have also defined the key as Total, so that all the scores are reduced based on a single key. Let's try out the preceding code to get the overall sentiment of Jurassic World. Replace the <hostname> with your hostname, this should suffice: $ ~/spark-1.3.0-bin-cdh4/bin/spark-submit --master spark://<hostname>:7077 ./BigData/spark_overall_sentiment.py hdfs://localhost:8020/tmp/jurassic_world/* The output of the preceding command is shown as follows: Total: 19 We can see that Spark has given an overall sentiment score of 19. The applications that get executed on Spark can be viewed in the browser on the 8080 port of the Spark master. Here is a screenshot of it:   We can see that the number of nodes of Spark, applications that are getting executed currently, and the applications that have been executed. Summary In this article, you were introduced to big data, learned about how the Hadoop software works, and the architecture associated with it. You then learned how to create a mapper and a reducer for a MapReduce program, how to test it locally, and then put it into Hadoop and deploy it. You were then introduced to the Hadoopy library and using this library, you were able to put files into Hadoop. You also learned about Pig and how to create a user-defined function with it. Finally, you learned about Apache Spark, which is an alternative to MapReduce and how to use it to perform distributed computing. With this article, we have come to an end in our journey, and you should be in a state to perform data science tasks with Python. From here on, you can participate in Kaggle Competitions at https://www.kaggle.com/ to improve your data science skills with real-world problems. This will fine-tune your skills and help understand how to solve analytical problems. Also, you can sign up for the Andrew NG course on Machine Learning at https://www.coursera.org/learn/machine-learning to understand the nuances behind machine learning algorithms. Resources for Article: Further resources on this subject: Bizarre Python[article] Predicting Sports Winners with Decision Trees and pandas[article] Optimization in Python [article]
Read more
  • 0
  • 0
  • 4921

article-image-modules-and-templates
Packt
07 Sep 2015
20 min read
Save for later

Modules and Templates

Packt
07 Sep 2015
20 min read
 In this article by Thomas Uphill, author of the book Troubleshooting Puppet, we will look at how the various parts of a module may cause issues. As a Puppet developer or a system administrator, modules are how you deliver your code to the nodes. Modules are great for organizing your code into manageable chunks, but modules are also where you'll see most of your problems when troubleshooting. Most modules contain classes in a manifests directory, but modules can also include custom facts, functions, types, providers, as well as files and templates. Each of these components can be a source of error. We will address each of these components in the following sections, starting with classes. (For more resources related to this topic, see here.) In Puppet, the namespace of classes is referred to as the scope. Classes can have multiple nested levels of subclasses. Each class and subclass defines a scope. Each scope is separate. To refer to variables in a different scope, you must refer to the fully scoped variable name. For instance, in the following example, we have a class and two subclasses with similar names defined within each of the classes: class leader { notify {'Leader-1': } } class autobots { include leader } class autobots::leader { notify {'Optimus Prime': } } class decepticons { include leader } class decepticons::leader { notify {'Megatron': } } We then include the leader, autobots, and decepticons classes in our node, as follows: include leader include autobots include decepticons When we run Puppet, we see the following output: t@mylaptop ~ $ puppet apply leaders.pp Notice: Compiled catalog for mylaptop.example.net in environment production in 0.03 seconds Notice: Optimus Prime Notice: /Stage[main]/Autobots::Leader/Notify[Optimus Prime]/message: defined 'message' as 'Optimus Prime' Notice: Leader-1 Notice: /Stage[main]/Leader/Notify[Leader-1]/message: defined 'message' as 'Leader-1' Notice: Megatron Notice: /Stage[main]/Decepticons::Leader/Notify[Megatron]/message: defined 'message' as 'Megatron' Notice: Finished catalog run in 0.06 seconds If this is the output that you expected, you can safely move on. If you are a little surprised, then read on. The problem here is the scope. Although we have a top scope class named leader, when we include leader from within the autobots and decepticons classes, the local scope is searched first. In both cases, a local match is found first and used. Instead of the three 'Leader-1' notifications, we see only one 'Leader-1', one 'Megatron', and one 'Optimus Prime'. If your normal procedure is to have the leader class defined and you forgot to do so, then you can end up being slightly confused. Consider the following modified example: class leader { notify {'Leader-1': } } class autobots { include leader } include autobots Now, when we apply this manifest, we see the following output: t@mylaptop ~ $ puppet apply leaders2.pp Notice: Compiled catalog for mylaptop.example.net in environment production in 0.02 seconds Notice: Leader-1 Notice: /Stage[main]/Leader/Notify[Leader-1]/message: defined 'message' as 'Leader-1' Notice: Finished catalog run in 0.04 seconds Since the leader class was not available in the scope within the autobot class, the top scope leader class was used. Knowing how Puppet evaluates scope can save you time when your issues turn out to be namespace-related. This example is contrived. The usual situation where people run into this problem is when they have multiple modules organized in the same way. The problem manifests itself when you have many different modules with subclasses in different modules with the same names. For example, two modules named myappa and myappb with config subclasses, myappa::config and myappb::config. This problem occurs when the developer forgets to write the myappc::config subclass and there is a top scope config module available. Metaparameters Metaparameters are parameters that are used by Puppet to compile the catalog but are not used when modifying the target system. Some metaparameters, such as tag, are used to specify or mark resources. Other metaparameters, such as before, require, notify, and subscribe, are used to specify the order in which the resources should be applied to a node. When the catalog is compiled, the resources are evaluated based on their dependencies as opposed to how they are defined in the manifests. The order in which the resources are evaluated can be a little confusing for a person who is new to Puppet. A common paradigm when creating files is to create the containing directory before creating the resource. Consider the following code: class apps { file {'/apps': ensure => 'directory', mode => '0755', } } class myapp { file {'/apps/myapp/config': content => 'on = true', mode => '0644', } file {'/apps/myapp': ensure => 'directory', mode => '0755', } } include myapp include apps When we apply this manifest, even though the order of the resources is not correct in the manifest, the catalog applies correctly, as follows: [root@trouble ~]# puppet apply order.pp Notice: Compiled catalog for trouble.example.com in environment production in 0.13 seconds Notice: /Stage[main]/Apps/File[/apps]/ensure: created Notice: /Stage[main]/Myapp/File[/apps/myapp]/ensure: created Notice: /Stage[main]/Myapp/File[/apps/myapp/config]/ensure: defined content as '{md5}1090eb22d3caa1a3efae39cdfbce5155' Notice: Finished catalog run in 0.05 seconds Recent versions of Puppet will automatically use the require metaparameter for certain resources. In the case of the preceding code, the '/apps/myapp' file has an implied require of the '/apps' file because directories autorequire their parents. We can safely rely on this autorequire mechanism but, when debugging, it is useful to know how to specify the resource order precisely. To ensure that the /apps directory exists before we try to create the /apps/myapp directory, we can use the require metaparameter to have the myapp directory require the /apps directory, as follows: classmyapp { file {'/apps/myapp/config': content => 'on = true', mode => '0644', require => File['/apps/myapp'], } file {'/apps/myapp': ensure => 'directory', mode => '0755', require => File['/apps'], } } The preceding require lines specify that each of the file resources requires its parent directory. Autorequires Certain resource relationships are ubiquitous. When the relationship is implied, a mechanism was developed to reduce resource ordering errors. This mechanism is called autorequire. A list of autorequire relationships is given in the type reference documentation at https://docs.puppetlabs.com/references/latest/type.html. When troubleshooting, you should know that the following autorequire relationships exist: A cron resource will autorequire the specified user. An exec resource will autorequire both the working directory of the exec as a file resource and the user as which the exec runs. A file resource will autorequire its owner and group. A mount will autorequire the mounts that it depends on (a mount resource of /apps/myapp will autorequire a mount resource of /apps). A user resource will autorequire its primary group. Autorequire relationships only work when the resources within the relationship are specified within the catalog. If your catalog does not specify the required resources, then your catalog will fail if those resources are not found on the node. For instance, if you have a mount resource of /apps/myapp but the /apps directory or mount does not exist, then the mount resource will fail. If the /apps mount is specified, then the autorequire mechanism will ensure that the /apps mount is mounted before the /apps/myapp mount. Explicit ordering When you are trying to determine an error in the evaluation of your class, it can be helpful to use the chaining arrow syntax to force your resources to evaluate in the order that you specified. For instance, if you have an exec resource that is failing, you can create another exec resource that outputs the information used within your failing exec. For example, we have the following exec code: file {'arrow': path => '/tmp/arrow', ensure => 'directory', } exec {'arrow_debug_before': command => 'echo debug_before', path => '/usr/bin:/bin', } exec {'arrow_example': command => 'echo arrow', path => '/usr/bin:/bin', require => File['arrow'], } exec {'arrow_debug_after': command => 'echo debug_after', path => '/usr/bin:/bin', } Now, when you apply this catalog, you will see that the arrow_before and arrow_after resources are not applied in the order that we were expecting: [root@trouble ~]# puppet agent -t Info: Retrieving pluginfacts Info: Retrieving plugin Info: Loading facts Info: Caching catalog for trouble.example.com Info: Applying configuration version '1431872398' Notice: /Stage[main]/Main/Node[default]/Exec[arrow_debug_before]/returns: executed successfully Notice: /Stage[main]/Main/Node[default]/Exec[arrow_debug_after]/returns: executed successfully Notice: /Stage[main]/Main/Node[default]/File[arrow]/ensure: created Notice: /Stage[main]/Main/Node[default]/Exec[arrow_example]/returns: executed successfully Notice: Finished catalog run in 0.23 seconds To enforce the sequence that we were expecting, you can use the chaining arrow syntax, as follows: exec {'arrow_debug_before': command => 'echo debug_before', path => '/usr/bin:/bin', }-> exec {'arrow_example': command => 'echo arrow', path => '/usr/bin:/bin', require => File['arrow'], }-> exec {'arrow_debug_after': command => 'echo debug_after', path => '/usr/bin:/bin', } Now, when we apply the agent this time, the order is what we expected: [root@trouble ~]# puppet agent -t Info: Retrieving pluginfacts Info: Retrieving plugin Info: Loading facts Info: Caching catalog for trouble.example.com Info: Applying configuration version '1431872778' Notice: /Stage[main]/Main/Node[default]/Exec[arrow_debug_before]/returns: executed successfully Notice: /Stage[main]/Main/Node[default]/Exec[arrow_example]/returns: executed successfully Notice: /Stage[main]/Main/Node[default]/Exec[arrow_debug_after]/returns: executed successfully Notice: Finished catalog run in 0.23 seconds A good way to use this sort of arrangement is to create an exec resource that outputs the environment information before your failing resource is applied. For example, you can create a class that runs a debug script and then use chaining arrows to have it applied before your failing resource. If your resource uses variables, then creating a notify that outputs the values of the variables can also help with debugging. Defined types Defined types are great for reducing the complexity and improving the readability of your code. However, they can lead to some interesting problems that may be difficult to diagnose. In the following code, we create a defined type that creates a host entry: define myhost ($short,$ip) { host {"$short": ip => $ip, host_aliases => [ "$title.example.com", "$title.example.net", "$short" ], } } In this define, the namevar for the host resource is an argument of the define, the $short variable. In Puppet, there are two important attributes of any resource—the namevar and the title. The confusion lies in the fact that, sometimes, both of these attributes have the same value. Both values must be unique, but they are used differently. The title is used to uniquely identify the resource to the compiler and need not be related to the actual resource. The namevar uniquely identifies the resource to the agent after the catalog is compiled. The namevar is specific to each resource. For example, the namevar for a package is the package name and the namevar for a file is the full path to the file. The problem with the preceding defined type is that you can end up with a duplicate resource that is difficult to find. The resource is defined within the defined type. So, when Puppet reports the duplicate definition, it will report it as though it were defined on the same line. Let's create the following node definition with two myhost resources: node default { $short = "trb" myhost {'trouble': short => 'trb',ip => '192.168.50.1' } myhost {'tribble': short => "$short",ip => '192.168.50.2' } } Even though the two myhost resources have different titles, when we run Puppet, we see a duplicate definition, as follows: [root@trouble~]# puppet agent -t Info: Retrieving pluginfacts Info: Retrieving plugin Info: Loading facts Error: Could not retrieve catalog from remote server: Error 400 on SERVER: Duplicate declaration: Host[trb] is already declared in file /etc/puppet/environments/production/modules/myhost/manifests/init.pp:5; cannot redeclare at /etc/puppet/environments/production/modules/myhost/manifests/init.pp:5 on node trouble.example.com Warning: Not using cache on failed catalog Error: Could not retrieve catalog; skipping run Tracking down this issue can be difficult if we have several myhost definitions throughout the node definition. To make this problem a lot easier to solve, we should use the title attribute of the defined type as the title attribute of the resources within the define method. The following rewrite shows this difference: define myhost ($short,$ip) { host {"$title": ip => $ip, host_aliases => [ "$title.example.com", "$title.example.net", "$short" ], } } Custom facts When you define custom facts within your modules (in the lib/facter directory), they are automatically transferred to your node via the pluginsync method. The issue here is that the facts are synced to the same directory. So, if you created two facts with the same filename, then it can be difficult to determine which fact will be synced down to your node. Facter is run at the beginning of a Puppet agent run. The results of Facter are used to compile the catalog. If any of your facts take longer than the configured timeout (config_timeout in the [agent] section of puppet.conf) in Puppet, then the agent run will fail. Instead of increasing this timeout, when designing your custom facts keep them simple enough so that they will take no longer than a few seconds to run. You can debug Facter from the command line using the -d switch. To load custom facts that are synced from Puppet, add the -p option as well. If you are having trouble with the output of your fact, then you can also have the output formatted as a JSON document by adding the -j option. Combining all of these options, the following is a good starting point for the debugging of your Facter output: [root@puppet ~]# facter -p -d -j |more Found no suitable resolves of 1 for ec2_metadata value for ec2_metadata is still nil Found no suitable resolves of 1 for gce value for gce is still nil ... { "lsbminordistrelease": "6", "puppetversion": "3.7.5", "blockdevice_sda_vendor": "ATA", "ipaddress_lo": "127.0.0.1", ... Having Facter output to a JSON file is helpful because the returned values are wrapped in quotes. So, any trailing spaces or control characters will be visible. The easiest way to debug custom facts is to run them through Ruby directly. To run a custom fact through Ruby, start with the custom fact in the directory and use the irb command to run interactive Ruby, as follows: [root@puppetfacter]# irb -r facter -r iptables_version.rb irb(main):001:0> puts Facter.value("iptables_version") 1.4.7 =>nil This displays the value of the iptables_version fact. From within IRB, you can check the code line-by-line to figure out your problem. The preceding command was executed on a Linux host. Doing this on a Windows host is not so easy, but it is possible. Locate the irb executable on your system. For the Puppet Enterprise installation, this should be in C:Program Files (x86)/Puppet Labs/Puppet Enterprise/sys/ruby/bin. Run irb and then alter the $LOAD_PATH variable to add the path to facter.rb (the Facter library), as follows: irb(main):001:0>$LOAD_PATH.push("C:/Program Files (x86)/Puppet Labs/Puppet Enterprise/facter/lib") Now require the Facter library, as follows: irb(main):002:0> require 'facter' =>true Finally, run Facter.value with the name of a fact, which is similar to what we did in the previous example: irb(main):003:0>Facter.value("uptime") => "0:08 hours" Pry When debugging any Ruby code, using the Pry library will allow you to inspect the Ruby environment that is running at any breakpoint that you define. In the earlier iptables_version example, we could use the Pry library to inspect the calculation of the fact. To do so, modify the fact definition and comment out the setcode section (the breakpoint definition will not work within a setcode block). Then define a breakpoint by adding binding.pry to the fact at the point that you wish to inspect, as follows: Facter.add(:iptables_version) do confine :kernel => :linux #setcode do version = Facter::Util::Resolution.exec('iptables --version') if version version.match(/d+.d+.d+/).to_s else nil end binding.pry #end end Now run Ruby with the Pry and Facter libraries on the iptables_version fact definition, as follows: root@mylaptop # ruby -r pry -r facteriptables_version.rb From: /var/lib/puppet/lib/facter/iptables_version.rb @ line 10 : 5: if version 6: version.match(/d+.d+.d+/).to_s 7: else 8: nil 9: end => 10: binding.pry 11: #end 12: end This will cause the evaluation of the iptables_version fact to halt at the binding.pry line. We can then inspect the value of the version variable and execute the regular expression matching ourselves to verify that it is working correctly, as follows: [1] pry(#<Facter::Util::Resolution>)> version => "iptables v1.4.21" [2] pry(#<Facter::Util::Resolution>)>version.match(/d+.d+.d+/).to_s => "1.4.21" ok Environment When developing custom facts, it is useful to make your Ruby fact file executable and run the Ruby script from the command line. When you run custom facts from the command line, the environment variables defined in your current shell can affect how the fact is calculated. This can result in different values being returned for the fact when it is run through the Puppet agent. One of the most common variables that cause this sort of problem is JAVA_HOME. This can also be a problem when testing the exec resources. Environment variables and shell aliases will be available for exec when it is run interactively. When run through the agent, these customizations will not be available, which has the potential to cause inconsistency. Files Files are transferred between the master and the node via Puppet's internal fileserver. When working with files, it is important to remember that all the files that are served via Puppet are read into memory by the Puppet Server. Transferring large files via Puppet is inefficient. You should avoid transferring large and/or binary files. Most of the problems with files are related to path and URL syntax errors. The source parameter contains a URL with the following syntax: source => "puppet:///path/to/file" In the preceding syntax, the three slashes specify the beginning of the URL location and the Puppet Server that should be contacted. The following is also valid: source => "puppet://myserver/path/to/file" The path from which we can to download a file depends on the context of the manifest. If the manifest is found within the manifest directory or the manifest is the site.pp manifest, then the path to the file is relative to this location starting at the files subdirectory. If the manifest is found within a module, then the path should start with the modules path; then the files will be found within the files directory of the module. Templates ERB templates are written in Ruby. The current releases of Puppet also support EPP Puppet templates, which are written in Puppet. The debugging of ERB templates can be done by running the templates through Ruby. To simply check the syntax, use the following code: $ erb -P -x -T '-' template.erb |ruby -c Syntax OK If your template does not pass the preceding test, then you know that your syntax is incorrect. The usual error type that you will see is as follows: -:8: syntax error, unexpected end-of-input, expecting keyword_end The problem with the preceding command is that the line number is in the evaluated code that is returned by the erb script, not the original file. When checking for the syntax error, you will have to inspect the intermediate code that is generated by the erb command. Unfortunately, doing anything more than checking simple syntax is a problem. Although the ERB templates can be evaluated using the ERB library, the <%= block markers that are used in the Puppet ERB templates break the normal evaluation. The simplest way to evaluate Ruby templates is by creating a simple manifest with a file resource that applies the template. As an example, the resolv.conf template is shown in the following code: # resolv.conf built by Puppet domain<%= @domain %> search<% searchdomains.each do |domain| -%> <%= domain -%><% end -%><%= @domain %> <% nameservers.each do |server| -%> nameserver<%= server %> <% end -%> This template is then saved into a file named template.erb. We then create a file resource using this template.erb file, as shown in the following code: $searchdomains = ['trouble.example.com','packt.example.com'] $nameservers = ['8.8.8.8','8.8.4.4'] $domains = 'example.com' file {'/tmp/test': content => template('/tmp/template.erb') } We then use puppet apply to apply this template and create the /tmp/test file, as follows: $ puppet apply file.pp Notice: Compiled catalog for mylaptop.example.net in environment production in 0.20 seconds Notice: /Stage[main]/Main/File[/tmp/test]/ensure: defined content as '{md5}4d1c547c40a27c06726ecaf784b99e84' Notice: Finished catalog run in 0.04 seconds The following are the contents of the /tmp/test file: # resolv.conf built by Puppet domainexample.net search trouble.example.com packt.example.com example.net nameserver 8.8.8.8 nameserver 8.8.4.4 Debugging templates Templates can also be used in debugging. You can create a file resource that uses a template that outputs all the defined variables and their values. You can include the following resource in your node definition: file { "/tmp/puppet-debug.txt": content =>inline_template("<% vars = scope.to_hash.reject { |k,v| !( k.is_a?(String) &&v.is_a?(String) ) }; vars.sort.each do |k,v| %><%= k %>=<%= v %>n<% end %>"), } This uses an inline template, which may make it slightly hard to read. The template loops through the output of the scope function and prints the values if the value is a string. Focusing only on the inner loop, this can be shown as follows: vars = scope.to_hash.reject { |k,v| !( k.is_a?(String) && v.is_a?(String) ) }; vars.sort.each do |k,v| k=vn end Summary In this article, we examined metaparameters and how to deal with resource ordering issues. We built custom facts and defines and discussed the issues that may arise when using them. We then moved on to templates and showed how to use templates as an aid in debugging. Resources for Article: Further resources on this subject: My First Puppet Module[article] Puppet Language and Style[article] Puppet and OS Security Tools [article]
Read more
  • 0
  • 0
  • 3559

article-image-storage-configurations
Packt
07 Sep 2015
21 min read
Save for later

Storage Configurations

Packt
07 Sep 2015
21 min read
In this article by Wasim Ahmed, author of the book Proxmox Cookbook, we will cover topics such as local storage, shared storage, Ceph storage, and a recipe which shows you how to configure the Ceph RBD storage. (For more resources related to this topic, see here.) A storage is where virtual disk images of virtual machines reside. There are many different types of storage systems with many different features, performances, and use case scenarios. Whether it is a local storage configured with direct attached disks or a shared storage with hundreds of disks, the main responsibility of a storage is to hold virtual disk images, templates, backups, and so on. Proxmox supports different types of storages, such as NFS, Ceph, GlusterFS, and ZFS. Different storage types can hold different types of data. For example, a local storage can hold any type of data, such as disk images, ISO/container templates, backup files and so on. A Ceph storage, on the other hand, can only hold a .raw format disk image. In order to provide the right type of storage for the right scenario, it is vital to have a proper understanding of different types of storages. The full details of each storage is beyond the scope of this article, but we will look at how to connect them to Proxmox and maintain a storage system for VMs. Storages can be configured into two main categories: Local storage Shared storage Local storage Any storage that resides in the node itself by using directly attached disks is known as a local storage. This type of storage has no redundancy other than a RAID controller that manages an array. If the node itself fails, the storage becomes completely inaccessible. The live migration of a VM is impossible when VMs are stored on a local storage because during migration, the virtual disk of the VM has to be copied entirely to another node. A VM can only be live-migrated when there are several Proxmox nodes in a cluster and the virtual disk is stored on a shared storage accessed by all the nodes in the cluster. Shared storage A shared storage is one that is available to all the nodes in a cluster through some form of network media. In a virtual environment with shared storage, the actual virtual disk of the VM may be stored on a shared storage, while the VM actually runs on another Proxmox host node. With shared storage, the live migration of a VM becomes possible without powering down the VM. Multiple Proxmox nodes can share one shared storage, and VMs can be moved around since the virtual disk is stored on different shared storages. Usually, a few dedicated nodes are used to configure a shared storage with their own resources apart from sharing the resources of a Proxmox node, which could be used to host VMs. In recent releases, Proxmox has added some new storage plugins that allow users to take advantage of some great storage systems and integrating them with the Proxmox environment. Most of the storage configurations can be performed through the Proxmox GUI. Ceph storage Ceph is a powerful distributed storage system, which provides RADOS Block Device (RBD) object storage, Ceph filesystem (CephFS), and Ceph Object Storage. Ceph is built with a very high-level of reliability, scalability, and performance in mind. A Ceph cluster can be expanded to several petabytes without compromising data integrity, and can be configured using commodity hardware. Any data written to the storage gets replicated across a Ceph cluster. Ceph was originally designed with big data in mind. Unlike other types of storages, the bigger a Ceph cluster becomes, the higher the performance. However, it can also be used in small environments just as easily for data redundancy. A lower performance can be mitigated using SSD to store Ceph journals. Refer to the OSD Journal subsection in this section for information on journals. The built-in self-healing features of Ceph provide unprecedented resilience without a single point of failure. In a multinode Ceph cluster, the storage can tolerate not just hard drive failure, but also an entire node failure without losing data. Currently, only an RBD block device is supported in Proxmox. Ceph comprises a few components that are crucial for you to understand in order to configure and operate the storage. The following components are what Ceph is made of: Monitor daemon (MON) Object Storage Daemon (OSD) OSD Journal Metadata Server (MSD) Controlled Replication Under Scalable Hashing map (CRUSH map) Placement Group (PG) Pool MON Monitor daemons form quorums for a Ceph distributed cluster. There must be a minimum of three monitor daemons configured on separate nodes for each cluster. Monitor daemons can also be configured as virtual machines instead of using physical nodes. Monitors require a very small amount of resources to function, so allocated resources can be very small. A monitor can be set up through the Proxmox GUI after the initial cluster creation. OSD Object Storage Daemons (OSDs) are responsible for the storage and retrieval of actual cluster data. Usually, each physical storage device, such as HDD or SSD, is configured as a single OSD. Although several OSDs can be configured on a single physical disc, it is not recommended for any production environment at all. Each OSD requires a journal device where data first gets written and later gets transferred to an actual OSD. By storing journals on fast-performing SSDs, we can increase the Ceph I/O performance significantly. Thanks to the Ceph architecture, as more and more OSDs are added into the cluster, the I/O performance also increases. An SSD journal works very well on small clusters with about eight OSDs per node. OSDs can be set up through the Proxmox GUI after the initial MON creation. OSD Journal Every single piece of data that is destined to be a Ceph OSD first gets written in a journal. A journal allows OSD daemons to write smaller chunks to allow the actual drives to commit writes that give more time. In simpler terms, all data gets written to journals first, then the journal filesystem sends data to an actual drive for permanent writes. So, if the journal is kept on a fast-performing drive, such as SSD, incoming data will be written at a much higher speed, while behind the scenes, slower performing SATA drives can commit the writes at a slower speed. Journals on SSD can really improve the performance of a Ceph cluster, especially if the cluster is small, with only a few terabytes of data. It should also be noted that if there is a journal failure, it will take down all the OSDs that the journal is kept on the journal drive. In some environments, it may be necessary to put two SSDs to mirror RAIDs and use them as journaling. In a large environment with more than 12 OSDs per node, performance can actually be gained by collocating a journal on the same OSD drive instead of using SSD for a journal. MDS The Metadata Server (MDS) daemon is responsible for providing the Ceph filesystem (CephFS) in a Ceph distributed storage system. MDS can be configured on separate nodes or coexist with already configured monitor nodes or virtual machines. Although CephFS has come a long way, it is still not fully recommended to use in a production environment. It is worth mentioning here that there are many virtual environments actively running MDS and CephFS without any issues. Currently, it is not recommended to configure more than two MDSs in a Ceph cluster. CephFS is not currently supported by a Proxmox storage plugin. However, it can be configured as a local mount and then connected to a Proxmox cluster through the Directory storage. MDS cannot be set up through the Proxmox GUI as of version 3.4. CRUSH map A CRUSH map is the heart of the Ceph distributed storage. The algorithm for storing and retrieving user data in Ceph clusters is laid out in the CRUSH map. CRUSH allows a Ceph client to directly access an OSD. This eliminates a single point of failure and any physical limitations of scalability since there are no centralized servers or controllers to manage data in and out. Throughout Ceph clusters, CRUSH maintains a map of all MONs and OSDs. CRUSH determines how data should be chunked and replicated among OSDs spread across several local nodes or even nodes located remotely. A default CRUSH map is created on a freshly installed Ceph cluster. This can be further customized based on user requirements. For smaller Ceph clusters, this map should work just fine. However, when Ceph is deployed with very big data in mind, this map should be customized. A customized map will allow better control of a massive Ceph cluster. To operate Ceph clusters of any size successfully, a clear understanding of the CRUSH map is mandatory. For more details on the Ceph CRUSH map, visit http://ceph.com/docs/master/rados/operations/crush-map/ and http://cephnotes.ksperis.com/blog/2015/02/02/crushmap-example-of-a-hierarchical-cluster-map. As of Proxmox VE 3.4, we cannot customize the CRUSH map throughout the Proxmox GUI. It can only be viewed through a GUI and edited through a CLI. PG In a Ceph storage, data objects are aggregated in groups determined by CRUSH algorithms. This is known as a Placement Group (PG) since CRUSH places this group in various OSDs depending on the replication level set in the CRUSH map and the number of OSDs and nodes. By tracking a group of objects instead of the object itself, a massive amount of hardware resources can be saved. It would be impossible to track millions of individual objects in a cluster. The following diagram shows how objects are aggregated in groups and how PG relates to OSD: To balance available hardware resources, it is necessary to assign the right number of PGs. The number of PGs should vary depending on the number of OSDs in a cluster. The following is a table of PG suggestions made by Ceph developers: Number of OSDs Number of PGs Less than 5 OSDs 128 Between 5-10 OSDs 512 Between 10-50 OSDs 4096 Selecting the proper number of PGs is crucial since each PG will consume node resources. Too many PGs for the wrong number of OSDs will actually penalize the resource usage of an OSD node, while very few assigned PGs in a large cluster will put data at risk. A rule of thumb is to start with the lowest number of PGs possible, then increase them as the number of OSDs increases. For details on Placement Groups, visit http://ceph.com/docs/master/rados/operations/placement-groups/. There's a great PG calculator created by Ceph developers to calculate the recommended number of PGs for various sizes of Ceph clusters at http://ceph.com/pgcalc/. Pools Pools in Ceph are like partitions on a hard drive. We can create multiple pools on a Ceph cluster to separate stored data. For example, a pool named accounting can hold all the accounting department data, while another pool can store the human resources data of a company. When creating a pool, assigning the number of PGs is necessary. During the initial Ceph configuration, three default pools are created. They are data, metadata, and rbd. Deleting a pool will delete all stored objects permanently. For details on Ceph and its components, visit http://ceph.com/docs/master/. The following diagram shows a basic Proxmox+Ceph cluster: The preceding diagram shows four Proxmox nodes, three Monitor nodes, three OSD nodes, and two MDS nodes comprising a Proxmox+Ceph cluster. Note that Ceph is on a different network than the Proxmox public network. Depending on the set replication number, each incoming data object needs to be written more than once. This causes high bandwidth usage. By separating Ceph on a dedicated network, we can ensure that a Ceph network can fully utilize the bandwidth. On advanced clusters, a third network is created only between Ceph nodes for cluster replication, thus improving network performance even further. As of Proxmox VE 3.4, the same node can be used for both Proxmox and Ceph. This provides a great way to manage all the nodes from the same Proxmox GUI. It is not advisable to put Proxmox VMs on a node that is also configured as Ceph. During day-to-day operations, Ceph nodes do not consume large amounts of resources, such as CPU or memory. However, when Ceph goes into rebalancing mode due to OSD or node failure, a large amount of data replication occurs, which takes up lots of resources. Performance will degrade significantly if resources are shared by both VMs and Ceph. Ceph RBD storage can only store .raw virtual disk image files. Ceph itself does not come with a GUI to manage, so having the option to manage Ceph nodes through the Proxmox GUI makes administrative tasks mush easier. Refer to the Monitoring the Ceph storage subsection under the How to do it... section of the Connecting the Ceph RBD storage recipe later in this article to learn how to install a great read-only GUI to monitor Ceph clusters. Connecting the Ceph RBD storage In this recipe, we are going to see how to configure a Ceph block storage with a Proxmox cluster. Getting ready The initial Ceph configuration on a Proxmox cluster must be accomplished through a CLI. After the Ceph installation, initial configurations and one monitor creation for all other tasks can be accomplished through the Proxmox GUI. How to do it... We will now see how to configure the Ceph block storage with Proxmox. Installing Ceph on Proxmox Ceph is not installed by default. Prior to configuring a Proxmox node for the Ceph role, Ceph needs to be installed and the initial configuration must be created through a CLI. The following steps need to be performed on all Proxmox nodes that will be part of the Ceph cluster: Log in to each node through SSH or a console. Configure a second network interface to create a separate Ceph network with a different subnet. Reboot the nodes to initialize the network configuration. Using the following command, install the Ceph package on each node: # pveceph install –version giant Initializing the Ceph configuration Before Ceph is usable, we have to create the initial Ceph configuration file on one Proxmox+Ceph node. The following steps need to be performed only on one Proxmox node that will be part of the Ceph cluster: Log in to the node using SSH or a console. Run the following command create the initial Ceph configuration: # pveceph init –network <ceph_subnet>/CIDR Run the following command to create the first monitor: # pveceph createmon Configuring Ceph through the Proxmox GUI After the initial Ceph configuration and the creation of the first monitor, we can continue with further Ceph configurations through the Proxmox GUI or simply run the Ceph Monitor creation command on other nodes. The following steps show how to create Ceph Monitors and OSDs from the Proxmox GUI: Log in to the Proxmox GUI as a root or with any other administrative privilege. Select a node where the initial monitor was created in previous steps, and then click on Ceph from the tabbed menu. The following screenshot shows a Ceph cluster as it appears after the initial Ceph configuration: Since no OSDs have been created yet, it is normal for a new Ceph cluster to show PGs stuck and unclean error Click on Disks on the bottom tabbed menu under Ceph to display the disks attached to the node, as shown in the following screenshot: Select an available attached disk, then click on the Create: OSD button to open the OSD dialog box, as shown in the following screenshot: Click on the Journal Disk drop-down menu to select a different device or collocate the journal on the same OSD by keeping it as the default. Click on Create to finish the OSD creation. Create additional OSDs on Ceph nodes as needed. The following screenshot shows a Proxmox node with three OSDs configured: By default, Proxmox has created OSDs with an ext3 partition. However, sometimes, it may be necessary to create OSDs with different partition types due to a requirement or for performance improvement. Enter the following command format through the CLI to create an OSD with a different partition type: # pveceph createosd –fstype ext4 /dev/sdX The following steps show how to create Monitors through the Proxmox GUI: Click on Monitor from the tabbed menu under the Ceph feature. The following screenshot shows the Monitor status with the initial Ceph Monitor we created earlier in this recipe: Click on Create to open the Monitor dialog box. Select a Proxmox node from the drop-down menu. Click on the Create button to start the monitor creation process. Create a total of three Ceph monitors to establish a Ceph quorum. The following screenshot shows the Ceph status with three monitors and OSDs added: Note that even with three OSDs added, the PGs are still stuck with errors. This is because by default, the Ceph CRUSH is set up for two replicas. So far, we've only created OSDs on one node. For a successful replication, we need to add some OSDs on the second node so that data objects can be replicated twice. Follow the steps described earlier to create three additional OSDs on the second node. After creating three more OSDs, the Ceph status should look like the following screenshot: Managing Ceph pools It is possible to perform basic tasks, such as creating and removing Ceph pools through the Proxmox GUI. Besides these, we can see check the list, status, number of PGs, and usage of the Ceph pools. The following steps show how to check, create, and remove Ceph pools through the Proxmox GUI: Click on the Pools tabbed menu under Ceph in the Proxmox GUI. The following screenshot shows the status of the default rbd pool, which has replica 1, 256 PG, and 0% usage: Click on Create to open the pool creation dialog box. Fill in the required information, such as the name of the pool, replica size, and number of PGs. Unless the CRUSH map has been fully customized, the ruleset should be left at the default value 0. Click on OK to create the pool. To remove a pool, select the pool and click on Remove. Remember that once a Ceph pool is removed, all the data stored in this pool is deleted permanently. To increase the number of PGs, run the following command through the CLI: #ceph osd pool set <pool_name> pg_num <value> #ceph osd pool set <pool_name> pgp_num <value> It is only possible to increase the PG value. Once increased, the PG value can never be decreased. Connecting RBD to Proxmox Once a Ceph cluster is fully configured, we can proceed to attach it to the Proxmox cluster. During the initial configuration file creation, Ceph also creates an authentication keyring in the /etc/ceph/ceph.client.admin.keyring directory path. This keyring needs to be copied and renamed to match the name of the storage ID to be created in Proxmox. Run the following commands to create a directory and copy the keyring: # mkdir /etc/pve/priv/ceph # cd /etc/ceph/ # cp ceph.client.admin.keyring /etc/pve/priv/ceph/<storage>.keyring For our storage, we are naming it rbd.keyring. After the keyring is copied, we can attach the Ceph RBD storage with Proxmox using the GUI: Click on Datacenter, then click on Storage from the tabbed menu. Click on the Add drop-down menu and select the RBD storage plugin. Enter the information as described in the following table: Item Type of value Entered value ID The name of the storage. rbd Pool The name of the Ceph pool. rbd Monitor Host The IP address and port number of the Ceph MONs. We can enter multiple MON hosts for redundancy. 172.16.0.71:6789;172.16.0.72:6789; 172.16.0.73:6789 User name The default Ceph administrator. Admin Nodes The Proxmox nodes that will be able to use the storage. All Enable The checkbox for enabling/disabling the storage. Enabled Click on Add to attach the RBD storage. The following screenshot shows the RBD storage under Summary: Monitoring the Ceph storage Ceph itself does not come with any GUI to manage or monitor the cluster. We can view the cluster status and perform various Ceph-related tasks through the Proxmox GUI. There are several third-party software that allow Ceph-only GUI to manage and monitor the cluster. Some software provide management features, while others provide read-only features for Ceph monitoring. Ceph Dash is such a software that provides an appealing read-only GUI to monitor the entire Ceph cluster without logging on to the Proxmox GUI. Ceph Dash is freely available through GitHub. There are other heavyweight Ceph GUI dashboards, such as Kraken, Calamari, and others. In this section, we are only going to see how to set up the Ceph Dash cluster monitoring GUI. The following steps can be used to download and start Ceph Dash to monitor a Ceph cluster using any browser: Log in to any Proxmox node, which is also a Ceph MON. Run the following commands to download and start the dashboard: # mkdir /home/tools # apt-get install git # git clone https://github.com/Crapworks/ceph-dash # cd /home/tools/ceph-dash # ./ceph_dash.py Ceph Dash will now start listening on port 5000 of the node. If the node is behind a firewall, open port 5000 or any other ports with port forwarding in the firewall. Open any browser and enter <node_ip>:5000 to open the dashboard. The following screenshot shows the dashboard of the Ceph cluster we have created: We can also monitor the status of the Ceph cluster through a CLI using the following commands: To check the Ceph status: # ceph –s To view OSDs in different nodes: # ceph osd tree To display real-time Ceph logs: # ceph –w To display a list of Ceph pools: # rados lspools To change the number of replicas of a pool: # ceph osd pool set size <value> Besides the preceding commands, there are many more CLI commands to manage Ceph and perform advanced tasks. The Ceph official documentation has a wealth of information and how-to guides along with the CLI commands to perform them. The documentation can be found at http://ceph.com/docs/master/. How it works… At this point, we have successfully integrated a Ceph cluster with a Proxmox cluster, which comprises six OSDs, three MONs, and three nodes. By viewing the Ceph Status page, we can get lot of information about a Ceph cluster at a quick glance. From the previous figure, we can see that there are 256 PGs in the cluster and the total cluster storage space is 1.47 TB. A healthy cluster will have the PG status as active+clean. Based on the nature of issue, the PGs can have various states, such as active+unclean, inactive+degraded, active+stale, and so on. To learn details about all the states, visit http://ceph.com/docs/master/rados/operations/pg-states/. By configuring a second network interface, we can separate a Ceph network from the main network. The #pveceph init command creates a Ceph configuration file in the /etc/pve/ceph.conf directory path. A newly configured Ceph configuration file looks similar to the following screenshot: Since the ceph.conf configuration file is stored in pmxcfs, any changes made to it are immediately replicated in all the Proxmox nodes in the cluster. As of Proxmox VE 3.4, Ceph RBD can only store a .raw image format. No templates, containers, or backup files can be stored on the RBD block storage. Here is the content of a storage configuration file after adding the Ceph RBD storage: rbd: rbd monhost 172.16.0.71:6789;172.16.0.72:6789;172.16.0.73:6789 pool rbd content images username admin If a situation dictates the IP address change of any node, we can simply edit this content in the configuration file to manually change the IP address of the Ceph MON nodes. See also To learn about Ceph in greater detail, visit http://ceph.com/docs/master/ for the official Ceph documentation Also, visit https://indico.cern.ch/event/214784/session/6/contribution/68/material/slides/0.pdf to find out why Ceph is being used at CERN to store the massive data generated by the Large Hadron Collider (LHC) Summary In this article, we came across with different configurations for a variety of storage categories and got hands-on practice with various stages in configuring the Ceph RBD storage. Resources for Article: Further resources on this subject: Deploying New Hosts with vCenter [article] Let's Get Started with Active Di-rectory [article] Basic Concepts of Proxmox Virtual Environment [article]
Read more
  • 0
  • 0
  • 9102

article-image-reusable-grid-system-sass
Brian Hough
07 Sep 2015
6 min read
Save for later

Reusable Grid System With SASS

Brian Hough
07 Sep 2015
6 min read
Grid systems have become an essential part of front-end web development. Whether you are building a web app or a marketing landing page, a grid system is the core of your layout. The problem I kept coming across is that grid systems are not one size fits all, so often I would have to go find a new system for each project. This led me to look for a way to avoid this search, which lead me to this solution. Thanks to some of SASS's functionality, we can actually create a grid system we can reuse and customize quickly for every project. Getting Started We're going to start out by setting up some variables that we will need to generate our grid system: $columnCount: 12; $columnPadding: 20px; $gridWidth: 100%; $baseColumnWidth: $gridWidth / $columnCount; $columnCount will do just what it says on the tin, set the number of columns for our grid layout. $columnPadding sets the spacing between each column as well as our outside gutters.$gridwidth sets how wide we want our layout to be. This can be set to a percentage for a fluid layout or another unit (such as px) for a fixed layout. Finally, we have $baseColumnWidth, which is a helper variable that determines the width of a single column based on the total layout width and the number of columns. We are going to finish our initial setup by adding some high-level styles: *, *:before, *:after { box-sizing: border-box; } img, picture { max-width: 100%; } This will set everything on our page to use box-sizing: border-box, making it much easier to calculate our layout since the browser will handle the math of deducting padding from our widths for us. The other thing we did was make all the images in our layout responsive, so we set a max-width: 100% on our image and picture tags. Rows Now that we have our basic setup done, let's start crafting our actual grid system. Our first task is to create our row wrapper: .row { width: $gridWidth; padding: 0 ( $gutterWidth / 2 ); &:after { content: ""; display: table; clear: both; } } Here we set the width of our row to the $gridWidth value from earlier. For this example, we are using a fully fluid width of 100%, but you could also add a max-width here in order to constrain the layout on larger screens. Next, we apply our outside gutters by taking $gutterWidth and dividing it in half. We do this because each column will have 10px of padding on either side of it, so that 10px plus the 10px we are adding to the outside of the row will give us our desired 20px gutter. Lastly, since we will be using floats to layout our columns, we will clear them after we close out each row. One of the features I always require out of a grid-system is the ability to create nested columns. This is the ability to start a new row of columns that is nested within another column. Let's modify our row styling to accommodate nesting: .row { width: $gridWidth; padding: 0 ( $gutterWidth / 2 ); &:after { content: ""; display: table; clear: both; } .row { width: auto; padding: 0 ( $gutterWidth / -2 ); } } This second .row class will handle our nesting. We set width: auto so that our nested row will fill its parent column, and to override a possible fix width that could be inherited from the original unnested .row class. Since this row is nested, we are not going to remove those outside gutters. We achieve this by taking our $gutterWidth value and divide it by -2, which will pull the edges of the row out to compensate for the parent column's padding. This will now let us nest till our heart's content. Columns Columns are the meat of our grid system. Most of the styles for our columns will be shared, so let's create that block first: [class*="column-"] { float: left; padding: 0 ( $columnPadding / 2 ); } Using a wildcard attribute selector, we target all of our column classes, floating them left and applying our familiar padding formula. If [browser compatibility] is a concern for you, you can also make this block a placeholder and @extend it from your individual column classes. Now that we have that out of the way, it's time for the real magic. To generate our individual column styles, we will use a SASS loop that iterates over $columnCount: @for $i from 1 through $columnCount { .column-#{$i} { width: ( $baseColumnWidth * $i) ; } } If you are familiar with JavaScript loops, then this code shouldn't be too foreign to you. For every column, we create a .column-x block that will span X number of columns. The #{$i} at the end of the class name prints out i to create each columns class name. We then set its width to $baseColumnWidth times the number of columns we want to span (represented by i). This will loop for the number of columns we set $columnCount. This is the core of what makes this pattern so powerful, as no matter how many or few columns we need this loop will generate all the necessary styles. This same pattern can be extended to make our grid-system even more flexible. Let's add the ability to offset columns to the left or right by making the following modifications: @for $i from 1 through $columnCount { .column-#{$i} { width: ( $baseColumnWidth * $i) ; } .prepend-#{$i} { margin-left: ( $baseColumnWidth * $i); } .append-#{$i} { margin-right: ( $baseColumnWidth * $i ); } } This creates two new blocks on each iteration that can be used to offset a row by X number of columns to the left or right. Plus, because you can have multiple loops, you can also use this pattern to create styles for different breakpoints: @media only screen and (min-width: 768px) { @for $i from 1 through $columnCount { .tablet-column-#{$i} { width: ( $baseColumnWidth * $i) ; } .tablet-prepend-#{$i} { margin-left: ( $baseColumnWidth * $i); } .tablet-append-#{$i} { margin-right: ( $baseColumnWidth * $i ); } } } Conclusion We now have a full-featured grid system that we can customize for individual use cases by adjusting just a few variables. As technologies and browser support changes, we can continue to modify this base file including support for things like flex-box and continue to use it for years to come. This has been a great addition to my toolbox, and I hope it is to yours as well. About The Author Brian is a Front-End Architect, Designer, and Product Manager at Piqora. By day, he is working to prove that the days of bad Enterprise User Experiences are a thing of the past. By night, he obsesses about ways to bring designers and developers together using technology. He blogs about his early stage startup experience at lostinpixelation.com, or you can read his general musings on twitter @b_hough.
Read more
  • 0
  • 0
  • 1849

article-image-working-powershell
Packt
07 Sep 2015
17 min read
Save for later

Working with PowerShell

Packt
07 Sep 2015
17 min read
In this article, you will cover: Retrieving system information – Configuration Service cmdlets Administering hosts and machines – Host and MachineCreation cmdlets Managing additional components – StoreFront Admin and Logging cmdlets (For more resources related to this topic, see here.) Introduction With hundreds or thousands of hosts to configure and machines to deploy, configuring all the components manually could be difficult. As for the previous XenDesktop releases, and also with the XenDesktop 7.6 version, you can find an integrated set of PowerShell modules. With its use, IT technicians are able to reduce the time required to perform management tasks by the creation of PowerShell scripts, which will be used to deploy, manage, and troubleshoot at scale the greatest part of the XenDesktop components. Working with PowerShell instead of the XenDesktop GUI will give you more flexibility in terms of what kind of operations to execute, having a set of additional features to use during the infrastructure creation and configuration phases. Retrieving system information – Configuration Service cmdlets In this recipe, we will use and explain a general-purpose PowerShell cmdlet: the Configuration Service category. This is used to retrieve general configuration parameters, and to obtain information about the implementation of the XenDesktop Configuration Service. Getting ready No preliminary tasks are required. You have already installed the Citrix XenDesktop PowerShell SDK during the installation of the Desktop Controller role machine(s). To be able to run a PowerShell script (.ps1 format), you have to enable the script execution from the PowerShell prompt in the following way, using its application: Set-ExecutionPolicy -ExecutionPolicy RemoteSigned -Force How to do it… In this section, we will explain and execute the commands associated with the XenDesktop System and Services configuration area: Connect to one of the Desktop Broker servers, by using a remote Desktop connection, for instance. Right-click on the PowerShell icon installed on the Windows taskbar and select the Run as Administrator option. Load the Citrix PowerShell modules by typing the following command and then press the Enter key: Asnp Citrix* As an alternative to the Asnp command, you can use the Add-PSSnapin command. Retrieve the active and configured Desktop Controller features by running the following command: Get-ConfigEnabledFeature To retrieve the current status of the Config Service, run the following command. The output result will be OK in the absence of configuration issues: Get-ConfigServiceStatus To get the connection string used by the Configuration Service and to connect to the XenDesktop database, run the following command: Get-ConfigDBConnection Starting from the previously received output, it's possible to configure the connection string to let the Configuration Service use the system DB. For this command, you have to specify the Server, Initial Catalog, and Integrated Security parameters: Set-ConfigDBConnection –DBConnection"Server=<ServernameInstanceName>; Initial Catalog=<DatabaseName>; Integrated Security=<True | False>" Starting from an existing Citrix database, you can generate a SQL procedure file to use as a backup to recreate the database. Run the following command to complete this task, specifying the DatabaseName and ServiceGroupName parameters: Get-ConfigDBSchema -DatabaseName<DatabaseName> -ServiceGroupName<ServiceGroupName>> Path:FileName.sql You need to configure a destination database with the same name as that of the source DB, otherwise the script will fail! To retrieve information about the active Configuration Service objects (Instance, Service, and Service Group), run the following three commands respectively: Get-ConfigRegisteredServiceInstance Get-ConfigService Get-ConfigServiceGroup To test a set of operations to check the status of the Configuration Service, run the following script: #------------ Script - Configuration Service #------------ Define Variables $Server_Conn="SqlDatabaseServer.xdseven.localCITRIX,1434" $Catalog_Conn="CitrixXD7-Site-First" #------------ write-Host"XenDesktop - Configuration Service CmdLets" #---------- Clear the existing Configuration Service DB connection $Clear = Set-ConfigDBConnection -DBConnection $null Write-Host "Clearing any previous DB connection - Status: " $Clear #---------- Set the Configuration Service DB connection string $New_Conn = Set-ConfigDBConnection -DBConnection"Server=$Server_Conn; Initial Catalog=$Catalog_Conn; Integrated Security=$true" Write-Host "Configuring the DB string connection - Status: " $New_Conn $Configured_String = Get-configDBConnection Write-Host "The new configured DB connection string is: " $Configured_String You have to save this script with the .ps1 extension, in order to invoke it with PowerShell. Be sure to change the specific parameters related to your infrastructure, in order to be able to run the script. This is shown in the following screenshot: How it works... The Configuration Service cmdlets of XenDesktop PowerShell permit the managing of the Configuration Service and its related information: the Metadata for the entire XenDesktop infrastructure, the Service instances registered within the VDI architecture, and the collections of these services, called Service Groups. This set of commands offers the ability to retrieve and check the DB connection string to contact the configured XenDesktop SQL Server database. These operations are permitted by the Get-ConfigDBConnection command (to retrieve the current configuration) and the Set-ConfigDBConnection command (to configure the DB connection string); both the commands use the DB Server Name with the Instance name, DB name, and Integrated Security as information fields. In the attached script, we have regenerated a database connection string. To be sure to be able to recreate it, first of all we have cleared any existing connection, setting it to null (verify the command associated with the $Clear variable), then we have defined the $New_Conn variable, using the Set-ConfigDBConnection command; all the parameters are defined at the top of the script, in the form of variables. Use the Write-Host command to echo results on the standard output. There's more... In some cases, you may need to retrieve the state of the registered services, in order to verify their availability. You can use the Test-ConfigServiceInstanceAvailability cmdlet, retrieving whether the service is responding or not and its response time. Run the following example to test the use of this command: Get-ConfigRegisteredServiceInstance | Test-ConfigServiceInstanceAvailability | more Use the –ForceWaitForOneOfEachType parameter to stop the check for a service category, when one of its services responds. Administering hosts and machines – Host and MachineCreation cmdlets In this recipe, we will describe how to create the connection between the Hypervisor and the XenDesktop servers, and the way to generate machines to assign to the end users, all by using Citrix PowerShell. Getting ready No preliminary tasks are required. You have already installed the Citrix XenDesktop PowerShell SDK during the installation of the Desktop Controller role machine(s). To be sure to be able to run a PowerShell script (the.ps1 format), you have to enable the script execution from the PowerShell prompt in this way: Set-ExecutionPolicy -ExecutionPolicy RemoteSigned -Force How to do it… In this section, we will discuss the PowerShell commands used to connect XenDesktop with the supported hypervisors plus the creation of the machines from the command line: Connect to one of the Desktop Broker servers. Click on the PowerShell icon installed on the Windows taskbar. Load the Citrix PowerShell modules by typing the following command, and then press the Enter key: Asnp Citrix* To list the available Hypervisor types, execute this task: Get-HypHypervisorPlugin –AdminAddress<BrokerAddress> To list the configured properties for the XenDesktop root-level location (XDHyp:), execute the following command: Get-ChildItemXDHyp:HostingUnits Please refer to the PSPath, Storage, and PersonalvDiskStorage output fields to retrieve information on the storage configuration. Execute the following cmdlet to add a storage resource to the XenDesktop Controller host: Add-HypHostingUnitStorage –LiteralPath<HostPathLocation> -StoragePath<StoragePath> -StorageType<OSStorage|PersonalvDiskStorage> - AdminAddress<BrokerAddress> To generate a snapshot for an existing VM, perform the following task: New-HypVMSnapshot –LiteralPath<HostPathLocation> -SnapshotDescription<Description> Use the Get-HypVMMacAddress -LiteralPath<HostPathLocation> command to list the MAC address of specified desktop VMs. To provision machine instances starting from the Desktop base image template, run the following command: New-ProvScheme –ProvisioningSchemeName<SchemeName> -HostingUnitName<HypervisorServer> -IdentityPoolName<PoolName> -MasterImageVM<BaseImageTemplatePath> -VMMemoryMB<MemoryAssigned> -VMCpuCount<NumberofCPU> To specify the creation of instances with the Personal vDisk technology, use the following option: -UsePersonalVDiskStorage. After the creation process, retrieve the provisioning scheme information by running the following command: Get-ProvScheme –ProvisioningSchemeName<SchemeName> To modify the resources assigned to desktop instances in a provisioning scheme, use the Set-ProvScheme cmdlet. The permitted parameters are –ProvisioningSchemeName, -VMCpuCount, and –VMMemoryMB. To update the desktop instances to the latest version of the Desktop base image template, run the following cmdlet: Publish-ProvMasterVmImage –ProvisioningSchemeName<SchemeName> -MasterImageVM<BaseImageTemplatePath> If you do not want to maintain the pre-update instance version to use as a restore checkpoint, use the –DoNotStoreOldImage option. To create machine instances, based on the previously configured provisioning scheme for an MCS architecture, run this command: New-ProvVM –ProvisioningSchemeName<SchemeName> -ADAccountName"DomainMachineAccount" Use the -FastBuild option to make the machine creation process faster. On the other hand, you cannot start up the machines until the process has been completed. Retrieve the configured desktop instances by using the next cmdlet: Get-ProvVM –ProvisioningSchemeName<SchemeName> -VMName<MachineName> To remove an existing virtual desktop, use the following command: Remove-ProvVM –ProvisioningSchemeName<SchemeName> -VMName<MachineName> -AdminAddress<BrokerAddress> The next script will combine the use of part of the commands listed in this recipe: #------------ Script - Hosting + MCS #----------------------------------- #------------ Define Variables $LitPath = "XDHyp:HostingUnitsVMware01" $StorPath = "XDHyp:HostingUnitsVMware01datastore1.storage" $Controller_Address="192.168.110.30" $HostUnitName = "Vmware01" $IDPool = $(Get-AcctIdentityPool -IdentityPoolName VDI-DESKTOP) $BaseVMPath = "XDHyp:HostingUnitsVMware01VMXD7-W8MCS-01.vm" #------------ Creating a storage location Add-HypHostingUnitStorage –LiteralPath $LitPath -StoragePath $StorPath -StorageTypeOSStorage -AdminAddress $Controller_Address #---------- Creating a Provisioning Scheme New-ProvScheme –ProvisioningSchemeName Deploy_01 -HostingUnitName $HostUnitName -IdentityPoolName $IDPool.IdentityPoolName -MasterImageVM $BaseVMPathT0-Post.snapshot -VMMemoryMB 4096 -VMCpuCount 2 -CleanOnBoot #---------- List the VM configured on the Hypervisor Host dir $LitPath*.vm exit How it works... The Host and MachineCreation cmdlet groups manage the interfacing with the Hypervisor hosts, in terms of machines and storage resources. This allows you to create the desktop instances to assign to the end user, starting from an existing and mapped Desktop virtual machine. The Get-HypHypervisorPlugin command retrieves and lists the available hypervisors to use to deploy virtual desktops and to configure the storage types. You need to configure an operating system storage area or a Personal vDisk storage zone. The way to map an existing storage location from the Hypervisor to the XenDesktop controller is by running the Add-HypHostingUnitStorage cmdlet. In this case you have to specify the destination path on which the storage object will be created (LiteralPath), the source storage path on the Hypervisor machine(s) (StoragePath), and the StorageType previously discussed. The storage types are in the form of XDHyp:HostingUnits<UnitName>. To list all the configured storage objects, execute the following command: dirXDHyp:HostingUnits<UnitName> *.storage After configuring the storage area, we have discussed the Machine Creation Service (MCS) architecture. In this cmdlets collection, we have the availability of commands to generate VM snapshots from which we can deploy desktop instances (New-HypVMSnapshot), and specify a name and a description for the generated disk snapshot. Starting from the available disk image, the New-ProvScheme command permits you to create a resource provisioning scheme, on which to specify the desktop base image, and the resources to assign to the desktop instances (in terms of CPU and RAM -VMCpuCount and –VMMemoryMB), and if generating these virtual desktops in a non-persistent mode (-CleanOnBoot option), with or without the use of the Personal vDisk technology (-UsePersonalVDiskStorage). It's possible to update the deployed instances to the latest base image update through the use of the Publish-ProvMasterVmImage command. In the generated script, we have located all the main storage locations (the LitPath and StorPath variables) useful to realize a provisioning scheme, then we have implemented a provisioning procedure for a desktop based on an existing base image snapshot, with two vCPUs and 4GB of RAM for the delivered instances, which will be cleaned every time they stop and start (by using the -CleanOnBoot option). You can navigate the local and remote storage paths configured with the XenDesktop Broker machine; to list an object category (such as VM or Snapshot) you can execute this command: dirXDHyp:HostingUnits<UnitName>*.<category> There's more... The discussed cmdlets also offer you the technique to preserve a virtual desktop from an accidental deletion or unauthorized use. With the Machine Creation cmdlets group, you have the ability to use a particular command, which allows you to lock critical desktops: Lock-ProvVM. This cmdlet requires as parameters the name of the scheme to which they refer (-ProvisioningSchemeName) and the ID of the virtual desktop to lock (-VMID). You can retrieve the Virtual Machine ID by running the Get-ProvVM command discussed previously. To revert the machine lock, and free the desktop instance from accidental deletion or improper use, you have to execute the Unlock-ProvVM cmdlet, using the same parameter showed for the lock procedure. Managing additional components – StoreFrontÔ admin and logging cmdlets In this recipe, we will use and explain how to manage and configure the StoreFront component, by using the available Citrix PowerShell cmdlets. Moreover, we will explain how to manage and check the configurations for the system logging activities. Getting ready No preliminary tasks are required. You have already installed the Citrix XenDesktop PowerShell SDK during the installation of the Desktop Controller role machine(s). To be able to run a PowerShell script (in the.ps1 format), you have to enable the script execution from the PowerShell prompt in this way: Set-ExecutionPolicy -ExecutionPolicy RemoteSigned -Force How to do it… In this section, we will explain and execute the commands associated with the Citrix Storefront system: Connect to one of the Desktop Broker servers. Click on the PowerShell icon installed on the Windows taskbar. Load the Citrix PowerShell modules by typing the following command, and then press the Enter key: Asnp Citrix* To execute a command, you have to press the Enter button after completing the right command syntax. Retrieve the currently existing StoreFront service instances, by running the following command: Get-SfService To limit the number of rows as output result, you can add the –MaxRecordCount<value> parameter. To list the detailed information about the StoreFront service(s) currently configured, execute the following command: Get-SfServiceInstance –AdminAddress<ControllerAddress> The status of the currently active StoreFront instances can be retrieved by using the Get-SfServiceStatus command. The OK output will confirm the correct service execution. To list the task history associated with the configured StoreFront instances, you have to run the following command: Get-SfTask You can filter the desired information for the ID of the researched task (-taskid) and sort the results by the use of the –sortby parameter. To retrieve the installed database schema versions, you can execute the following command: Get-SfInstalledDBVersion By applying the –Upgrade and –Downgrade filters, you will receive respectively the schemas for which the database version can be updated or reverted to a previous compatible one. To modify the StoreFront configurations to register its state on a different database, you can use the following command: Set-SfDBConnection-DBConnection<DBConnectionString> -AdminAddress<ControllerAddress> Be careful when you specify the database connection string; if not specified, the existing database connections and configurations will be cleared! To check that the database connection has been correctly configured, the following command is available: Test-SfDBConnection-DBConnection<DBConnectionString>-AdminAddress<ControllerAddress> The second discussed cmdlets allows the logging group to retrieve information about the current status of the logging service and run the following command: Get-LogServiceStatus To verify the language used and whether the logging service has been enabled, run the following command: Get-LogSite The available configurable locales are en, ja, zh-CN, de, es, and fr. The available states are Enabled, Disabled, NotSupported, and Mandatory. The NotSupported state will show you an incorrect configuration for the listed parameters. To retrieve detailed information about the running logging service, you have to use the following command: Get-LogService As discussed earlier for the StoreFront commands, you can filter the output by applying the –MaxRecordCount<value> parameter. In order to get all the operations logged within a specified time range, run the following command; this will return the global operations count: Get-LogSummary –StartDateRange<StartDate>-EndDateRange<EndDate> The date format must be the following: AAAA-MM-GGHH:MM:SS. To list the collected operations per day in the specified time period, run the previous command in the following way: Get-LogSummary –StartDateRange<StartDate> -EndDateRange<EndDate>-intervalSeconds 86400 The value 86400 is the number of seconds that are present in a day. To retrieve the connection string information about the database on which logging data is stored, execute the following command: Get-LogDataStore To retrieve detailed information about the high level operations performed on the XenDesktop infrastructure, you have to run the following command: Get-LogHighLevelOperation –Text <TextincludedintheOperation> -StartTime<FormattedDateandTime> -EndTime<FormattedDateandTime>-IsSuccessful<true | false>-User <DomainUserName>-OperationType<AdminActivity | ConfigurationChange> The indicated filters are not mandatory. If you do not apply any filters, all the logged operations will be returned. This could be a very long output. The same information can be retrieved for the low level system operations in the following way: Get-LogLowLevelOperation-StartTime<FormattedDateandTime> -EndTime<FormattedDateandTime> -IsSuccessful<true | false>-User <DomainUserName> -OperationType<AdminActivity | ConfigurationChange> In the How it works section we will explain the difference between the high and low level operations. To log when a high level operation starts and stops respectively, use the following two commands: Start-LogHighLevelOperation –Text <OperationDescriptionText>- Source <OperationSource> -StartTime<FormattedDateandTime> -OperationType<AdminActivity | ConfigurationChange> Stop-LogHighLevelOperation –HighLevelOperationId<OperationID> -IsSuccessful<true | false> The Stop-LogHighLevelOperation must be related to an existing start high level operation, because they are related tasks. How it works... Here, we have discussed two new PowerShell command collections for the XenDesktop 7 versions: the cmdlet related to the StoreFront platform; and the activities Logging set of commands. The first collection is quite limited in terms of operations, despite the other discussed cmdlets. In fact, the only actions permitted with the StoreFront PowerShell set of commands are retrieving configurations and settings about the configured stores and the linked database. More activities can be performed regarding the modification of existing StoreFront clusters, by using the Get-SfCluster, Add-SfServerToCluster, New-SfCluster, and Set-SfCluster set of operations. More interesting is the PowerShell Logging collection. In this case, you can retrieve all the system-logged data, putting it into two principal categories: High-level operations: These tasks group all the system configuration changes that are executed by using the Desktop Studio, the Desktop Director, or Citrix PowerShell. Low-level operations: This category is related to all the system configuration changes that are executed by a service and not by using the system software's consoles. With the low level operations command, you can filter for a specific high level operation to which the low level refers, by specifying the -HighLevelOperationId parameter. This cmdlet category also gives you the ability to track the start and stop of a high level operation, by the use of Start-LogHighLevelOperation and Stop-LogHighLevelOperation. In this second case, you have to specify the previously started high level operation. There's more... In case of too much information in the log store, you have the ability to clear all of it. To refresh all the log entries, we use the following command: Remove-LogOperation -UserName<DBAdministrativeCredentials> -Password <DBUserPassword>-StartDateRange <StartDate> -EndDateRange <EndDate> The not encrypted –Password parameter can be substituted by –SecurePassword, the password indicated in secure string form. The credentials must be database administrative credentials, with deleting permissions on the destination database. This is a not reversible operation, so ensure that you want to delete the logs in the specified time range, or verify that you have some form of data backup. Resources for Article: Further resources on this subject: Working with Virtual Machines [article] Virtualization [article] Upgrading VMware Virtual Infrastructure Setups [article]
Read more
  • 0
  • 0
  • 20202

article-image-my-first-puppet-module
Packt
07 Sep 2015
15 min read
Save for later

My First Puppet Module

Packt
07 Sep 2015
15 min read
In this article by Jussi Heinonen, the author of Learning Puppet, we will get started with creating a Puppet module and the various aspects associated with it. Together with all the manifest files that we created so far, there are several of them already, and we haven't yet started to develop Puppet manifests. As the number of manifests expand, one may start wondering how files can be distributed and applied efficiently across multiple systems. This article will introduce you to Puppet modules and show you how to prepare a simple web server environment with Puppet. (For more resources related to this topic, see here.) Introducing the Puppet module The Puppet module is a collection of code and data that usually solves a particular problem, such as the installation and configuration of a web server. A module is packaged and distributed in the TAR (tape archive) format. When a module is installed, Puppet extracts the archive file on the disk, and the output of the installation process is a module directory that contains Puppet manifests (code), static files (data), and template files (code and data). Static files are typically some kind of configuration files that we want to distribute across all the nodes in the cluster. For example, if we want to ensure that all the nodes in the cluster are using the same DNS server configuration, we can include the /etc/resolv.conf file in the module and tell Puppet to apply it across all the nodes. This is just an example of how static files are used in Puppet and not a recommendation for how to configure DNS servers. Like static files, template files can also be used to provide configuration. The difference between a static and template file is that a static file will always have the same static content when applied across multiple nodes, whereas the template file can be customized based on the unique characteristics of a node. A good example of a unique characteristic is an IP address. Each node (or a host) in the network must have a unique IP address. Using the template file, we can easily customize the configuration on every node, wherever the template is applied. It's a good practice to keep the manifest files short and clean to make them easy to read and quick to debug. When I write manifests, I aim to keep the length of the manifest file in less than a hundred lines. If the manifest length exceeds 100 lines, then this means that I may have over-engineered the process a little bit. If I can't simplify the manifest to reduce the number of lines, then I have to split the manifest into multiple smaller manifest files and store these files within a Puppet module. The Puppet module structure The easiest way to get familiar with a module structure is to create an empty module with the puppet module generate command. As we are in the process of building a web server that runs a web application, we should give our module a meaningful name, such as learning-webapp. The Puppet module name format Before we create our first module, let's take a quick look at the Puppet module naming convention. The Puppet module name is typically in the format of <author>-<modulename>. A module name must contain one hyphen character (no more, no less) that separates the <author> and the <modulename> names. In the case of our learning-webapp module that we will soon create, the author is called learning and the module name is webapp, thus the module name learning-webapp. Generating a Puppet module Let's take a look at the following steps to create the learning-webapp Puppet module: Start the puppet-agent virtual machine. Using the cd command, navigate to the directory that is shared via the shared folder. On my virtual machine, my shared folder appears as /media/sf_learning, and I can move to the directory by running the following command: # cd /media/sf_learning Then, I'll create an empty puppet module with the command puppet module generate learning-webapp --skip-interview and the command returns a list of files and directories that the module contains: # puppet module generate learning-webapp --skip-interview Notice: Generating module at /media/sf_learning/learning-webapp Notice: Populating templates... Finished; module generated in learning-webapp. learning-webapp/metadata.json learning-webapp/Rakefile learning-webapp/manifests learning-webapp/manifests/init.pp learning-webapp/spec learning-webapp/spec/spec_helper.rb learning-webapp/spec/classes learning-webapp/spec/classes/init_spec.rb learning-webapp/Gemfile learning-webapp/tests learning-webapp/tests/init.pp learning-webapp/README.md To get a better view of how the files in the directories are organized in the learning-webapp module, you can run the tree learning-webapp command, and this command will produce the following tree structure of the files:   Here, we have a very simple Puppet module structure. Let's take a look at the files and directories inside the module in more detail: Gemfile: A file used for describing the Ruby package dependencies that are used for unit testing. For more information on Gemfile, visit http://bundler.io/v1.3/man/gemfile.5.html. manifests: A directory for all the Puppet manifest files in the module. manifests/init.pp: A default manifest file that declares the main Puppet class called webapp. metadata.json: A file that contains the module metadata, such as the name, version, and module dependencies. README.md: A file that contains information about the usage of the module. Spec: An optional directory for automated tests. Tests: A directory that contains examples that show how to call classes that are stored in the manifests directory. tests/init.pp: A file containing an example how to call the main class webapp in file manifests/init.pp. A Puppet class A Puppet class is a container for Puppet resources. A class typically includes references to multiple different types of resources and can also reference other Puppet classes. The syntax for declaring a Puppet class is not that different from declaring Puppet resources. A class definition begins with the keyword class, followed by the name of the class (unquoted) and an opening curly brace ({). A class definition ends with a closing curly brace (}). Here is a generic syntax of the Puppet class: class classname { } Let's take a look at the manifests/init.pp file that you just created with the puppet module generate command. Inside the file, you will find an empty Puppet class called webapp. You can view the contents of the manifests/init.pp file using the following command: # cat /media/sf_learning/learning-webapp/manifests/init.pp The init.pp file mostly contains the comment lines, which are prefixed with the # sign, and these lines can be ignored. At the end of the file, you can find the following declaration for the webapp class: class webapp { } The webapp class is a Puppet class that does nothing as it has no resources declared inside it. Resources inside the Puppet class Let's add a notify resource to the webapp class in the manifests/init.pp file before we go ahead and apply the class. The notify resource does not manage any operating system resources, such as files or users, but instead, it allows Puppet to report a message when a resource is processed. As the webapp module was created inside shared folders, you no longer have to use the Nano editor inside the virtual machine to edit manifests. Instead, you can use a graphical text editor, such as a Notepad on Windows or Gedit on the Linux host. This should make the process of editing manifests a bit easier and more user friendly. The directory that I shared on the host computer is /home/jussi/learning. When I take a look inside this directory, I can find a subdirectory called learning-webapp, which is the Puppet module directory that we created a moment ago. Inside this, there is a directory called manifests, which contains the init.pp file. Open the init.pp file in the text editor on the host computer and scroll down the file until you find the webapp class code block that looks like the following: class webapp { } If you prefer to carry on using the Nano editor to edit manifest files (I salute you!), you can open the init.pp file inside the virtual machine with the nano /media/sf_learning/learning-webapp/manifests/init.pp command. The notify resource that we are adding must be added inside the curly braces that begins and ends the class statement; otherwise, the resource will not be processed when we apply the class. Now we can add a simple notify resource that makes the webapp class look like the following when completed: class webapp { notify { 'Applying class webapp': } } Let's take a look at the preceding lines one by one: Line 1 begins with the webapp class, followed by the opening curly brace. Line 2 declares a notify resource and a new opening curly brace, followed by the resource name. The name of the notify resource will become the message that Puppet prints on the screen when the resource from a class is processed. Line 3 closes the notify resource statement. Line 4 indicates that the webapp class finishes here. Once you have added the notify resource to the webapp class, save the init.pp file. Rename the module directory Before we can apply our webapp class, we must rename our module directory. It is unclear to me as to why the puppet module generate command creates a directory name that contains a hyphen character (as in learning-webapp). The hyphen character is not allowed to be present in the Puppet module directory name. For this reason, we must rename the learning-webapp directory before we can apply the webapp class inside it. As the learning-webapp module directory lives in the shared folders, you can either use your preferred file manager program to rename the directory, or you can run the following two commands inside the Puppet Learning VM to change the directory name from learning-webapp to webapp: # cd /media/sf_learning # mv learning-webapp webapp Your module directory name should now be webapp, and we can move on to apply the webapp class inside the module and see what happens. Applying a Puppet class You can try running the puppet apply webapp/manifests/init.pp command but don't be disappointed when nothing happens. Why is that? The reason is because there is nothing inside the init.pp file that references the webapp class. If you are familiar with object-oriented programming, you may know that a class must be instantiated in order to get services from it. In this case, Puppet behaves in a similar way to object-oriented programming languages, as you must make a reference to the class in order to tell Puppet to process the class. Puppet has an include keyword that is used to reference a class. The include keyword in Puppet is only available for class resources, and it cannot be used in conjunction with any other type of Puppet resources. To apply the webapp class, we can make use of the init.pp file under the tests directory that was created when the module was generated. If you take a look inside the tests/init.pp file, you will find a line include webapp. The tests/init.pp file is the one that we should use to apply the webapp class. Here are the steps on how to apply the webapp class inside the Puppet Learning VM: Go to the parent directory of the webapp module: # cd /media/sf_learning Apply the webapp class that is included in the tests/init.pp file: # puppet apply --modulepath=./ webapp/tests/init.pp When the class is applied successfully, you should see the notify resource that was added to the webapp class that appears on lines 2 and 3 in the following Puppet report: Notice: Compiled catalog for web.development.vm in environment production in 0.05 seconds Notice: Applying class webapp Notice: /Stage[main]/Webapp/Notify[Applying class webapp]/message: defined 'message' as 'Applying class webapp' Notice: Finished catalog run in 0.81 seconds Let's take a step back and look again at the command that we used to apply to the webapp class: # puppet apply --modulepath=./ webapp/tests/init.pp The command can be broken down into three elements: puppet apply: The puppet apply command is used when applying a manifest from the command line. modulepath=./: This option is used to tell Puppet what filesystem path to use to look for the webapp module. The ./ (dot forward slash) notation means that we want our current /media/sf_learning working directory to be used as the modulepath value. webapp/tests/init.pp: This is the file that the puppet apply command should read. Installing a module from Puppet Forge Puppet Forge is a public Puppet module repository (https://forge.puppetlabs.com) for modules that are created by the community around Puppet. Making use of the modules in Puppet Forge is a great way to build a software stack quickly, without having to write all the manifests yourself from scratch. The web server that we are going to install is a highly popular Apache HTTP Server (http://httpd.apache.org/), and there is a module in Puppet Forge called puppetlabs-apache that we can install. The Puppetlabs-apache module provides all the necessary Puppet resources for the Apache HTTP Server installation. Note that the puppet module installation requires an Internet connection. To test whether the Puppet Learning VM can connect to the Internet, run the following command on the command line: # host www.google.com On successful completion, the command will return the following output: www.google.com has address 216.58.211.164www.google.com has IPv6 address 2a00:1450:400b:801::2004 Note that the reported IP address may vary. As long as the host command returns www.google.com has address …, the Internet connection works. Now that the Internet connection has been tested, you can now proceed with the module installation. Before we install the puppetlabs-apache module, let's do a quick search to confirm that the module is available in Puppet Forge. The following command will search for the puppetlabs-apache module: # puppet module search puppetlabs-apache When the search is successful, it returns the following results:   Then, we can install the module. Follow these steps to install the puppetlabs-apache module: In the Puppet Learning VM, go to the shared folders /media/sf_learning directory by running the cd /media/sf_learning command. Then, run the following command: # puppet module install --modulepath=./ puppetlabs-apache The --modulepath=./ option specifies that the module should be installed in the current /media/sf_learning working directory The installation will take a couple of minutes to complete, and once it is complete, you will see the following lines appear on the screen: Notice: Preparing to install into /media/sf_learning ... Notice: Preparing to install into /media/sf_learning ... Notice: Downloading from https://forgeapi.puppetlabs.com ... Notice: Installing -- do not interrupt ... /media/sf_learning └─┬ puppetlabs-apache (v1.2.0) ├── puppetlabs-concat (v1.1.2) └── puppetlabs-stdlib (v4.8.0) Let's take a look at the output line by line to fully understand what happened during the installation process: Line 1 tells us that the module is going to be installed in the /media/sf_learning directory, which is our current working directory. This directory was specified with the --modulepath=./ option in the puppet module install command. Line 2 says that the module is going to be installed from https://forgeapi.puppetlabs.com/, which is the address for Puppet Forge. Line 3 is fairly self-explanatory and indicates that the installation process is running. Lines 4 and 5 tell us that the puppetlabs-apache module was installed in the current /media/sf_learning working directory. Line 6 indicates that as part of the puppetlabs-apache module installation, a puppetlabs-concat dependency module was also installed. Line 7 lists another dependency module called puppetlabs-stdlib that got installed in the process. Now you can run the tree -L 1 command to see what new directories got created in /media/sf_learning as a result of the puppet module install command: # tree -L 1 ├── apache ├── concat ├── stdlib └── webapp 4 directories, 0 files The argument -L 1 in the tree command specifies that it should only traverse one level of directory hierarchy. Installing Apache HTTP Server Now that the puppetlabs-apache module is installed in the filesystem, we can proceed with the Apache HTTP Server installation. Earlier, we talked about how a Puppet class can be referenced with the include keyword. Let's see how this works in practice by adding the include apache statement to our webapp class, and then applying the webapp class from the command line. Open the webapp/manifests/init.pp file in your preferred text editor, and add the include apache statement inside the webapp class. I like to place the include statements at the beginning of the class before any resource statement. In my text editor, the webapp class looks like the following after the include statement has been added to it:   Once you have saved the webapp/manifests/init.pp file, you can apply the webapp class with the following command: # puppet apply --modulepath=./ webapp/tests/init.pp This time, the command output is much longer compared to what it was when we applied the webapp class for the first time. In fact, the output is too long to be included in full, so I'm only going to show you the last two lines of the Puppet report, which shows you the step where the state of the Service[httpd] resource has changed from stopped to running: Notice: /Stage[main]/Apache::Service/Service[httpd]/ensure: ensure changed 'stopped' to 'running'Notice: Finished catalog run in 65.20 seconds Summary So we have now come to the end of this article. I hope you found the content useful and not too challenging. One of the key deliverables of this article was to experiment with Puppet modules and learn how to create your own module
Read more
  • 0
  • 0
  • 5813
Unlock access to the largest independent learning library in Tech for FREE!
Get unlimited access to 7500+ expert-authored eBooks and video courses covering every tech area you can think of.
Renews at $19.99/month. Cancel anytime
article-image-storage-policy-based-management
Packt
07 Sep 2015
10 min read
Save for later

Storage Policy-based Management

Packt
07 Sep 2015
10 min read
In this article by Jeffery Taylor, the author of the book VMware Virtual SAN Cookbook, we have a functional VSAN cluster, we can leverage the power of Storage Policy-based Management (SPBM) to control how we deploy our virtual machines (VMs).We will discuss the following topics, with a recipe for each: Creating storage policies Applying storage policies to a new VM or a VM deployed from a template Applying storage policies to an existing VM migrating to VSAN Viewing a VM's storage policies and object distribution Changing storage policies on a VM already residing in VSAN Modifying existing storage policies (For more resources related to this topic, see here.) Introduction SPBM is where the administrative power of converged infrastructure becomes apparent. You can define VM-thick provisioning on a sliding scale, define how fault tolerant the VM's storage should be, make distribution and performance decisions, and more. RAID-type decisions for VMs resident on VSAN are also driven through the use of SPBM. VSAN can provide RAID-1 (mirrored) and RAID-0 (striped) configurations, or a combination of the two in the form of RAID-10 (mirrored set of stripes). All of this is done on a per-VM basis. As the storage and compute infrastructures are now converged, you can define how you want a VM to run in the most logical place—at the VM level or its disks. The need for a datastore-centric configuration, storage tiering, and so on is obviated and made redundant through the power of SPBM. Technically, the configuration of storage policies is optional. If you choose not to define any storage policies, VSAN will create VMs and disks according to its default cluster-wide storage policy. While this will provide for generic levels of fault tolerance and performance, it is strongly recommended to create and apply storage policies according to your administrative need. Much of the power given to you through a converged infrastructure and VSAN is in the policy-driven and VM-centric nature of policy-based management.While some of these options will be discussed throughout the following recipes, it is strongly recommended that you review the storage-policy appendix to familiarize yourself with all the storage-policy options prior to continuing. Creating VM storage policies Before a storage policy can be applied, it must be created. Once created, the storage policy can be applied to any part of any VM resident on VSAN-connected storage. You will probably want to create a number of storage policies to suit your production needs. Once created, all storage policies are tracked by vCenter and enforced/maintained by VSAN itself. Because of this, your policy selections remain valid and production continues even in the event of a vCenter outage. In the example policy that we will create in this recipe, the VM policy will be defined as tolerating the failure of a single VSAN host. The VM will not be required to stripe across multiple disks and it will be 30percent thick-provisioned. Getting ready Your VSAN should be deployed and functional as per the previous article. You should be logged in to the vSphere Web Client as an administrator or as a user with rights to create, modify, apply, and delete storage policies. How to do it… From the vSphere 5.5 Web Client, navigate to Home | VM Storage Policies. From the vSphere 6.0 Web Client, navigate to Home | Policies and Profiles | VM Storage Policies. Initially, there will be no storage policies defined unless you have already created storage policies for other solutions. This is normal. In VSAN 6.0, you will have the VSAN default policy defined here prior to the creation of your own policies. Click the Create a new VM storage policy button:   A wizard will launch to guide you through the process. If you have multiple vCenter Server systems in linked-mode, ensure that you have selected the appropriate vCenter Server system from the drop-down menu. Give your storage policy a name that will be useful to you and a description of what the policy does. Then, click Next:   The next page describes the concept of rule sets and requires no interaction. Click the Next button to proceed. When creating the rule set, ensure that you select VSAN from the Rules based on vendor-specific capabilities drop-down menu. This will expose the <Add capability> button. Select Number of failures to tolerate from the drop-down menu and specify a value of 1:   Add other capabilities as desired. For this example, we will specify a single stripe with 30% space reservation. Once all required policy definitions have been applied, click Next:   The next page will tell you which datastores are compatible with the storage policy you have created. As this storage policy is based on specific capabilities exposed by VSAN, only your VSAN datastore will appear as a matching resource. Verify that the VSAN datastore appears, and then click Next. Review the summary page and ensure that the policy is being created on the basis of your specifications. When finished, click Finish. The policy will be created. Depending on the speed of your system, this operation should be nearly instantaneous but may take several seconds to finish. How it works… The VSAN-specific policy definitions are presented through VMware Profile-Driven Storage service, which runs with vCenter Server. Profile-Driven Storage Service determines which policy definitions are available by communicating with the ESXi hosts that are enabled for VSAN. Once VSAN is enabled, each host activates a VASA provider daemon, which is responsible for communicating policy options to and receiving policy instructions from Profile-Driven Storage Service. There's more… The nature of the storage policy definitions enabled by VSAN is additive. No policy option mutually excludes any other, and they can be combined in any way that your policy requirements demand. For example, specifying a number of failures to tolerate will not preclude the specification cache reservation. See also For a full explanation of all policy options and when you might want to use them Applying storage policies to a new VM or a VM deployed from a template When creating a new VM on VSAN, you will want to apply a storage policy to that VM according to your administrative needs. As VSAN is fully integrated into vSphere and vCenter, this is a straightforward option during the normal VM deployment wizard. The workflow described in this recipe is for creating a new VM on VSAN. If deployed from a template, the wizard process is functionally identical from step 4 of the How to do it… section in this recipe. Getting ready You should be logged into vSphere Web Client as an administrator or a user authorized to create virtual machines. You should have at least one storage policy defined (see previous recipe). How to do it… Navigate to Home | Hosts and Clusters | Datacenter | Cluster. Right-click the cluster, and then select New Virtual Machine…:   In the subsequent screen, select Create a new virtual machine. Proceed through the wizard through Step 2b. For the compute resource, ensure that you select your VSAN cluster or one of its hosts:   On the next step, select one of the VM storage policies that you created in the previous recipe. Once you select a VSAN storage policy, only the VSAN datastore will appear as compatible. Any other datastores that you have present will be ineligible for selection:   Complete the rest of the VM-deployment wizard as you normally would to select the guest OS, resources, and so on. Once completed, the VM will deploy and it will populate in the inventory tree on the left side. The VM summary will reflect that the VM resides on the VSAN storage:   How it works… All VMs resident on the VSAN storage will have a storage policy applied. Selecting the appropriate policy during VM creation means that the VM will be how you want it to be from the beginning of the VM's life. While policies can be changed later, this could involve a reconfiguration of the object, which can take time to complete and can result in increased disk and network traffic once it is initiated. Careful decision making during deployment can help you save time later. Applying storage policies to an existing VM migrating to VSAN When introducing VSAN into an existing infrastructure, you may have existing VMs that reside on the external storage, such as NFS, iSCSI, or Fibre Channel (FC). When the time comes to move these VMs into your converged infrastructure and VSAN, we will have to make policy decisions about how these VMs should be handled. Getting ready You should be logged into vSphere Web Client as an administrator or a user authorized to create, migrate, and modify VMs. How to do it… Navigate to Home | Hosts and Clusters | Datacenter | Cluster. Identify the VM that you wish to migrate to VSAN. For the example used in this recipe, we will migrate the VM called linux-vm02 that resides on NFS Datastore. Right-click the VM and select Migrate… from the context menu:   In the resulting page, select Change datastore or Change both host and datastore as applicable, and then click Next. If the VM does not already reside on one of your VSAN-enabled hosts, you must choose the Change both host and datastore option for your migration. In the next step, select one of the VM storage policies that you created in the previous recipe. Once you select a VSAN storage policy, only the VSAN datastore will appear as compatible. Any other datastores that you have present will be ineligible for selection:   You can apply different storage policies to different VM disks. This can be done by performing the following steps: Click on the Advanced >> button to reveal various parts of the VM: Once clicked, the Advanced >> button will change to << Basic.   In the Storage column, click the existing datastore to reveal a drop-down menu. Click Browse. In the subsequent window, select the desired policy from the VM Storage Policy drop-down menu. You will find that the only compatible datastore is your VSAN datastore. Click OK:   Repeat the preceding step as needed for other disks and the VM configuration file. After performing the preceding steps, click on Next. Review your selection on the final page, and then click Finish. Migrations can potentially take a long time, depending on how large the VM is, the speed of the network, and other considerations. Please monitor the progress of your VM relocation tasks using the Recent Tasks pane:   Once the migration task finishes, the VM's Summary tab will reflect that the datastore is now the VSAN datastore. For the example of this VM, the VM moved from NFS Datastore to vsanDatastore:   How it works… Much like the new VM workflow, we select the storage policy that we want to use during the migration of the VM to VSAN. However, unlike the deploy-from-template or VM-creation workflows, this process requires none of the VM configuration steps. We only have to select the storage policy, and then SPBM instructs VSAN how to place and distribute the objects. All object-distribution activities are completely transparent and automatic. This process can be used to change the storage policy of a VM already resident in the VSAN cluster, but it is more cumbersome than modifying the policies by other means. Summary In this article, we learned that storage policies give you granular control over how the data for any given VM or VM disk is handled. Storage policies allow you to define how many mirrors (RAID-1) and how many stripes (RAID-0) are associated with any given VM or VM disk. Resources for Article: Further resources on this subject: Working with Virtual Machines [article] Virtualization [article] Upgrading VMware Virtual Infrastructure Setups [article]
Read more
  • 0
  • 0
  • 5818

article-image-how-optimize-put-requests
Packt
04 Sep 2015
8 min read
Save for later

How to Optimize PUT Requests

Packt
04 Sep 2015
8 min read
In this article by Naoya Hashiotmo, author of the book Amazon S3 Cookbook, explains how to optimize PUT requests, it would be effective to use multipart uploads because it can aggregate throughput by parallelizing PUT requests and uploading a large object into parts. It is recommended that the size of each part should be between 25 and 50 MB for higher networks and 10 MB for mobile networks. (For more resources related to this topic, see here.) Amazon S3 is a highly-scalable, reliable, and low-latency data storage service at a very low cost, designed for mission-critical and primary data storage. It provides the Amazon S3 APIs to simplify your programming tasks. S3 performance optimization is composed of several factors, for example, which region to choose to reduce latency, considering the naming scheme and optimizing the put and get operations. Multipart upload consists of three-step processes; the first step is initiating the upload, next is uploading the object parts, and finally, after uploading all the parts, the multipart upload is finished. The following methods are currently supported to upload objects with multipart upload: AWS SDK for Android AWS SDK for iOS AWS SDK for Java AWS SDK for JavaScript AWS SDK for PHP AWS SDK for Python AWS SDK for Ruby AWS SDK for .NET REST API AWS CLI In order to try multipart upload and see how much it aggregates throughput, we use AWS SDK for Node.js and S3 via NPM (package manager for Node.js). AWS CLI also supports multipart upload. When you use the AWS CLI s3 or s3api subcommand to upload an object, the object is automatically uploaded via multipart requests. Getting ready You need to complete the following set up in advance: Sign up on AWS and be able to access S3 with your IAM credentials Install and set up AWS CLI in your PC or use Amazon Linux AMI Install Node.js It is recommended that you score the benchmark from your local PC or if you use the EC2 instance, you should launch an instance and create an S3 bucket in different regions. For example, if you launch an instance in the Asia Pacific Tokyo region, you should create an S3 bucket in the US standard region. The reason is that the latency between EC2 and S3 is very low, and it is hard to see the difference. How to do it… We upload a 300 GB file in an S3 bucket over HTTP in two ways; one is to use a multipart upload and the other is not to use multipart upload to compare the time. To clearly see how the performance differs, I launched an instance and created an S3 bucket in different regions as follows: EC2 instance: Asia Pacific Tokyo Region (ap-northeast-1) S3 bucket: US Standard region (us-east-1) First, we install the S3 Node.js module via npm, create a dummy file, upload the object into a bucket using a sample Node.js script without enabling multipart upload, and then do the same enabling multipart upload, so that we can see how multipart upload performs the operation. Now, let's move on to the instructions: Install s3 via the npm command: $ cdaws-nodejs-sample/ $ npm install s3 Create a 300 GB dummy file: $ file=300mb.dmp $ dd if=/dev/zero of=${file} bs=10M count=30 Put the following script and save the script as s3_upload.js: // Load the SDK var AWS = require('aws-sdk'); var s3 = require('s3'); var conf = require('./conf'); // Load parameters var client = s3.createClient({ maxAsyncS3: conf.maxAsyncS3, s3RetryCount: conf.s3RetryCount, s3RetryDelay: conf.s3RetryDelay, multipartUploadThreshold: conf.multipartUploadThreshold, multipartUploadSize: conf.multipartUploadSize, }); var params = { localFile: conf.localFile, s3Params: { Bucket: conf.Bucket, Key: conf.localFile, }, }; // upload objects console.log("## s3 Parameters"); console.log(conf); console.log("## Begin uploading."); var uploader = client.uploadFile(params); uploader.on('error', function(err) { console.error("Unable to upload:", err.stack); }); uploader.on('progress', function() { console.log("Progress", uploader.progressMd5Amount,uploader.progressAmount, uploader.progressTotal); }); uploader.on('end', function() { console.log("## Finished uploading."); }); Create a configuration file and save the file conf.js in the same directory as s3_upload.js: exports.maxAsyncS3 = 20; // default value exports.s3RetryCount = 3; // default value exports.s3RetryDelay = 1000; // default value exports.multipartUploadThreshold = 20971520; // default value exports.multipartUploadSize = 15728640; // default value exports.Bucket = "your-bucket-name"; exports.localFile = "300mb.dmp"; exports.Key = "300mb.dmp"; How it works… First of all, let's try uploading a 300 GB object using multipart upload, and then upload the same file without using multipart upload. You can upload an object and see how long it takes by typing the following command: $ time node s3_upload.js ## s3 Parameters { maxAsyncS3: 20, s3RetryCount: 3, s3RetryDelay: 1000, multipartUploadThreshold: 20971520, multipartUploadSize: 15728640, localFile: './300mb.dmp', Bucket: 'bucket-sample-us-east-1', Key: './300mb.dmp' } ## Begin uploading. Progress 0 16384 314572800 Progress 0 32768 314572800 … Progress 0 314572800 314572800 Progress 0 314572800 314572800 ## Finished uploading. real 0m16.111s user 0m4.164s sys 0m0.884s As it took about 16 seconds to upload the object, the transfer rate was 18.75 MB/sec. Then, let's change the following parameters in the configuration (conf.js) as follows and see the result. The 300 GB object is uploaded through only one S3 client and exports.maxAsyncS3 = 1; exports. multipartUploadThreshold = 2097152000; exports.maxAsyncS3 = 1; exports.s3RetryCount = 3; // default value exports.s3RetryDelay = 1000; // default value exports.multipartUploadThreshold = 2097152000; exports.multipartUploadSize = 15728640; // default value exports.Bucket = "your-bucket-name"; exports.localFile = "300mb.dmp"; exports.Key = "300mb.dmp"; Let's see the result after changing the parameters in the configuration (conf.js): $ time node s3_upload.js ## s3 Parameters … ## Begin uploading. Progress 0 16384 314572800 … Progress 0 314572800 314572800 ## Finished uploading. real 0m41.887s user 0m4.196s sys 0m0.728s As it took about 42 seconds to upload the object, the transfer rate was 7.14 MB/sec. Now, let's quickly check each parameter, and then get to the conclusion; maxAsyncS3 defines the maximum number of simultaneous requests that S3 clients are open to Amazon S3. The default value is 20. s3RetryCount defines the number of retries when a request fails. The default value is 3. s3RetryDelay is how many milliseconds S3 clients will wait when a request fails. The default value is 1000. multipartUploadThreshold defines the size of uploading objects via multipart requests. The object will be uploaded via multipart request, if you choose an object that is greater than the size you specified. The default value is 20 MB, the minimum is 5 MB, and the maximum is 5 GB. multipartUploadSize defines the size for each part when uploaded via the multipart request. The default value is 15 MB, the minimum is 5 MB, and the maximum is 5 GB. The following table shows the speed test score with different parameters: maxAsyncS3 1 20 20 40 30 s3RetryCount 3 3 3 3 3 s3RetryDelay 1000 1000 1000 1000 1000 multipartUploadThreshold 2097152000 20971520 20971520 20971520 20971520 multipartUploadSize 15728640 15728640 31457280 15728640 10728640 Time (seconds) 41.88 16.11 17.41 16.37 9.68 Transfer Rate (MB) 7.51 19.53 18.07 19.22 32.50 In conclusion, multipart upload is effective for optimizing the PUT operation, aggregating throughput. However, you need to consider the following: Benchmark your scenario and evaluate the number of retry count, delay, parts, and the multipart upload size based on the networks that your application belongs to. There's more… Multipart upload specification There are limits to using multipart upload. The following table shows the specification of multipart upload: Item Specification Maximum object size 5 TB Maximum number of parts per upload 10,000 Part numbers 1 to 10,000 (inclusive) Part size 5 MB to 5 GB, last part can be more than 5 MB Maximum number of parts returned for a list of parts request 1,000 Maximum number of multipart uploads returned in a list of multipart uploads request 1,000 Multipart upload and charging If you initiate multipart upload and abort the request, Amazon S3 deletes the upload artifacts and any parts you have uploaded and you are not charged for the bills. However, you are charged for all storage, bandwidth, and requests for the multipart upload requests and the associated parts of an object after the operation is completed. The point is you are charged when a multipart upload is completed (not aborted). See also Multipart Upload Overview https://docs.aws.amazon.com/AmazonS3/latest/dev/mpuoverview.html AWS SDK for Node.js http://docs.aws.amazon.com/AWSJavaScriptSDK/guide/node-intro.htm Node.js S3 package npm https://www.npmjs.com/package/s3 Amazon Simple Storage Service: Introduction to Amazon S3 http://docs.aws.amazon.com/AmazonS3/latest/dev/Introduction.html (PFC403) Maximizing Amazon S3 Performance | AWS re:Invent 2014 http://www.slideshare.net/AmazonWebServices/pfc403-maximizing-amazon-s3-performance-aws-reinvent-2014 AWS re:Invent 2014 | (PFC403) Maximizing Amazon S3 Performance https://www.youtube.com/watch?v=_FHRzq7eHQc Summary In this article, we learned how to optimize the PUT requests and uploading a large object into parts. Resources for Article: Further resources on this subject: Amazon Web Services[article] Achieving High-Availability on AWS Cloud[article] Architecture and Component Overview [article]
Read more
  • 0
  • 0
  • 14028

article-image-first-look-and-blinking-lights
Packt
04 Sep 2015
19 min read
Save for later

First Look and Blinking Lights

Packt
04 Sep 2015
19 min read
 This article, by Tony Olsson, the author of the book, Arduino Wearable Projects, explains the Arduino platform based on three different aspects: software, hardware, and the Arduino philosophy. (For more resources related to this topic, see here.) The hardware is the Arduino board, and there are multiple versions available for different needs. Here, we will be focusing on Arduino boards that were made with wearables in mind. The software used to program the boards is also known as the Arduino IDE. IDE stands for Integrated Development Environment, which are programs used to write programs in programming code. The programs written for the board are known as sketches, because the idea aids how to write programs and works similar to a sketchpad. If you have an IDE, you can quickly try it out in code. This is also a part of the Arduino philosophy. Arduino is based on the open source philosophy, which also reflects on how we learn about Arduino. Arduino has a large community, and there are tons of projects to learn from. First, we have the Arduino hardware, which we will use to build all the examples along with different additional electronic components. When the Arduino projects started back in 2005, there was only one piece of headwear to speak of, which was the serial Arduino board. Since then, there have been several iterations of this board, and it has inspired new designs of the Arduino hardware to fit different needs. If you are familiar with Arduino for a while, you probably started out with the standard Arduino board. Today, there are different Arduino boards that fit different needs, and there are countless clones available for specific purposes. In this article, we will be using different specialized Arduino boards the FLORA board. The Arduino software that is Arduino IDE is what we will use to program our projects. The IDE is the software used to write programs for the hardware. Once a program is compiled in the IDE, it will upload it to the Arduino board, and the processor on the board will do whatever your program says. Arduino programs are also known as sketches. The name sketches is borrowed from another open source project and software called Processing. Processing was developed as a tool for digital artists, where the idea was to use Processing as a digital sketchpad. The idea behind sketches and other aspects of Arduino is what we call the Arduino philosophy, and this is the third thing that makes Arduino. Arduino is based on open source, which is a type of licensing model where you are free to develop you own designs based on the original Arduino board. This is one of the reasons why you can find so many different models and clones of the Arduino boards. Open source is also a philosophy that allows ideas and knowledge to be shared freely. The Arduino community has grown strong, and there are many great resources to be found, and Arduino friends to be made. The only problem may be where to start? This is based on a project that will take you from the start, all the way to a finished "prototype". I call all the project prototypes because these are not finished products. As your knowledge progresses, you can develop new sketches to run on you prototypes, develop new functions, or change the physical appearance to fit your needs and preferences. In this article, you will have a look at: Installing the IDE Working with the IDE and writing sketches The FLORA board layout Connecting the FLORA board to the computer Controlling and connecting LEDs to the FLORA board Wearables This is all about wearables, which are defined as computational devices that are worn on the body. A computational device is something that can make calculations of any sort. Some consider mechanical clocks to be the first computers, since they make calculations on time. According to this definition, wearables have been around for centuries, if you think about it. Pocket watches were invented in the 16th century, and a watch is basically as small device that calculates time. Glasses are also an example of wearable technology that can be worn on your head, which have also been around for a long time. Even if glasses do not fit our more specified definition of wearables, they serve as a good example of how humans have modified materials and adapted their bodies to gain new functionality. If we are cold, we dress in clothing to keep us warm, if we break a leg, we use crutches to get around, or even if an organ fails, we can implant a device that replicates their functionality. Humans have a long tradition of developing technology to extend the functionality of the human body. With the development of technology for the army, health care, and professional sport, wearables have a long tradition. But in recent years, more and more devices have been developed for the consumer market. Today, we have smart watches, smart glasses, and different types of smart clothing. Here, we will carry on this ancient tradition and develop some wearable projects for you to learn about electronics and programming. Some of these projects are just for fun and some have a specific application. If you are already familiar with Arduino, you can pick any project and get started. Installing and using software The projects will be based on different boards made by the company Adafruit. Later in this article, we will take a look at one of these boards, called the FLORA, and explain the different parts. These boards come with a modified version of the Arduino IDE, which we will be using in the article. The Adafruit IDE looks exactly the same as the Arduino IDE. The FLORA board, for example, is based on the same microprocessor as the Arduino Leonardo board and can be used with the standard Arduino IDE but programmed using the Leonardo board option. With the use of the Adafruit IDE the FLORA board is properly named. The Adafruit version of the IDE comes preloaded with the necessary libraries for programming these boards, so there is no need to install them separately. For downloading and instructions on installing the IDE, head over to the Adafruit website and follow the steps on the website: https://learn.adafruit.com/getting-started-with-flora/download-software Make sure to download the software corresponding to your operating system. The process for installing the software depends on your operating system. These instructions may change over time and may be different for different versions of the operating system. The installation is a very straightforward process if you are working with OS X. On Windows, you will need to install some additional USB drivers. The process for installing on Linux depends on which distribution you are using. For the latest instructions, take a look at the Arduino website for the different operating systems. The Arduino IDE On the following website, you can find the original Arduino IDE if you need it in the future. Here, you will be fine sticking with the Adafruit version of the IDE, since the most common original Arduino boards are also supported. The following is the link for downloading the Arduino software: https://www.arduino.cc/en/Main/Software. First look at the IDE The IDE is where we will be doing all of our programming. The first time you open up the IDE, it should look like Figure 1.1: Figure 1.1: The Arduino IDE The main white area of the IDE is blank when you open a new sketch, and this is the area of the IDE where we will write our code later on. First, we need to get familiar with the functionality of the IDE. At the top left of the IDE, you will find five buttons. The first one, which looks like a check sign, is the compile button. When you press this button, the IDE will try to compile the code in your sketch, and if it succeeds, you will get a message in the black window at the bottom of you IDE that should look similar to this: Figure 1.2: The compile message window When writing code in an IDE, we will be using what is known as a third-level programming language. The problem with microprocessors on Arduino boards is that they are very hard to communicate with using their native language, and this is why third-level languages have been developed with human readable commands. The code you will see later needs to be translated into code that the Arduino board understands, and this is what is done when we compile the code. The compile button also makes a logical check of your code so that it does not contain any errors. If you have any errors, the text in the black box in the IDE will appear in red, indicating the line of code that is wrong by highlighting it in yellow. Don't worry about errors. They are usually misspelling errors and they happen a lot even to the most experienced programmers. One of the error messages can be seen in the following screenshot: Figure 1.3: Error message in the compile window Adjacent to the compile button, you will find the Upload button. Once this button is pressed, it does the same thing as the compile button, and if your sketch is free from errors, it will send the code from your computer to the board: Figure 1.4: The quick buttons The next three buttons are quick buttons for opening a new sketch, opening an old sketch, or saving your sketch. Make sure to save your sketches once in a while when working on them. If something happens and the IDE closes unexpectedly, it does not autosave, so manually saving once in a while is always a good idea. At the far right of the IDE you will find a button that looks like a magnifying glass. This is used to open the Serial monitor. This button will open up a new window that lets you see the communication form from, and to, the computer and the board At the top of the screen you will find a classic application menu, which may look a bit different depending on your operating system, but will follow the same structure. Under File, you will find the menu for opening your previous sketches and different example sketches that come with the IDE, as shown in Figure 1.5. Under Edit, you will find different options and quick commands for editing your code. In Sketch, you can find the same functions as in the buttons in the IDE window: Figure 1.5: The File menu Under Tools, you will find two menus that are very important to keep track of when uploading sketches to your board. Navigate to Tools | Board and you will find many different types of Arduino boards. In this menu, you will need to select the type of board you are working with. Under Tools | Serial port, you will need to select the USB port which you have connected to your board. Depending on your operating system, the port will be named differently. In Windows, they are named COM*. On OS X, they are named /dev/tty.****: Figure 1.6: The Tools menu Since there may be other things inside your computer also connected to a port, these will also show up in the list. The easiest way to figure out which port is connected to your board is to: Plug you board in to your computer using a USB cable. Then check the Serial port list and remember which port is occupied. Unplug the board and check the list again. The board missing in the list is the port where your board is connected. Plug your board back in and select it in the list. All Arduino boards connected to you computer will be given a new number. In most cases, when your sketch will not upload to you board, you have either selected the wrong board type or serial port in the tools menu. Getting to know you board As mentioned earlier, we will not be using the standard Uno Arduino boards, which is the board most people think of when they hear Arduino board. Most Arduino variations and clones use the same microprocessors as the standard Arduino boards, and it is the microprocessors that are the heart of the board. As long as they use the same microprocessors, they can be programmed as normal by selecting the corresponding standard Arduino board in the Tools menu. In our case, we will be using a modified version of the Arduino IDE, which features the types of boards we will be using. What sets other boards apart from the standard Uno Arduino boards is usually the form factor of the board and pin layout. We will be using a board called the FLORA. This board was created with wearables in mind. The FLORA is based on the same chip used in the Arduino Leonardo board, but uses a much smaller form factor and has been made round to ease the use in a wearable context. You can complete all the projects using most Arduino boards and clones, but remember that the code and construction of the project may need some modifying. The FLORA board In the following Figure 1.7 you will find the FLORA board: Figure 1.7: The FLORA board The biggest difference to normal Arduino boards besides the form factor is the number of pins available. The pins are the copper-coated areas at the edge of the FLORA. The form factor of the pins on FLORA boards is also a bit different from other Arduino boards. In this case, the pin holes and soldering pads are made bigger on FLORA boards so they can be easily sewn into garments, which is common when making wearable projects. The larger pins also make it easier to prototype with alligator clips. The pins available on the FLORA are as follows, starting from the right of the USB connector, which is located at the top of the board in the preceding Figure 1.7: The pins available on the FLORA are as follows, starting from the right of the USB connector, which is located at the top of the board in Figure 1.7: 3.3V: Regulated 3.3 volt output at a 100mA max D10: Is both a digital pin 10 and an analog pin 10 with PWM D9: Is both a digital pin 9 and an analog pin 9 with PWM GND: Ground pin D6: Is both a digital pin 6 and an analog pin 7 with PWM D12: Is both a digital pin 12 and an analog pin 11 VBATT: Raw battery voltage, can be used for as battery power output GND: Ground pin TX: Transmission communication pin or digital pin 1 RX: Receive communication pin or digital pin 0 3.3V: Regulated 3.3 volt output at a 100mA max SDA: Communication pin or digital pin 2 SCL: Clock pin or digital pin 3 with PWM As you can see, most of the pins have more than one function. The most interesting pins are the D* pins. These are the pins we will use to connect to other components. These pins can either be a digital pin or an analog pin. Digital pins operate only in 1 or 0, which mean that they can be only On or Off. You can receive information on these pins, but again, this is only in terms of on or off. The pins marked PWM have a special function, which is called Pulse Width Modulation. On these pins, we can control the output voltage level. The analog pins, however, can handle information in the range from 0 to 1023. The 3.3V pins are used to power any components connected to the board. In this case, an electronic circuit needs to be completed, and that's why there are two GND pins. In order to make an electronic circuit, power always needs to go back to where it came from. For example, if you want to power a motor, you need power from a power source connected via a cable, with another cable directing the power back to the power source, or the motor will not spin. TX, RX, SDA, and SCL are pins used for communication, dealing with more complex sensors. The VBATT pin can be used to output the same voltage as your power source, which you connect to the connector located at the bottom of the FLORA board shown in Figure 1.7. Other boards In Figure 1.8 you will find the other board types we will be using: Figure 1.8: The Gemma, Trinket and Trinket pro board In Figure 1.8, the first one from the left is the Gemma board. In the middle, you will find the Trinket board, and to the right, you have the Trinket pro board. Both the Gemma and Trinket board are based on the ATtiny85 microprocessor, which is a much smaller and cheaper processor, but comes with limitations. These boards only have three programmable pins, but what they lack in functionality, the make up for in size. The difference between the Gemma and Trinket board is the form factor, but the Trinket board also lacks a battery connector. The Trinket Pro board runs on an Atmega328 chip, which is the same chip used on the standard Arduino board to handle the USB communication. This chip has 20 programmable pins, but also lacks a battery connector. The reason for using different types of boards is that different projects require different functionalities, and in some cases, space for adding components will be limited. Don't worry though, since all of them can be programmed in the same way. Connecting and testing your board In order to make sure that you have installed your IDE correctly and to ensure your board is working, we need to connect it to your computer using a USB to USB micro cable, as show in Figure 1.9: Figure 1.9: USB to USB micro cable The small connector of the cable connects to your board, and the larger connector connects to your computer. As long as your board is connected to your computer, the USB port on the computer will power your board. Once your board is connected to the computer, open up your IDE and enter the following code. Follow the basic structure of writing sketches: First, declare your variables at the top of the sketch. The setup you make is the first segment of code that runs when the board is powered up. Then, add the loop function, which is the second segment of the code that runs, and will keep on looping until the board is powered off: int led = 7; void setup() { pinMode(led, OUTPUT); } void loop() { digitalWrite(led, HIGH); delay(1000); digitalWrite(led, LOW); delay(1000); } The first line of code declares pin number 7 as an integer and gives it the name LED. An integer is a data type, and declaring the variable using the name int allows you to store whole numbers in memory. On the FLORA board, there is a small on-board LED connected to the digital pin 7. The next part is void setup(), which is one of the functions that always needs to be in your sketch in order for it to compile. All functions use curly brackets to indicate where the function starts and ends. The { bracket is used for the start, and } the bracket is used to indicated the end of the function. In void setup(), we have declared the mode of the pin we are using. All digital pins can be used as either an input or an output. An input is used for reading the state of anything connected to it, and output is used to control anything connected to the pin. In this case, we are using pin 7, which is connected to the on-board LED. In order to control this pin we need declared it as an output. If you are using a different board, remember to change the pin number in your code. On most other Arduino boards, the onboard LED is connected to pin 13. The void loop() function is where the magic happens. This is where we put the actual commands that operate the pins on the board. In the preceding code, the first thing we do is turn the led pin HIGH by using the digitalWrite()command. The digitalWrite() function is a built-in function that takes two parameters. The first is the number of the pin, in this case, we put in the variable led that has the value 7. The second parameter is the state of the pin, and we can use the HIGH or LOW shortcuts to turn the pin on or off, respectively. Then, we make a pause in the program using the delay() command. The delay command takes one parameter, which is the number of milliseconds you want to pause your program for. After this, we use the same command as before to control the state of the pin, but this time we turn it LOW, which is the same as turning the pin off. Then we wait for an additional 1000 milliseconds. Once the sketch reaches the end of the loop function, the sketch will start over from the start of the same function and keep on looping until a new sketch is uploaded. The reset button is pressed on the FLORA board, or until the power is disconnected. Now that we have the sketch ready, you can press the upload button. If everything goes as planned, the on-board LED should start to blink with a 1 second delay. The sketch you have uploaded will stay on the board even if the board is powered off, until you upload a new sketch that overwrites the old one. If you run into problems with uploading the code, remember to perform the following steps: Check your code for errors or misspelling Check your connections and USB cable Make sure you have the right board type selected Make sure your have the right USB port selected Summary In this article, we have had a look at the different parts of the FLORA board and how to install the IDE. We also made some small sketches to work with the on-board LED. We made our first electronic circuit using an external LED. Resources for Article: Further resources on this subject: Dealing with Interrupts [article] Programmable DC Motor Controller with an LCD [article] Prototyping Arduino Projects using Python [article]
Read more
  • 0
  • 0
  • 11026

article-image-installing-red-hat-cloudforms-red-hat-openstack
Packt
04 Sep 2015
8 min read
Save for later

Installing Red Hat CloudForms on Red Hat OpenStack

Packt
04 Sep 2015
8 min read
In this article by Sangram Rath, the author of the book Hybrid Cloud Management with Red Hat CloudForms, this article takes you through the steps required to install, configure, and use Red Hat CloudForms on Red Hat Enterprise Linux OpenStack. However, you should be able to install it on OpenStack running on any other Linux distribution. The following topics are covered in this article: System requirements Deploying the Red Hat CloudForms Management Engine appliance Configuring the appliance Accessing and navigating the CloudForms web console (For more resources related to this topic, see here.) System requirements Installing the Red Hat CloudForms Management Engine Appliance requires an existing virtual or cloud infrastructure. The following are the latest supported platforms: OpenStack Red Hat Enterprise Virtualization VMware vSphere The system requirements for installing CloudForms are different for different platforms. Since this book talks about installing it on OpenStack, we will see the system requirements for OpenStack. You need a minimum of: Four VCPUs 6 GB RAM 45 GB disk space The flavor we select to launch the CloudForms instance must meet or exceed the preceding requirements. For a list of system requirements for other platforms, refer to the following links: System requirements for Red Hat Enterprise Virtualization: https://access.redhat.com/documentation/en-US/Red_Hat_CloudForms/3.1/html/Installing_CloudForms_on_Red_Hat_Enterprise_Virtualization/index.html System requirements for installing CloudForms on VMware vSphere: https://access.redhat.com/documentation/en-US/Red_Hat_CloudForms/3.1/html/Installing_CloudForms_on_VMware_vSphere/index.html Additional OpenStack requirements Before we can launch a CloudForms instance, we need to ensure that some additional requirements are met: Security group: Ensure that a rule is created to allow traffic on port 443 in the security group that will be used to launch the appliance. Flavor: Based on the system requirements for running the CloudForms appliance, we can either use an existing flavor, such as m1.large, or create a new flavor for the CloudForms Management Engine Appliance. To create a new flavor, click on the Create Flavor button under the Flavor option in Admin and fill in the required parameters, especially these three: At least four VCPUs At least 6144 MB of RAM At least 45 GB of disk space Key pair: Although, at the VNC console, you can just use the default username and password to log in to the appliance, it is good to have access to a key pair as well, if required, for remote SSH. Deploying the Red Hat CloudForms Management Engine appliance Now that we are aware of the resource and security requirements for Red Hat CloudForms, let's look at how to obtain a copy of the appliance and run it. Obtaining the appliance The CloudForms Management appliance for OpenStack can be downloaded from your Red Hat customer portal under the Red Hat CloudForms product page. You need access to a Red Hat CloudForms subscription to be able to do so. At the time of writing this book, the direct download link for this is https://rhn.redhat.com/rhn/software/channel/downloads/Download.do?cid=20037. For more information on obtaining the subscription and appliance, or to request a trial, visit http://www.redhat.com/en/technologies/cloud-computing/cloudforms. Note If you are unable to get access to Red Hat CloudForms, ManageIQ (the open source version) can also be used for hands-on experience. Creating the appliance image in OpenStack Before launching the appliance, we need to create an image in OpenStack for the appliance, since OpenStack requires instances to be launched from an image. You can create a new Image under Project with the following parameters (see the screenshot given for assistance): Enter a name for the image. Enter the image location in Image Source (HTTP URL). Set the Format as QCOW2. Optionally, set the Minimum Disk size. Optionally, set Minimum Ram. Make it Public if required and Create An Image. Note that if you have a newer release of OpenStack, there may be some additional options, but the preceding are what need to be filled in—most importantly the download URL of the Red Hat CloudForms appliance. Wait for the Status field to reflect as Active before launching the instance, as shown in this screenshot: Launching the appliance instance In OpenStack, under Project, select Instances and then click on Launch Instance. In the Launch Instance wizard enter the following instance information in the Details tab: Select an Availabilty Zone. Enter an Instance Name. Select Flavor. Set Instance Count. Set Instance Boot Source as Boot from image. Select CloudForms Management Engine Appliance under Image Name. The final result should appear similar to the following figure: Under the Access & Security tab, ensure that the correct Key Pair and Security Group tab are selected, like this: For Networking, select the proper networks that will provide the required IP addresses and routing, as shown here: Other options, such as Post-Creation and Advanced Options, are optional and can be left blank. Click on Launch when ready to start creating the instance. Wait for the instance state to change to Running before proceeding to the next step. Note If you are accessing the CloudForms Management Engine from the Internet, a Floating IP address needs to be associated with the instance. This can be done from Project, under Access & Security and then the Floating IPs tab. The Red Hat CloudForms web console The web console provides a graphical user interface for working with the CloudForms Management Engine Appliance. The web console can be accessed from a browser on any machine that has network access to the CloudForms Management Engine server. System requirements The system requirements for accessing the Red Hat CloudForms web console are: A Windows, Linux, or Mac computer A modern browser, such as Mozilla Firefox, Google Chrome, and Internet Explorer 8 or above Adobe Flash Player 9 or above The CloudForms Management Engine Appliance must already be installed and activated in your enterprise environment Accessing the Red Hat CloudForms Management Engine web console Type the hostname or floating IP assigned to the instance prefixed by https in a supported browser to access the appliance. Enter default username as admin and the password as smartvm to log in to the appliance, as shown in this screenshot: You should log in to only one tab in each browser, as the console settings are saved for the active tab only. The CloudForms Management Engine also does not guarantee that the browser's Back button will produce the desired results. Use the breadcrumbs provided in the console. Navigating the web console The web console has a primary top-level menu that provides access to feature sets such as Insight, Control, and Automate, along with menus used to add infrastructure and cloud providers, create service catalogs and view or raise requests. The secondary menu appears below the top primary menu, and its options change based on the primary menu option selected. In certain cases, a third-sublevel menu may also appear for additional options based on the selection in the secondary menu. The feature sets available in Red Hat CloudForms are categorized under eight menu items: Cloud Intelligence: This provides a dashboard view of your hybrid cloud infrastructure for the selected parameters. Whatever is displayed here can be configured as a widget. It also provides additional insights into the hybrid cloud in the form of reports, chargeback configuration and information, timeline views, and an RSS feeds section. Services: This provides options for creating templates and service catalogs that help in provisioning multitier workloads across providers. It also lets you create and approve requests for these service catalogs. Clouds: This option in the top menu lets you add cloud providers; define availability zones; and create tenants, flavors, security groups, and instances. Infrastructure: This option, in a way similar to clouds, lets you add infrastructure providers; define clusters; view, discover, and add hosts; provision VMs; work with data stores and repositories; view requests; and configure the PXE. Control: This section lets you define compliance and control policies for the infrastructure providers using events, conditions, and actions based on the conditions. You can further combine these policies into policy profiles. Another important feature is alerting the administrators, which is configured from here. You can also simulate these policies, import and export them, and view logs. Automate: This menu option lets you manage life cycle tasks such as provisioning and retirement, and automation of resources. You can create provisioning dialogs to provision hosts and virtual machines and service dialogs to provision service catalogs. Dialog import/export, logs, and requests for automation are all managed from this menu option. Optimize: This menu option provides utilization, planning, and bottleneck summaries for the hybrid cloud environment. You can also generate reports for these individual metrics. Configure: Here, you can customize the look of the dashboard; view queued, running tasks and check errors and warnings for VMs and the UI. It let's you configure the CloudForms Management Engine appliance settings such as database, additional worker appliances, SmartProxy, and white labelling. One can also perform tasks maintenance tasks such as updates and manual modification of the CFME server configuration files. Summary In this article, we deployed the Red Hat CloudForms Management Engine Appliance in an OpenStack environment, and you learned where to configure the hostname, network settings, and time zone. We then used the floating IP of the instance to access the appliance from a web browser, and you learned where the different feature sets are and how to navigate around. Resources for Article: Further resources on this subject: Introduction to Microsoft Azure Cloud Services[article] Apache CloudStack Architecture[article] Using OpenStack Swift [article]
Read more
  • 0
  • 0
  • 14012
article-image-introducing-liferay-your-intranet
Packt
04 Sep 2015
32 min read
Save for later

Introducing Liferay for Your Intranet

Packt
04 Sep 2015
32 min read
In this article by Navin Agarwal, author of the book Liferay Portal 6.2 Enterprise Intranets, we will learn that Liferay is an enterprise application solution. It provides a lot of functionalities, which helps an organization to grow and is a one-solution package as a portal and content management solution. In this article, we will look at the following topics: The complete features you want your organization's intranet solution to have Reasons why Liferay is an excellent choice to build your intranet Where and how Liferay is used besides intranet portals Easy integration with other open source tools and applications Getting into more technical information about what Liferay is and how it works So, let's start looking at exactly what kind of site we're going to build. (For more resources related to this topic, see here.) Liferay Portal makes life easy We're going to build a complete corporate intranet solution using Liferay. Let's discuss some of the features your intranet portal will have. Hosted discussions Are you still using e-mail for group discussions? Then, it's time you found a better way! Running group discussions over e-mail clogs up the team's inbox—this means you have to choose your distribution list in advance, and that makes it hard for team members to opt in and out of the discussion. Using Liferay, we will build a range of discussion boards for discussion within and between teams. The discussions are archived in one place, which means that it's always possible to go back and refer to them later. On one level, it's just more convenient to move e-mail discussions to a discussion forum designed for the purpose. But once the forum is in place, you will find that a more productive group discussion takes place here than it ever did over e-mail. Collaborative documents using wikis Your company probably has guideline documents that should be updated regularly but swiftly lose their relevance as practices and procedures change. Even worse, each of your staff will know useful, productive tricks and techniques—but there's probably no easy way to record that knowledge in a way that is easy for others to find and use. We will see how to host wikis within Liferay. A wiki enables anybody to create and edit web pages and link all of those web pages together without requiring any HTML or programming skills. You can put your guideline documents into a wiki, and as practices change, your frontline staff can quickly and effortlessly update the guideline documentation. Wikis can also act as a shared notebook, enabling team members to collaborate and share ideas and findings and work together on documents. Team and individual blogs Your company probably needs frequent, chronological publications of personal thoughts and web links in the intranet. Your company probably has teams and individuals working on specific projects in order to share files and blogs about a project process and more. By using the Liferay Blog features, you can use HTML text editors to create or update files and blogs and to provide RSS feeds. Liferay provides an easy way for teams and individuals to share files with the help of blogs. Blogs provide a straightforward blogging solution with features such as RSS, user and guest comments, browsable categories, tags and labels, and a rating system. Liferay's RSS with the subscription feature provides the ability to frequently read RSS feeds from within the portal framework. At the same time, What You See Is What You Get (WYSIWYG) editors provide the ability to edit web content, including the blogs' content. Less technical people can use the WYSIWYG editor instead of sifting through complex code. Shared calendars Many companies require calendar information and share the calendar among users from different departments. We will see how to share a calendar within Liferay. The shared calendar can satisfy the basic business requirements incorporated into a featured business intranet, such as scheduling meetings, sending meeting invitations, checking for attendees' availability, and so on. Therefore, you can provide an environment for users to manage events and share calendars. Document management – CMS When there is a need for document sharing and document management, Liferay's Documents and Media library helps you with lots of features. The Documents and Media portlet allows you to add folders and subfolders for documents and media files, and also allows users to publish documents. It serves as a repository for all types of files and makes Content management systems (CMSes) available for intranets. The Documents and Media library portlet is equipped with customizable folders and acts as a web-based solution to share documents and media files among all your team members—just as a shared drive would. All the intranet users will be able to access the files from anywhere, and the content is accessible only by those authorized by administrators. All the files are secured by the permission layer by the administrator. Web content management – WCM Your company may have a lot of images and documents, and you may need to manage all these images and documents as well. Therefore, you require the ability to manage a lot of web content and then publish web content in intranets. We will see how to manage web content and how to publish web content within Liferay. Liferay Journal (Web Content) not only provides high availability to publish, manage, and maintain web content and documents, but it also separates content from the layout. Liferay WCM allows us to create, edit, and publish web content (articles). It also allows quick changes in the preview of the web content by changing the layout. It has built-in functionality, such as workflow, search, article versioning, scheduling, and metadata. Personalization and internalization All users can get a personal space that can be either made public (published as a website with a unique, friendly URL) or kept private. You can also customize how the space looks, what tools and applications are included, what goes into Documents and Media, and who can view and access all of this content. In addition, Liferay supports multiple languages, where you can select your own language. Multilingual organizations get out-of-the-box support for up to 45 languages. Users can toggle among different language settings with just one click and produce/publish multilingual documents and web content. Users can make use of the internalization feature to define the specific site in a localized language. Workflow, staging, scheduling, and publishing You can use a workflow to manage definitions, instances, and predetermined sequences of connected steps. Workflow can be used for web content management, assets, and so on. Liferay's built-in workflow engine is called Kaleo. It allows users to set up the review and publishing process on the web content article of any document that needs to end up on the live site. Liferay 6.2 integrates with the powerful features of the workflow and data capabilities of dynamic data lists in Kaleo Forms; it's only available in Liferay Enterprise Edition. Staging environments are integrated with Liferay's workflow engine. To have a review process for staged pages, you need to make sure you have a workflow engine configured and you have a staging setup in the workflow. As a content creator, you can update what you've created and publish it in a staging workflow. Other users can then review and modify it. Moreover, content editors can make a decision on whether to publish web content from staging to live, that is, you can easily create and manage everything from a simple article of text and images to fully functional websites in staging and then publish them live. Before going live, you can schedule web content as well. For instance, you can publish web content immediately or schedule it for publishing on a specific date. Social networks and Social Office Liferay Portal supports social networks—you can easily manage your Google Plus, Facebook, MySpace, Twitter, and other social network accounts in Liferay. In addition, you can manage your instant messenger accounts, such as AIM, ICQ, Jabber, MSN, Skype, YM, and so on smoothly from inside Liferay. Liferay Social Office gives us a social collaboration on top of the portal—a fully virtual workspace that streamlines communication and builds up group cohesion. It provides holistic enhancement to the way you and your colleagues work together. All components in Social Office are tied together seamlessly, getting everyone on the same page by sharing the same look and feel. More importantly, the dynamic activity tracking gives us a bird's-eye view of who has been doing what and when within each individual site. Using Liferay Social Office, you can enhance your existing personal workflow with social tools, keep your team up to date, and turn collective knowledge into collective action. Note that Liferay 6.2 supports the Liferay Social Office 3.0 current version. Liferay Sync and Marketplace Liferay Sync is Liferay's newest product, designed to make file sharing as easy as a simple drag and drop! Liferay Sync is an add-on product for Liferay 6.1 CE, EE, and later versions, which makes it a more raw boost product and enables the end user to publish and access documents and files from multiple environments and devices, including Windows and MacOS systems, and iOS-based mobile platforms. Liferay Sync is one of the best features, and it is fully integrated into the Liferay platform. Liferay 6.1 introduced the new concept of the marketplace, which leverages the developers to develop any components or functionality and release and share it with other users. It's a user-friendly and one-stop place to share apps. Liferay Marketplace provides the portal product with add-on features with a new hub to share, browse, and download Liferay-compatible applications. In Liferay 6.2, Marketplace comes under App Manager, where all the app-related controls can be possible. More features The intranet also arranges staff members into teams and sites, provides a way of real-time IM and chatting, and gives each user an appropriate level of access. This means that they can get all the information they need and edit and add content as necessary but won't be able to mess with sensitive information that they have no reason to see. In particular, the portal provides an integrating framework so that you can integrate external applications easily. For example, you can integrate external applications with the portal, such as Alfresco, OpenX, LDAP, SSO CAS, Orbeon Forms, Konakart, PayPal, Solr, and so on. In a word, the portal offers compelling benefits to today's enterprises—reduced operational costs, improved customer satisfaction, and streamlined business processes. Everything in one place All of these features are useful on their own. However, it gets better when you consider that all of these features will be combined into one easy-to-use searchable portal. A user of the intranet, for example, can search for a topic—let's say financial report—and find the following in one go: Any group discussions about financial reports Blog entries within the intranet concerning financial reports Documents and files—perhaps the financial reports themselves Wiki entries with guidelines on preparing financial reports Calendar entries for meetings to discuss the financial report Of course, users can also restrict their search to just one area if they already know exactly what they are looking for. Liferay provides other features, such as tagging, in order to make it even easier to organize information across the whole intranet. We will do all of this and more. Introducing Palm Tree Publications We are going to build an intranet for a fictional company as an example, focusing on how to install, configure, and integrate it with other applications and also implement portals and plugins (portlets, themes, layout templates, hooks, and webs) within Liferay. By applying the instructions to your own business, you will be able to build an intranet to meet your own company's needs. "Palm Tree Publications" needs an intranet of its own, which we will call bookpub.com. The enterprise's global headquarters are in the United States. It has several departments—editorial, website, engineering, marketing, executive, and human resources. Each department has staff in the U.S., Germany, and India or in all three places. The intranet site provides a site called "Book Street and Book Workshop" consisting of users who have an interest in reading books. The enterprise needs to integrate collaboration tools, such as wikis, discussion forums, blogs, instant messaging, mail, RSS, shared calendars, tagging, and so on. Palm Tree Publications has more advanced needs too: a workflow to edit, approve, and publish books. Furthermore, the enterprise has a lot of content, such as books stored and managed alfresco currently. In order to build the intranet site, the following functionality should be considered: Installing the portal, experiencing the portal and portlets, and customizing the portal and personal web pages Bringing the features of enabling document sharing, calendar sharing, and other collaboration within a business to the users of the portal Discussion forums—employees should be able to discuss book ideas and proposals Wikis—keeping track of information about editorial guidance and other resources that require frequent editing Dissemination of information via blogs—small teams working on specific projects share files and blogs about a project process Sharing a calendar among employees Web content management creation by the content author and getting approved by the publisher Document repository—using effective content management systems (CMSes), a natural fit for a portal for secure access, permissions, and distinct roles (such as writers, editors, designers, administrators, and so on) Collaborative chat and instant messaging, social network, Social Office, and knowledge management tools Managing a site named Book Street and Book Workshop that consists of users who have the same interest in reading books as staging, scheduling, and publishing web content related to books Federated search for discussion forum entries, blog posts, wiki articles, users in the directory, and content in both the Document and Media libraries; search by tags Integrating back-of-the-house software applications, such as Alfresco, Orbeon Forms, the Drools rule server, Jasper Server, and BI/Reporting Pentaho; strong authentication and authorization with LDAP; and single authentication to access various company sites besides the intranet site The enterprise can have the following groups of people: Admin: This group installs systems, manages membership, users, user groups, organizations, roles and permissions, security on resources, workflow, servers and instances, and integrates with third-party systems Executives: Executive management handles approvals Marketing: This group handles websites, company brochures, marketing campaigns, projects, and digital assets Sales: This group makes presentations, contracts, documents, and reports Website editors: This group manages pages of the intranet—writes articles, reviews articles, designs the layout of articles, and publishes articles Book editors: This group writes, reviews, and publishes books and approves and rejects the publishing of books Human resources: This group manages corporate policy documents Finance: This group manages accounts documents, scanned invoices and checks accounts Corporate communications: This group manages external public relations, internal news releases, and syndication Engineering: This group sets up the development environment and collaborates on engineering projects and presentation templates Introducing Liferay Portal's architecture and framework Liferay Portal's architecture supports high availability for mission-critical applications using clustering and the fully distributed cache and replication support across multiple servers. The following diagram has been taken from the Liferay forum written by Jorge Ferrer. This diagram depicts the various architectural layers and functionalities of portlets: Figure 1.1: The Liferay architecture The preceding image was taken from https://www.liferay.com/web/jorge.ferrer/blog/-/blogs/liferay-s-architecture-the-beginning-of-a-blog-series site blog. The Liferay Portal architecture is designed in such a way that it provides tons of features at one place: Frontend layer: This layer is the end user's interface Service layer: This contains the great majority of the business logic for the portal platform and all of the portlets included out of the box Persistence layer: Liferay relies on Hibernate to do most of its database access Web services API layer: This handles web services, such as JSON and SOAP In Liferay, the service layer, persistence layer, and web services API layer are built automatically by that wonderful tool called Service Builder. Service Builder is the tool that glues together all of Liferay's layers and that hides the complexities of using Spring or Hibernate under the hood. Service-oriented architecture Liferay Portal uses service-oriented architecture (SOA) design principles throughout and provides the tools and framework to extend SOA to other enterprise applications. Under the Liferay enterprise architecture, not only can the users access the portal from traditional and wireless devices, but developers can also access it from the exposed APIs via REST, SOAP, RMI, XML-RPC, XML, JSON, Hessian, and Burlap. Liferay Portal is designed to deploy portlets that adhere to the portlet API compliant with both JSR-168 and JSR-286. A set of useful portlets are bundled with the portal, including Documents and Media, Calendar, Message Boards, Blogs, Wikis, and so on. They can be used as examples to add custom portlets. In a word, the key features of Liferay include using SOA design principles throughout, such as reliable security, integrating the portal with SSO and LDAP, multitier and limitless clustering, high availability, caching pages, dynamic virtual hosting, and so on. Understanding Enterprise Service Bus Enterprise Service Bus (ESB) is a central connection manager that allows applications and services to be added quickly to an enterprise infrastructure. When an application needs to be replaced, it can easily be disconnected from the bus at a single point. Liferay Portal uses Mule or ServiceMix as ESB. Through ESB, the portal can integrate with SharePoint, BPM (such as the jBPM workflow engine and Intalio | BPMS engine), BI Xforms reporting, JCR repository, and so on. It supports JSR 170 for content management systems with the integration of JCR repositories, such as Jackrabbit. It also uses Hibernate and JDBC to connect to any database. Furthermore, it supports an event system with synchronous and asynchronous messaging and a lightweight message bus. Liferay Portal uses the Spring framework for its business and data services layers. It also uses the Spring framework for its transaction management. Based on service interfaces, portal-impl is implemented and exposed only for internal usage—for example, they are used for the extension environment. portal-kernel and portal-service are provided for external usage (or for internal usage)—for example, they are used for the Plugins SDK environment. Custom portlets, both JSR-168 and JSR-286, and web services can be built based on portal-kernel and portal-service. In addition, the Web 2.0 Mail portlet and the Web 2.0 Chat portlet are supported as well. More interestingly, scheduled staging and remote staging and publishing serve as a foundation through the tunnel web for web content management and publishing. Liferay Portal supports web services to make it easy for different applications in an enterprise to communicate with each other. Java, .NET, and proprietary applications can work together easily because web services use XML standards. It also supports REST-style JSON web services for lightweight, maintainable code and supports AJAX-based user interfaces. Liferay Portal uses industry-standard, government-grade encryption technologies, including advanced algorithms, such as DES, MD5, and RSA. Liferay was benchmarked as one of the most secure portal platforms using LogicLibrary's Logiscan suite. Liferay offers customizable single sign-on (SSO) that integrates into Yale CAS, JAAS, LDAP, NTLM, CA Siteminder, Novell Identity Manager, OpenSSO, and more. Open ID, OpenAuth, Yale CAS, Siteminder, and OpenAM integration are offered by it out of the box. In short, Liferay Portal uses ESB in general with an abstraction layer on top of an enterprise messaging system. It allows integration architects to exploit the value of messaging systems, such as reporting, e-commerce, and advertisements. Understanding the advantages of using Liferay to build an intranet Of course, there are lots of ways to build a company intranet. What makes Liferay such a good choice to create an intranet portal? It has got the features we need All of the features we outlined for our intranet come built into Liferay: discussions, wikis, calendars, blogs, and so on are part of what Liferay is designed to do. It is also designed to tie all of these features together into one searchable portal, so we won't be dealing with lots of separate components when we build and use our intranet. Every part will work together with others. Easy to set up and use Liferay has an intuitive interface that uses icons, clear labels, and drag and drop to make it easy to configure and use the intranet. Setting up the intranet will require a bit more work than using it, of course. However, you will be pleasantly surprised by how simple it is—no programming is required to get your intranet up and running. Free and open source How much does Liferay cost? Nothing! It's a free, open source tool. Here, being free means that you can go to Liferay's website and download it without paying anything. You can then go ahead and install it and use it. Liferay comes with an enterprise edition too, for which users need to pay. In addition, Liferay provides full support and access to additional enterprise edition plugins/applications. Liferay makes its money by providing additional services, including training. However, the standard use of Liferay is completely free. Now you probably won't have to pay another penny to get your intranet working. Being open source means that the program code that makes Liferay work is available to anybody to look at and change. Even if you're not a programmer, this is still good for you: If you need Liferay to do something new, then you can hire a programmer to modify Liferay to do it. There are lots of developers studying the source code, looking for ways to make it better. Lots of improvements get incorporated into Liferay's main code. Developers are always working to create plugins—programs that work together with Liferay to add new features. Probably, for now, the big deal here is that it doesn't cost any money. However, as you use Liferay more, you will come to understand the other benefits of open source software for you. Grows with you Liferay is designed in a way that means it can work with thousands and thousands of users at once. No matter how big your business is or how much it grows, Liferay will still work and handle all of the information you throw at it. It also has features especially suited to large, international businesses. Are you opening offices in non-English speaking countries? No problem! Liferay has internationalization features tailored to many of the world's popular languages. Works with other tools Liferay is designed to work with other software tools—the ones that you're already using and the ones that you might use in the future—for instance: You can hook up Liferay to your LDAP directory server and SSO so that user details and login credentials are added to Liferay automatically Liferay can work with Alfresco—a popular and powerful Enterprise CMS (used to provide extremely advanced document management capabilities, which are far beyond what Liferay does on its own) Based on "standards" This is a more technical benefit; however, it is a very useful one if you ever want to use Liferay in a more specialized way. Liferay is based on standard technologies that are popular with developers and other IT experts and that confer the following benefits on users: Built using Java: Java is a popular programming language that can run on just about any computer. There are millions of Java programmers in the world, so it won't be too hard to find developers who can customize Liferay. Based on tried and tested components: With any tool, there's a danger of bugs. Liferay uses lots of well-known, widely tested components to minimize the likelihood of bugs creeping in. If you are interested, here are some of the well-known components and technologies Liferay uses—Apache ServiceMix, Mule, ehcache, Hibernate, ICEfaces, Java J2EE/JEE, jBPM, Activiti, JGroups, Alloy UI, Lucene, PHP, Ruby, Seam, Spring and AOP, Struts and Tiles, Tapestry, Velocity, and FreeMarker. Uses standard ways to communicate with other software: There are various standards established to share data between pieces of software. Liferay uses these so that you can easily get information from Liferay into other systems. The standards implemented by Liferay include AJAX, iCalendar and Microformat, JSR-168, JSR-127, JSR-170, JSR-286 (Portlet 2.0), JSR-314 (JSF 2.0), OpenSearch, the Open platform with support for web services, including JSON, Hessian, Burlap, REST, RMI, and WSRP, WebDAV, and CalDAV. Makes publication and collaboration tools Web Content Accessibility Guidelines 2.0 (WCAG 2.0) compliant: The new W3C recommendation is to make web content accessible to a wide range of people with disabilities, including blindness and low vision, deafness and hearing loss, learning disabilities, cognitive limitations, limited movement, speech disabilities, photosensitivity, and combinations of these. For example, the portal integrates CKEditor-standards support, such as W3C (WAI-AA and WCAG), 508 (Section 508). Alloy UI: The Liferay UI supports HTML 5, CSS 3, and Yahoo! User Interface Library 3 (YUI 3). Supports Apache Ant 1.8 and Maven 2: Liferay Portal can be built through Apache Ant by default, where you can build services; clean, compile, and build JavaScript CMD; build language native to ASCII, deploy, fast deploy; and so on. Moreover, Liferay supports Maven 2 SDK, providing Community Edition (CE) releases through public maven repositories as well as Enterprise Edition (EE) customers to install maven artifacts in their local maven repository. Bootstrap: Liferay 6.2 provides support for Twitter Bootstrap out of the box. With its fully responsive UI, the benefit of bootstrap is that it will support any device to render the content. Even content authors can use bootstrap markup and styles to make the content nicer. Many of these standards are things that you will never need to know much about, so don't worry if you've never heard of them. Liferay is better for using them, but mostly, you won't even know they are there. Other advantages of Liferay Liferay isn't just for intranets! Users and developers are building all kinds of different websites and systems based on Liferay. Corporate extranets An intranet is great for collaboration and information sharing within a company. An extranet extends this facility to suppliers and customers, who usually log in over the Internet. In many ways, this is similar to an intranet—however, there are a few technical differences. The main difference is that you create user accounts for people who are not part of your company. Collaborative websites Collaborative websites not only provide a secure and administrated framework, but they also empower users with collaborative tools, such as blogs, instant e-mail, message boards, instant messaging, shared calendars, and so on. Moreover, they encourage users to use other tools, such as tag administration, fine-grained permissions, delegable administrator privileges, enterprise taxonomy, and ad hoc user groups. By means of these tools, as an administrator, you can ultimately control what people can and cannot do in Liferay. In many ways, this is similar to an intranet too; however, there are a few technical differences. The main difference is that you use collaborative tools simply, such as blogs, instant e-mail, message boards, instant messaging, shared calendars, and so on. Content management and web publishing You can also use Liferay to run your public company website with content management and web publishing. Content management and web publishing are useful features in websites. It is a fact that the volume of digital content for any organization is increasing on a daily basis. Therefore, an effective CMS is a vital part of any organization. Meanwhile, document management is also useful and more effective when repositories have to be assigned to different departments and groups within the organization. Content management and document management are effective in Liferay. Moreover, when managing and publishing content, we may have to answer many questions, such as "who should be able to update and delete a document from the system?". Fortunately, Liferay's security and permissions model can satisfy the need for secure access and permissions and distinct roles (for example, writer, editor, designer, and administrator). Furthermore, Liferay integrates with the workflow engine. Thus, users can follow a flow to edit, approve, and publish content in the website. Content management and web publishing are similar to an intranet; however, there are a few technical differences. The main difference is that you can manage content and publish web content smoothly. Infrastructure portals Infrastructure portals integrate all possible functions, as we stated previously. This covers collaboration and information sharing within a company in the form of collaborative tools, content management, and web publishing. In infrastructure portals, users can create a unified interface to work with content, regardless of source via content interaction APIs. Furthermore, using the same API and the same interface as that of the built-in CMS, users can also manage content and publish web content from third-party systems, such as Alfresco, Vignette, Magnolia, FatWire, Microsoft SharePoint, and so on. Infrastructure portals are similar to an intranet; there are a few technical differences though. The main difference is that you can use collaborative tools, manage content, publish web content, and integrate other systems in one place. Why do you need a portal? The main reason is that a portal can serve as a framework to aggregate content and applications. A portal normally provides a secure and manageable framework where users can easily make new and existing enterprise applications available. In order to build an infrastructure portal smoothly, Liferay Portal provides an SOA-based framework to integrate third-party systems. Out-of-the-box portlets and features Liferay provides out-of-the-box (OOTB) portlets that have key features and can be used in the enterprise intranet very efficiently. These portlets are very scalable and powerful and provide the developer with the tools to customize it very easily. Let's see some of the most frequently used portlets in Liferay Portal. Content management Content management is a common feature in any web-based portal or website: The Web Content portlet has the features of full web publishing, office integration, and the asset library, which contains documents, images, and videos. This portlet also has the structure and templates that help with the designing of the web content's look and feel. Structure can be designed with the help of a visual editor with drag and drop. It has the integrated help feature with tooltips to name the attributes of the fields. The Asset Publisher portlet provides you with the feature to select any type of content/asset, such as wiki pages, web content, calendar events, message board messages, documents, media documents, and many more. It also allows us to use filter on them by types, categories, tags, and sources. The display settings provide configurable settings, which helps the content to be displayed to the end users perfectly. The Document and Media portlet is one of the most usable portlets to store any type of document. It allows you to store and manage your documents. It allows you to manage Liferay documents from your own machine's filesystem with the help of WebDAV integration. It has lots of new, built-in features, such as the inline document preview, image preview, and video player. Document metadata is displayed in document details, which makes it easier for you to review the metadata of the document. Also, Document and Media has features named checkin and checkout that helps editing the document in a group very easily. The Document and Media portlet has the multi-repository integration feature, which allows you to configure or mount any other repository very easily, such as SharePoint, Documentum, and Alfresco, utilizing the CMIS standard. Collaboration Collaboration features are generally ways in which users communicate with each other, such as the ones shown in the following list: The Dynamic data list portlet provides you with the facility of not writing a single line of code to create the form or data list. Say, for example, your corporate intranet needs the job posting done on a daily basis by the HR administrator. The administrator needs to develop the custom portlet to fulfill that requirement. Now, the dynamic data list portlet will allow the administrator to create a form for job posting. It's very easy to create and display new data types. The Blog portlet is one of the best features of Liferay. Blog portlets have two other related portlets, namely Recent Bloggers and Blogs Aggregator. The blog portlet provides the best possible ways for chronological publications of personal thoughts and web links in the intranet. Blog portlets can be placed for users of different sites/departments under the respective site//department page. The Calendar portlet provides the feature to create the event and schedule the event. It has many features that help the users in viewing the meeting schedule. The Message Board portlet is a full-featured forum solution with threaded views, categories, RSS capability, avatars, file attachments, previews, dynamic lists of recent posts, and forum statistics. Message Board portlets work with the fine-grained permissions and role-based access control model to give detailed levels of control to administrators and users. The Wiki portlet, like the Message Boards portlet, provides a straightforward wiki solution for both intranet and extranet portals that provides knowledge management among the users. It has all of the features you would expect in a state-of-the-art wiki. Again, it has the features of a file attachment preview, publishing the content, and versioning, and works with a fine-grained permission and role-based access control model. This again takes all the features of the Liferay platform. The Social Activity portlet allows you to tweak the measurements used to calculate user involvement within a site. The contribution and participation values determine the reward value of an action. It uses the blog entry, wiki, and message board points to calculate the user involvement in the site. The Marketplace portlet is placed inside the control panel. It's a hub for the applications provided by Liferay and other partners. You can find that many applications are free, and for certain applications, you need to pay an amount. It's more like an app store. This feature was introduced in Liferay Version 6.1. In the Liferay 6.2 control panel, under the Apps | Store link section, you will see apps that are stored in the Marketplace portlet. Liferay 6.2 comes with a new control panel that is very easy to manage for the portal's Admin users. Liferay Sync is not a portlet; it's a new feature of Liferay that allows you to synchronize documents of Liferay Document and Media with your local system. Liferay provide the Liferay Sync application, which has to be installed in your local system or mobile device. News RSS portlets provide RSS feeds. RSS portlets are used for the publishers by letting them syndicate content automatically. They benefit readers who want to subscribe to timely updates from their favorite websites or to aggregate feeds from many sites into one place. A Liferay RSS portlet is fully customizable, and it allows you to set the URL from which site you would like to get feeds. Social Activities portlets display portal-wide user activity, such as posting on message boards, creating wikis, and adding documents to Documents and Media. There are more portlets for social categories, such as User Statistics portlets, Group Statistics portlets, and Requests portlets. All these portlets are used for the social media. Tools The Search portlet provides faceted search features. When a search is performed, facet information will appear based on the results of the search. The number of each asset type and the most frequently occurring tags and categories as well as their frequency will all appear in the left-hand side column of the portlet. It searches through Bookmarks, Blogs Entries, Web Content Articles, Document Library Files, Users, Message Board, and Wiki. Finding more information on Liferay In this article, we looked at what Liferay can do for your corporate intranet and briefly saw why it's a good choice. If you want more background information on Liferay, the best place to start is the Liferay corporate website (http://www.liferay.com) itself. You can find the latest news and events, various training programs offered worldwide, presentations, demonstrations, and hosted trails. More interestingly, Liferay eats its own dog food; corporate websites within forums (called message boards), blogs, and wikis are built by Liferay using its own products. It is a real demo of Liferay Portal's software. Liferay is 100 percent open source and all downloads are available from the Liferay Portal website at http://www.liferay.com/web/guest/downloads/portal and the SourceForge website at http://sourceforge.net/projects/lportal/files. The source code repository is available at https://github.com/liferay. The Liferay website's wiki (http://www.liferay.com/web/guest/community/wiki) contains documentation, including a tutorial, user guide, developer guide, administrator guide, roadmap, and so on. The Liferay website's discussion forums can be accessed at http://www.liferay.com/web/guest/community/forums and the blogs at http://www.liferay.com/community/blogs/highlighted. The official plugins and the community plugins are available at http://www.liferay.com/marketplace and are the best place to share your thoughts, get tips and tricks about Liferay implementation, and use and contribute community plugins. If you would like to file a bug or know more about the fixes in a specific release, then you must visit the bug-tracking system at http://issues.liferay.com/. Summary In this article, we looked at what Liferay can offer your intranet and what we should consider while designing the company's enterprise site. We saw that our final intranet will provide shared documents, discussions, collaborative wikis, and more in a single, searchable portal. Well, Liferay is a great choice for an intranet because it provides so many features and is easy to use, free and open source, extensible, and well-integrated with other tools and standards. We also saw the other kinds of sites Liferay is good for, such as extranets, collaborative websites, content management, web publishing, and infrastructure portals. For the best example of an intranet and extranet, you can visit www.liferay.com. It will provide you with more background information. Resources for Article: Further resources on this subject: Working with a Liferay User / User Group / Organization[article] Liferay, its Installation and setup[article] Building your First Liferay Site [article]
Read more
  • 0
  • 6
  • 48713

article-image-javascript-execution-selenium
Packt
04 Sep 2015
23 min read
Save for later

JavaScript Execution with Selenium

Packt
04 Sep 2015
23 min read
In this article, by Mark Collin, the author of the book, Mastering Selenium WebDriver, we will look at how we can directly execute JavaScript snippets in Selenium. We will explore the sort of things that you can do and how they can help you work around some of the limitations that you will come across while writing your scripts. We will also have a look at some examples of things that you should avoid doing. (For more resources related to this topic, see here.) Introducing the JavaScript executor Selenium has a mature API that caters to the majority of automation tasks that you may want to throw at it. That being said, you will occasionally come across problems that the API doesn't really seem to support. This was very much on the development team's mind when Selenium was written. So, they provided a way for you to easily inject and execute arbitrary blocks of JavaScript. Let's have a look at a basic example of using a JavaScript executor in Selenium: JavascriptExecutor js = (JavascriptExecutor) driver; js.executeScript("console.log('I logged something to the Javascript console');"); Note that the first thing we do is cast a WebDriver object into a JavascriptExecutor object. The JavascriptExecutor interface is implemented through the RemoteWebDriver class. So, it's not a part of the core set of API functions. Since we normally pass around a WebDriver object, the executeScript functions will not be available unless we perform this cast. If you are directly using an instance of RemoteWebDriver or something that extends it (most driver implementations now do this), you will have direct access to the .executeScript() function. Here's an example: FirefoxDriver driver = new FirefoxDriver(new FirefoxProfile()); driver.executeScript("console.log('I logged something to the Javascript console');"); The second line (in both the preceding examples) is just telling Selenium to execute an arbitrary piece of JavaScript. In this case, we are just going to print something to the JavaScript console in the browser. We can also get the .executeScript() function to return things to us. For example, if we tweak the script of JavaScript in the first example, we can get Selenium to tell us whether it managed to write to the JavaScript console or not, as follows: JavascriptExecutor js = (JavascriptExecutor) driver; Object response = js.executeScript("return console.log('I logged something to the Javascript console');"); In the preceding example, we will get a result of true coming back from the JavaScript executor. Why does our JavaScript start with return? Well, the JavaScript executed by Selenium is executed as a body of an anonymous function. This means that if we did not add a return statement to the start of our JavaScript snippet, we would actually be running this JavaScript function using Selenium: var anonymous = function () { console.log('I logged something to the Javascript console'); }; This function does log to the console, but it does not return anything. So, we can't access the result of the JavaScript snippet. If we prefix it with a return, it will execute this anonymous function: var anonymous = function () { return console.log('I logged something to the Javascript console'); }; This does return something for us to work with. In this case, it will be the result of our attempt to write some text to the console. If we succeeded in writing some text to the console, we will get back a true value. If we failed, we will get back a false value. Note that in our example, we saved the response as an object—not a string or a Boolean. This is because the JavaScript executor can return lots of different types of objects. What we get as a response can be one of the following: If the result is null or there is no return value, a null will be returned If the result is an HTML element, a WebElement will be returned If the result is a decimal, a double will be returned If the result is a nondecimal number, a long will be returned If the result is a Boolean, a Boolean will be returned If the result is an array, a List object with each object that it contains, along with all of these rules, will be returned (nested lists are supported) For all other cases, a string will be returned It is an impressive list, and it makes you realize just how powerful this method is. There is more as well. You can also pass arguments into the .executeScript() function. The arguments that you pass in can be any one of the following: Number Boolean String WebElement List They are then put into a magic variable called arguments, which can be accessed by the JavaScript. Let's extend our example a little bit to pass in some arguments, as follows: String animal = "Lion"; int seen = 5; JavascriptExecutor js = (JavascriptExecutor) driver; js.executeScript("console.log('I have seen a ' + arguments[0] + ' ' + arguments[1] + ' times(s)');", animal, seen); This time, you will see that we managed to print the following text into the console: I have seen a Lion 5 times(s) As you can see, there is a huge amount of flexibility with the JavaScript executor. You can write some complex bits of JavaScript code and pass in lots of different types of arguments from your Java code. Think of all the things that you could do! Let's not get carried away We now know the basics of how one can execute JavaScript snippets in Selenium. This is where some people can start to get a bit carried away. If you go through the mailing list of the users of Selenium, you will see many instances of people asking why they can't click on an element. Most of the time, this is due to the element that they are trying to interact with not being visible, which is blocking a click action. The real solution to this problem is to perform an action (the same one that they would perform if they were manually using the website) to make the element visible so that they can interact with it. However, there is a shortcut offered by many, which is a very bad practice. You can use a JavaScript executor to trigger a click event on this element. Doing this will probably make your test pass. So why is it a bad solution? The Selenium development team has spent quite a lot of time writing code that works out if a user can interact with an element. It's pretty reliable. So, if Selenium says that you cannot currently interact with an element, it's highly unlikely that it's wrong. When figuring out whether you can interact with an element, lots of things are taken into account, including the z-index of an element. For example, you may have a transparent element that is covering the element that you want to click on and blocking the click action so that you can't reach it. Visually, it will be visible to you, but Selenium will correctly see it as not visible. If you now invoke a JavaScript executor to trigger a click event on this element, your test will pass, but users will not be able to interact with it when they try to manually use your website. However, what if Selenium got it wrong and I can interact with the element that I want to click manually? Well, that's great, but there are two things that you need to think about. First of all, does it work in all browsers? If Selenium thinks that it is something that you cannot interact with, it's probably for a good reason. Is the markup, or the CSS, overly complicated? Can it be simplified? Secondly, if you invoke a JavaScript executor, you will never know whether the element that you want to interact with really does get blocked at some point in the future. Your test may as well keep passing when your application is broken. Tests that can't fail when something goes wrong are worse than no test at all! If you think of Selenium as a toolbox, a JavaScript executor is a very powerful tool that is present in it. However, it really should be seen as a last resort when all other avenues have failed you. Too many people use it as a solution to any slightly sticky problem that they come across. If you are writing JavaScript code that attempts to mirror existing Selenium functions but are removing the restrictions, you are probably doing it wrong! Your code is unlikely to be better. The Selenium development team have been doing this for a long time with a lot of input from a lot of people, many of them being experts in their field. If you are thinking of writing methods to find elements on a page, don't! Use the .findElement() method provided by Selenium. Occasionally, you may find a bug in Selenium that prevents you from interacting with an element in the way you would expect to. Many people first respond by reaching for the JavascriptExecutor to code around the problem in Selenium. Hang on for just one moment though. Have you upgraded to the latest version of Selenium to see if that fixes your problem? Alternatively, did you just upgrade to the latest version of Selenium when you didn't need to? Using a slightly older version of Selenium that works correctly is perfectly acceptable. Don't feel forced to upgrade for no reason, especially if it means that you have to write your own hacks around problems that didn't exist before. The correct thing to do is to use a stable version of Selenium that works for you. You can always raise bugs for functionality that doesn't work, or even code a fix and submit a pull request. Don't give yourself the additional work of writing a workaround that's probably not the ideal solution, unless you need to. So what should we do with it? Let's have a look at some examples of the things that we can do with the JavaScript executor that aren't really possible using the base Selenium API. First of all, we will start off by getting the element text. Wait a minute, element text? But, that’s easy! You can use the existing Selenium API with the following code: WebElement myElement = driver.findElement(By.id("foo")); String elementText = myElement.getText(); So why would we want to use a JavaScript executor to find the text of an element? Getting text is easy using the Selenium API, but only under certain conditions. The element that you are collecting the text from needs to be displayed. If Selenium thinks that the element from which you are collecting the text is not displayed, it will return an empty string. If you want to collect some text from a hidden element, you are out of luck. You will need to implement a way to do it with a JavaScript executor. Why would you want to do this? Well, maybe you have a responsive website that shows different elements based on different resolutions. You may want to check whether these two different elements are displaying the same text to the user. To do this, you will need to get the text of the visible and invisible elements so that you can compare them. Let's create a method to collect some hidden text for us: private String getHiddenText(WebElement element) { JavascriptExecutor js = (JavascriptExecutor) ((RemoteWebElement) element).getWrappedDriver(); return (String) js.executeScript("return arguments[0].text", element); } There is some cleverness in this method. First of all, we took the element that we wanted to interact with and then extracted the driver object associated with it. We did this by casting the WebElement into a RemoteWebElement, which allowed us to use the getWrappedDriver() method. This removes the need to pass a driver object around the place all the time (this is something that happens a lot in some code bases). We then took the driver object and cast it into a JavascriptExecutor so that we would have the ability to invoke the executeScript() method. Next, we executed the JavaScript snippet and passed in the original element as an argument. Finally, we took the response of the executeScript() call and cast it into a string that we can return as a result of the method. Generally, getting text is a code smell. Your tests should not rely on specific text being displayed on a website because content always changes. Maintaining tests that check the content of a site is a lot of work, and it makes your functional tests brittle. The best thing to do is test the mechanism that injects the content into the website. If you use a CMS that injects text into a specific template key, you can test whether each element has the correct template key associated with it. I want to see a more complex example! So you want to see something more complicated. The Advanced User Interactions API cannot deal with HTML5 drag and drop. So, what happens if we come across an HTML5 drag-and-drop implementation that we want to automate? Well, we can use the JavascriptExecutor. Let's have a look at the markup for the HTML5 drag-and-drop page: <!DOCTYPE html> <html lang="en"> <head> <meta charset=utf-8> <title>Drag and drop</title> <style type="text/css"> li { list-style: none; } li a { text-decoration: none; color: #000; margin: 10px; width: 150px; border-width: 2px; border-color: black; border-style: groove; background: #eee; padding: 10px; display: block; } *[draggable=true] { cursor: move; } ul { margin-left: 200px; min-height: 300px; } #obliterate { background-color: green; height: 250px; width: 166px; float: left; border: 5px solid #000; position: relative; margin-top: 0; } #obliterate.over { background-color: red; } </style> </head> <body> <header> <h1>Drag and drop</h1> </header> <article> <p>Drag items over to the green square to remove them</p> <div id="obliterate"></div> <ul> <li><a id="one" href="#" draggable="true">one</a></li> <li><a id="two" href="#" draggable="true">two</a></li> <li><a id="three" href="#" draggable="true">three</a></li> <li><a id="four" href="#" draggable="true">four</a></li> <li><a id="five" href="#" draggable="true">five</a></li> </ul> </article> </body> <script> var draggableElements = document.querySelectorAll('li > a'), obliterator = document.getElementById('obliterate'); for (var i = 0; i < draggableElements.length; i++) { element = draggableElements[i]; element.addEventListener('dragstart', function (event) { event.dataTransfer.effectAllowed = 'copy'; event.dataTransfer.setData('being-dragged', this.id); }); } obliterator.addEventListener('dragover', function (event) { if (event.preventDefault) event.preventDefault(); obliterator.className = 'over'; event.dataTransfer.dropEffect = 'copy'; return false; }); obliterator.addEventListener('dragleave', function () { obliterator.className = ''; return false; }); obliterator.addEventListener('drop', function (event) { var elementToDelete = document.getElementById( event.dataTransfer.getData('being-dragged')); elementToDelete.parentNode.removeChild(elementToDelete); obliterator.className = ''; return false; }); </script> </html> Note that you need a browser that supports HTML5/CSS3 for this page to work. The latest versions of Google Chrome, Opera Blink, Safari, and Firefox will work. You may have issues with Internet Explorer (depending on the version that you are using). For an up-to-date list of HTML5/CSS3 support, have a look at http://caniuse.com. If you try to use the Advanced User Interactions API to automate this page, you will find that it just doesn't work. It looks like it's time to reach for JavascriptExecutor. First of all, we need to write some JavaScript that can simulate the events that we need to trigger to perform the drag-and-drop action. To do this, we are going to create three JavaScript functions. The first function is going to create a JavaScript event: function createEvent(typeOfEvent) { var event = document.createEvent("CustomEvent"); event.initCustomEvent(typeOfEvent, true, true, null); event.dataTransfer = { data: {}, setData: function (key, value) { this.data[key] = value; }, getData: function (key) { return this.data[key]; } }; return event; } We then need to write a function that will fire events that we have created. This also allows you to pass in the dataTransfer value set on an element. We need this to keep track of the element that we are dragging: function dispatchEvent(element, event, transferData) { if (transferData !== undefined) { event.dataTransfer = transferData; } if (element.dispatchEvent) { element.dispatchEvent(event); } else if (element.fireEvent) { element.fireEvent("on" + event.type, event); } } Finally, we need something that will use these two functions to simulate the drag-and-drop action: function simulateHTML5DragAndDrop(element, target) { var dragStartEvent = createEvent('dragstart'); dispatchEvent(element, dragStartEvent); var dropEvent = createEvent('drop'); dispatchEvent(target, dropEvent, dragStartEvent.dataTransfer); var dragEndEvent = createEvent('dragend'); dispatchEvent(element, dragEndEvent, dropEvent.dataTransfer); } Note that the simulateHTML5DragAndDrop function needs us to pass in two elements—the element that we want to drag, and the element that we want to drag it to. It's always a good idea to try out your JavaScript in a browser first. You can copy the preceding functions into the JavaScript console in a modern browser and then try using them to make sure that they work as expected. If things go wrong in your Selenium test, you then know that it is most likely an error invoking it via the JavascriptExecutor rather than a bad piece of JavaScript. We now need to take these scripts and put them into a JavascriptExecutor along with something that will call the simulateHTML5DragAndDrop function: private void simulateDragAndDrop(WebElement elementToDrag, WebElement target) throws Exception { WebDriver driver = getDriver(); JavascriptExecutor js = (JavascriptExecutor) driver; js.executeScript("function createEvent(typeOfEvent) {n" + "var event = document.createEvent("CustomEvent");n" + "event.initCustomEvent(typeOfEvent, true, true, null);n" + " event.dataTransfer = {n" + " data: {},n" + " setData: function (key, value) {n" + " this.data[key] = value;n" + " },n" + " getData: function (key) {n" + " return this.data[key];n" + " }n" + " };n" + " return event;n" + "}n" + "n" + "function dispatchEvent(element, event, transferData) {n" + " if (transferData !== undefined) {n" + " event.dataTransfer = transferData;n" + " }n" + " if (element.dispatchEvent) {n" + " element.dispatchEvent(event);n" + " } else if (element.fireEvent) {n" + " element.fireEvent("on" + event.type, event);n" + " }n" + "}n" + "n" + "function simulateHTML5DragAndDrop(element, target) {n" + " var dragStartEvent = createEvent('dragstart');n" + " dispatchEvent(element, dragStartEvent);n" + " var dropEvent = createEvent('drop');n" + " dispatchEvent(target, dropEvent, dragStartEvent.dataTransfer);n" + " var dragEndEvent = createEvent('dragend'); n" + " dispatchEvent(element, dragEndEvent, dropEvent.dataTransfer);n" + "}n" + "n" + "var elementToDrag = arguments[0];n" + "var target = arguments[1];n" + "simulateHTML5DragAndDrop(elementToDrag, target);", elementToDrag, target); } This method is really just a wrapper around the JavaScript code. We take a driver object and cast it into a JavascriptExecutor. We then pass the JavaScript code into the executor as a string. We have made a couple of additions to the JavaScript functions that we previously wrote. Firstly, we set a couple of variables (mainly for code clarity; they can quite easily be inlined) that take the WebElements that we have passed in as arguments. Finally, we invoke the simulateHTML5DragAndDrop function using these elements. The final piece of the puzzle is to write a test that utilizes the simulateDragAndDrop method, as follows: @Test public void dragAndDropHTML5() throws Exception { WebDriver driver = getDriver(); driver.get("http://ch6.masteringselenium.com/ dragAndDrop.html"); final By destroyableBoxes = By.cssSelector("ul > li > a"); WebElement obliterator = driver.findElement(By.id("obliterate")); WebElement firstBox = driver.findElement(By.id("one")); WebElement secondBox = driver.findElement(By.id("two")); assertThat(driver.findElements(destroyableBoxes).size(), is(equalTo(5))); simulateDragAndDrop(firstBox, obliterator); assertThat(driver.findElements(destroyableBoxes). size(), is(equalTo(4))); simulateDragAndDrop(secondBox, obliterator); assertThat(driver.findElements(destroyableBoxes). size(), is(equalTo(3))); } This test finds a couple of boxes and destroys them one by one using the simulated drag and drop. As you can see, the JavascriptExcutor is extremely powerful. Can I use JavaScript libraries? The logical progression is, of course, to write your own JavaScript libraries that you can import instead of sending everything over as a string. Alternatively, maybe you would just like to import an existing library. Let's write some code that allows you to import a JavaScript library of your choice. It's not a particularly complex JavaScript. All that we are going to do is create a new <script> element in a page and then load our library into it, as follows: public void injectScript(String scriptURL) throws Exception { WebDriver driver = getDriver(); JavascriptExecutor js = (JavascriptExecutor) driver; js.executeScript("function injectScript(url) {n" + " var script = document.createElement ('script');n" + " script.src = url;n" + " var head = document.getElementsByTagName( 'head')[0];n" + " head.appendChild(script);n" + "}n" + "n" + "var scriptURL = arguments[0];n" + "injectScript(scriptURL);" , scriptURL); } We have again set arguments[0] to a variable before injecting it for clarity, but you can inline this part if you want to. All that remains now is to inject this into a page and check whether it works. Let's write a test! We are going to use this function to inject jQuery into the Google website. The first thing that we need to do is write a method that can tell us whether jQuery has been loaded or not, as follows: public Boolean isjQueryLoaded() throws Exception { WebDriver driver = getDriver(); JavascriptExecutor js = (JavascriptExecutor) driver; return (Boolean) js.executeScript("return typeof jQuery != 'undefined';"); } Now, we need to put all of this together in a test, as follows: @Test public void injectjQueryIntoGoogle() throws Exception { WebDriver driver = DriverFactory.getDriver(); driver.get("http://www.google.com"); assertThat(isjQueryLoaded(), is(equalTo(false))); injectScript("https://code.jquery.com/jquery-latest.min.js"); assertThat(isjQueryLoaded(), is(equalTo(true))); } It's a very simple test. We loaded the Google website. Then, we checked whether jQuery existed. Once we were sure that it didn't exist, we injected jQuery into the page. Finally, we again checked whether jQuery existed. We have used jQuery in our example, but you don't have to use jQuery. You can inject any script that you desire. Should I inject JavaScript libraries? It's very easy to inject JavaScript into a page, but stop and think before you do it. Adding lots of different JavaScript libraries may affect the existing functionality of the site. You may have functions in your JavaScript that overwrite existing functions that are already on the page and break the core functionality. If you are testing a site, it may make all of your tests invalid. Failures may arise because there is a clash between the scripts that you inject and the existing scripts used on the site. The flip side is also true—injecting a script may make the functionality that is broken, work. If you are going to inject scripts into an existing site, be sure that you know what the consequences are. If you are going to regularly inject a script, it may be a good idea to add some assertions to ensure that the functions that you are injecting do not already exist before you inject the script. This way, your tests will fail if the developers add a JavaScript function with the same name at some point in the future without your knowledge. What about asynchronous scripts? Everything that we have looked at so far has been a synchronous piece of JavaScript. However, what if we wanted to perform some asynchronous JavaScript calls as a part of our test? Well, we can do this. The JavascriptExecutor also has a method called executeAsyncScript(). This will allow you to run some JavaScript that does not respond instantly. Let's have a look at some examples. First of all, we are going to write a very simple bit of JavaScript that will wait for 25 seconds before triggering a callback, as follows: @Test private void javascriptExample() throws Exception { WebDriver driver = DriverFactory.getDriver(); driver.manage().timeouts().setScriptTimeout(60, TimeUnit.SECONDS); JavascriptExecutor js = (JavascriptExecutor) driver; js.executeAsyncScript("var callback = arguments[ arguments.length - 1]; window.setTimeout(callback, 25000);"); driver.get("http://www.google.com"); } Note that we defined a JavaScript variable named callback, which uses a script argument that we have not set. For asynchronous scripts, Selenium needs to have a callback defined, which is used to detect when the JavaScript that you are executing has finished. This callback object is automatically added to the end of your arguments array. This is what we have defined as the callback variable. If we now run the script, it will load our browser and then sit there for 25 seconds as it waits for the JavaScript snippet to complete and call the callback. It will then load the Google website and finish. We have also set a script timeout on the driver object that will wait for up to 60 seconds for our piece of JavaScript to execute. Let's see what happens if our script takes longer to execute than the script timeout: @Test private void javascriptExample() throws Exception { WebDriver driver = DriverFactory.getDriver(); driver.manage().timeouts().setScriptTimeout(5, TimeUnit.SECONDS); JavascriptExecutor js = (JavascriptExecutor) driver; js.executeAsyncScript("var callback = arguments[ arguments.length - 1]; window.setTimeout(callback, 25000);"); driver.get("http://www.google.com"); } This time, when we run our test, it waits for 5 seconds and then throws a TimoutException. It is important to set a script timeout on the driver object when running asynchronous scripts, to give them enough time to execute. What do you think will happen if we execute this as a normal script? @Test private void javascriptExample() throws Exception { WebDriver driver = DriverFactory.getDriver(); driver.manage().timeouts().setScriptTimeout( 5, TimeUnit.SECONDS); JavascriptExecutor js = (JavascriptExecutor) driver; js.executeScript("var callback = arguments[arguments. length - 1]; window.setTimeout(callback, 25000);"); driver.get("http://www.google.com"); } You may have been expecting an error, but that's not what you got. The script got executed as normal because Selenium was not waiting for a callback; it didn't wait for it to complete. Since Selenium did not wait for the script to complete, it didn't hit the script timeout. Hence, no error was thrown. Wait a minute. What about the callback definition? There was no argument that was used to set the callback variable. Why didn't it blow up? Well, JavaScript isn't as strict as Java. What it has done is try and work out what arguments[arguments.length - 1] would resolve and realized that it is not defined. Since it is not defined, it has set the callback variable to null. Our test then completed before setTimeout() had a chance to complete its call. So, you won't see any console errors. As you can see, it's very easy to make a small error that stops things from working when working with asynchronous JavaScript. It's also very hard to find these errors because there can be very little user feedback. Always take extra care when using the JavascriptExecutor to execute asynchronous bits of JavaScript. Summary In this article, we: Learned how to use a JavaScript executor to execute JavaScript snippets in the browser through Selenium Learned about passing arguments into a JavaScript executor and the sort of arguments that are supported Learned what the possible return types are for a JavaScript executor Gained a good understanding of when we shouldn't use a JavaScript executor Worked through a series of examples that showed ways in which we can use a JavaScript executor to enhance our tests Resources for Article: Further resources on this subject: JavaScript tech page Cross-browser Tests using Selenium WebDriver Selenium Testing Tools Learning Selenium Testing Tools with Python
Read more
  • 0
  • 0
  • 22339

article-image-introduction-odoo
Packt
04 Sep 2015
12 min read
Save for later

Introduction to Odoo

Packt
04 Sep 2015
12 min read
 In this article by Greg Moss, author of Working with Odoo, he explains that Odoo is a very feature-filled business application framework with literally hundreds of applications and modules available. We have done our best to cover the most essential features of the Odoo applications that you are most likely to use in your business. Setting up an Odoo system is no easy task. Many companies get into trouble believing that they can just install the software and throw in some data. Inevitably, the scope of the project grows and what was supposed to be a simple system ends up being a confusing mess. Fortunately, Odoo's modular design will allow you to take a systematic approach to implementing Odoo for your business. (For more resources related to this topic, see here.) What is an ERP system? An Enterprise Resource Planning (ERP) system is essentially a suite of business applications that are integrated together to assist a company in collecting, managing, and reporting information throughout core business processes. These business applications, typically called modules, can often be independently installed and configured based on the specific needs of the business. As the needs of the business change and grow, additional modules can be incorporated into an existing ERP system to better handle the new business requirements. This modular design of most ERP systems gives companies great flexibility in how they implement the system. In the past, ERP systems were primarily utilized in manufacturing operations. Over the years, the scope of ERP systems have grown to encompass a wide range of business-related functions. Recently, ERP systems have started to include more sophisticated communication and social networking features. Common ERP modules The core applications of an ERP system typically include: Sales Orders Purchase Orders Accounting and Finance Manufacturing Resource Planning (MRP) Customer Relationship Management (CRM) Human Resources (HR) Let's take a brief look at each of these modules and how they address specific business needs. Selling products to your customer Sales Orders, commonly abbreviated as SO, are documents that a business generates when they sell products and services to a customer. In an ERP system, the Sales Order module will usually allow management of customers and products to optimize efficiency for data entry of the sales order. Many sales orders begin as customer quotes. Quotes allow a salesperson to collect order information that may change as the customer makes decisions on what they want in their final order. Once a customer has decided exactly what they wish to purchase, the quote is turned into a sales order and is confirmed for processing. Depending on the requirements of the business, there are a variety of methods to determine when a customer is invoiced or billed for the order. This preceding screenshot shows a sample sales order in Odoo. Purchasing products from suppliers Purchase Orders, often known as PO, are documents that a business generates when they purchase products from a vendor. The Purchase Order module in an ERP system will typically include management of vendors (also called suppliers) as well as management of the products that the vendor carries. Much like sales order quotes, a purchase order system will allow a purchasing department to create draft purchase orders before they are finalized into a specific purchasing request. Often, a business will configure the Sales Order and Purchase Order modules to work together to streamline business operations. When a valid sales order is entered, most ERP systems will allow you to configure the system so that a purchase order can be automatically generated if the required products are not in stock to fulfill the sales order. ERP systems will allow you to set minimum quantities on-hand or order limits that will automatically generate purchase orders when inventory falls below a predetermined level. When properly configured, a purchase order system can save a significant amount of time in purchasing operations and assist in preventing supply shortages. Managing your accounts and financing in Odoo Accounting and finance modules integrate with an ERP system to organize and report business transactions. In many ERP systems, the accounting and finance module is known as GL for General Ledger. All accounting and finance modules are built around a structure known as the chart of accounts. The chart of accounts organizes groups of transactions into categories such as assets, liabilities, income, and expenses. ERP systems provide a lot of flexibility in defining the structure of your chart of accounts to meet the specific requirements for your business. Accounting transactions are grouped by date into periods (typically by month) for reporting purposes. These reports are most often known as financial statements. Common financial statements include balance sheets, income statements, cash flow statements, and statements of owner's equity. Handling your manufacturing operations The Manufacturing Resource Planning (MRP) module manages all the various business operations that go into the manufacturing of products. The fundamental transaction of an MRP module is a manufacturing order, which is also known as a production order in some ERP systems. A manufacturing order describes the raw products or subcomponents, steps, and routings required to produce a finished product. The raw products or subcomponents required to produce the finished product are typically broken down into a detailed list called a bill of materials or BOM. A BOM describes the exact quantities required of each component and are often used to define the raw material costs that go into manufacturing the final products for a company. Often an MRP module will incorporate several submodules that are necessary to define all the required operations. Warehouse management is used to define locations and sublocations to store materials and products as they move through the various manufacturing operations. For example, you may receive raw materials in one warehouse location, assemble those raw materials into subcomponents and store them in another location, then ultimately manufacture the end products and store them in a final location before delivering them to the customer. Managing customer relations in Odoo In today's business environment, quality customer service is essential to being competitive in most markets. A Customer Relationship Management (CRM) module assists a business in better handling the interactions they may have with each customer. Most CRM systems also incorporate a presales component that will manage opportunities, leads, and various marketing campaigns. Typically, a CRM system is utilized the most by the sales and marketing departments within a company. For this reason, CRM systems are often considered to be sales force automation tools or SFA tools. Sales personnel can set up appointments, schedule call backs, and employ tools to manage their communication. More modern CRM systems have started to incorporate social networking features to assist sales personnel in utilizing these newly emerging technologies. Configuring human resource applications in Odoo Human Resource modules, commonly known as HR, manage the workforce- or employee-related information in a business. Some of the processes ordinarily covered by HR systems are payroll, time and attendance, benefits administration, recruitment, and knowledge management. Increased regulations and complexities in payroll and benefits have led to HR modules becoming a major component of most ERP systems. Modern HR modules typically include employee kiosk functions to allow employees to self-administer many tasks such as putting in a leave request or checking on their available vacation time. Finding additional modules for your business requirements In addition to core ERP modules, Odoo has many more official and community-developed modules available. At the time of this article's publication, the Odoo application repository had 1,348 modules listed for version 7! Many of these modules provide small enhancements to improve usability like adding payment type to a sales order. Other modules offer e-commerce integration or complete application solutions, such as managing a school or hospital. Here is a short list of the more common modules you may wish to include in an Odoo installation: Point of Sale Project Management Analytic Accounting Document Management System Outlook Plug-in Country-Specific Accounting Templates OpenOffice Report Designer You will be introduced to various Odoo modules that extend the functionality of the base Odoo system. You can find a complete list of Odoo modules at http://apps.Odoo.com/. This preceding screenshot shows the module selection page in Odoo. Getting quickly into Odoo Do you want to jump in right now and get a look at Odoo 7 without any complex installations? Well, you are lucky! You can access an online installation of Odoo, where you can get a peek at many of the core modules right from your web browser. The installation is shared publicly, so you will not want to use this for any sensitive information. It is ideal, however, to get a quick overview of the software and to get an idea for how the interface functions. You can access a trial version of Odoo at https://www.Odoo.com/start. Odoo – an open source ERP solution Odoo is a collection of business applications that are available under an open source license. For this reason, Odoo can be used without paying license fees and can be customized to suit the specific needs of a business. There are many advantages to open source software solutions. We will discuss some of these advantages shortly. Free your company from expensive software license fees One of the primary downsides of most ERP systems is they often involve expensive license fees. Increasingly, companies must pay these license fees on an annual basis just to receive bug fixes and product updates. Because ERP systems can require companies to devote great amounts of time and money for setup, data conversion, integration, and training, it can be very expensive, often prohibitively so, to change ERP systems. For this reason, many companies feel trapped as their current ERP vendors increase license fees. Choosing open source software solutions, frees a company from the real possibility that a vendor will increase license fees in the years ahead. Modify the software to meet your business needs With proprietary ERP solutions, you are often forced to accept the software solution the vendor provides chiefly "as is". While you may have customization options and can sometimes pay the company to make specific changes, you rarely have the freedom to make changes directly to the source code yourself. The advantages to having the source code available to enterprise companies can be very significant. In a highly competitive market, being able to develop solutions that improve business processes and give your company the flexibility to meet future demands can make all the difference. Collaborative development Open source software does not rely on a group of developers who work secretly to write proprietary code. Instead, developers from all around the world work together transparently to develop modules, prepare bug fixes, and increase software usability. In the case of Odoo, the entire source code is available on Launchpad.net. Here, developers submit their code changes through a structure called branches. Changes can be peer reviewed, and once the changes are approved, they are incorporated into the final source code product. Odoo – AGPL open source license The term open source covers a wide range of open source licenses that have their own specific rights and limitations. Odoo and all of its modules are released under the Affero General Public License (AGPL) version 3. One key feature of this license is that any custom-developed module running under Odoo must be released with the source code. This stipulation protects the Odoo community as a whole from developers who may have a desire to hide their code from everyone else. This may have changed or has been appended recently with an alternative license. You can find the full AGPL license at http://www.gnu.org/licenses/agpl-3.0.html. A real-world case study using Odoo The goal is to do more than just walk through the various screens and reports of Odoo. Instead, we want to give you a solid understanding of how you would implement Odoo to solve real-world business problems. For this reason, this article will present a real-life case study in which Odoo was actually utilized to improve specific business processes. Silkworm, Inc. – a mid-sized screen printing company Silkworm, Inc. is a highly respected mid-sized silkscreen printer in the Midwest that manufactures and sells a variety of custom apparel products. They have been kind enough to allow us to include some basic aspects of their business processes as a set of real-world examples implementing Odoo into a manufacturing operation. Using Odoo, we will set up the company records (or system) from scratch and begin by walking through their most basic sales order process, selling T-shirts. From there, we will move on to manufacturing operations, where custom art designs are developed and then screen printed onto raw materials for shipment to customers. We will come back to this real-world example so that you can see how Odoo can be used to solve real-world business solutions. Although Silkworm is actively implementing Odoo, Silkworm, Inc. does not directly endorse or recommend Odoo for any specific business solution. Every company must do their own research to determine whether Odoo is a good fit for their operation. Summary In this article, we have learned about the ERP system and common ERP modules. An introduction about Odoo and features of it. Resources for Article: Further resources on this subject: Getting Started with Odoo Development[article] Machine Learning in IPython with scikit-learn [article] Making Goods with Manufacturing Resource Planning [article]
Read more
  • 0
  • 0
  • 9467
article-image-creating-poi-listview-layout
Packt
04 Sep 2015
27 min read
Save for later

Creating the POI ListView layout

Packt
04 Sep 2015
27 min read
In this article by Nilanchala Panigrahy, author of the book Xamarin Mobile Application Development for Android Second Edition, will walk you through the activities related to creating and populating a ListView, which includes the following topics: Creating the POIApp activity layout Creating a custom list row item layout The ListView and ListAdapter classes (For more resources related to this topic, see here.) It is technically "possible to create and attach the user interface elements to your activity using C# code. However, it is a bit of a mess. We will go with the most common approach by declaring the XML-based layout. Rather than deleting these files, let's give them more appropriate names and remove unnecessary content as follows: Select the Main.axml file in Resources | Layout and rename it to "POIList.axml. Double-click on the POIList.axml file to open it in a layout "designer window.Currently, the POIList.axml file contains the layout that was created as part of the default Xamarin Studio template. As per our requirement, we need to add a ListView widget that takes the complete screen width and a ProgressBar in the middle of the screen. The indeterminate progress "bar will be displayed to the user while the data is being downloaded "from the server. Once the download is complete and the data is ready, "the indeterminate progress bar will be hidden before the POI data is rendered on the list view. Now, open the "Document Outline tab in the designer window and delete both the button and LinearLayout. Now, in the designer Toolbox, search for RelativeLayout and drag it onto the designer layout preview window. Search for ListView in the Toolbox search field and drag it over the layout designer preview window. Alternatively, you can drag and drop it over RelativeLayout in the Document Outline tab. We have just added a ListView widget to POIList.axml. Let's now open the Properties pad view in the designer window and edit some of its attributes: There are five buttons at the top of the pad that switch the set of properties being edited. The @+id notation notifies the compiler that a new resource ID needs to be created to identify the widget in API calls, and listView1 identifies the name of the constant. Now, perform the following steps: Change the ID name to poiListView and save the changes. Switch back to the Document Outline pad and notice that the ListView ID is updated. Again, switch back to the Properties pad and click on the Layout button. Under the View Group section of the layout properties, set both the Width and Height properties to match_parent. The match_parent value "for the Height and Width properties tells us that the ListView can use the entire content area provided by the parent, excluding any margins specified. In our case, the parent would be the top-level RelativeLayout. Prior to API level 8, fill_parent was used instead of match_parent to accomplish the same effect. In API level 8, fill_parent was deprecated and replaced with match_parent for clarity. Currently, both the constants are defined as the same value, so they have exactly the same effect. However, fill_ parent may be removed from the future releases of the API; so, going forward, match_parent should be used. So far, we have added a ListView to RelativeLayout, let's now add a Progress Bar to the center of the screen. Search for Progress Bar in the Toolbox search field. You will notice that several types of progress bars will be listed, including horizontal, large, normal, and small. Drag the normal progress bar onto RelativeLayout.By default, the Progress Bar widget is aligned to the top left of its parent layout. To align it to the center of the screen, select the progress bar in the Document Outline tab, switch to the Properties view, and click on the Layout tab. Now select the Center In Parent checkbox, and you will notice that the progress bar is aligned to the center of the screen and will appear at the top of the list view. Currently, the progress bar is visible in the center of the screen. By default, this could be hidden in the layout and will be made visible only while the data is being downloaded. Change the Progress Bar ID to progressBar and save the changes. To hide the Progress Bar from the layout, click on the Behavior tab in the Properties view. From Visibility, select Box, and then select gone.This behavior can also be controlled by calling setVisibility() on any view by passing any of the following behaviors. The View.Visibility property" allows you to control whether a view is visible or not. It is based on the ViewStates enum, which defines the following values: Value Description Gone This value tells the parent ViewGroup to treat the View as though it does not exist, so no space will be allocated in the layout Invisible This value tells the parent ViewGroup to hide the content for the View; however, it occupies the layout space Visible This value tells the parent ViewGroup to display the content of the View Click on the Source tab to switch the IDE context from visual designer to code, and see what we have built so far. Notice that the following code is generated for the POIList.axml layout: <?xml version="1.0" encoding="utf-8"?> <RelativeLayout p1_layout_width="match_parent" p1_layout_height="match_parent" p1_id="@+id/relativeLayout1"> <ListView p1_minWidth="25px" p1_minHeight="25px" p1_layout_width="match_parent" p1_layout_height="match_parent" p1_id="@+id/poiListView" /> <ProgressBar p1_layout_width="wrap_content" p1_layout_height="wrap_content" p1_id="@+id/progressBar" p1_layout_centerInParent="true" p1_visibility="gone" /> </RelativeLayout> Creating POIListActivity When we created the" POIApp solution, along with the default layout, a default activity (MainActivity.cs) was created. Let's rename the MainActivity.cs file "to POIListActivity.cs: Select the MainActivity.cs file from Solution Explorer and rename to POIListActivity.cs. Open the POIListActivity.cs file in the code editor and rename the "class to POIListActivity. The POIListActivity class currently contains the code that was "created automatically while creating the solution using Xamarin Studio. We will write our own activity code, so let's remove all the code from the POIListActivity class. Override the OnCreate() activity life cycle callback method. This method will be used to attach the activity layout, instantiate the views, and write other activity initialization logic. Add the following code blocks to the POIListActivity class: namespace POIApp { [Activity (Label = "POIApp", MainLauncher = true, Icon = "@ drawable/icon")] public class POIListActivity : Activity { protected override void OnCreate (Bundle savedInstanceState) { base.OnCreate (savedInstanceState); } } } Now let's set the activity content layout by calling the SetContentView(layoutId) method. This method places the layout content directly into the activity's view hierarchy. Let's provide the reference to the POIList layout created in previous steps. At this point, the POIListActivity class looks as follows: namespace POIApp { [Activity (Label = "POIApp", MainLauncher = true, Icon = "@drawable/icon")] public class POIListActivity : Activity { protected override void OnCreate (Bundle savedInstanceState) { base.OnCreate (savedInstanceState); SetContentView (Resource.Layout.POIList); } } } Notice that in the preceding code snippet, the POIListActivity class uses some of the [Activity] attributes such as Label, MainLauncher, and Icon. During the build process, Xamarin.Android uses these attributes to create an entry in the AndroidManifest.xml file. Xamarin makes it easier by allowing all of the Manifest properties to set using attributes so that you never have to modify them manually in AndroidManifest.xml. So far, we have "declared an activity and attached the layout to it. At this point, if you run the app on your Android device or emulator, you will notice that a blank screen will be displayed. Creating the POI list row layout We now turn our attention to the" layout for each row in the ListView widget. "The" Android platform provides a number of default layouts out of the box that "can be used with a ListView widget: Layout Description SimpleListItem1 A "single line with a single caption field SimpleListItem2 A "two-line layout with a larger font and a brighter text color for the first field TwoLineListItem A "two-line layout with an equal sized font for both lines and a brighter text color for the first line ActivityListItem A "single line of text with an image view All of the preceding three layouts provide a pretty standard design, but for more control over content layout, a custom layout can also be created, which is what is needed for poiListView. To create a new layout, perform the following steps: In the Solution pad, navigate to Resources | Layout, right-click on it, and navigate to Add | New File. Select Android from the list on the left-hand side, Android Layout from the template list, enter POIListItem in the name column, and click on New. Before we proceed to lay out the design for each of the row items in the list, we must draw on a piece of paper and analyze how the UI will look like. In our example, the POI data will be organized as follows: There are a "number of ways to achieve this layout, but we will use RelativeLayout to achieve the same result. There is a lot going on in this diagram. Let's break it down as follows: A RelativeLayout view group is used as the top-level container; it provides a number of flexible options for positioning relative content, its edges, or other content. An ImageView widget is used to display a photo of the POI, and it is anchored to the left-hand side of the RelativeLayout utility. Two TextView widgets are used to display the POI name and address information. They need to be anchored to the right-hand side of the ImageView widget and centered within the parent RelativeLayout "utility. The easiest way to accomplish this is to place both the TextView classes inside another layout; in this case, a LinearLayout widget with "the orientation set to vertical. An additional TextView widget is used to display the distance, and it is anchored on the right-hand side of the RelativeLayout view group and centered vertically. Now, our task is to get this definition into POIListItem.axml. The next few sections describe how to "accomplish this using the Content view of the designer when feasible and the Source view when required. Adding a RelativeLayout view group The RelativeLayout layout "manager allows its child views to be positioned relative to each other or relative to the container or another container. In our case, for building the row layout, as shown in the preceding diagram, we can use RelativeLayout as a top-level view group. When the POIListItem.axml layout file was created, by default a top-level LinearLayout was added. First, we need to change the top-level ViewGroup to RelativeLayout. The following section will take you through the steps to complete the layout design for the POI list row: With POIListItem.axml opened in the content mode, select the entire layout by clicking on the content area. You should see a blue outline going around the edge. Press Delete. The LinearLayout view group will be deleted, and you will see a message indicating that the layout is empty. Alternatively, you can also select the LinearLayout view group from the Document Outline tab and press Delete. Locate the RelativeLayout view group in the toolbox and drag it onto the layout. Select the RelativeLayout view group from Document Outline. Open the Properties pad and change the following properties: The Padding option to 5dp The Layout Height option to wrap_content The Layout Width option to match_parent The padding property controls how much space will be placed around each item as a margin, and the height determines the height of each list row. Setting the Layout Width option to match_ parent will cause the POIListItem content to consume the entire width of the screen, while setting the Layout Height option to wrap_content will cause each row to be equal to the longest control. Switch to the Code view to see what has been added to the layout. Notice that the following lines of code have been added to RelativeLayout: <RelativeLayout p1_minWidth="25px" p1_minHeight="25px" p1_layout_width="match_parent" p1_layout_height="wrap_content" p1_id="@+id/relativeLayout1" p1_padding="5dp"/> Android runs on a "variety of devices that offer different screen sizes and densities. When specifying dimensions, you can use a number of different units, including pixels (px), inches (in), and density-independent pixels (dp). Density-independent pixels are abstract units based on 1 dp being 1 pixel on a 160 dpi screen. At runtime, Android will scale the actual size up or down based on the actual screen density. It is a best practice to specify dimensions using density-independent pixels. Adding an ImageView widget The ImageView widget in "Android is used to display the arbitrary image for different sources. In our case, we will download the images from the server and display them in the list. Let's add an ImageView widget to the left-hand side of the layout and set the following configurations: Locate the ImageView widget in the toolbox and drag it onto RelativeLayout. With the ImageView widget selected, use the Properties pad to set the ID to poiImageView. Now, click on the Layout tab in the Properties pad and set the Height and Width values to 65 dp. In the property grouping named RelativeLayout, set Center Vertical to true. Simply clicking on the checkbox does not seem to work, but you can click on the small icon that looks like an edit box, which is to the right-hand side, and just enter true. If everything else fails, just switch to the Source view and enter the following code: p1:layout_centerVertical="true" In the property grouping named ViewGroup, set the Margin Right to 5dp. This brings some space between the POI image and the POI name. Switch to the Code view to see what has been added to the layout. Notice the following lines of code added to ImageView: <ImageView p1_src="@android:drawable/ic_menu_gallery" p1_layout_width="65dp" p1_layout_height="65dp" p1_layout_marginRight="5dp" p1_id="@+id/poiImageView" /> Adding a LinearLayout widget LinearLayout is one of the "most basic layout managers that organizes its child "views either horizontally or vertically based on the value of its orientation property. Let's add a LinearLayout view group that will be used to lay out "the POI name and address data as follows: Locate the LinearLayout (vertical) view group in the toolbox. Adding this widget is a little trickier because we want it anchored to the right-hand side of the ImageView widget. Drag the LinearLayout view group to the right-hand side of the ImageView widget until the edge turns to a blue dashed line, and then drop the LinearLayout view group. It will be aligned with the right-hand side of the ImageView widget. In the property grouping named RelativeLayout of the Layout section, set Center Vertical to true. As before, you will need to enter true in the edit box or manually add it to the Source view. Switch to the Code view to see what has been added to the layout. Notice "the following lines of code added to LinearLayout: <LinearLayout p1_orientation="vertical" p1_minWidth="25px" p1_minHeight="25px" p1_layout_width="wrap_content" p1_layout_height="wrap_content" p1_layout_toRightOf="@id/poiImageView" p1_id="@+id/linearLayout1" p1_layout_centerVertical="true" /> Adding the name and address TextView classes Add the TextView classes to display the POI name and address: Locate TextView in" the Toolbox and add a TextView class to the layout. "This TextView needs to be added within the LinearLayout view group we just added, so drag TextView over the LinearLayout view group until it turns blue and then drop it. Name the TextView ID as nameTextView and set the text size to 20sp. "The text size can be set in the Style section of the Properties pad; you will need to expand the Text Appearance group by clicking on the ellipsis (...) button on the right-hand side. Scale-independent pixels (sp) "are like dp units, but they are also scaled by the user's font size preference. Android allows users to select a font size in the Accessibility section of Settings. When font sizes are specified using sp, Android will not only take into account the screen density when scaling text, but will also consider the user's accessibility settings. It is recommended that you specify font sizes using sp. Add "another TextView to the LinearLayout view group using the same technique except dragging the new widget to the bottom edge of the nameTextView until it changes to a blue dashed line and then drop it. This will cause the second TextView to be added below nameTextView. Set the font size to 14sp. Change the ID of the newly added TextView to addrTextView. Now change the sample text for both nameTextView and addrTextView to POI Name and City, State, Postal Code. To edit the text shown in TextView, just double tap the widget on the content panel. This enables a small editor that allows you to enter the text directly. Alternately, you can change the text by entering a value for the Text property in the Widget section of the Properties pad. It is a design practice to declare all your static strings in the Resources/values/string.xml file. By declaring the strings in the strings.xml file, you can easily translate your whole app to support other languages. Let's add the following strings to string.xml: <string name="poi_name_hint">POI Name</string> <string name="address_hint">City, State, Postal Code.</string> You can now change the Text property of both nameTextView and addrTextView by selecting the ellipsis (…) button, which is next to the Text property in the Widget section of the Properties pad. Notice that this will open a dialog window that lists all the strings declared in the string.xml file. Select the appropriate strings for both TextView objects. Now let's switch to the Code view to see what has been added to the layout. Notice the following lines of code added inside LinearLayout: <TextView p1_layout_width="match_parent" p1_layout_height="wrap_content" p1_id="@+id/nameTextView " p1_textSize="20sp" p1_text="@string/app_name" /> <TextView p1_text="@string/address_hint" p1_layout_width="match_parent" p1_layout_height="wrap_content" p1_id="@+id/addrTextView " p1_textSize="14sp" /> Adding the distance TextView Add a TextView to show "the distance from POI: Locate the TextView in the toolbox and add a TextView to the layout. This TextView needs to be anchored to the right-hand side of the RelativeLayout view group, but there is no way to visually accomplish this; so, we will use a multistep process. Initially, align the TextView with the right-hand edge of the LinearLayout view group by dragging it to the left-hand side until the edge changes to a dashed blue line and drop it. In the Widget section of the Properties pad, name the widget as distanceTextView and set the font size to 14sp. In the Layout section of the Properties pad, set Align Parent Right to true, Center Vertical to true, and clear out the linearLayout1 view group name in the To Right Of layout property. Change the sample text to 204 miles. To do this, let's add a new string entry to string.xml and set the Text property from the Properties pad.The following screenshot depicts what should be seen from the Content view "at this point: Switch back to the "Source tab in the layout designer, and notice the following code generated for the POIListItem.axml layout: <?xml version="1.0" encoding="utf-8"?> <RelativeLayout p1_minWidth="25px" p1_minHeight="25px" p1_layout_width="match_parent" p1_layout_height="wrap_content" p1_id="@+id/relativeLayout1" p1_padding="5dp"> <ImageView p1_src="@android:drawable/ic_menu_gallery" p1_layout_width="65dp" p1_layout_height="65dp" p1_layout_marginRight="5dp" p1_id="@+id/poiImageView" /> <LinearLayout p1_orientation="vertical" p1_layout_width="wrap_content" p1_layout_height="wrap_content" p1_layout_toRightOf="@id/poiImageView" p1_id="@+id/linearLayout1" p1_layout_centerVertical="true"> <TextView p1_layout_width="match_parent" p1_layout_height="wrap_content" p1_id="@+id/nameTextView " p1_textSize="20sp" p1_text="@string/app_name" /> <TextView p1_text="@string/address_hint" p1_layout_width="match_parent" p1_layout_height="wrap_content" p1_id="@+id/addrTextView " p1_textSize="14sp" /> </LinearLayout> <TextView p1_text="@string/distance_hint" p1_layout_width="wrap_content" p1_layout_height="wrap_content" p1_id="@+id/textView1" p1_layout_centerVertical="true" p1_layout_alignParentRight="true" /> </RelativeLayout> Creating the PointOfInterest apps entity class The first class that is needed is the one that represents the primary focus of the application, a PointofInterest class. POIApp will allow the following attributes "to be captured for the Point Of Interest app: Id Name Description Address Latitude Longitude Image The POI entity class can be nothing more than a simple .NET class, which houses these attributes. To create a POI entity class, perform the following steps: Select the POIApp project from the Solution Explorer in Xamarin Studio. Select the POIApp project and not the solution, which is the top-level node in the Solution pad. Right-click on it and select New File. On the left-hand side of the New File dialog box, select General. At the top of the template list, in the middle of the dialog box, select Empty Class (C#). Enter the name PointOfInterest and click on OK. The class will be created in the POIApp project folder. Change the "visibility of the class to public and fill in the attributes based on the list previously identified. The following code snippet is from POIAppPOIAppPointOfInterest.cs from the code bundle available for this article: public class PointOfInterest { public int Id { get; set;} public string Name { get; set; } public string Description { get; set; } public string Address { get; set; } public string Image { get; set; } public double? Latitude { get; set; } public double? Longitude { get; set; } } Note that the Latitude and Longitude attributes are all marked as nullable. In the case of latitude and longitude, (0, 0) is actually a valid location so a null value indicates that the attributes have never been set. Populating the ListView item All the adapter "views such as ListView and GridView use an Adapter that acts as a bridge between the data and views. The Adapter iterates through the content and generates Views for each data item in the list. The Android SDK provides three different adapter implementations such as ArrayAdapter, CursorAdapter, and SimpleAdapter. An ArrayAdapter expects an array or a list as input, while CursorAdapter accepts the instance of the Cursor, and SimpleAdapter maps the static data defined in the resources. The type of adapter that suits your app need is purely based on the input data type. The BaseAdapter is the generic implementation for all of the three adapter types, and it implements the IListAdapter, ISpinnerAdapter, and IDisposable interfaces. This means that the BaseAdapter can be used for ListView, GridView, "or Spinners. For POIApp, we will create a subtype of BaseAdapter<T> as it meets our specific needs, works well in many scenarios, and allows the use of our custom layout. Creating POIListViewAdapter In order to create "POIListViewAdapter, we will start by creating a custom adapter "as follows: Create a new class named POIListViewAdapter. Open the POIListViewAdapter class file, make the class a public class, "and specify that it inherits from BaseAdapter<PointOfInterest>. Now that the adapter class has been created, we need to provide a constructor "and implement four abstract methods. Implementing a constructor Let's implement a "constructor that accepts all the information we will need to work with to populate the list. Typically, you need to pass at least two parameters: an instance of an activity because we need the activity context while accessing the standard common "resources and an input data list that can be enumerated to populate the ListView. The following code shows the constructor from the code bundle: private readonly Activity context; private List<PointOfInterest> poiListData; public POIListViewAdapter (Activity _context, List<PointOfInterest> _poiListData) :base() { this.context = _context; this.poiListData = _poiListData; } Implementing Count { get } The BaseAdapter<T> class "provides an abstract definition for a read-only Count property. In our case, we simply need to provide the count of POIs as provided in poiListData. The following code example demonstrates the implementation from the code bundle: public override int Count { get { return poiListData.Count; } } Implementing GetItemId() The BaseAdapter<T> class "provides an abstract definition for a method that returns a long ID for a row in the data source. We can use the position parameter to access a POI object in the list and return the corresponding ID. The following code example demonstrates the implementation from the code bundle: public override long GetItemId (int position) { return position; } Implementing the index getter method The BaseAdapter<T> class "provides an abstract definition for an index getter method that returns a typed object based on a position parameter passed in as an index. We can use the position parameter to access the POI object from poiListData and return an instance. The following code example demonstrates the implementation from the code bundle: public override PointOfInterest this[int index] { get{ return poiListData [index]; } } Implementing GetView() The BaseAdapter<T> class "provides an abstract definition for GetView(), which returns a view instance that represents a single row in the ListView item. As in other scenarios, you can choose to construct the view entirely in code or to inflate it from a layout file. We will use the layout file we previously created. The following code example demonstrates inflating a view from a layout file: view = context.LayoutInflater.Inflate (Resource.Layout.POIListItem, null, false); The first parameter of Inflate is a resource ID and the second is a root ViewGroup, which in this case can be left null since the view will be added to the ListView item when it is returned. Reusing row Views The GetView() method is called" for each row in the source dataset. For datasets with large numbers of rows, hundreds, or even thousands, it would require a great deal of resources to create a separate view for each row, and it would seem wasteful since only a few rows are visible at any given time. The AdapterView architecture addresses this need by placing row Views into a queue that can be reused as they" scroll out of view of the user. The GetView() method accepts a parameter named convertView, which is of type view. When a view is available for reuse, convertView will contain a reference to the view; otherwise, it will be null and a new view should be created. The following code example depicts the use of convertView to facilitate the reuse of row Views: var view = convertView; if (view == null){ view = context.LayoutInflater.Inflate (Resource.Layout.POIListItem, null); } Populating row Views Now that we have an instance of the "view, we need to populate the fields. The View class defines a named FindViewById<T> method, which returns a typed instance of a widget contained in the view. You pass in the resource ID defined in the layout file to specify the control you wish to access. The following code returns access to nameTextView and sets the Text property: PointOfInterest poi = this [position]; view.FindViewById<TextView>(Resource.Id.nameTextView).Text = poi.Name; Populating addrTextView is slightly more complicated because we only want to use the portions of the address we have, and we want to hide the TextView if none of the address components are present. The View.Visibility property allows you to control the visibility property "of a view. In our case, we want to use the ViewState.Gone value if none of "the components of the address are present. The following code shows the "logic in GetView: if (String.IsNullOrEmpty (poi.Address)) { view.FindViewById<TextView> (Resource.Id.addrTextView).Visibility = ViewStates.Gone; } else{ view.FindViewById<TextView>(Resource.Id.addrTextView).Text = poi.Address; } Populating the value for the distance text view requires an understanding of the location services. We need to do some calculation, by considering the user's current location with the POI latitude and longitude. Populating the list thumbnail image Image downloading and" processing is a complex task. You need to consider the various aspects, such as network logic, to download images from the server, caching downloaded images for performance, and image resizing for avoiding the memory out conditions. Instead of writing our own logic for doing all the earlier mentioned tasks, we can use UrlImageViewHelper, which is a free component available in the Xamarin Component Store. The Xamarin Component Store provides a set of reusable components, "including both free and premium components, that can be easily plugged into "any Xamarin-based application. Using UrlImageViewHelper The following steps will walk you "through the process of adding a component from the Xamarin Component Store: To include the UrlImageViewHelper component in POIApp, you can either double-click on the Components folder in the Solution pad, or right-click and select Edit Components. Notice that the component manager will be loaded with the already downloaded components and a Get More Components button that allows you to open the Components store from the window. Note that to access the component manager, you need to log in to your Xamarin account: Search for UrlImageViewHelper in the components search box available in the left-hand side pane. Now click on the download button to add your Xamarin Studio solution. Now that we have added the UrlImageViewHelper component, let's go back to the GetView() method in the POIListViewAdapter class. Let's take a look at the following section of the code: var imageView = view.FindViewById<ImageView> (Resource.Id.poiImageView); if (!String.IsNullOrEmpty (poi.Address)) { Koush.UrlImageViewHelper.SetUrlDrawable (imageView, poi.Image, Resource.Drawable.ic_placeholder); } Let us examine how the preceding code snippet works: The SetUrlDrawable() method defined in the UrlImageViewHelper "component provides a logic to download an image using a single line of code. It accepts three parameters: an instance of imageView, where the image is to be displayed after the download, the image source URL, and the placeholder image. Add a new image ic_placeholder.png to the drawable Resources directory. While the image is being downloaded, the placeholder image will be displayed on imageView. Downloading the image over the network requires Internet permissions. The following section will walk you through the steps involved in defining permissions in your AndroidManifest.xml file. Adding Internet permissions Android apps must be "granted permissions while accessing certain features, such as downloading data from the Internet, saving an image in storage, and so on. You must specify the permissions that an app requires in the AndroidManifest.xml file. This allows the installer to show potential users the set of permissions an app requires at the time of installation. To set the appropriate permissions, perform the following steps: Double-click on AndroidManifest.xml in the Properties directory in the Solution pad. The file will open in the manifest editor. There are two tabs: Application and Source, at the bottom of the screen, that can be used to toggle between viewing a form for editing the file and the raw XML, as shown in the following screenshot: In the" Required permissions list, check Internet and navigate to File | Save. Switch to the Source view to view the XML as follows: Summary In this article, we covered a lot about how to create user interface elements using different layout managers and widgets such as TextView, ImageView, ProgressBar, and ListView. Resources for Article: Further resources on this subject: Code Sharing Between iOS and Android[article] XamChat – a Cross-platform App[article] Heads up to MvvmCross [article]
Read more
  • 0
  • 0
  • 5098

article-image-introduction-watchkit
Packt
04 Sep 2015
7 min read
Save for later

Introduction to WatchKit

Packt
04 Sep 2015
7 min read
In this article by Hossam Ghareeb, author of the book Application Development with Swift, we will talk about a new technology, WatchKit, and a new era of wearable technologies. Now technology is a part of all aspects of our lives, even wearable objects. You can see smart watches such as the new Apple watch or glasses such as Google glass. We will go through the new WatchKit framework to learn how to extend your iPhone app functionalities to your wrist. (For more resources related to this topic, see here.) Apple watch Apple watch is a new device on your wrist that can be used to extend your iPhone app functionality; you can access the most important information and respond in easy ways using the watch. The watch is now available in most countries in different styles and models so that everyone can find a watch that suits them. When you get your Apple watch, you can pair it with your iPhone. The watch can't be paired with the iPad; it can only be paired with your iPhone. To run third-party apps on your watch, iPhone should be paired with the watch. Once paired, when you install any app on your iPhone that has an extension for the watch, the app will be installed on the watch automatically and wirelessly. WatchKit WatchKit is a new framework to build apps for Apple watch. To run third-party apps on Apple watch, you need the watch to be connected to the iPhone. WatchKit helps you create apps for Apple watch by creating two targets in Xcode: The WatchKit app: The WatchKit app is an executable app to be installed on your watch, and you can install or uninstall it from your iPhone. The WatchKit app contains the storyboard file and resources files. It doesn't contain any source code, just the interface and resource files. The WatchKit extension: This extension runs on the iPhone and has the InterfaceControllers file for your storyboard. This extension just contains the model and controller classes. The actions and outlets from the previous WatchKit app will be linked to these controller files in the WatchKit extension. These bundles—the WatchKit extension and WatchKit app—are put together and packed inside the iPhone application. When the user installs the iPhone app, the system will prompt the user to install the WatchKit app if there is a paired watch. Using WatchKit, you can extend your iOS app in three different ways: The WatchKit app As we mentioned earlier, the WatchKit app is an app installed on Apple watch and the user can find it in the list of Watch apps. The user can launch, control, and interact with the app. Once the app is launched, the WatchKit extension on the iPhone app will run in the background to update a user interface, perform any logic required, and respond to user actions. Note that the iPhone app can't launch or wake up the WatchKit extension or the WatchKit app. However, the WatchKit extension can ask the system to launch the iPhone app and this will be performed in the background. Glances Glances are single interfaces that the user can navigate between. The glance view is just read-only information, which means that you can't add any interactive UI controls such as buttons and switches. Apps should use glances to display very important and timely information. The glance view is a nonscrolling view, so your glance view should fit the watch screen. Avoid using tables and maps in interface controllers and focus on delivering the most important information in a nice way. Once the user clicks on the glance view, the watch app will be launched. The glance view is optional in your app. The glance interface and its interface controller files are a part of your WatchKit extension and WatchKit app. The glance interface resides in a storyboard, which resides in the WatchKit app. The interface controller that is responsible for filling the view with the timely important information is located in the WatchKit extension, which runs in the background in the iPhone app, as we said before. Actionable notifications For sure, you can handle and respond to local and remote notifications in an easy and fast way using Apple watch. WatchKit helps you build user interfaces for the notification that you want to handle in your WatchKit app. WatchKit helps you add actionable buttons so that the user can take action based on the notification. For example, if a notification for an invitation is sent to you, you can take action to accept or reject the notification from your wrist. You can respond to these actions easily in interface controllers in WatchKit extension. Working with WatchKit Enough talking about theory, lets see some action. Go to our lovely Xcode and create a new single-view application and name it WatchKitDemo. Don't forget to select Swift as the app language. Then navigate to File | New | Target to create a new target for the WatchKit app: After you select the target, in the pop-up window, from the left side under iOS choose Apple Watch and select WatchKit App. Check the following screenshot: After you click on Next, it will ask you which application to embed the target in and which scenes to include. Please check the Include Notification Scene and Include Glance Scene options, as shown in the following screenshot: Click on Finish, and now you have an iPhone app with the built-in WatchKit extension and WatchKit app. Xcode targets Now your project should be divided into three parts. Check the following screenshot and let's explain these parts: As you see in this screenshot, the project files are divided into three sections. In section 1, you can see the iPhone app source files, interface files or storyboard, and resources files. In section 2, you can find the WatchKit extension, which contains only interface controllers and model files. Again, as we said before, this extension also runs in iPhone in the background. In section 3, you can see the WatchKit app, which runs in Apple watch itself. As we see, it contains the storyboard and resources files. No source code can be added in this target. Interface controllers In the WatchKit extension of your Xcode project, open InterfaceController.swift. You will find the interface controller file for the scene that exists in Interface.storyboard in the WatchKit app. The InterfaceController file extends from WKInterfaceController, which is the base class for interface controllers. Forget the UI classes that you were using in the iOS apps from the UIKit framework, as it has different interface controller classes in WatchKit and they are very limited in configuration and customization. In the InterfaceController file, you can find three important methods that explain the lifecycle of your controller: awakeWithContext, willActivate, and didDeactivate. Another important method that can be overridden for the lifecycle is called init, but it's not implemented in the controller file. Let's now explain the four lifecycle methods: init: You can consider this as your first chance to update your interface elements. awakeWithContext: This is called after the init method and contains context data that can be used to update your interface elements or to perform some logical operations on these data. Context data is passed between interface controllers when you push or present another controller and you want to pass some data. willActivate: Here, your scene is about to be visible onscreen, and its your last chance to update your UI. Try to put simple UI changes here in this method so as not to cause freezing in UI. didDeactivate: Your scene is about to be invisible and, if you want to clean up your code, it's time to stop animations or timers. Summary In this article, we covered a very important topic: how to develop apps for the new wearable technology, Apple watch. We first gave a quick introduction about the new device and how it can communicate with paired iPhones. We then talked about WatchKit, the new framework, that enables you to develop apps for Apple watch and design its interface. Apple has designed the watch to contain only the storyboard and resources files. All logic and operations are performed in the iPhone app in the background. Resources for Article: Further resources on this subject: Flappy Swift [article] Playing with Swift [article] Using OpenStack Swift [article]
Read more
  • 0
  • 0
  • 9475
Modal Close icon
Modal Close icon