Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Save more on your purchases! discount-offer-chevron-icon
Savings automatically calculated. No voucher code required.
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Events
Videos
Audiobooks
Packt Hub
Free Learning
Arrow right icon
timer SALE ENDS IN
0 Days
:
00 Hours
:
00 Minutes
:
00 Seconds

How-To Tutorials

7018 Articles
article-image-control-file-types-ubuntu
Packt
15 Apr 2010
7 min read
Save for later

Control of File Types in Ubuntu

Packt
15 Apr 2010
7 min read
What is a file type? Let's not go as far deep as "What is a file?", but before we start, let's take a look at file types. File types are determined by the contents of the files themselves, and are used to allow the opening program to be chosen wisely. In Microsoft Windows, file extension globbing is the sole method of identifying file types. Users must provide a common phrase at the end of files, and the files would be searched by their name, in turn providing the correct icon and program. Things are a little different in Ubuntu. Of course, globbing (the most basic method) is present for file types, but Ubuntu has a few other tricks up its sleeve. One of these is magic numbers. The magic number of a binary file is the first few bytes, which identify the file type. The definition of a "magic" number has somewhat loosened in recent years; it can now mean any piece of data, generally near the beginning of a file, that can be used to uniquely identify the type. Another, more powerful, but too rarely used feature, is XML namespace matching. Without this feature, all XML files wouldn't be able to be more specifically identified, with the exception of extension globbing, of course. Namespace matching allows for quick detection of a XML-based format based on not only the namespace, but also the root element. For example, XHTML files (application/xhtml+xml) can not only be matched by an xhtml file extension, but also by its namespace URI (http://www.w3.org/1999/xhtml) and its root element (html). How are file types detected? In Ubuntu, programs such as Nautilus use the shared-mime-database as the sole location for file type information. Unfortunately, other Gnome facilities such as file open and save dialogs only use extension globbing, and are independent from the MIME database. These databases are stored in a similar way to how programs can be located in four tiers, /bin, /usr/bin, /usr/local/bin and ~/bin. These databases can be found in the following directories: /usr/share/mime /usr/local/share/mime ~/.local/share/mime Like the program tiers, it is generally agreed that only MIME types installed from Ubuntu packages should be located in the first level. System-wide changes by the user or programs installed via make install are placed in the second tier, while changes local to the user are in the third. The directories inside these MIME databases represent MIME groups, for example ./video for video/* MIME types, and ./application for application/* types. Not all of these directories may exist; they'll be created on demand for file types. In these directories, there are multiple XML files, each named by their MIME suffix. They contain nodes with information about magic numbers, extension globs, parent types, child and alias types, and the file type description (often in multiple languages). The update-mime-database command, invoked manually or as a trigger opened when packages are changed, draws upon the information in these files and turns them into fast-seeking formats that aren't as friendly as XML. These real databases are in the following files: aliases: alternate names for MIME types generic-icons: system icons to be used for files globs: extension globbing without priority values (deprecated) globs2: extension globbing with priority values (current) icons: custom icons for odd file types magic: magic number database mime.cache: master cache with the entire database subclasses: child file types treemagic: detection of directory structures types: a list of MIME types XMLnamespaces: detection through XML namespaces and elements A long time ago, when I was learning about the MIME database, I used Bless to directly edit these files to create changes, but was always confused by my changes immediately disappearing. This is because the information is converted one way from the XML files to the cache files. The structure of the XML files Before we use programs to modify the MIME database for convenience, here's a quick breakdown of the format of the XML files in the database. The root element is mime-info, with the shared MIME info namespace: <mime-info /> This root element contains any number of mime-type nodes, providing detection information about a file type. You could even have an empty mime-info node, but that isn't productive at all. The following are a selection of the most important elements that can be found in mime-type nodes: glob nodes with a simple wildcard glob in a pattern attribute. A weight attribute from 0 to 100 is optional, and defaults to 50: <glob pattern="*.mkv" weight="55"/> glob-deleteall and magic-deleteall nodes, which clear any cascading of globs or magic numbers from previously parsed files and starts afresh magic nodes with an optional priority attribute from 0 to 100 (again defaulting to 50). These contain match nodes, which define rules for matching using magic numbers. These are the attributes to be used with match elements: type: one of string, host16, host32, big16, big32, little16, little32 or byte offset: where to check for the magic, using a single numeric offset or a range notated start:end value: the value to match with (numeric for any type other than string) mask: an optional attribute, this can be used for more detailed matches by running a bitwise AND on the potential match before testing. The value is either numeric (in the type specified) or strings, which are hexadecimal values all starting with 0x <magic priority="60"> <match type="string" offset="0" value="DVDVIDEO"></magic> alias nodes, with a type attribute specifying alternate or deprecated MIME types that are equivalent <alias type="video/x-matroska-mkv"/> sub-class-of nodes, with a type attribute specifying the parent MIME type comment, acronym and expanded-acronym nodes that help describe the file type to people; xml:lang attributes can be used to distinguish language root-XML elements which determine types using XML namespaces have namespaceURI and localName (root element) attributes Here's an example XML source file that uses a couple of these features (this file type is bogus, I just created it for the example): <mime-info > <mime-type> <comment xml_lang="en-AU">DML source document</comment> <acronym>DML</acronym> <expanded-acronym xml_lang="en-AU">Delan's Markup Language</expanded-acronym> <sub-class-of type="application/xml"/> <glob-deleteall/> <glob pattern="*.dml"/> <root-XML namespaceURI="http://azabani.com/dml" localName="dml"/> </mime-type></mime-info> Assogiate: a GUI editor for the Gnome MIME database Assogiate is a neat little program that allows you to create and modify file types, modifying the database in a very user-friendly and quick way. It can access the user database, ~/.local/share/mime, or the system override database, /usr/local/share/mime. Changes are not, however, placed in XML files with the file name structure of the MIME type, instead they are placed in ./packages/Override.xml allowing for a memory of the user-changed file types. Assogiate can be found in the Ubuntu universe repository: sudo apt-get install assogiate In the case that it is not, you can download and compile it: curl http://azabani.com/files/apps/assogiate-0.2.1.tar.gz | tar xvzcd assogiate-0.2.1; ./configure; make; sudo make install You aren't allowed to change the system override database without running the program as a privileged user, so always run it as root: gksu assogiate In the Assogiate window, you can use the toolbar buttons to add and modify selected file types, remove and revert changes, or search for file types. The left pane allows you to narrow your view to groups of MIME types, or user modified types. Adding and editing file types The process for these two actions is very similar. When you are in the Edit Type dialog, you can edit canonical information, alias and parent types, globbing, magic numbers and XML namespace matching each in its own tab.
Read more
  • 0
  • 0
  • 9566

article-image-facebooks-ceo-mark-zuckerberg-summoned-for-hearing-by-uk-and-canadian-houses-of-commons
Bhagyashree R
01 Nov 2018
2 min read
Save for later

Facebook's CEO, Mark Zuckerberg summoned for hearing by UK and Canadian Houses of Commons

Bhagyashree R
01 Nov 2018
2 min read
Yesterday, the chairs of the UK and Canadian Houses of Commons issued a letter calling for Mark Zuckerberg, Facebook’s CEO to appear before them. The primary aim of this hearing is to get a clear idea of what measures Facebook is taking to avoid the spreading of disinformation on the social media platform and to protect user data. It is scheduled to happen at the Westminster Parliament on Tuesday 27th November. The committee has already gathered evidence regarding several data breaches and process failures including the Cambridge Analytica scandal and is now seeking answers from Mark Zuckerberg on what led to all of these incidents. Mark last attended a hearing in April with the Senate's Commerce and Judiciary committees this year in which he was asked about the company’s failure to protect its user data, its perceived bias against conservative speech, and its use for selling illegal material like drugs. After which he has not attended any of the hearings and instead sent other senior representatives such as Sheryl Sandberg, COO at Facebook. The letter pointed out: “You have chosen instead to send less senior representatives, and have not yourself appeared, despite having taken up invitations from the US Congress and Senate, and the European Parliament.” Throughout this year we saw major security and data breaches involving Facebook. The social media platform faced a security issue last month which impacted almost 50 million user accounts. Its engineering team discovered that hackers were able to find a way to exploit a series of bugs related to the View As Facebook feature. Earlier this year, Facebook witnessed a backlash for the Facebook-Cambridge Analytica data scandal. It was a major political scandal about Cambridge Analytica using personal data of millions of Facebook users for political purposes without their permission. The reports of this hearing will be shared in December if at all Zuckerberg agrees to attend it. The committee has requested his response till 7th November. Read the full letter issued by the committee. Facebook is at it again. This time with Candidate Info where politicians can pitch on camera Facebook finds ‘no evidence that hackers accessed third party Apps via user logins’, from last week’s security breach How far will Facebook go to fix what it broke: Democracy, Trust, Reality
Read more
  • 0
  • 0
  • 9560

article-image-controlling-relevancy
Packt
18 Jan 2016
19 min read
Save for later

Controlling Relevancy

Packt
18 Jan 2016
19 min read
In this article written by Bharvi Dixit, author of the book Elasticsearch Essentials, we understand that getting a search engine to behave can be very hard. It does not matter if you are a newbie or have years of experience with Elasticsearch or Solr, you must have definitely struggled with low-quality search results in your application. The default algorithm of Lucene does not come close to meeting your requirements, and there is always a struggle to deliver the relevant search results. We will be covering the following topics: (For more resources related to this topic, see here.) Introducing relevant search Out of the Box Tools from Elasticsearch Controlling relevancy with custom scoring Introducing relevant search Relevancy is the root of a search engine's value proposition and can be defined as the art of ranking content for a user's search based on how much that content satisfies the needs of the user or the business. In an application, it does not matter how beautiful your user interface looks or how many functionalities you are providing to the user; search relevancy cannot be avoided at any cost. So, despite of the mystical behavior of search engines, you have to find a solution to get the relevant results. The relevancy becomes more important because a user does not care about the whole bunch of documents that you have. The user enters his keywords, selects filters, and focuses on a very small amount of data—the relevant results. And if your search engine fails to deliver according to expectations, the user might be annoyed, which might be a loss for your business. A search engine like Elasticsearch comes with a built-in intelligence. You enter the keyword and within a blink of an eye, it returns to you the results that it thinks are relevant according to its intelligence. However, Elasticsearch does not a built-in intelligence according to your application domain. The relevancy is not defined by a search engine; rather it is defined by your users, their business needs, and the domains. Take an example of Google or Twitter, they have put in years of engineering experience, but still fail occasionally while providing relevancy. Don't they? Further, the challenges of search differ with the domain: the search on an e-commerce platform is about driving sales and bringing positive customer outcomes, whereas in fields such as medicine, it is about the matter of life and death. The lives of search engineers become more complicated because they do not have domain-specific knowledge, which can be used to understand the semantics of user queries. However, despite of all the challenges, the implementation of search relevancy is up to you, and it depends on what information you can extract from the users, their queries, and the content they see. We continuously take feedbacks from the users, create funnels, or enable loggings to capture the search behavior of the users so that we can improve our algorithms to provide the relevant results. The Elasticsearch out-of-the-box tools Elasticsearch primarily works with two models of information retrieval: the Boolean model and the Vector Space model. In addition to these, there are other scoring algorithms available in Elasticsearch as well, such as Okapi BM25, Divergence from Randomness (DFR), and Information Based (IB). Working with these three models requires an extensive mathematical knowledge and needs some extra configurations in Elasticsearch. The Boolean model uses the AND, OR, and NOT conditions in a query to find all the matching documents. This Boolean model can be further combined with the Lucene scoring formula, TF/IDF, to rank documents. The Vector Space model works differently from the Boolean model, as it represents both queries and documents as vectors. In the vector space model, each number in the vector is the weight of a term that is calculated using TF/IDF. The queries and documents are compared using a cosine similarity in which angles between two vectors are compared to find the similarity, which ultimately leads to finding the relevancy of the documents. An example: why defaults are not enough Let's build an index with sample documents to understand the examples in a better way. First, create an index with the name profiles: curl -XPUT 'localhost:9200/profiles' Then, put the mapping with the document type as candidate: curl -XPUT 'localhost:9200/profiles/candidate' {  "properties": {    "geo_code": {      "type": "geo_point",      "lat_lon": true    }  } } Please note that in preceding mapping, we are putting mapping only for the geo data type. The rest of the fields will be indexed dynamically. Now, you can create a data.json file with the following content in it: { "index" : { "_index" : "profiles", "_type" : "candidate", "_id" : 1 }} { "name" : "Sam", "geo_code" : "12.9545163,77.3500487", "total_experience":5, "skills":["java","python"] } { "index" : { "_index" : "profiles", "_type" : "candidate", "_id" : 2 }} { "name" : "Robert", "geo_code" : "28.6619678,77.225706", "total_experience":2, "skills":["java"] } { "index" : { "_index" : "profiles", "_type" : "candidate", "_id" : 3 }} { "name" : "Lavleen", "geo_code" : "28.6619678,77.225706", "total_experience":4, "skills":["java","Elasticsearch"] } { "index" : { "_index" : "profiles", "_type" : "candidate", "_id" : 4 }} { "name" : "Bharvi", "geo_code" : "28.6619678,77.225706", "total_experience":3, "skills":["java","lucene"] } { "index" : { "_index" : "profiles", "_type" : "candidate", "_id" : 5 }} { "name" : "Nips", "geo_code" : "12.9545163,77.3500487", "total_experience":7, "skills":["grails","python"] } { "index" : { "_index" : "profiles", "_type" : "candidate", "_id" : 6 }} { "name" : "Shikha", "geo_code" : "28.4250666,76.8493508", "total_experience":10, "skills":["c","java"] }  If you are indexing skills, which are separated by spaces or which include non-English characters, that is, c++, c#, or core java, you need to create mapping for the skills field as not_analyzed in advance to have exact term matching. Once the file is created, execute the following command to put the data inside the index we have just created: curl -XPOST 'localhost:9200' --data-binary @data.json If you look carefully at the example, the documents contain the data of the candidates who might be looking for jobs. For hiring candidates, a recruiter can have the following criteria: Candidates should know about Java Candidate should have an experience between 3 to 5 years Candidate should fall in the distance range of 100 kilometers from the office of the recruiter. You can construct a simple bool query in combination with a term query on the skills field along with geo_distance and range filters on the geo_code and total_experience fields respectively. However, does this give a relevant set of results? The answer would be NO. The problem is that if you are restricting the range of experience and distance, you might even get zero results or no suitable candidate. For example, you can put a range of 0 to 100 kilometers of distance but your perfect candidate might be at a distance of 101 kilometers. At the same time, if you define a wide range, you might get a huge number of non-relevant results. The other problem is that if you search for candidates who know Java, there are chances that a person who knows only Java and not any other programming language will be at the top, while a person who knows other languages apart from Java will be at the bottom. This happens because during the ranking of documents with TF/IDF, the lengths of the fields are taken into account. If the length of a field is small, the document is more relevant. Elasticsearch is not intelligent enough to understand the semantic meaning of your queries but for these scenarios, it offers you the full power to redefine how scoring and document ranking should be done. Controlling relevancy with custom scoring In most cases, you are good to go with the default scoring algorithms of Elasticsearch to return the most relevant results. However, some cases require you to have more control on the calculation of a score. This is especially required while implementing a domain-specific logic such as finding the relevant candidates for a job, where you need to implement a very specific scoring formula. Elasticsearch provides you with the function_score query to take control of all these things. Here we cover the code examples only in Java because a Python client gives you the flexibility to pass the query inside the body parameter of a search function. Python programmers can simply use the example queries in the same way. There is no extra module required to execute these queries. function_score query Function score query allows you to take the complete control of how a score needs to be calculated for a particular query: Syntax of a function_score query: {   "query": {"function_score": {     "query": {},     "boost": "boost for the whole query",     "functions": [       {}     ],     "max_boost": number,     "score_mode": "(multiply|max|...)",     "boost_mode": "(multiply|replace|...)",     "min_score" : number   }} } The function_score query has two parts: the first is the base query that finds the overall pool of results you want. The second part is the list of functions, which are used to adjust the scoring. These functions can be applied to each document that matches the main query in order to alter or completely replace the original query _score. In a function_score query, each function is composed of an optional filter that tells Elasticsearch which records should have their scores adjusted (defaults to "all records") and a description of how to adjust the score. The other parameters that can be used with a functions_score query are as follows: boost: An optional parameter that defines the boost for the entire query. max_boost: The maximum boost that will be applied by a function score. boost_mode: An optional parameter, which defaults to multiply. Score mode defines how the combined result of the score functions will influence the final score together with the subquery score. This can be replace (only the function score is used, the query score is ignored), max (the maximum of the query score and the function score), min (the minimum of the query score and the function score), sum (the query score and the function score are added), avg, or multiply (the query score and the function score are multiplied). score_mode: This parameter specifies how the results of individual score functions will be aggregated. The possible values can be first (the first function that has a matching filter is applied), avg, max, sum, min, and multiply. min_score: The minimum score to be used. Excluding Non-Relevant Documents with min_score To exclude documents that do not meet a certain score threshold, the min_score parameter can be set to the desired score threshold. The following are the built-in functions that are available to be used with the function score query: weight field_value_factor script_score The decay functions—linear, exp, and gauss Let's see them one by one and then you will learn how to combine them in a single query. weight A weight function allows you to apply a simple boost to each document without the boost being normalized: a weight of 2 results in 2 * _score. For example: GET profiles/candidate/_search {   "query": {     "function_score": {       "query": {         "term": {           "skills": {             "value": "java"           }         }       },       "functions": [         {           "filter": {             "term": {               "skills": "python"             }           },           "weight": 2         }       ],       "boost_mode": "replace"     }   } } The preceding query will match all the candidates who know Java, but will give a higher score to the candidates who also know Python. Please note that boost_mode is set to replace, which will cause _score to be calculated by a query that is to be overridden by the weight function for our particular filter clause. The query output will contain the candidates on top with a _score of 2 who know both Java and Python. Java example The previous query can be implemented in Java in the following way: First, you need to import the following classes into your code: import org.elasticsearch.action.search.SearchResponse; import org.elasticsearch.client.Client; import org.elasticsearch.index.query.QueryBuilders; import org.elasticsearch.index.query.functionscore.FunctionScoreQueryBuilder; import org.elasticsearch.index.query.functionscore.ScoreFunctionBuilders; Then the following code snippets can be used to implement the query: FunctionScoreQueryBuilder functionQuery = new FunctionScoreQueryBuilder(QueryBuilders.termQuery("skills", "java"))     .add(QueryBuilders.termQuery("skills", "python"),   ScoreFunctionBuilders.weightFactorFunction(2)).boostMode("replace");   SearchResponse response = client.prepareSearch().setIndices(indexName)         .setTypes(docType).setQuery(functionQuery)         .execute().actionGet(); field_value_factor It uses the value of a field in the document to alter the _score: GET profiles/candidate/_search {   "query": {     "function_score": {       "query": {         "term": {           "skills": {             "value": "java"           }         }       },       "functions": [         {           "field_value_factor": {             "field": "total_experience"           }         }       ],       "boost_mode": "multiply"     }   } } The preceding query finds all the candidates with java in their skills, but influences the total score depending on the total experience of the candidate. So, the more experience the candidate will have, the higher ranking he will get. Please note that boost_mode is set to multiply, which will yield the following formula for the final scoring: _score = _score * doc['total_experience'].value However, there are two issues with the preceding approach: first are the documents that have the total experience value as 0 and will reset the final score to 0. Second, Lucene _score usually falls between 0 and 10, so a candidate with an experience of more than 10 years will completely swamp the effect of the full text search score. To get rid of this problem, apart from using the field parameter, the field_value_factor function provides you with the following extra parameters to be used: factor: This is an optional factor to multiply the field value with. This defaults to 1. modifier: This is a mathematical modifier to apply to the field value. This can be :none, log, log1p, log2p, ln, ln1p, ln2p, square, sqrt, or reciprocal. It defaults to none. Java example The preceding query can be implemented in Java in the following way: First, you need to import the following classes into your code: import org.elasticsearch.action.search.SearchResponse; import org.elasticsearch.client.Client; import org.elasticsearch.index.query.QueryBuilders; import org.elasticsearch.index.query.functionscore*; Then the following code snippets can be used to implement the query: FunctionScoreQueryBuilder functionQuery = new FunctionScoreQueryBuilder(QueryBuilders.termQuery("skills", "java"))     .add(new FieldValueFactorFunctionBuilder("total_experience")).boostMode("multiply");   SearchResponse response = client.prepareSearch().setIndices("profiles")         .setTypes("candidate").setQuery(functionQuery)         .execute().actionGet(); script_score script_score is the most powerful function available in Elasticsearch. It uses a custom script to take complete control of the scoring logic. You can write a custom script to implement the logic you need. Scripting allows you to write from a simple to very complex logic. Scripts are cached, too, to allow faster executions of repetitive queries. Let's see an example: {   "script_score": {     "script": "doc['total_experience'].value"   } } Look at the special syntax to access the field values inside the script parameter. This is how the value of the fields is accessed using groovy scripting language. Scripting is, by default, disabled in Elasticsearch, so to use script score functions, first you need to add this line in your elasticsearch.yml file: script.inline: on To see some of the power of this function, look at the following example: GET profiles/candidate/_search {   "query": {     "function_score": {       "query": {         "term": {           "skills": {             "value": "java"           }         }       },       "functions": [         {           "script_score": {             "params": {               "skill_array_provided": [                 "java",                 "python"               ]             },             "script": "final_score=0; skill_array = doc['skills'].toArray(); counter=0; while(counter<skill_array.size()){for(skill in skill_array_provided){if(skill_array[counter]==skill){final_score = final_score+doc['total_experience'].value};};counter=counter+1;};return final_score"           }         }       ],       "boost_mode": "replace"     }   } } Let's understand the preceding query: params is the placeholder where you can pass the parameters to your function, similar to how you use parameters inside a method signature in other languages. Inside the script parameter, you write your complete logic. This script iterates through each document that has Java mentioned in the skills, and for each document, it fetches all the skills and stores them inside the skill_array variable. Finally, each skill that we have passed inside the params section is compared with the skills inside skill_array. If this matches, the value of the final_score variable is incremented with the value of the total_experience field of that document. The score calculated by the script score will be used to rank the documents because boost_mode is set to replace the original _score value. Do not try to work with the analyzed fields while writing the scripts. You might get weird results. This is because, had our skills field contained a value such as "core java", you could not have got the exact matching for it inside the script section. So, the fields with space-separated values need to be set as not_analyzed or the keyword has to be analyzed in advance. To write these script functions, you need to have some command over groovy scripting. However, if you find it complex, you can write these scripts in other languages, such as python, using the language plugin of Elasticsearch. More on this can be found here: https://github.com/elastic/elasticsearch-lang-python For a fast performance, use Groovy or Java functions. Python and JavaScript code requires the marshalling and unmarshalling of values that kill performances due to more CPU/memory usage. Java example The previous query can be implemented in Java in the following way: First, you need to import the following classes into your code: import org.elasticsearch.action.search.SearchResponse; import org.elasticsearch.client.Client; import org.elasticsearch.index.query.QueryBuilders; import org.elasticsearch.index.query.functionscore.*; import org.elasticsearch.script.Script; Then, the following code snippets can be used to implement the query: String script = "final_score=0; skill_array =            doc['skills'].toArray(); "         + "counter=0; while(counter<skill_array.size())"         + "{for(skill in skill_array_provided)"         + "{if(skill_array[counter]==skill)"         + "{final_score =     final_score+doc['total_experience'].value};};"         + "counter=counter+1;};return final_score";   ArrayList<String> skills = new ArrayList<String>();   skills.add("java");   skills.add("python");   Map<String, Object> params = new HashMap<String, Object>();   params.put("skill_array_provided",skills);   FunctionScoreQueryBuilder functionQuery = new   FunctionScoreQueryBuilder(QueryBuilders.termQuery("skills", "java"))     .add(new ScriptScoreFunctionBuilder(new Script(script,   ScriptType.INLINE, "groovy", params))).boostMode("replace");   SearchResponse response =   client.prepareSearch().setIndices(indexName)         .setTypes(docType).setQuery(functionQuery)         .execute().actionGet(); As you can see, the script logic is a simple string that is used to instantiate the Script class constructor inside ScriptScoreFunctionBuilder. Decay functions - linear, exp, gauss We have seen the problems of restricting the range of experience and distance that could result in getting zero results or no suitable candidates. May be a recruiter would like to hire a candidate from a different province because of a good candidate profile. So, instead of completely restricting with the range filters, we can incorporate sliding-scale values such as geo_location or dates into _score to prefer documents near a latitude/longitude point or recently published documents. Function score provide to work with this sliding scale with the help of three decay functions: linear, exp (that is, exponential), and gauss (that is, Gaussian). All three functions take the same parameter as shown in the following code and are required to control the shape of the curve created for the decay function: origin, scale, decay, and offset. The point of origin is used to calculate distance. For date fields, the default is the current timestamp. The scale parameter defines the distance from the origin at which the computed score will be equal to the decay parameter. The origin and scale parameters can be thought of as your min and max that define a bounding box within which the curve will be defined. If we want to give more boosts to the documents that have been published in the past10 days, it would be best to define the origin as the current timestamp and the scale as 10d. The offset specifies that the decay function will only compute the decay function of the  documents with a distance greater that the defined offset. The default is 0. Finally, the decay option alters how severely the document is demoted based on its position. The default decay value is 0.5. All three decay functions work only on numeric, date, and geo-point fields. GET profiles/candidate/_search {   "query": {     "function_score": {       "query": {         "match_all": {}       },       "functions": [         {           "exp": {             "geo_code": {               "origin": {                 "lat": 28.66,                 "lon": 77.22               },               "scale": "100km"             }           }         }       ],"boost_mode": "multiply"     }   } } In the preceding query, we have used the exponential decay function that tells Elasticsearch to start decaying the score calculation after a distance of 100 km from the given origin. So, the candidates who are at a distance of greater than 100km from the given origin will be ranked low, but not discarded. These candidates can still get a higher rank if we combine other functions score queries such as weight or field_value_factor with the decay function and combine the result of all the functions together. Java example: The preceding query can be implemented in Java in the following way: First, you need to import the following classes into your code: import org.elasticsearch.action.search.SearchResponse; import org.elasticsearch.client.Client; import org.elasticsearch.index.query.QueryBuilders; import org.elasticsearch.index.query.functionscore.*; Then, the following code snippets can be used to implement the query: Map<String, Object> origin = new HashMap<String, Object>();     String scale = "100km";     origin.put("lat", "28.66");     origin.put("lon", "77.22"); FunctionScoreQueryBuilder functionQuery = new     FunctionScoreQueryBuilder()     .add(new ExponentialDecayFunctionBuilder("geo_code",origin,     scale)).boostMode("multiply"); //For Linear Decay Function use below syntax //.add(new LinearDecayFunctionBuilder("geo_code",origin,   scale)).boostMode("multiply"); //For Gauss Decay Function use below syntax //.add(new GaussDecayFunctionBuilder("geo_code",origin,   scale)).boostMode("multiply");     SearchResponse response = client.prepareSearch().setIndices(indexName)         .setTypes(docType).setQuery(functionQuery)         .execute().actionGet(); In the preceding example, we have used the exp decay function but, the commented lines show examples of how other decay functions can be used. At last, as always, remember that Elasticsearch lets  you use multiple functions in a single function_score query to calculate a score that combines the results of each function. Summary Overall we covered the most important aspects of search engines, that is, relevancy. We discussed the powerful scoring capabilities available in Elasticsearch and the practical examples to show how you can control the scoring process according to your needs. Despite the relevancy challenges faced while working with search engines, the out–of-the-box features such as functions scores and custom scoring always allow us to tackle challenges with ease. Resources for Article:   Further resources on this subject: An Introduction to Kibana [article] Extending Chef [article] Introduction to Hadoop [article]
Read more
  • 0
  • 0
  • 9544

article-image-build-your-personal-assistant-with-agentgpt
Louis Owen
10 Oct 2023
7 min read
Save for later

Build your Personal Assistant with AgentGPT

Louis Owen
10 Oct 2023
7 min read
Dive deeper into the world of AI innovation and stay ahead of the AI curve! Subscribe to our AI_Distilled newsletter for the latest insights. Don't miss out – sign up today!IntroductionIn a world where technology is progressing at an exponential rate, the concept of a personal assistant is no longer confined to high-profile executives with hectic schedules. Today, due to the incredible advancements in artificial intelligence (AI), each one of us has the chance to take advantage of a personal assistant's services, even for tasks that may have appeared beyond reach just a few years ago. Imagine having an entity that can aid you in conducting research, examining your daily financial expenditures, organizing your travel itinerary, and much more. This entity is known as AI and, more precisely, it is embodied in AgentGPT.You have likely heard of AI's incredible capabilities, ranging from diagnosing diseases to defeating world-class chess champions. While AI has undoubtedly made significant strides, here's the caveat: unless you possess technical expertise, devising the workflow to fully utilize AI's potential can be an intimidating endeavor. This is where the concepts of "tools" and "agents" become relevant, and AgentGPT excels in this domain.An "agent" is essentially the mastermind behind your AI assistant. It's the entity that “thinks”, strategizes, and determines how to achieve your objectives based on the available "tools." These "tools" represent the skills your agent possesses, such as web searching, code writing, generating images, retrieving knowledge from your personal data, and a myriad of other capabilities. Creating a seamless workflow where your agent utilizes these tools effectively is no simple task. It entails connecting the agent to the tools, managing errors that may arise, devising prompts to guide the agent, and more.Fortunately, there's a game-changer in the world of AI personal assistants, and it goes by the name of AgentGPT. Without wasting any more time, let’s take a deep breath, make yourselves comfortable, and be ready to learn how to utilize AgentGPT to build your personal assistant!What is AgentGPT?AgentGPT is an open-source project that streamlines the intricate process of creating and configuring AI personal assistants. This powerful tool enables you to deploy Autonomous AI agents, each equipped with distinct capabilities and skills. You can even name your AI, fostering a sense of personalization and relatability. With AgentGPT, you can assign your AI any mission you can conceive, and it will strive to accomplish it.The magic of AgentGPT lies in its ability to empower your AI agent to think, act, and learn. Here's how it operates:Select the Tools: You start by selecting the tools for the agent. It can be web searching, code writing, generating images, or even retrieving knowledge from your personal dataSetting the Goal: You then need to define the goal you want your AI to achieve. Whether it's conducting research, managing your finances, or planning your dream vacation, the choice is yours.Task Generation: Once the goal is set and the tools are selected, your AI agent "thinks" about the tasks required to accomplish it. This involves considering the available tools and formulating a plan of action.Task Execution: Your AI agent then proceeds to execute the tasks it has devised. This can include searching the web for information, performing calculations, generating content, and more.Learning and Adaptation: As your AI agent carries out its tasks, it learns from the results. If something doesn't go as planned, it adapts its approach for the future, continuously improving its performance.In a world where time is precious and efficiency is crucial, AgentGPT emerges as a ray of hope. It's a tool that empowers individuals from all walks of life to harness the might of AI to streamline their daily tasks, realize their goals, and amplify their productivity. Thus, whether you're a business professional seeking to optimize your daily operations or an inquisitive individual eager to explore the boundless possibilities of AI, AgentGPT stands ready to propel you into a new era of personalized assistance.Initialize AgentGPTTo build your own personal assistant with AgetnGPT, you can just follow the following simple instructions. Or even, you can also just go to the website and try the demo.Open Your Terminal: You can usually access the terminal from a 'Terminal' tab or by using a shortcut.Clone the Repository: Copy and paste the following command into your terminal and press Enter. This will clone the AgentGPT repository to your local machine.a. For Max/Linux usersgit clone https://github.com/reworkd/AgentGPT.git cd AgentGPT ./setup.sh                b. For Windows usersgit clone https://github.com/reworkd/AgentGPT.git cd AgentGPT ./setup.batFollow Setup Instructions: The setup script will guide you through the setup process. You'll need to add the appropriate API keys and other required information as instructed.Access the Web Interface: Once all the services are up and running, you can access the AgentGPT web interface by opening your web browser and navigating to http://localhost:3000.Build Your Own Assistant with AgentGPTLet’s start with an example of how to build your own assistant. First and foremost, let’s select the tools for our agent. Here, we’re selecting image generation, web search, and code writing as the tools. Once we finish selecting the tools, we can define the goal for our assistant. AgentGPT provides three templates for us:ResearchGPT: Create a comprehensive report of the Nike companyTravelGPT: Plan a detailed trip to HawaiiPlatformerGPT: Write some code to make a platformer gameNote that we can also create our own assistant name with a specific goal apart from these three templates. For now, let’s select the PlatformerGPT template.Once the goal is defined, then the agent will generate all tasks required to accomplish the goal. This involves considering the available tools and formulating a plan of action.Then, based on the generated tasks, the Agent will execute each task and learn through the results of each of the tasks.This process will continue until the goal is achieved, or in this case, until the Agent succeeds in writing the code for a platformer game. If something doesn't go as planned, it adapts its approach for the future, continuously improving its performance.ConclusionCongratulations on keeping up to this point! Throughout this article, you have learned what AgentGPT is capable of and how to build your own personal assistant with it. I wish the best for your experiment in creating your personal assistant and see you in the next article!Author BioLouis Owen is a data scientist/AI engineer from Indonesia who is always hungry for new knowledge. Throughout his career journey, he has worked in various fields of industry, including NGOs, e-commerce, conversational AI, OTA, Smart City, and FinTech. Outside of work, he loves to spend his time helping data science enthusiasts to become data scientists, either through his articles or through mentoring sessions. He also loves to spend his spare time doing his hobbies: watching movies and conducting side projects.Currently, Louis is an NLP Research Engineer at Yellow.ai, the world’s leading CX automation platform. Check out Louis’ website to learn more about him! Lastly, if you have any queries or any topics to be discussed, please reach out to Louis via LinkedIn.
Read more
  • 0
  • 0
  • 9532

article-image-testing-components-service-dependencies
Victor Mejia
10 Nov 2016
5 min read
Save for later

Testing Components with Service Dependencies

Victor Mejia
10 Nov 2016
5 min read
It is very common for your Angular 2 components to depend on a service that performs actions, such as fetching data. In this post we will look at testing components with service dependencies, and at testing asynchronous actions. We will be using Jasmine for our tests. If you have not read Getting Started Testing Angular 2 Components, I strongly suggest you do so before continuing. Angular 2 Component with a Service Dependency Continuing with our contact manager application, we need to have a ContactService that fetches data from a server: import { Injectable } from '@angular/core'; import { Http } from '@angular/http'; import 'rxjs/add/operator/map'; @Injectable() export class ContactService { constructor(private http: Http){ } getContacts() { return this.http.get('/contacts.json') .map(res => res.json()); } } The Http service is injected here, and TypeScript will automatically assign the injected service to this.http. With this service ready to use, we are now ready to inject it into our ContactsComponent : import { Component, OnInit } from '@angular/core'; import { ContactService } from '../shared/contact.service'; @Component({ selector: 'contacts', template: ` <button (click)="getContacts()">Get Contacts</button> <profile *ngFor="let profile of contacts" [info]="profile"></profile> ` }) export class ContactsComponent implements OnInit { contacts: Array<any>; constructor(private contactService: ContactService) { } ngOnInit() { } getContacts() { this.contactService.getContacts() .subscribe(data => { this.contacts = data; }); } } We have an action set up, so when we click on the button, we make a call to our ContactService to fetch the data and assign the result. Once the call is resolved, the data will display. Setting up your unit test What we have to keep in mind is that we want to test our components in isolation. What this means is that instead of using the actual ContactService implementation, we create a MockContactService that returns mock data (array of Profile s). let mockData = [ { name: 'Victor Mejia', email: 'victor.mejia@example.com', phone: '123-456-7890' } ]; class MockContactService { getContacts(url) { return Observable.create((observer: Observer<Array<Profile>>) => { observer.next(mockData); }); } } When configuring our testing module, we add a new property,providers, where we specify the usage of our mock service: TestBed.configureTestingModule({ declarations: [ContactsComponent], providers: [ { provide: ContactService, useClass: MockContactService } ] }); We can now go ahead and get handles on fixture, component, and element : import { TestBed, async, ComponentFixture } from '@angular/core/testing'; import { ContactsComponent } from './contacts.component'; import { ContactService } from '../shared/contact.service'; import { Profile } from '../shared/profile.model'; import { Observable } from 'rxjs/Observable'; import { Observer } from 'rxjs/Observer'; let mockData = [ { name: 'Victor Mejia', email: 'victor.mejia@example.com', phone: '123-456-7890' } ]; class MockContactService { getContacts(url) { return Observable.create((observer: Observer<Array<Profile>>) => { observer.next(mockData); }); } } let fixture: ComponentFixture<ContactsComponent>; let component: ContactsComponent; let element: HTMLElement; describe('Component: Contacts', () => { beforeEach(async(() => { TestBed.configureTestingModule({ declarations: [ContactsComponent], providers: [ { provide: ContactService, useClass: MockContactService } ] }); TestBed.compileComponents() .then(() => { fixture = TestBed.createComponent(ContactsComponent); component = fixture.debugElement.componentInstance; element = fixture.debugElement.nativeElement }); })); }); Ensuring calls to our service A good test to always perform is to ensure that your actions are making the correct calls to your service. To do so, we can spy on the getContacts() function on the service, calling the component action and then ensuring that the function was indeed called: describe('getContacts', () => { it('should make a call to contactService.getContacts()', () => { spyOn(component.contactService, 'getContacts').and.callThrough(); component.getContacts(); expect(component.contactService.getContacts).toHaveBeenCalled(); }); }); Ensuring data is set A follow-up test to be performed is to ensure that the data is being set on the component after the call to the API is resolved. Since our call to getContacts() is performing an asynchronous action, we should use the async function in the it: it('should set the contacts property after fetching data', async(() => { ... })); It wraps the test function in an asynchronous “test zone”. Basically, it automatically completes when the asynchronous actions are complete. Next, we can make a call to component.getContacts() . However, we don’t want to run our specs until after that call has been resolved. There is a useful function we can use in our fixture, fixture.whenStable(). This returns a promise that resolves after asynchronous activity. Our test should now look as follows: it('should set the contacts property after fetching data', async(() => { component.getContacts(); fixture.whenStable().then(() => { expect(component.contacts).toEqual(mockData); }); })); We simply run a check to ensure that the contacts property is set to what the API call returns. Finer Async Control There are times when you want finer control, such as dealing with time intervals, and so on. To do so, we can simply use the fakeAsync in conjunction with the tick() function to simulate the passage of time. it('asynchronous timed test...', fakeAsync(() => { component.asyncActionWithTime(); tick(2000); // "advance" 2 seconds expect(...).toBe(...); })); Conclusion Angular 2 has wonderful APIs that make it really easy to test your components. We have seen how to test components with service dependencies, along with asynchronous actions. Time to start writing tests!
Read more
  • 0
  • 0
  • 9531

article-image-techniques-for-creating-a-multimedia-database
Packt
17 May 2013
37 min read
Save for later

Techniques for Creating a Multimedia Database

Packt
17 May 2013
37 min read
(For more resources related to this topic, see here.) Tier architecture The rules surrounding technology are constantly changing. Decisions and architectures based on current technology might easily become out of date with hardware changes. To best understand how multimedia and unstructured data fit and can adapt to the changing technology, it's important to understand how and why we arrived at our different current architectural positions. In some cases we have come full circle and reinvented concepts that were in use 20 years ago. Only by learning from the lessons of the past can we see how to move forward to deal with this complex environment. In the past 20 years a variety of architectures have come about in an attempt to satisfy some core requirements: Allow as many users as possible to access the system Ensure those users had good performance for accessing the data Enable those users to perform DML (insert/update/delete) safely and securely (safely implies ability to restore data in the event of failure) The goal of a database management system was to provide an environment where these points could be met. The first databases were not relational. They were heavily I/O focused as the computers did not have much memory and the idea of caching data was deemed to be too expensive. The servers had kilobytes and then eventually, megabytes of memory. This memory was required foremost by the programs to run in them. The most efficient architecture was to use pointers to link the data together. The architecture that emerged naturally was hierarchical and a program would navigate the hierarchy to find rows related to each other. Users connected in via a dumb terminal. This was a monitor with a keyboard that could process input and output from a basic protocol and display it on the screen. All the processing of information, including how the screen should display it (using simple escape sequence commands), was controlled in the server. Traditional no tier The mainframes used a block mode structure, where the user would enter a screen full of data and press the Enter key. After doing this the whole screen of information was sent to the server for processing. Other servers used asynchronous protocols, where each letter, as it was typed, was sent to the server for processing. This method was not as efficient as block mode because it required more server processing power to handle the data coming in. It did provide a friendlier interface for data entry as mistakes made could be relayed immediately back to the user. Block mode could only display errors once the screen of data was sent, processed, and returned. As more users started using these systems, the amount of data in them began to grow and the users wanted to get more intelligence out of the data entered. Requirements for reporting appeared as well as the ability to do ad hoc querying. The databases were also very hard to maintain and enhance as the pointer structure linked everything together tightly. It was very difficult to perform maintenance and changes to code. In the 1970s the relational database concept was formulated and it was based on sound mathematical principles. In the early 1980s the first conceptual relational databases appeared in the marketplace with Oracle leading the way. The relational databases were not received well. They performed poorly and used a huge amount of server resources. Though they achieved a stated goal of being flexible and adaptable, enabling more complex applications to be built quicker, the performance overheads of performing joins proved to be a major issue. Benefits could be seen in them, but they could never be seen as being able to be used in any environment that required tens to hundreds or thousands of concurrent users. The technology wasn't there to handle them. To initially achieve better performance the relational database vendors focused on using a changing hardware feature and that was memory. By the late 1980s the computer servers were starting to move from 16 bit to 32 bit. The memory was increasing and there was drop in the price. By adapting to this the vendors managed to take advantage of memory and improved join performance. The relational databases in effect achieved a balancing act between memory and disk I/O. Accessing a disk was about a thousand times slower than accessing memory. Memory was transient, meaning if there was a power failure and if there was data stored in memory, it would be lost. Memory was also measured in megabytes, but disk was measured in gigabytes. Disk was not transient and generally reliable, but still required safeguards to be put in place to protect from disk failure. So the balancing act the databases performed involved caching data in memory that was frequently accessed, while ensuring any modifications made to that data were always stored to disk. Additionally, the database had to ensure no data was lost if a disk failed. To improve join performance the database vendors came up with their own solutions involving indexing, optimization techniques, locking, and specialized data storage structures. Databases were judged on the speed at which they could perform joins. The flexibility and ease in which applications could be updated and modified compared to the older systems soon made the relational database become popular and must have. As all relational databases conformed to an international SQL standard, there was a perception that a customer was never locked into a propriety system and could move their data between different vendors. Though there were elements of truth to this, the reality has shown otherwise. The Oracle Database key strength was that you were not locked into the hardware and they offered the ability to move a database between a mainframe to Windows to Unix. This portability across hardware effectively broke the stranglehold a number of hardware vendors had, and opened up the competition enabling hardware vendors to focus on the physical architecture rather than the operating system within it. In the early 1990s with the rise in popularity of the Apple Macintosh, the rules changed dramatically and the concept of a user friendly graphical environment appeared. The Graphical User Interface (GUI) screen offered a powerful interface for the user to perform data entry. Though it can be argued that data entry was not (and is still not) as fast as data entry via a dumb terminal interface, the use of colors, varying fonts, widgets, comboboxes, and a whole repository of specialized frontend data entry features made the interface easier to use and more data could be entered with less typing. Arguably, the GUI opened up the computer to users who could not type well. The interface was easier to learn and less training was needed to use the interface. Two tier The GUI interface had one major drawback; it was expensive to run on the CPU. Some vendors experimented with running the GUI directly on the server (the Solaris operating system offered this capability), but it become obvious that this solution would not scale. To address this, the two-tier architecture was born. This involved using the GUI, which was running on an Apple Macintosh or Microsoft Windows or other Windows environment (Microsoft Windows wasn't the only GUI to run on Intel platforms) to handle the display processing. This was achieved by moving the application displayed to the computer that the user was using. Thus splitting the GUI presentation layer and application from the database. This seemed like an ideal solution as the database could now just focus on handling and processing SQL queries and DML. It did not have to be burdened with application processing as well. As there were no agreed network protocols, a number had to be used, including named pipes, LU6.2, DECNET, and TCP/IP. The database had to handle language conversion as the data was moved between the client and the server. The client might be running on a 16-bit platform using US7ASCII as the character set, but the server might be running on 32-bit using EBCDIC as the character set. The network suddenly became very complex to manage. What proved to be the ultimate show stopper with the architecture had nothing to do with the scalability of client or database performance, but rather something which is always neglected in any architecture, and that is the scalability of maintenance. Having an environment of a hundred users, each with their own computer accessing the server, requires a team of experts to manage those computers and ensure the software on it is correct. Application upgrades meant upgrading hundreds of computers at the same time. This was a time-consuming and manual task. Compounded by this is that if the client computer is running multiple applications, upgrading one might impact the other applications. Even applying an operating system patch could impact other applications. Users also might install their own software on their computer and impact the application running on it. A lot of time was spent supporting users and ensuring their computers were stable and could correctly communicate with the server. Three tier Specialized software vendors tried to come to the rescue by offering the ability to lock down a client computer from being modified and allowing remote access to the computer to perform remote updates. Even then, the maintenance side proved very difficult to deal with and when the idea of a three tier architecture was pushed by vendors, it was very quickly adopted as the ideal solution to move towards because it critically addressed the maintenance issue. In the mid 1990s the rules changed again. The Internet started to gain in popularity and the web browser was invented. The browser opened up the concept of a smart presentation layer that is very flexible and configured using a simple mark up language. The browser ran on top of the protocol called HTTP, which uses TCP/IP as the underlying network protocol. The idea of splitting the presentation layer from the application became a reality as more applications appeared in the browser. The web browser was not an ideal platform for data entry as the HTTP protocol was stateless making it very hard to perform transactions in it. The HTTP protocol could scale. The actual usage involved the exact same concepts as block mode data entry performed on mainframe computers. In a web browser all the data is entered on the screen, and then sent in one go to the application handling the data. The web browser also pushed the idea that the operating system the client is running on is immaterial. The web browsers were ported to Apple computers, Windows, Solaris, and Unix platforms. The web browser also introduced the idea of standard for the presentation layer. All vendors producing a web browser had to conform to the agreed HTML standard. This ensured that anyone building an application that confirmed to HTML would be able to run on any web browser. The web browser pushed the concept that the presentation layer had to run on any client computer (later on, any mobile device as well) irrespective of the operating system and what else was installed on it. The web browser was essentially immune from anything else running on the client computer. If all the client had to use was a browser, maintenance on the client machine would be simplified. HTML had severe limitations and it was not designed for data entry. To address this, the Java language came about and provided the concept of an applet which could run inside the browser, be safe, and provide an interface to the user for data entry. Different vendors came up with different architectures for splitting their two tier application into a three tier one. Oracle achieved this by taking their Oracle Forms product and moving it to the middle application tier, and providing a framework where the presentation layer would run as a Java applet inside the browser. The Java applet would communicate with a process on the application server and it would give it its own instructions for how to draw the display. When the Forms product was replaced with JDeveloper, the same concept was maintained and enhanced. The middle tier became more flexible and multiple middle application tiers could be configured enabling more concurrent users. The three tier architecture has proven to be an ideal environment for legacy systems, giving them a new life and enabling them be put in an environment where they can scale. The three tier environment has a major flaw preventing it from truly scaling. The flaw is the bottleneck between the application layer and the database. The three tier environment also is designed for relational databases. It is not designed for multimedia databases.In the architecture if the digital objects are stored in the database, then to be delivered to the customer they need to pass through the application-database network (exaggerating the bottleneck capacity issues), and from there passed to the presentation layer. Those building in this environment naturally lend themselves to the concept that the best location for the digital objects is the middle tier. This then leads to issues of security, backing up, management, and all the issues previously cited for why storing the digital objects in the database is ideal. The logical conclusion to this is to move the database to the middle tier to address this. In reality, the logical conclusion is to move the application tier back into the database tier. Virtualized architecture In the mid 2000s the idea of a virtualization began to appear in the marketplace. A virtualization was not really a new idea and the concept has existed on the IBM MVS environment since the late 1980s. What made this virtualization concept powerful was that it could run Windows, Linux, Solaris, and Mac environments within them. A virtualized environment was basically the ability to run a complete operating system within another operating system. If the computer server had sufficient power and memory, it could run multiple virtualizations (VMs). We can take the snapshot of a VM, which involves taking a view of the disk and memory and storing it. It then became possible to rollback to the snapshot. A VM could be easily cloned (copied) and backed up. VMs could also be easily transferred to different computer servers. The VM was not tied to a physical server and the same environment could be moved to new servers as their capacity increased. A VM environment became attractive to administrators simply because they were easy to manage. Rather than running five separate servers, an administrator could have the one server with five virtualizations in it. The VM environment entered at a critical moment in the evolution of computer servers. Prior to 2005 most computer servers had one or two CPUs in them. The advanced could have as many as 64 (for example, the Sun E10000), but generally, one or two was the simplest solution. The reason was that computer power was doubling every two years following Moore's law. By around 2005 the market began to realize that there was a limit to the speed of an individual CPU due to physical limitations in the size of the transistors in the chips. The solution was to grow the CPUs sideways and the concept of cores came about. A CPU could be broken down into multiple cores, where each one acted like a separate CPU but was contained in one chip. With the introduction of smart threading, the number of virtual cores increased. A single CPU could now simulate eight or more CPUs. This concept has changed the rules. A server can now run with a large number of cores whereas 10 years ago it was physically limited to one or two CPUs. If a process went wild and consumed all the resources of one CPU, it impacted all users. In the multicore CPU environment, a rogue process will not impact the others. In a VM the controlling operating system (which is also called a hypervisor, and can be hardware, firmware, or software centric) can enable VMs to be constrained to certain cores as well as CPU thresholds within that core. This allows a VM to be fenced in. This concept was taken by Amazon and the concept of the cloud environment formed. This architecture is now moving into a new path where users can now use remote desktop into their own VM on a server. The user now needs a simple laptop (resulting in the demise of the tower computer) to use remote desktop (or equivalent) into the virtualization. They then become responsible for managing their own laptop, and in the event of an issue, it can be replaced or wiped and reinstalled with a base operating system on it. This simplifies the management. As all the business data and application logic is in the VM, the administrator can now control it, easily back it up, and access it. Though this VM cloud environment seems like a good solution to resolving the maintenance scalability issue, a spanner has been thrown in the works at the same time as VMs are becoming popular, so was the evolution of the mobile into a portable hand held device with applications running on it. Mobile applications architecture The iPhone, iPad, Android, Samsung, and other devices have caused a disruption in the marketplace as to how the relationship between the user and the application is perceived and managed. These devices are simpler and on the face of it employ a variety of architectures including two tier and three tier. Quality control of the application is managed by having an independent and separate environment, where the user can obtain their application for the mobile device. The strict controls Apple employs for using iTunes are primarily to ensure that the Trojan code or viruses are not embedded in the application, resulting in a mobile device not requiring a complex and constantly updating anti-virus software. Though the interface is not ideal for heavy data entry, the applications are naturally designed to be very friendly and use touch screen controls. The low cost combined with their simple interface has made them an ideal product for most people and are replacing the need for a laptop in a number of cases. Application vendors that have applications that naturally lend themselves to this environment are taking full advantage of it to provide a powerful interface for clients to use. The result is that there are two architectures today that exist and are moving in different directions. Each one is popular and resolves certain issues. Each has different interfaces and when building and configuring a storage repository for digital objects, both these environments need to be taken into consideration. For a multimedia environment the ideal solution to implement the application is based on the Web. This is because the web environment over the last 15 years has evolved into one which is very flexible and adaptable for dealing with the display of those objects. From the display of digital images to streaming video, the web browser (with sometimes plugins to improve the display) is ideal. This includes the display of documents. The browser environment though is not strong for the editing of these digital objects. Adobe Photoshop, Gimp, Garage Band, Office, and a whole suite of other products are available that are designed to edit each type of digital object perfectly. This means that currently the editing of those digital objects requires a different solution to the loading, viewing and delivery of those digital objects. There is no right solution for the tier architecture to manage digital objects. The N-Tier model moves the application and database back into the database tier. An HTTP server can also be located in this tier or for higher availability it can be located externally. Optimal performance is achieved by locating the application as close to the database as possible. This reduces the network bottleneck. By locating the application within the database (in Oracle this is done by using PL/SQL or Java) an ideal environment is configured where there is no overhead between the application and database. The N-Tier model also supports the concept of having the digital objects stored outside the environment and delivered using other methods. This could include a streaming server. The N-Tier model also supports the concept of transformation servers. Scalability is achieved by adding more tiers and spreading the database between them. The model also deals with the issue of the connection to the Internet becoming a bottleneck. A database server in the tier is moved to another network to help balance the load. For Oracle this can be done using RAC to achieve a form of transparent scalability. In most situations, Tuning, scalability at the server is achieved using manual methods using a form of application partitioning. Basic database configuration concepts When a database administrator first creates a database that they know will contain digital objects, they will be confronted with some basic database configuration questions covering key sizing features of the database. When looking at the Oracle Database there are a number of physical and logical structures built inside the database. To avoid confusion with other database management systems, it's important to note that an Oracle Database is a collection of schemas, whereas in other database management the terminology for a database equates to exactly one schema. This confusion has caused a lot of issues in the past. An Oracle Database administrator will say it can take 30 minutes to an hour to create a database, whereas a SQL Server administrator will say it takes seconds to create a database. In Oracle to create a schema (the same as a SQL Server database) also takes seconds to perform. For the physical storage of tables, the Oracle Database is composed of logical structures called tablespaces. The tablespace is designed to provide a transparent layer between the developer creating a table and the physical disk system and to ensure the two are independent. Data in a table that resides in a tablespace can span multiple disks and disk subsystem or a network storage system. A subsystem equating to a Raid structure has been covered in greater detail at the end of this article. A tablespace is composed of many physical datafiles. Each datafile equates to one physical file on the disk. The goal when creating a datafile is to ensure its allocation of storage is contiguous in that the operating system and doesn't split its location into different areas on the disk (Raid and NAS structures store the data in different locations based on their core structure so this rule does not apply to them). A contiguous file will result in less disk activity being performed when full tablespace scans are performed. In some cases, especially, when reading in very large images, this can improve performance. A datafile is fragmented (when using locally managed tablespaces, the default in Oracle) into fixed size extents. Access to the extents is controlled via a bitmap which is managed in the header of the tablespace (which will reside on a datafile). An extent is based on the core Oracle block size. So if the extent is 128 KB and the database block size is 8 KB, 16 Oracle blocks will exist within the extent. An Oracle block is the smallest unit of storage within the database. Blocks are read into memory for caching, updated, and changes stored in the redo logs. Even though the Oracle block is the smallest unit of storage, as a datafile is an operating system file, based on the type of server filesystem (UNIX can be UFS and Windows can be NTFS), the unit of storage at this level can change. The default in Windows was once 512 bytes, but with NTFS can be as high as 64 KB. This means every time a request is made to the disk to retrieve data from the filesystem it does a read to return this amount of data. So if the Oracle block's size was 8 KB in size and the filesystem block size was 64 KB, when Oracle requests a block to be read in, the filesystem will read in 64 KB, return the 8 KB requested, and reject the rest. Most filesystems cache this data to improve performance, but this example highlights how in some cases not balancing the database block size with the filesystem block size can result in wasted I/O. The actual answer to this is operating system and filesystem dependent, and it also depends on whether Oracle is doing read aheads (using the init.ora parameter db_file_multiblock_read_count). When Oracle introduced the Exadata they put forward the idea of putting smarts into the disk layer. Rather than the database working out how best to retrieve the physical blocks of data, the database passes a request for information to the disk system. As the Exadata knows about its own disk performance, channel speed, and I/O throughput, it is in a much better position for working out the optimal method for extracting the data. It then works out the best way of retrieving it based on the request (which can be a query). In some cases it might do a full table scan because it can process the blocks faster than if it used an index. It now becomes a smart disk system rather than a dumb/blind one. This capability has changed the rules for how a database works with the underlying storage system. ASM—Automated Storage Management In Oracle 10G, Oracle introduced ASM primarily to improve the performance of Oracle RAC (clustered systems, where multiple separate servers share the same database on the same disk). It replaces the server filesystem and can handle mirroring and load balancing of datafiles. ASM takes the filesystem and operating system out of the equation and enables the database administrator to have a different degree of control over the management of the disk system. Block size The database block size is the fundamental unit of storage within an Oracle Database. Though the database can support different block sizes, a tablespace is restricted to one fixed block size. The block sizes available are 4 KB, 8 KB, 16 KB, and 32 KB (a 32 KB block size is valid only on 64-bit platforms). The current tuning mentality says it's best to have one block size for the whole database. This is based on the idea that the one block size makes it easier to manage the SGA and ensure that memory isn't wasted. If multiple block sizes are used, the database administrator has to partition the SGA into multiple areas and assign each a block size. So if the administrator decided to have the database at 8 KB and 16 KB, they would have to set up a database startup parameter indicating the size of each: DB_8K_CACHE_SIZE = 2GDB_16K_CACHE_SIZE = 1G The problem that an administrator faces is that it can be hard to judge memory usage with table usage. In the above scenario the tables residing in the 8 KB block might be accessed a lot more than 16 KB ones, meaning the memory needs to be adjusted to deal with that. This balancing act of tuning invariably results in the decision that unless exceptional situations warrant its use, it's best to keep to the same database blocks size across the whole database. This makes the job of tuning simpler. As is always the case when dealing with unstructured data, the rules change. The current thinking is that it's more efficient to store the data in a large block size. This ensures there is less wasted overhead and fewer block reads to read in a row of data. The challenge is that the size of the unstructured data can vary dramatically. It's realistic for an image thumbnail to be under 4 KB in size. This makes it an ideal candidate to be stored in the row with the other relational data. Even if an 8 KB block size is used, the thumbnail and other relational data might happily exist in the one block. A photo might be 10 MB in size requiring a large number of blocks to be used to store it. If a 16 KB block size is used, it requires about 64 blocks to store 1 MB (assuming there is some overhead that requires overall extra storage for the block header). An 8 KB block size requires about 130 blocks. If you have to store 10 MB, the number of blocks increases 10 times. For an 8 KB block that is over 1300 reads is sufficient for one small-sized 10 MB image. With images now coming close to 100 MB in size, this figure again increases by a factor of 10. It soon becomes obvious that a very large block size is needed. When storing video at over 4 GB in size, even a 32 KB block size seems too small. As is covered later in the article, unstructured data stored in an Oracle blob does not have to be cached in the SGA. In fact, it's discouraged because in most situations the data is not likely to be accessed on a frequent basis. This generally holds true but there are cases, especially with video, where this does not hold true and this situation is covered later. Under the assumption that the thumbnails are accessed frequently and should be cached and the originals are accessed infrequently and should not be cached, the conclusion is that it now becomes practical to split the SGA in two. The unstructured, uncached data is stored in a tablespace using a large block size (32 KB) and the remaining data is stored in a more acceptable and reasonable 8 KB block. The SGA for the 32 KB is kept to a bare minimum as it will not be used, thus bypassing the issue of perceived wasted memory by splitting the SGA in two. In the following table a simple test was done using three tablespace block sizes. The aim was to see if the block size would impact load and read times. The load involved reading in 67 TIF images totaling 3 GB in size. The result was that the tablespace block size made no statistical significant difference. The test was done using a 50-MB extent size and as shown shown in the next segment, this size will impact performance. So to correctly understand how important block size can be, one has to look at not only the block size but also the extent size. Details of the environment used to perform these tests CREATE TABLESPACE tbls_name BLOCKSIZE 4096/8192/16384 EXTENTMANAGEMENT LOCAL UNIFORM SIZE 50M segment space management autodatafile 'directory/datafile' size 5G reuse; The following table compares the various block sizes: Tablespace block size Blocks Extents Load time Read time 4 KB 819200 64 3.49 minutes 1.02 minutes 8 KB 403200 63 3.46 minutes 0.59 minutes 16 KB 201600 63 3.55 minutes 0.59 minutes UNIFORM extent size and AUTOALLOCATE When creating a tablespace to store the unstructured data, the next step after the block size is determined is to work out what the most efficient extent size will be. As a table might contain data ranging from hundreds of gigabytes to terabytes determining the extent size is important. The larger the extent, the potential to possible waste space if the table doesn't use it all is greater. The smaller the extent size the risk is that the table will grow into tens or hundreds of thousands of extents. As a locally managed tablespace uses a bitmap to manage the access to the extents and is generally quite fast, having it manage tens of thousands of extents might be pushing its performance capabilities. There are two methods available to the administrator when creating a tablespace. They can manually specify the fragment size using the UNIFORM extent size clause or they can let the Oracle Database calculate it using the AUTOALLOCATE clause. Tests were done to determine what the optimal fragment size was when AUTOALLOCATE was not used. The AUTOALLOCATE is a more set-and-forget method and one goal was to see if this clause was as efficient as manually setting it. Locally managed tablespace UNIFORM extent size Covers testing performed to try to find an optimal extent and block size. The results showed that a block size of 16384 (16 KB) is ideal, though 8192 (8 KB) is acceptable. The block size of 32 KB was not tested. The administrator, who might be tempted to think the larger the extent size, the better the performance, would be surprised that the results show that this is not always the case and an extent size between 50 MB-200 MB is optimal. For reads with SECUREFILES the number of extents was not a major performance factor but it was for writes. When compared to the AUTOALLOCATE clause, it was shown there was no real performance improvement or loss when used. The testing showed that an administrator can use this clause knowing they will get a good all round result when it comes to performance. The syntax for configuration is as follows: EXTENT MANAGEMENT LOCAL AUTOALLOCATE segment space management auto Repeated tests showed that this configuration produced optimal read/write times without the database administrator having to worry about what the extent size should be. For a 300 GB tablespace it produced a similar number of extents as when a 50M extent size was used. As has been covered, once an image is loaded it is rare that it is updated. A relational database fragmentation within a tablespace is caused by repeated creation/dropping of schema objects and extents of different sizes, resulting in physical storage gaps, which are not easily reused. Storage is lost. This is analogous to the Microsoft Windows environment with its disk storage. After a period of time, the disk becomes fragmented making it hard to find contiguous storage and locate similar items together. Locating all the pieces in a file as close together as possible can dramatically reduce the number of disk reads required to read it in. With NTFS (a Microsoft disk filesystem format) the system administrator can on creation determine whether extents are autoallocated or fragmented. This is similar in concept to the Oracle tablespace creation. Testing was not done to check if the fragmentation scenario is avoided with the AUTOALLOCATE clause. The database administrator should therefore be aware of the tablespace usage and whether it is likely going to be stable once rows are added (in which case AUTOALLOCATE can be used simplifying storage management). If it is volatile, the UNIFORM clause might be considered as a better option. Temporary tablespace For working with unstructured data, the primary uses of the TEMPORARY tablespace is to hold the contents of temporary tables and temporary lobs. A temporary lob is used for processing a temporary multimedia object. In the following example, a temporary blob is created. It is not cached in memory. A multimedia image type is created and loaded into it. Information is extracted and the blob is freed. This is useful if images are stored temporarily outside the database. This is not the same case as using a bfile which Oracle Multimedia supports. The bfile is a permanent pointer to an image stored outside the database. SQL>declareimage ORDSYS.ORDImage;ctx raw(4000);beginimage := ordsys.ordimage.init();dbms_lob.createtemporary(image.source.localdata,FALSE);image.importfrom(ctx, 'file', 'LOADING_DIR', 'myimg.tif');image.setProperties;dbms_output.put_line( 'width x height = ' || image.width ||'x' || image.height);dbms_lob.freetemporary(image.source.localdata);end;/width x height = 2809x4176 It's important when using this tablespace to ensure that all code, especially on failure, performs a dbms_lob.freetemporary function, to ensure that storage leakage doesn't occur. This will result in the tablespace continuing to grow until it runs out of room. In this case the only way to clean it up is to either stop all database processes referencing, then resize the datafile (or drop and recreate the temporary tablespace after creating another interim one), or to restart the database and mount it. The tablespace can then be resized or dropped and recreated. UNDO tablespace The UNDO tablespace is used by the database to store sufficient information to rollback a transaction. In a database containing a lot of digital objects, the size of the database just for storage of the objects can exceed terabytes. In this situation the UNDO tablespace can be sized larger giving added opportunity for the database administrator to perform flashback recovery from user error. It's reasonable to size the UNDO tablespace at 50 GB even growing it to 100 GB in size. The larger the UNDO tablespace the further back in time the administrator can go and the greater the breathing space between user failure, user failure detected and reported, and the database administrator doing the flash back recovery. The following is an example flashback SQL statement. The as of timestamp clause tells Oracle to find rows that match the timestamp from the current time going back so that we can have a look at a table an hour ago: select t.vimg.source.srcname || '=' ||dbms_lob.getlength(t.vimg.source.localdata)from test_load as of timestamp systimestamp - (1/24) t; SYSTEM tablespace The SYSTEM tablespace contains the data dictionary. In Oracle 11g R2 it also contains any compiled PL/SQL code (where PLSQL_CODE_TYPE=NATIVE). The recommended initial starting size of the tablespace should be 1500 MB. Redo logs The following test results highlight how important it is to get the size and placement of the redo logs correct. The goal was to determine what combination of database parameters and redo/undo size were optimal. In addition, an SSD was used as a comparison. Based on the result of each test, the parameters and/or storage was modified to see whether it would improve the results. When it appeared an optimal parameter/storage setting was found, it was locked in while the other parameters were tested further. This enabled multiple concurrent configurations to be tested and an optimal result to be calculated. The test involved loading 67 images into the database. Each image varied in size between 40 to 80 MB resulting in 2.87 GB of data being loaded. As the test involved only image loading, no processing such as setting properties or extraction of metadata was performed. Archiving on the database was not enabled. All database files resided on hard disk unless specified. In between each test a full database reboot was done. The test was run at least three times with the range of results shown as follows: Database parameter descriptions used:Redo Buffer Size = LOG_BUFFERMultiblock Read Count = db_file_multiblock_read_count Source disk Redo logs Database parameters Fastest time Slowest time Hard disk Hard disk 3 x 50 MB Redo buffer size = 4 MB Multiblock read count = 64 UNDO tablespace on HD (10 GB) Table datafile on HD 3 minutes and 22 sec 3 minutes and 53 sec Hard disk Hard disk 3 x 1 GB Redo buffer size = 4 MB Multiblock read count = 64 UNDO tablespace on HD (10 GB) Table datafile on HD 2 minutes and 49 sec 2 minutes and 57 sec Hard disk SSD 3 x 1 GB Redo buffer size = 4 MB Multiblock read count = 64 UNDO tablespace on HD (10 GB) Table datafile on HD 1 minute and 30 sec 1 minute and 41 sec Hard disk SSD 3 x 1 GB Redo buffer size = 64 MB Multiblock read count = 64 UNDO tablespace on HD (10 GB) Table datafile on HD 1 minute and 23 sec 1 minute and 48 sec Hard disk SSD 3 x 1 GB Redo buffer size = 8 MB Multiblock read count = 64 UNDO tablespace on HD (10 GB) Table datafile on HD 1 minute and 18 sec 1 minute and 29 sec Hard disk SSD 3 x 1 GB Redo buffer size = 16 MB Multiblock read count = 64 UNDO tablespace on HD (10 GB) Table datafile on HD 1 minute and 19 sec 1 minute and 27 sec Hard disk SSD 3 x 1 GB Redo buffer size = 16 MB Multiblock read count = 256 UNDO tablespace on HD (10 GB) Table datafile on HD 1 minute and 27 sec 1 minute and 41 sec Hard disk SSD 3 x 1 GB Redo buffer size = 8 MB Multiblock read count = 64 UNDO tablespace = 1 GB on SSD Table datafile on HD 1 minute and 21 sec 1 minute and 49 sec SSD SSD 3 x 1 GB Redo buffer size = 8 MB Multiblock read count = 64 UNDO tablespace = 1 GB on SSD Table datafile on HD 53 sec 54 sec SSD SSD 3 x 1 GB Redo buffer size = 8 MB Multiblock read count = 64 UNDO tablespace = 1 GB on SSD Table datafile on SSD 1 minute and 20 sec 1 minute and 20 sec Analysis The tests show a huge improvement when the redo logs were moved to a Solid State Drive (SSD). Though the conclusion that can be drawn is this: the optimal step to perform it might be self defeating. A number of manufacturers of SSD acknowledge there are limitations with the SSD when it comes to repeated writes. The Mean Time to Failure (MTF) might be 2 million hours for reads; for writes the failure rate can be very high. Modern SSD and flash cards offer much improved wear leveling algorithms to reduce failures and make performance more consistent. No doubt improvements will continue in the future. A redo log by its nature is constant and has heavy writes. So, moving the redo logs to the SSD might quickly result in it becoming damaged and failing. For an organization that on configuration performs one very large load of multimedia, the solution might be to initially keep the redo logs on SSD, and once the load is finished, to move the redo logs to a hard drive. Increasing the size of the redo logs from 50 MB to 1 GB improves performance and all database containing unstructured data should have a redo log size of at least 1 GB. The number of logs should be at least 10; preferred is from 50 to 100. As is covered later, disk is cheaper today than it once was, and 100 GB of redo logs is not that large a volume of data as it once was. The redo logs should always be mirrored. The placement or size of the UNDO tablespace makes no difference with performance. The redo buffer size (LOG_BUFFER) showed a minor improvement when it was increased in size, but the results were inconclusive as the figures varied. A figure of LOG_BUFFER=8691712, showed the best results and database administrators might use this figure as a starting point for tuning. The changing of multiblock read count (DB_FILE_MULTIBLOCK_READ_COUNT) from the default value of 64 to 256 showed no improvement. As the default value (in this case 64) is set by the database as optimal for the platform, the conclusion that can be drawn is that the database has set this figure to be a good size. By moving the original images to an SSD showed another huge improvement in performance. This highlighted how the I/O bottleneck of reading from disk and the writing to disk (redo logs) is so critical for digital object loading. The final test involved moving the datafile containing the table to the SSD. It highlighted a realistic issue that DBAs face in dealing with I/O. The disk speed and seek time might not be critical in tuning if the bottleneck is the actual time it takes to transfer the data to and from the disk to the server. In the test case the datafile was moved to the same SSD as the redo logs resulting in I/O competition. In the previous tests the datafile was on the hard disk and the database could write to the disk (separate I/O channel) and to the redo logs (separate I/O channel) without one impacting the other. Even though the SSD is a magnitude faster in performance than the disk, it quickly became swamped with calls for reads and writes. The lesson is that it's better to have multiple smaller SSDs on different I/O channels into the server than one larger channel. Sites using a SAN will soon realize that even though SAN might offer speed, unless it offers multiple I/O channels into the server, its channel to the server will quickly become the bottleneck, especially if the datafiles and the images for loading are all located on the server. The original tuning notion of separating data fi les onto separate disks that was performed more than 15 years ago still makes sense when it comes to image loading into a multimedia database. It's important to stress that this is a tuning issue while dealing with image loading not when running the database in general. Tuning the database in general is a completely different story and might result in a completely different architecture.
Read more
  • 0
  • 0
  • 9514
Unlock access to the largest independent learning library in Tech for FREE!
Get unlimited access to 7500+ expert-authored eBooks and video courses covering every tech area you can think of.
Renews at €18.99/month. Cancel anytime
article-image-introducing-jax-rs-api
Packt
21 Sep 2015
25 min read
Save for later

Introducing JAX-RS API

Packt
21 Sep 2015
25 min read
 In this article by Jobinesh Purushothaman, author of the book, RESTful Java Web Services, Second Edition, we will see that there are many tools and frameworks available in the market today for building RESTful web services. There are some recent developments with respect to the standardization of various framework APIs by providing unified interfaces for a variety of implementations. Let's take a quick look at this effort. (For more resources related to this topic, see here.) As you may know, Java EE is the industry standard for developing portable, robust, scalable, and secure server-side Java applications. The Java EE 6 release took the first step towards standardizing RESTful web service APIs by introducing a Java API for RESTful web services (JAX-RS). JAX-RS is an integral part of the Java EE platform, which ensures portability of your REST API code across all Java EE-compliant application servers. The first release of JAX-RS was based on JSR 311. The latest version is JAX-RS 2 (based on JSR 339), which was released as part of the Java EE 7 platform. There are multiple JAX-RS implementations available today by various vendors. Some of the popular JAX-RS implementations are as follows: Jersey RESTful web service framework: This framework is an open source framework for developing RESTful web services in Java. It serves as a JAX-RS reference implementation. You can learn more about this project at https://jersey.java.net. Apache CXF: This framework is an open source web services framework. CXF supports both JAX-WS and JAX-RS web services. To learn more about CXF, refer to http://cxf.apache.org. RESTEasy: This framework is an open source project from JBoss, which provides various modules to help you build a RESTful web service. To learn more about RESTEasy, refer to http://resteasy.jboss.org. Restlet: This framework is a lightweight, open source RESTful web service framework. It has good support for building both scalable RESTful web service APIs and lightweight REST clients, which suits mobile platforms well. You can learn more about Restlet at http://restlet.com. Remember that you are not locked down to any specific vendor here, the RESTful web service APIs that you build using JAX-RS will run on any JAX-RS implementation as long as you do not use any vendor-specific APIs in the code. JAX-RS annotations                                      The main goal of the JAX-RS specification is to make the RESTful web service development easier than it has been in the past. As JAX-RS is a part of the Java EE platform, your code becomes portable across all Java EE-compliant servers. Specifying the dependency of the JAX-RS API To use JAX-RS APIs in your project, you need to add the javax.ws.rs-api JAR file to the class path. If the consuming project uses Maven for building the source, the dependency entry for the javax.ws.rs-api JAR file in the Project Object Model (POM) file may look like the following: <dependency> <groupId>javax.ws.rs</groupId> <artifactId>javax.ws.rs-api</artifactId> <version>2.0.1</version><!-- set the tight version --> <scope>provided</scope><!-- compile time dependency --> </dependency> Using JAX-RS annotations to build RESTful web services Java annotations provide the metadata for your Java class, which can be used during compilation, during deployment, or at runtime in order to perform designated tasks. The use of annotations allows us to create RESTful web services as easily as we develop a POJO class. Here, we leave the interception of the HTTP requests and representation negotiations to the framework and concentrate on the business rules necessary to solve the problem at hand. If you are not familiar with Java annotations, go through the tutorial available at http://docs.oracle.com/javase/tutorial/java/annotations/. Annotations for defining a RESTful resource REST resources are the fundamental elements of any RESTful web service. A REST resource can be defined as an object that is of a specific type with the associated data and is optionally associated to other resources. It also exposes a set of standard operations corresponding to the HTTP method types such as the HEAD, GET, POST, PUT, and DELETE methods. @Path The @javax.ws.rs.Path annotation indicates the URI path to which a resource class or a class method will respond. The value that you specify for the @Path annotation is relative to the URI of the server where the REST resource is hosted. This annotation can be applied at both the class and the method levels. A @Path annotation value is not required to have leading or trailing slashes (/), as you may see in some examples. The JAX-RS runtime will parse the URI path templates in the same way even if they have leading or trailing slashes. Specifying the @Path annotation on a resource class The following code snippet illustrates how you can make a POJO class respond to a URI path template containing the /departments path fragment: import javax.ws.rs.Path; @Path("departments") public class DepartmentService { //Rest of the code goes here } The /department path fragment that you see in this example is relative to the base path in the URI. The base path typically takes the following URI pattern: http://host:port/<context-root>/<application-path>. Specifying the @Path annotation on a resource class method The following code snippet shows how you can specify @Path on a method in a REST resource class. Note that for an annotated method, the base URI is the effective URI of the containing class. For instance, you will use the URI of the following form to invoke the getTotalDepartments() method defined in the DepartmentService class: /departments/count, where departments is the @Path annotation set on the class. import javax.ws.rs.GET; import javax.ws.rs.Path; import javax.ws.rs.Produces; @Path("departments") public class DepartmentService { @GET @Path("count") @Produces("text/plain") public Integer getTotalDepartments() { return findTotalRecordCount(); } //Rest of the code goes here } Specifying variables in the URI path template It is very common that a client wants to retrieve data for a specific object by passing the desired parameter to the server. JAX-RS allows you to do this via the URI path variables as discussed here. The URI path template allows you to define variables that appear as placeholders in the URI. These variables would be replaced at runtime with the values set by the client. The following example illustrates the use of the path variable to request for a specific department resource. The URI path template looks like /departments/{id}. At runtime, the client can pass an appropriate value for the id parameter to get the desired resource from the server. For instance, the URI path of the /departments/10 format returns the IT department details to the caller. The following code snippet illustrates how you can pass the department ID as a path variable for deleting a specific department record. The path URI looks like /departments/10. import javax.ws.rs.Path; import javax.ws.rs.DELETE; @Path("departments") public class DepartmentService { @DELETE @Path("{id}") public void removeDepartment(@PathParam("id") short id) { removeDepartmentEntity(id); } //Other methods removed for brevity } In the preceding code snippet, the @PathParam annotation is used for copying the value of the path variable to the method parameter. Restricting values for path variables with regular expressions JAX-RS lets you use regular expressions in the URI path template for restricting the values set for the path variables at runtime by the client. By default, the JAX-RS runtime ensures that all the URI variables match the following regular expression: [^/]+?. The default regular expression allows the path variable to take any character except the forward slash (/). What if you want to override this default regular expression imposed on the path variable values? Good news is that JAX-RS lets you specify your own regular expression for the path variables. For example, you can set the regular expression as given in the following code snippet in order to ensure that the department name variable present in the URI path consists only of lowercase and uppercase alphanumeric characters: @DELETE @Path("{name: [a-zA-Z][a-zA-Z_0-9]}") public void removeDepartmentByName(@PathParam("name") String deptName) { //Method implementation goes here } If the path variable does not match the regular expression set of the resource class or method, the system reports the status back to the caller with an appropriate HTTP status code, such as 404 Not Found, which tells the caller that the requested resource could not be found at this moment. Annotations for specifying request-response media types The Content-Type header field in HTTP describes the body's content type present in the request and response messages. The content types are represented using the standard Internet media types. A RESTful web service makes use of this header field to indicate the type of content in the request or response message body. JAX-RS allows you to specify which Internet media types of representations a resource can produce or consume by using the @javax.ws.rs.Produces and @javax.ws.rs.Consumes annotations, respectively. @Produces The @javax.ws.rs.Produces annotation is used for defining the Internet media type(s) that a REST resource class method can return to the client. You can define this either at the class level (which will get defaulted for all methods) or the method level. The method-level annotations override the class-level annotations. The possible Internet media types that a REST API can produce are as follows: application/atom+xml application/json application/octet-stream application/svg+xml application/xhtml+xml application/xml text/html text/plain text/xml The following example uses the @Produces annotation at the class level in order to set the default response media type as JSON for all resource methods in this class. At runtime, the binding provider will convert the Java representation of the return value to the JSON format. import javax.ws.rs.Path; import javax.ws.rs.Produces; import javax.ws.rs.core.MediaType; @Path("departments") @Produces(MediaType.APPLICATION_JSON) public class DepartmentService{ //Class implementation goes here... } @Consumes The @javax.ws.rs.Consumes annotation defines the Internet media type(s) that the resource class methods can accept. You can define the @Consumes annotation either at the class level (which will get defaulted for all methods) or the method level. The method-level annotations override the class-level annotations. The possible Internet media types that a REST API can consume are as follows: application/atom+xml application/json application/octet-stream application/svg+xml application/xhtml+xml application/xml text/html text/plain text/xml multipart/form-data application/x-www-form-urlencoded The following example illustrates how you can use the @Consumes attribute to designate a method in a class to consume a payload presented in the JSON media type. The binding provider will copy the JSON representation of an input message to the Department parameter of the createDepartment() method. import javax.ws.rs.Consumes; import javax.ws.rs.core.MediaType; import javax.ws.rs.POST; @POST @Consumes(MediaType.APPLICATION_JSON) public void createDepartment(Department entity) { //Method implementation goes here… } The javax.ws.rs.core.MediaType class defines constants for all media types supported in JAX-RS. To learn more about the MediaType class, visit the API documentation available at http://docs.oracle.com/javaee/7/api/javax/ws/rs/core/MediaType.html. Annotations for processing HTTP request methods In general, RESTful web services communicate over HTTP with the standard HTTP verbs (also known as method types) such as GET, PUT, POST, DELETE, HEAD, and OPTIONS. @GET A RESTful system uses the HTTP GET method type for retrieving the resources referenced in the URI path. The @javax.ws.rs.GET annotation designates a method of a resource class to respond to the HTTP GET requests. The following code snippet illustrates the use of the @GET annotation to make a method respond to the HTTP GET request type. In this example, the REST URI for accessing the findAllDepartments() method may look like /departments. The complete URI path may take the following URI pattern: http://host:port/<context-root>/<application-path>/departments. //imports removed for brevity @Path("departments") public class DepartmentService { @GET @Produces(MediaType.APPLICATION_JSON) public List<Department> findAllDepartments() { //Find all departments from the data store List<Department> departments = findAllDepartmentsFromDB(); return departments; } //Other methods removed for brevity } @PUT The HTTP PUT method is used for updating or creating the resource pointed by the URI. The @javax.ws.rs.PUT annotation designates a method of a resource class to respond to the HTTP PUT requests. The PUT request generally has a message body carrying the payload. The value of the payload could be any valid Internet media type such as the JSON object, XML structure, plain text, HTML content, or binary stream. When a request reaches a server, the framework intercepts the request and directs it to the appropriate method that matches the URI path and the HTTP method type. The request payload will be mapped to the method parameter as appropriate by the framework. The following code snippet shows how you can use the @PUT annotation to designate the editDepartment() method to respond to the HTTP PUT request. The payload present in the message body will be converted and copied to the department parameter by the framework: @PUT @Path("{id}") @Consumes(MediaType.APPLICATION_JSON) public void editDepartment(@PathParam("id") Short id, Department department) { //Updates department entity to data store updateDepartmentEntity(id, department); } @POST The HTTP POST method posts data to the server. Typically, this method type is used for creating a resource. The @javax.ws.rs.POST annotation designates a method of a resource class to respond to the HTTP POST requests. The following code snippet shows how you can use the @POST annotation to designate the createDepartment() method to respond to the HTTP POST request. The payload present in the message body will be converted and copied to the department parameter by the framework: @POST public void createDepartment(Department department) { //Create department entity in data store createDepartmentEntity(department); } @DELETE The HTTP DELETE method deletes the resource pointed by the URI. The @javax.ws.rs.DELETE annotation designates a method of a resource class to respond to the HTTP DELETE requests. The following code snippet shows how you can use the @DELETE annotation to designate the removeDepartment() method to respond to the HTTP DELETE request. The department ID is passed as the path variable in this example. @DELETE @Path("{id}") public void removeDepartment(@PathParam("id") Short id) { //remove department entity from data store removeDepartmentEntity(id); } @HEAD The @javax.ws.rs.HEAD annotation designates a method to respond to the HTTP HEAD requests. This method is useful for retrieving the metadata present in the response headers, without having to retrieve the message body from the server. You can use this method to check whether a URI pointing to a resource is active or to check the content size by using the Content-Length response header field, and so on. The JAX-RS runtime will offer the default implementations for the HEAD method type if the REST resource is missing explicit implementation. The default implementation provided by runtime for the HEAD method will call the method designated for the GET request type, ignoring the response entity retuned by the method. @OPTIONS The @javax.ws.rs.OPTIONS annotation designates a method to respond to the HTTP OPTIONS requests. This method is useful for obtaining a list of HTTP methods allowed on a resource. The JAX-RS runtime will offer a default implementation for the OPTIONS method type, if the REST resource is missing an explicit implementation. The default implementation offered by the runtime sets the Allow response header to all the HTTP method types supported by the resource. Annotations for accessing request parameters You can use this offering to extract the following parameters from a request: a query, URI path, form, cookie, header, and matrix. Mostly, these parameters are used in conjunction with the GET, POST, PUT, and DELETE methods. @PathParam A URI path template, in general, has a URI part pointing to the resource. It can also take the path variables embedded in the syntax; this facility is used by clients to pass parameters to the REST APIs as appropriate. The @javax.ws.rs.PathParam annotation injects (or binds) the value of the matching path parameter present in the URI path template into a class field, a resource class bean property (the getter method for accessing the attribute), or a method parameter. Typically, this annotation is used in conjunction with the HTTP method type annotations such as @GET, @POST, @PUT, and @DELETE. The following example illustrates the use of the @PathParam annotation to read the value of the path parameter, id, into the deptId method parameter. The URI path template for this example looks like /departments/{id}: //Other imports removed for brevity javax.ws.rs.PathParam @Path("departments") public class DepartmentService { @DELETE @Path("{id}") public void removeDepartment(@PathParam("id") Short deptId) { removeDepartmentEntity(deptId); } //Other methods removed for brevity } The REST API call to remove the department resource identified by id=10 looks like DELETE /departments/10 HTTP/1.1. We can also use multiple variables in a URI path template. For example, we can have the URI path template embedding the path variables to query a list of departments from a specific city and country, which may look like /departments/{country}/{city}. The following code snippet illustrates the use of @PathParam to extract variable values from the preceding URI path template: @Produces(MediaType.APPLICATION_JSON) @Path("{country}/{city} ") public List<Department> findAllDepartments( @PathParam("country") String countyCode, @PathParam("city") String cityCode) { //Find all departments from the data store for a country //and city List<Department> departments = findAllMatchingDepartmentEntities(countyCode, cityCode ); return departments; } @QueryParam The @javax.ws.rs.QueryParam annotation injects the value(s) of a HTTP query parameter into a class field, a resource class bean property (the getter method for accessing the attribute), or a method parameter. The following example illustrates the use of @QueryParam to extract the value of the desired query parameter present in the URI. This example extracts the value of the query parameter, name, from the request URI and copies the value into the deptName method parameter. The URI that accesses the IT department resource looks like /departments?name=IT: @GET @Produces(MediaType.APPLICATION_JSON) public List<Department> findAllDepartmentsByName(@QueryParam("name") String deptName) { List<Department> depts= findAllMatchingDepartmentEntities (deptName); return depts; } @MatrixParam Matrix parameters are another way of defining parameters in the URI path template. The matrix parameters take the form of name-value pairs in the URI path, where each pair is preceded by semicolon (;). For instance, the URI path that uses a matrix parameter to list all departments in Bangalore city looks like /departments;city=Bangalore. The @javax.ws.rs.MatrixParam annotation injects the matrix parameter value into a class field, a resource class bean property (the getter method for accessing the attribute), or a method parameter. The following code snippet demonstrates the use of the @MatrixParam annotation to extract the matrix parameters present in the request. The URI path used in this example looks like /departments;name=IT;city=Bangalore. @GET @Produces(MediaType.APPLICATION_JSON) @Path("matrix") public List<Department> findAllDepartmentsByNameWithMatrix(@MatrixParam("name") String deptName, @MatrixParam("city") String locationCode) { List<Department> depts=findAllDepartmentsFromDB(deptName, city); return depts; } You can use PathParam, QueryParam, and MatrixParam to pass the desired search parameters to the REST APIs. Now, you may ask when to use what? Although there are no strict rules here, a very common practice followed by many is to use PathParam to drill down to the entity class hierarchy. For example, you may use the URI of the following form to identify an employee working in a specific department: /departments/{dept}/employees/{id}. QueryParam can be used for specifying attributes to locate the instance of a class. For example, you may use URI with QueryParam to identify employees who have joined on January 1, 2015, which may look like /employees?doj=2015-01-01. The MatrixParam annotation is not used frequently. This is useful when you need to make a complex REST style query to multiple levels of resources and subresources. MatrixParam is applicable to a particular path element, while the query parameter is applicable to the entire request. @HeaderParam The HTTP header fields provide necessary information about the request and response contents in HTTP. For example, the header field, Content-Length: 348, for an HTTP request says that the size of the request body content is 348 octets (8-bit bytes). The @javax.ws.rs.HeaderParam annotation injects the header values present in the request into a class field, a resource class bean property (the getter method for accessing the attribute), or a method parameter. The following example extracts the referrer header parameter and logs it for audit purposes. The referrer header field in HTTP contains the address of the previous web page from which a request to the currently processed page originated: @POST public void createDepartment(@HeaderParam("Referer") String referer, Department entity) { logSource(referer); createDepartmentInDB(department); } Remember that HTTP provides a very wide selection of headers that cover most of the header parameters that you are looking for. Although you can use custom HTTP headers to pass some application-specific data to the server, try using standard headers whenever possible. Further, avoid using a custom header for holding properties specific to a resource, or the state of the resource, or parameters directly affecting the resource. @CookieParam The @javax.ws.rs.CookieParam annotation injects the matching cookie parameters present in the HTTP headers into a class field, a resource class bean property (the getter method for accessing the attribute), or a method parameter. The following code snippet uses the Default-Dept cookie parameter present in the request to return the default department details: @GET @Path("cook") @Produces(MediaType.APPLICATION_JSON) public Department getDefaultDepartment(@CookieParam("Default-Dept") short departmentId) { Department dept=findDepartmentById(departmentId); return dept; } @FormParam The @javax.ws.rs.FormParam annotation injects the matching HTML form parameters present in the request body into a class field, a resource class bean property (the getter method for accessing the attribute), or a method parameter. The request body carrying the form elements must have the content type specified as application/x-www-form-urlencoded. Consider the following HTML form that contains the data capture form for a department entity. This form allows the user to enter the department entity details: <!DOCTYPE html> <html> <head> <title>Create Department</title> </head> <body> <form method="POST" action="/resources/departments"> Department Id: <input type="text" name="departmentId"> <br> Department Name: <input type="text" name="departmentName"> <br> <input type="submit" value="Add Department" /> </form> </body> </html> Upon clicking on the submit button on the HTML form, the department details that you entered will be posted to the REST URI, /resources/departments. The following code snippet shows the use of the @FormParam annotation for extracting the HTML form fields and copying them to the resource class method parameter: @Path("departments") public class DepartmentService { @POST //Specifies content type as //"application/x-www-form-urlencoded" @Consumes(MediaType.APPLICATION_FORM_URLENCODED) public void createDepartment(@FormParam("departmentId") short departmentId, @FormParam("departmentName") String departmentName) { createDepartmentEntity(departmentId, departmentName); } } @DefaultValue The @javax.ws.rs.DefaultValue annotation specifies a default value for the request parameters accessed using one of the following annotations: PathParam, QueryParam, MatrixParam, CookieParam, FormParam, or HeaderParam. The default value is used if no matching parameter value is found for the variables annotated using one of the preceding annotations. The following REST resource method will make use of the default value set for the from and to method parameters if the corresponding query parameters are found missing in the URI path: @GET @Produces(MediaType.APPLICATION_JSON) public List<Department> findAllDepartmentsInRange (@DefaultValue("0") @QueryParam("from") Integer from, @DefaultValue("100") @QueryParam("to") Integer to) { findAllDepartmentEntitiesInRange(from, to); } @Context The JAX-RS runtime offers different context objects, which can be used for accessing information associated with the resource class, operating environment, and so on. You may find various context objects that hold information associated with the URI path, request, HTTP header, security, and so on. Some of these context objects also provide the utility methods for dealing with the request and response content. JAX-RS allows you to reference the desired context objects in the code via dependency injection. JAX-RS provides the @javax.ws.rs.Context annotation that injects the matching context object into the target field. You can specify the @Context annotation on a class field, a resource class bean property (the getter method for accessing the attribute), or a method parameter. The following example illustrates the use of the @Context annotation to inject the javax.ws.rs.core.UriInfo context object into a method variable. The UriInfo instance provides access to the application and request URI information. This example uses UriInfo to read the query parameter present in the request URI path template, /departments/IT: @GET @Produces(MediaType.APPLICATION_JSON) public List<Department> findAllDepartmentsByName( @Context UriInfo uriInfo){ String deptName = uriInfo.getPathParameters().getFirst("name"); List<Department> depts= findAllMatchingDepartmentEntities (deptName); return depts; } Here is a list of the commonly used classes and interfaces, which can be injected using the @Context annotation: javax.ws.rs.core.Application: This class defines the components of a JAX-RS application and supplies additional metadata javax.ws.rs.core.UriInfo: This interface provides access to the application and request URI information javax.ws.rs.core.Request: This interface provides a method for request processing such as reading the method type and precondition evaluation. javax.ws.rs.core.HttpHeaders: This interface provides access to the HTTP header information javax.ws.rs.core.SecurityContext: This interface provides access to security-related information javax.ws.rs.ext.Providers: This interface offers the runtime lookup of a provider instance such as MessageBodyReader, MessageBodyWriter, ExceptionMapper, and ContextResolver javax.ws.rs.ext.ContextResolver<T>: This interface supplies the requested context to the resource classes and other providers javax.servlet.http.HttpServletRequest: This interface provides the client request information for a servlet javax.servlet.http.HttpServletResponse: This interface is used for sending a response to a client javax.servlet.ServletContext: This interface provides methods for a servlet to communicate with its servlet container javax.servlet.ServletConfig: This interface carries the servlet configuration parameters @BeanParam The @javax.ws.rs.BeanParam annotation allows you to inject all matching request parameters into a single bean object. The @BeanParam annotation can be set on a class field, a resource class bean property (the getter method for accessing the attribute), or a method parameter. The bean class can have fields or properties annotated with one of the request parameter annotations, namely @PathParam, @QueryParam, @MatrixParam, @HeaderParam, @CookieParam, or @FormParam. Apart from the request parameter annotations, the bean can have the @Context annotation if there is a need. Consider the example that we discussed for @FormParam. The createDepartment() method that we used in that example has two parameters annotated with @FormParam: public void createDepartment( @FormParam("departmentId") short departmentId, @FormParam("departmentName") String departmentName) Let's see how we can use @BeanParam for the preceding method to give a more logical, meaningful signature by grouping all the related fields into an aggregator class, thereby avoiding too many parameters in the method signature. The DepartmentBean class that we use for this example is as follows: public class DepartmentBean { @FormParam("departmentId") private short departmentId; @FormParam("departmentName") private String departmentName; //getter and setter for the above fields //are not shown here to save space } The following code snippet demonstrates the use of the @BeanParam annotation to inject the DepartmentBean instance that contains all the FormParam values extracted from the request message body: @POST public void createDepartment(@BeanParam DepartmentBean deptBean) { createDepartmentEntity(deptBean.getDepartmentId(), deptBean.getDepartmentName()); } @Encoded By default, the JAX-RS runtime decodes all request parameters before injecting the extracted values into the target variables annotated with one of the following annotations: @FormParam, @PathParam, @MatrixParam, or @QueryParam. You can use @javax.ws.rs.Encoded to disable the automatic decoding of the parameter values. With the @Encoded annotation, the value of parameters will be provided in the encoded form itself. This annotation can be used on a class, method, or parameters. If you set this annotation on a method, it will disable decoding for all parameters defined for this method. You can use this annotation on a class to disable decoding for all parameters of all methods. In the following example, the value of the path parameter called name is injected into the method parameter in the URL encoded form (without decoding). The method implementation should take care of the decoding of the values in such cases: @GET @Produces(MediaType.APPLICATION_JSON) public List<Department> findAllDepartmentsByName(@QueryParam("name") String deptName) { //Method body is removed for brevity } URL encoding converts a string into a valid URL format, which may contain alphabetic characters, numerals, and some special characters supported in the URL string. To learn about the URL specification, visit http://www.w3.org/Addressing/URL/url-spec.html. Summary With the use of annotations, the JAX-RS API provides a simple development model for RESTful web service programming. In case you are interested in knowing other Java RESTful Web Services books that Packt has in store for you, here is the link: RESTful Java Web Services, Jose Sandoval RESTful Java Web Services Security, René Enríquez, Andrés Salazar C Resources for Article: Further resources on this subject: The Importance of Securing Web Services[article] Understanding WebSockets and Server-sent Events in Detail[article] Adding health checks [article]
Read more
  • 0
  • 0
  • 9512

article-image-working-aspnet-datalist-control
Packt
19 Feb 2010
8 min read
Save for later

Working With ASP.NET DataList Control

Packt
19 Feb 2010
8 min read
In this article by Joydip Kanjilal, we will discuss the ASP.NET DataList control which can be used to display a list of repeated data items. We will learn about the following: Using the DataList control Binding images to a DataList control dynamically Displaying data using the DataList control Selecting, editing and deleting data using this control Handling the DataList control events The ASP.NET DataList Control The DataList control like the Repeater control is a template driven, light weight control, and acts as a container of repeated data items. The templates in this control are used to define the data that it will contain. It is flexible in the sense that you can easily customize the display of one or more records that are displayed in the control. You have a property in the DataList control called RepeatDirection that can be used to customize the layout of the control. The RepeatDirection property can accept one of two values, that is, Vertical or Horizontal. The RepeatDirection is Vertical by default. However, if you change it to Horizontal, rather than displaying the data as rows and columns, the DataList control will display them as a list of records with the columns in the data rendered displayed as rows. This comes in handy, especially in situations where you have too many columns in your database table or columns with larger widths of data. As an example, imagine what would happen if there is a field called Address in our Employee table having data of large size and you are displaying the data using a Repeater, a DataGrid, or a GridView control. You will not be able to display columns of such large data sizes with any of these controls as the display would look awkward. This is where the DataList control fits in. In a sense, you can think the DataList control as a combination of the DataGrid and the Repeater controls. You can use templates with it much as you did with a Repeater control and you can also edit the records displayed in the control, much like the DataGrid control of ASP.NET. The next section compares the features of the three controls that we have mentioned so far, that is, the Repeater, the DataList, and the DataGrid control of ASP.NET. When the web page is in execution with the data bound to it using the Page_Load event, the data in the DataList control is rendered as DataListItem objects, that is, each item displayed is actually a DataListItem. Similar to the Repeater control, the DataList control does not have Paging and Sorting functionalities build into it. Using the DataList Control To use this control, drag and drop the control in the design view of the web form onto a web form from the toolbox. Refer to the following screenshot, which displays a DataList control on a web form: The following list outlines the steps that you can follow to add a DataList control in a web page and make it working: Drag and drop a DataList control in the web form from the toolbox. Set the DataSourceID property of the control to the data source that you will use to bind data to the control, that is, you can set this to an SQL Data Source control. Open the .aspx file, declare the <ItemTemplate> element and define the fields as per your requirements. Use data binding syntax through the Eval() method to display data in these defined fields of the control. You can bind data to the DataList control in two different ways, that is, using the DataSourceID and the DataSource properties. You can use the inbuilt features like selecting and updating data when using the DataSourceID property. Note that you need to write custom code for selecting and updating data to any data source that implements the ICollection and IEnumerable data sources. We will discuss more on this later. The next section discusses how you can handle the events in the DataList control. Displaying Data Similar to the Repeater control, the DataList control contains a template that is used to display the data items within the control. Since there are no data columns associated with this control, you use templates to display data. Every column in a DataList control is rendered as a <span> element. A DataList control is useless without templates. Let us now lern what templates are, the types of templates, and how to work with them. A template is a combination of HTML elements, controls, and embedded server controls, and can be used to customize and manipulate the layout of a control. A template comprises HTML tags and controls that can be used to customize the look and feel of controls like Repeater, DataGrid, or DataList. There are seven templates and seven styles in all. You can use templates for the DataList control in the same way you did when using the Repeater control. The following is the list of templates and their associated styles in the DataList control The Templates are as follows: ItemTemplate AlternatingItemTemplate EditItemTemplate FooterTemplate HeaderTemplate SelectedItemTemplate SeparatorTemplate The following screenshot illustrates the different templates of this control. As you can see from this figure, the templates are grouped under three broad categories. These are: Item Templates Header and Footer Templates Separator Template Note that out of the templates given above, the ItemTemplate is the one and only mandatory template that you have to use when working with a DataList control. Here is a sample of how your DataList control's templates are arranged: < asp:DataList id="dlEmployee" runat="server"><HeaderTemplate>...</HeaderTemplate><ItemTemplate>...</ItemTemplate><AlternatingItemTemplate>...</AlternatingItemTemplate><FooterTemplate>...</FooterTemplate></asp:DataList> The following screenshot displays a DataList control populated with data and with its templates indicated. Customizing a DataList control at run timeYou can customize the DataList control at run time using the ListItemType property in the ItemCreated event of this control as follows: private void DataList1_ItemCreated(objectsender, ...........System.Web.UI.WebControls.DataListItemEventArgs e){ switch (e.Item.ItemType) { case System.Web.UI.WebControls.ListItemType.Item : e.Item.BackColor = Color.Red; break; case System.Web.UI.WebControls.ListItemType. AlternatingItem : e.Item.BackColor = Color.Blue; break; case System.Web.UI.WebControls.ListItemType. SelectedItem : e.Item.BackColor = Color.Green; break; default : break; }} The Styles that you can use with the DataList control to customize the look and feel are: AlternatingItemStyle EditItemStyle FooterStyle HeaderStyle ItemStyle SelectedItemStyle SeparatorStyle You can use any of these styles to format the control, that is, format the HTML code that is rendered. You can also use layouts of the DataList control for formatting, that is, further customization of your user interface. The available layouts are as follows: FlowLayout TableLayout VerticalLayout HorizontalLayout You can specify your desired flow or table format at design time by specifying the following in the .aspx file. RepeatLayout = "Flow" You can also do the same at run time by specifying your desired layout using the RepeatLayout property of the DataList control as shown in the following code snippet: DataList1.RepeatLayout = RepeatLayout.Flow In the code snippet, it is assumed that the name of the DataList control is DataList1. Let us now understand how we can display data using the DataList control. For this, we would first drag and drop a DataList control in our web form and specify the templates for displaying data. The code in the .aspx file is as follows: <asp:DataList ID="DataList1" runat="server"> <HeaderTemplate> <table border="1"> <tr> <th> Employee Code </th> <th> Employee Name </th> <th> Basic </th> <th> Dept Code </th> </tr> </HeaderTemplate> <ItemTemplate> <tr bgcolor="#0xbbbb"> <td> <%# DataBinder.Eval(Container.DataItem, "EmpCode")%> </td> <td> <%# DataBinder.Eval(Container.DataItem, "EmpName")%> </td> <td> <%# DataBinder.Eval(Container.DataItem, "Basic")%> </td> <td> <%# DataBinder.Eval(Container.DataItem, "DeptCode")%> </td> </tr> </ItemTemplate> <FooterTemplate> </FooterTemplate></asp:DataList> The DataList control is populated with data in the Page_Load event of the web form using the DataManager class as usual. protected void Page_Load(object sender, EventArgs e) { DataManager dataManager = new DataManager(); DataList1.DataSource = dataManager.GetEmployees(); DataList1.DataBind(); } Note that the DataBinder.Eval() method has been used as usual to display the values of the corresponding fields from the data container in the DataList control. The data container in our case is the DataSet instance that is returned by the GetEmployees () method of the DataManager class. When you execute the application, the output is as follows:
Read more
  • 0
  • 1
  • 9507

article-image-fundamental-razor-syntaxes
Packt
18 Jun 2013
2 min read
Save for later

Fundamental Razor syntaxes

Packt
18 Jun 2013
2 min read
(For more resources related to this topic, see here.) Getting ready In this view page you can try all the Razor syntaxes given in this section. How to do it... Here, let's start learning the fundamene written using three different approaches: inline, code block, and mixed. Inline code expressions Inline code expressions are always written in a single line, as follows: I always enjoy @DateTime.Now.DayOfWeek with my family. At runtime, the inline code expression, which is @DateTime.Now.DayOfWeek, will be converted into a day, such as Sunday. This can be seen in the following screenshot: Let's look at one more example, which will pass the controller's ViewBag and ViewData messages on the view. The rendered output will be as follows: Code block expression Code block expression is actually a set of multiple code lines that start and end with @{}. The use of opening (@{) and closing (}) characters is mandatory, even for single line of C# or VB code; as shown in the following screenshot: This will render the following output: Mixed code expression Mixed code expression is a set of multiple inline code expressions in a code block where we switch between C# and HTML. The magical key here is @:, which allows writing HTML in a code block, as follows: This will render the following output: So, this is all about how we write the code on Razor view page. Summary This article thus you learned about inline code expressions, code block expressions, and mixed code expressions. Resources for Article : Further resources on this subject: Deploying HTML5 Applications with GNOME [Article] Making the World Wide Web an Easier Place to Talk About [Article] The Best Way to Create Round Cornered Boxes with CSS [Article]
Read more
  • 0
  • 0
  • 9505

article-image-hyper-v-basics
Packt
06 Feb 2015
10 min read
Save for later

Hyper-V Basics

Packt
06 Feb 2015
10 min read
This article by Vinith Menon, the author of Microsoft Hyper-V PowerShell Automation, delves into the basics of Hyper-V, right from installing Hyper-V to resizing virtual hard disks. The Hyper-V PowerShell module includes several significant features that extend its use, improve its usability, and allow you to control and manage your Hyper-V environment with more granular control. Various organizations have moved on from Hyper-V (V2) to Hyper-V (V3). In Hyper-V (V2), the Hyper-V management shell was not built-in and the PowerShell module had to be manually installed. In Hyper-V (V3), Microsoft has provided an exhaustive set of cmdlets that can be used to manage and automate all configuration activities of the Hyper-V environment. The cmdlets are executed across the network using Windows Remote Management. In this article, we will cover: The basics of setting up a Hyper-V environment using PowerShell The fundamental concepts of Hyper-V management with the Hyper-V management shell The updated features in Hyper-V (For more resources related to this topic, see here.) Here is a list of all the new features introduced in Hyper-V in Windows Server 2012 R2. We will be going in depth through the important changes that have come into the Hyper-V PowerShell module with the following features and functions: Shared virtual hard disk Resizing the live virtual hard disk Installing and configuring your Hyper-V environment Installing and configuring Hyper-V using PowerShell Before you proceed with the installation and configuration of Hyper-V, there are some prerequisites that need to be taken care of: The user account that is used to install the Hyper-V role should have administrative privileges on the computer There should be enough RAM on the server to run newly created virtual machines Once the prerequisites have been taken care of, let's start with installing the Hyper-V role: Open a PowerShell prompt in Run as Administrator mode: Type the following into the PowerShell prompt to install the Hyper-V role along with the management tools; once the installation is complete, the Hyper-V Server will reboot and the Hyper-V role will be successfully installed: Install-WindowsFeature –Name Hyper-V -IncludeManagementTools - Restart Once the server boots up, verify the installation of Hyper-V using the Get-WindowsFeature cmdlet: Get-WindowsFeature -Name hyper* You will be able to see that the Hyper-V role, Hyper-V PowerShell management shell, and the GUI management tools are successfully installed:   Fundamental concepts of Hyper-V management with the Hyper-V management shell In this section, we will look at some of the fundamental concepts of Hyper-V management with the Hyper-V management shell. Once you get the Hyper-V role installed as per the steps illustrated in the previous section, a PowerShell module to manage your Hyper-V environment will also get installed. Now, perform the following steps: Open a PowerShell prompt in the Run as Administrator mode. PowerShell uses cmdlets that are built using a verb-noun naming system (for more details, refer to Learning Windows PowerShell Names at http://technet.microsoft.com/en-us/library/dd315315.aspx). Type the following command into the PowerShell prompt to get a list of all the cmdlets in the Hyper-V PowerShell module: Get-Command -Module Hyper-V Hyper-V in Windows Server 2012 R2 ships with about 178 cmdlets. These cmdlets allow a Hyper-V administrator to handle very simple, basic tasks to advanced ones such as setting up a Hyper-V replica for virtual machine disaster recovery. To get the count of all the available Hyper-V cmdlets, you can type the following command in PowerShell: Get-Command -Module Hyper-V | Measure-Object The Hyper-V PowerShell cmdlets follow a very simple approach and are very user friendly. The cmdlet name itself indirectly communicates with the Hyper-V administrator about its functionality. The following screenshot shows the output of the Get command: For example, in the following screenshot, the Remove-VMSwitch cmdlet itself says that it's used to delete a previously created virtual machine switch: If the administrator is still not sure about the task that can be performed by the cmdlet, he or she can get help with detailed examples using the Get-Help cmdlet. To get help on the cmdlet type, type the cmdlet name in the prescribed format. To make sure that the latest version of help files are installed on the server, run the Update-Help cmdlet before executing the following cmdlet: Get-Help <Hyper-V cmdlet> -Full The following screenshot is an example of the Get-Help cmdlet: Shared virtual hard disks This new and improved feature in Windows Server 2012 R2 allows an administrator to share a virtual hard disk file (the .vhdx file format) between multiple virtual machines. These .vhdx files can be used as shared storage for a failover cluster created between virtual machines (also known as guest clustering). A shared virtual hard disk allows you to create data disks and witness disks using .vhdx files with some advantages: Shared disks are ideal for SQL database files and file servers Shared disks can be run on generation 1 and generation 2 virtual machines This new feature allows you to save on storage costs and use the .vhdx files for guest clustering, enabling easier deployment rather than using virtual Fibre Channel or Internet Small Computer System Interface (iSCSI), which are complicated and require storage configuration changes such as zoning and Logic Unit Number (LUN) masking. In Windows Server 2012 R2, virtual iSCSI disks (both shared and unshared virtual hard disk files) show up as virtual SAS disks when you add an iSCSI hard disk to a virtual machine. Shared virtual hard disks (.vhdx) files can be placed on Cluster Shared Volumes (CSV) or a Scale-Out File Server cluster Let's look at the ways you can automate and manage your shared .vhdx guest clustering configuration using PowerShell. In the following example, we will demonstrate how you can create a two-node file server cluster using the shared VHDX feature. After that, let's set up a testing environment within which we can start learning these new features. The steps are as follows: We will start by creating two virtual machines each with 50 GB OS drives, which contains a sysprep image of Windows Server 2012 R2. Each virtual machine will have 4 GB RAM and four virtual CPUs. D:vhdbase_1.vhdx and D:vhdbase_2.vhdx are already existing VHDX files with sysprepped image of Windows Server 2012 R2. The following code is used to create two virtual machines: New-VM –Name "Fileserver_VM1" –MemoryStartupBytes 4GB – NewVHDPath d:vhdbase_1.vhdx -NewVHDSizeBytes 50GB New-VM –Name "Fileserver_VM2" –MemoryStartupBytes 4GB –NewVHDPath d:vhdbase_2.vhdx -NewVHDSizeBytes 50GB Next, we will install the file server role and configure a failover cluster on both the virtual machines using PowerShell. You need to enable PowerShell remoting on both the file servers and also have them joined to a domain. The following is the code: Install-WindowsFeature -computername Fileserver_VM1 File- Services, FS-FileServer, Failover-Clustering   Install-WindowsFeature -computername Fileserver_VM1 RSAT- Clustering –IncludeAllSubFeature   Install-WindowsFeature -computername Fileserver_VM2 File- Services, FS-FileServer, Failover-Clustering   Install-WindowsFeature -computername Fileserver_VM2 RSAT- Clustering -IncludeAllSubFeature Once we have the virtual machines created and the file server and failover clustering features installed, we will create the failover cluster as per Microsoft's best practices using the following set of cmdlets: New-Cluster -Name Cluster1 -Node FileServer_VM1,   FileServer_VM2 -StaticAddress 10.0.0.59 -NoStorage – Verbose You will need to choose a name and IP address that fits your organization. Next, we will create two vhdx files named sharedvhdx_data.vhdx (which will be used as a data disk) and sharedvhdx_quorum.vhdx (which will be used as the quorum or the witness disk). To do this, the following commands need to be run on the Hyper-V cluster: New-VHD -Path   c:ClusterStorageVolume1sharedvhdx_data.VHDX -Fixed - SizeBytes 10GB   New-VHD -Path   c:ClusterStorageVolume1sharedvhdx_quorum.VHDX -Fixed - SizeBytes 1GB Once we have created these virtual hard disk files, we will add them as shared .vhdx files. We will attach these newly created VHDX files to the Fileserver_VM1 and Fileserver_VM2 virtual machines and specify the parameter-shared VHDX files for guest clustering: Add-VMHardDiskDrive –VMName Fileserver_VM1 -Path   c:ClusterStorageVolume1sharedvhdx_data.VHDX – ShareVirtualDisk   Add-VMHardDiskDrive –VMName Fileserver_VM2 -Path   c:ClusterStorageVolume1sharedvhdx_data.VHDX – ShareVirtualDisk Finally, we will be making the disks available online and adding them to the failover cluster using the following command: Get-ClusterAvailableDisk | Add-ClusterDisk Once we have executed the preceding set of steps, we will have a highly available file server infrastructure using shared VHD files. Live virtual hard disk resizing With Windows Server 2012 R2, a newly added feature in Hyper-V allows the administrators to expand or shrink the size of a virtual hard disk attached to the SCSI controller while the virtual machines are still running. Hyper-V administrators can now perform maintenance operations on a live VHD and avoid any downtime by not temporarily shutting down the virtual machine for these maintenance activities. Prior to Windows Server 2012 R2, to resize a VHD attached to the virtual machine, it had to be turned off leading to costly downtime. Using the GUI controls, the VHD resize can be done by using only the Edit Virtual Hard Disk wizard. Also, note that the VHDs that were previously expanded can be shrunk. The Windows PowerShell way of doing a VHD resize is by using the Resize-VirtualDisk cmdlet. Let's look at the ways you can automate a VHD resize using PowerShell. In the next example, we will demonstrate how you can expand and shrink a virtual hard disk connected to a VM's SCSI controller. We will continue using the virtual machine that we created for our previous example. We have a pre-created VHD of 50 GB that is connected to the virtual machine's SCSI controller. Expanding the virtual hard disk Let's resize the aforementioned virtual hard disk to 57 GB using the Resize-Virtualdisk cmdlet: Resize-VirtualDisk -Name "scsidisk" -Size (57GB) Next, if we open the VM settings and perform an inspect disk operation, we'll be able to see that the VHDX file size has become 57 GB: Also, one can verify this when he or she logs into the VM, opens disk management, and extends the unused partition. You can see that the disk size has increased to 57 GB: Resizing the virtual hard disk Let's resize the earlier mentioned VHD to 57 GB using the Resize-Virtualdisk cmdlet: For this exercise, the primary requirement is to shrink the disk partition by logging in to the VM using disk management, as you can see in the following screenshot; we're shrinking the VHDX file by 7 GB: Next, click on Shrink. Once you complete this step, you will see that the unallocated space is 7 GB. You can also execute this step using the Resize-Partition Powershell cmdlet: Get-Partition -DiskNumber 1 | Resize-Partition -Size 50GB The following screenshot shows the partition: Next, we will resize/shrink the VHD to 50 GB: Resize-VirtualDisk -Name "scsidisk" -Size (50GB) Once the previous steps have been executed successfully, run a re-scan disk using disk management and you will see that the disk size is 50 GB: Summary In this article, we went through the basics of setting up a Hyper-V environment using PowerShell. We also explored the fundamental concepts of Hyper-V management with Hyper-V management shell. Resources for Article: Further resources on this subject: Hyper-V building blocks for creating your Microsoft virtualization platform [article] The importance of Hyper-V Security [article] Network Access Control Lists [article]
Read more
  • 0
  • 0
  • 9499
Packt
09 Mar 2016
13 min read
Save for later

Keystone – OpenStack Identity Service

Packt
09 Mar 2016
13 min read
In this article by Cody Bunch, Kevin Jackson and, Egle Sigler, the authors of  OpenStack Cloud Computing Cookbook, Third Edition, we will cover the following topics: Creating tenants in Keystone Configuring roles in Keystone Adding users to Keystone (For more resources related to this topic, see here.) The OpenStack Identity service, known as Keystone, provides services for authenticating and managing user accounts and role information for our OpenStack cloud environment. It is a crucial service that underpins the authentication and verification between all of our OpenStack cloud services and is the first service that needs to be installed within an OpenStack environment. The OpenStack Identity service authenticates users and tenants by sending a validated authorization token between all OpenStack services. This token is used for authentication and verification so that you can use that service, such as OpenStack Storage and Compute. Therefore, configuration of the OpenStack Identity service must be completed first, consisting of creating appropriate roles for users and services, tenants, the user accounts, and the service API endpoints that make up our cloud infrastructure. In Keystone, we have the concepts of tenants, roles and users. A tenant is like a project and has resources such as users, images, and instances, as well as networks in it that are only known to that particular project. A user can belong to one or more tenants and is able to switch between these projects to gain access to those resources. Users within a tenant can have various roles assigned. In the most basic scenario, a user can be assigned either the role of admin or just be a member. When a user has admin privileges within a tenant, they are able to utilize features that can affect the tenant (such as modifying external networks), whereas a normal user is assigned the member role, which is generally assigned to perform user-related roles, such as spinning up instances, creating volumes, and creating tenant only networks. Creating tenants in Keystone A tenant in OpenStack is a project, and the two terms are generally used interchangeably. Users can't be created without having a tenant assigned to them, so these must be created first. Here, we will create a tenant called cookbook for our users. Getting ready We will be using the keystone client to operate Keystone. If the python-keystoneclient tool isn't available, follow the steps described at http://bit.ly/OpenStackCookbookClientInstall. Ensure that we have our environment set correctly to access our OpenStack environment for administrative purposes: export OS_TENANT_NAME=cookbook export OS_USERNAME=admin export OS_PASSWORD=openstack export OS_AUTH_URL=https://192.168.100.200:5000/v2.0/ export OS_NO_CACHE=1 export OS_KEY=/vagrant/cakey.pem export OS_CACERT=/vagrant/ca.pem You can use the controller node if no other machines are available on your network, as this has the python-keystoneclient and the relevant access to the OpenStack environment. If you are using the Vagrant environment issue the following command to get access to the Controller: vagrant ssh controller How to do it... To create a tenant in our OpenStack environment, perform the following steps: We start by creating a tenant called cookbook: keystone tenant-create \     --name cookbook \     --description "Default Cookbook Tenant" \     --enabled true This will produce output similar to: +-------------+----------------------------------+|   Property  |              Value               |+-------------+----------------------------------+| description |     Default Cookbook Tenant      ||   enabled   |               True               ||      id     | fba7b31689714d1ab39a751bc9483efd ||     name    |             cookbook             |+-------------+----------------------------------+ We also need an admin tenant so that when we create users in this tenant, they have access to our complete environment. We do this in the same way as in the previous step: keystone tenant-create \     --name admin \     --description "Admin Tenant" \     --enabled true How it works... Creation of the tenants is achieved by using the keystone client, specifying the tenant-create option with the following syntax: keystone tenant-create \     --name tenant_name \     --description "A description" \     --enabled true The tenant_name is an arbitrary string and must not contain spaces. On creation of the tenant, this returns an ID associated with it that we use when adding users to this tenant. To see a list of tenants and the associated IDs in our environment, we can issue the following command: keystone tenant-list Configuring roles in Keystone Roles are the permissions given to users within a tenant. Here, we will configure two roles: an admin role that allows for the administration of our environment, and a member role that is given to ordinary users who will be using the cloud environment. Getting ready We will be using the keystone client to operate Keystone. If the python-keystoneclient tool isn't available, follow the steps described at http://bit.ly/OpenStackCookbookClientInstall. Ensure that we have our environment set correctly to access our OpenStack environment for administrative purposes: export OS_TENANT_NAME=cookbook export OS_USERNAME=admin export OS_PASSWORD=openstack export OS_AUTH_URL=https://192.168.100.200:5000/v2.0/ export OS_NO_CACHE=1 export OS_KEY=/vagrant/cakey.pem export OS_CACERT=/vagrant/ca.pem You can use the controller node if no other machines are available on your network, as this has the python-keystoneclient and the relevant access to the OpenStack environment. If you are using the Vagrant environment, issue the following command to get access to the Controller: vagrant ssh controller How to do it... To create the required roles in our OpenStack environment, perform the following steps: Create the admin role as follows: # admin role keystone role-create --name admin You will get an output like this: +----------+----------------------------------+| Property |              Value               |+----------+----------------------------------+|    id    | 625b81ae9f024366bbe023a62ab8a18d ||   name   |              admin               |+----------+----------------------------------+ To create the Member role, we repeat the step and specify the Member role: # Member role keystone role-create --name Member How it works... Creation of the roles is simply achieved by using the keystone client and specifying the role-create option with the following syntax: keystone role-create --name role_name The role_name attribute can't be arbitrary for admin and Member roles. The admin role has been set by default in /etc/keystone/policy.json as having administrative rights: {     "admin_required": [["role:admin"], ["is_admin:1"]] } The Member role is also configured by default in the OpenStack Dashboard, Horizon, for a non-admin user created through the web interface. On creation of the role, the ID associated with is returned, and we can use it when assigning roles to users. To see a list of roles and the associated IDs in our environment, we can issue the following command: keystone role-list Adding users to Keystone Adding users to the OpenStack Identity service requires that the user has a tenant that they can exist in and there is a defined role that can be assigned to them. Here, we will create two users. The first user will be named admin and will have the admin role assigned to them in the cookbook tenant. The second user will be named demo and will have the Member role assigned to them in the same cookbook tenant. Getting ready We will be using the keystone client to operate Keystone. If the python-keystoneclient tool isn't available, follow the steps described at http://bit.ly/OpenStackCookbookClientInstall. Ensure that we have our environment set correctly to access our OpenStack environment for administrative purposes: export OS_TENANT_NAME=cookbook export OS_USERNAME=admin export OS_PASSWORD=openstack export OS_AUTH_URL=https://192.168.100.200:5000/v2.0/ export OS_NO_CACHE=1 export OS_KEY=/vagrant/cakey.pem export OS_CACERT=/vagrant/ca.pem You can use the controller node if no other machines are available on your network, as this has the python-keystoneclient and the relevant access to the OpenStack environment. If you are using the Vagrant environment, issue the following command to get access to the Controller: vagrant ssh controller How to do it... To create the required users in our OpenStack environment, perform the following steps: To create a user in the cookbook tenant, we first need to get the cookbook tenant ID. To do this, issue the following command, which we conveniently store in a variable named TENANT_ID with the tenant-list option: TENANT_ID=$(keystone tenant-list \     | awk '/\ cookbook\ / {print $2}') Now that we have the tenant ID, the admin user in the cookbook tenant is created using the user-create option and a password is chosen for the user: PASSWORD=openstack keystone user-create \     --name admin \     --tenant_id $TENANT_ID \     --pass $PASSWORD \     --email root@localhost \    --enabled true The preceding code will produce the following output: +----------+----------------------------------+| Property |              Value               |+----------+----------------------------------+|  email   |          root@localhost          || enabled  |               True               ||    id    | 2e23d0673e8a4deabe7c0fb70dfcb9f2 ||   name   |              admin               || tenantId | 14e34722ac7b4fe298886371ec17cf40 || username |              admin               |+----------+----------------------------------+ As we are creating the admin user, which we are assigning the admin role, we need the admin role ID. We pick out the ID of the admin role and conveniently store it in a variable to use it when assigning the role to the user with the role-list option: ROLE_ID=$(keystone role-list \    | awk '/\ admin\ / {print $2}') To assign the role to our user, we need to use the user ID that was returned when we created that user. To get this, we can list the users and pick out the ID for that particular user with the following user-list option: USER_ID=$(keystone user-list \     | awk '/\ admin\ / {print $2}') With the tenant ID, user ID, and an appropriate role ID available, we can assign that role to the user with the following user-role-add option: keystone user-role-add \     --user $USER_ID \     --role $ROLE_ID \     --tenant_id $TENANT_ID Note that there is no output produced on successfully running this command. The admin user also needs to be in the admin tenant for us to be able to administer the complete environment. To do this, we need to get the admin tenant ID and then repeat the previous step using this new tenant ID: ADMIN_TENANT_ID=$(keystone tenant-list \     | awk '/\ admin\ / {print $2}') keystone user-role-add \     --user $USER_ID \     --role $ROLE_ID \     --tenant_id $ADMIN_TENANT_ID To create the demo user in the cookbook tenant with the Member role assigned, we repeat the process defined in steps 1 to 5: # Get the cookbook tenant ID TENANT_ID=$(keystone tenant-list \     | awk '/\ cookbook\ / {print $2}')   # Create the user PASSWORD=openstack keystone user-create \     --name demo \     --tenant_id $TENANT_ID \     --pass $PASSWORD \     --email demo@localhost \     --enabled true   # Get the Member role ID ROLE_ID=$(keystone role-list \     | awk '/\ Member\ / {print $2}')   # Get the demo user ID USER_ID=$(keystone user-list \     | awk '/\ demo\ / {print $2}')   # Assign the Member role to the demo user in cookbook keystone user-role-add \     --user $USER_ID \     -–role $ROLE_ID \     --tenant_id $TENANT_ID How it works... Adding users in the OpenStack Identity service involves a number of steps and dependencies. First, a tenant is required for the user to be part of. Once the tenant exists, the user can be added. At this point, the user has no role associated, so the final step is to designate the role to this user, such as Member or admin. Use the following syntax to create a user with the user-create option: keystone user-create \     --name user_name \     --tenant_id TENANT_ID \     --pass PASSWORD \     --email email_address \     --enabled true The user_name attribute is an arbitrary name but cannot contain any spaces. A password attribute must be present. In the previous examples, these were set to openstack. The email_address attribute must also be present. To assign a role to a user with the user-role-add option, use the following syntax: keystone user-role-add \     --user USER_ID \     --role ROLE_ID \     --tenant_id TENANT_ID This means that we need to have the ID of the user, the ID of the role, and the ID of the tenant in order to assign roles to users. These IDs can be found using the following commands: keystone tenant-list keystone user-list keystone role-list Summary In this article, we have looked at the basic operations with respect to Keystone, such as creating tenants, configuring roles, and adding users. To know everything else about cloud computing with OpenStack, check out OpenStack Cloud Computing Cookbook, Third Edition, also currently being used at CERN! The book has chapters on the Identity Service, Image Service, Networking, Object Storage, Block Storage, as well as how to manage OpenStack in production environments! It’s everything you need and more to make your job so much easier! Resources for Article:  Further resources on this subject:  Introducing OpenStack Trove [article] OpenStack Performance, Availability [article] Concepts for OpenStack [article]
Read more
  • 0
  • 0
  • 9497

article-image-christmas-light-sequencer
Packt
25 Feb 2015
20 min read
Save for later

Christmas Light Sequencer

Packt
25 Feb 2015
20 min read
In this article by Sai Yamanoor and Srihari Yamanoor, authors of the book Raspberry Pi Mechatronics Projects Hotshot, have picked a Christmas-themed project to demonstrate controlling appliances connected to a local network using Raspberry Pi. We will design automation and control of Christmas lights in our homes. We will decorate our homes with lights for any festive occasion and work on a article that enables us to build fantastic projects. We will build a local server to control the devices. We will use the web.py framework to design the web server. We'd like to dedicate this article to the memory of Aaron Swartz who was the founder of the web.py framework. Mission briefing In this article, we will install a local web server-based control of GPIO pins on the Raspberry Pi. We will use this web server framework to control it via a web page. The Raspberry Pi on top of the tree is just an ornament for decoration Why is it awesome? We celebrate festive occasions by decorating our homes. The decorations reflect our heart and it can be enhanced by using Raspberry Pi. This article involves interfacing AC-powered devices to Raspberry Pi. You should exercise extreme caution while interfacing the devices, and it is strongly recommended that you stick to the recommended devices. Your objectives In this article, we will work on the following aspects: Interface of the Christmas tree lights and other decorative equipment to the Raspberry Pi Set up the digitally-addressable RGB matrix Interface of an audio device Setting up the web server Interfacing devices to the web server You can buy your sustainable and UK grown Christmas trees from christmastrees.co.uk. Mission checklist This article is based on a broad concept. You are free to choose decorative items of your own interest. We chose to show the following items for demonstration: Item Estimated Cost Christmas tree * 1 30 USD Outdoor decoration (optional) 30 USD Santa Claus figurine * 1 20 USD Digitally addressable strip * 1 30 USD approximately Power Switch Tail 2 from Adafruit Industries (http://www.adafruit.com/product/268) 25 USD approximately Arduino Uno (any variant) 20 – 30 USD approximately Interface the devices to the Raspberry Pi It is important to exercise caution while connecting electrical appliances to the Raspberry Pi. If you don't know what you are doing, please skip this section. Adult supervision is required while connecting appliances. In this task, we will look into interfacing decorative appliances (operated with an AC power supply) such as the Christmas tree. It is important to interface AC appliances to the Raspberry Pi in accordance with safety practices. It is possible to connect AC appliances to the Raspberry Pi using solid state relays. However, if the prototype boards aren't connected properly, it is a potential hazard. Hence, we use the Power Switch Tail II sold by Adafruit Industries. The Power Switch Tail II has been rated for 110V. According to the specifications provided on the Adafruit website, Power Switch Tail's relay can switch up to 15A resistive loads. It can be controlled by providing a 3-12V DC signal. We will look into controlling the lights on a Christmas tree in this task. Power Switch Tail II – Photo courtesy: Adafruit.com Prepare for lift off We have to connect the Power Switch Tail II to the Raspberry Pi to test it. The follow Fritzing schematic shows the connection of the switch to the Raspberry Pi using Pi Cobbler. Pin 25 is connected to in+, while the in- pin is connected to the Ground pin of the Raspberry Pi. The Pi Cobbler breakout board is connected to the Raspberry Pi as shown in the following image: The Raspberry Pi connection to the Power Switch Tail II using Pi Cobbler Engage thrusters In order to test the device, there are two options to control the device the GPIO Pins of the Raspberry Pi. This can be controlled either using the quick2wire GPIO library or using the Raspi GPIO library. The main difference between the quick2wire gpio library and the Raspi GPIO library is that the former does not require that the Python script to be run with root user privileges (to those who are not familiar with root privileges, the Python script needs to be run using sudo). In the case of the Raspi GPIO library, it is possible to set the ownership of the pins to avoid executing the script as root. Once the installation is complete, let's turn on/off the lights on the tree with a three second interval. The code for it is given as follows: # Import the rpi.gpio module.import RPi.GPIO as GPIO#Import delay module.from time import sleep#Set to BCM GPIOGPIO.setmode(GPIO.BCM)# BCM pin 25 is the output.GPIO.setup(25, GPIO.OUT) # Initialise Pin25 to low (false) so that the Christmas tree lights are switched off. GPIO.output(25, False)while 1:GPIO.output(25,False)sleep(3)GPIO.output(25,True)sleep(3) In the preceding task, we will get started by importing the raspi.gpio module and the time module to introduce a delay between turning on/off the lights: import RPi.GPIO as GPIO#Import delay modulefrom time import sleep We need to set the mode in which the GPIO pins are being used. There are two modes, namely the board's GPIO mode and the BCM GPIO mode (more information available on http://sourceforge.net/p/raspberry-gpio-python/wiki/). The former refers to the pin numbers on the Raspberry Pi board while the latter refers to the pin number found on the Broadcom chipset. In this example, we will adopt the BCM chipset's pin description. We will set the pin 25 to be an output pin and set it to false so that the Christmas tree lights are switched off at the start of the program: GPIO.setup(25, GPIO.OUT)GPIO.output(25, False) In the preceding routine, we are switching off the lights and turning them back on with a three-second interval: while 1:GPIO.output(25,True)sleep(3)GPIO.output(25,False)sleep(3) When the pin 25 is set to high, the device is turned on, and it is turned off when the pin is set to low with a three-second interval. Connecting multiple appliances to the Raspberry Pi Let's consider a scenario where we have to control multiple appliances using the Raspberry Pi. It is possible to connect a maximum of 15 devices to the GPIO interface of the Raspberry Pi. (There are 17 GPIO pins on the Raspberry Pi Model B, but two of those pins, namely GPIO14 and 15, are set to be UART in the default state. This can be changed after startup. It is also possible to connect a GPIO expander to connect more devices to Raspberry Pi.) In the case of appliances that need to be connected to the 110V AC mains, it is recommended that you use multiple power switch tails to adhere to safety practices. In the case of decorative lights that operate using a battery (for example, a two-feet Christmas tree) or appliances that operate at low voltage levels of 12V DC, a simple transistor circuit and a relay can be used to connect the devices. A sample circuit is shown in the figure that follows: A transistor switching circuit In the preceding circuit, since the GPIO pins operate at 3.3V levels, we will connect the GPIO pin to the base of the NPN transistor. The collector pin of the transistor is connected to one end of the relay. The transistor acts as a switch and when the GPIO pin is set to high, the collector is connected to the emitter (which in turn is connected to the ground) and hence, energizes the relay. Relays usually have three terminals, namely, the common terminal, Normally Open Terminal, and Normally Closed Terminal. When the relay is not energized, the common terminal is connected to the Normally Closed Terminal. Upon energization, the Normally Open Terminal is connected to the common terminal, thus turning on the appliance. The freewheeling diode across the relay is used to protect the circuit from any reverse current from the switching of the relays. The transistor switching circuit aids in operating an appliance that operates at 12V DC using the Raspberry Pi's GPIO pins (the GPIO pins of the Raspberry Pi operate at 3.3V levels). The relay and the transistor switching circuit enables controlling high current devices using the Raspberry Pi. It is possible to use an array of relays (as shown in the following image) and control an array of decorative lighting arrangements. It would be cool to control lighting arrangements according to the music that is being played on the Raspberry Pi (a project idea for the holidays!). The relay board (shown in the following image) operates at 5V DC and comes with the circuitry described earlier in this section. We can make use of the board by powering up the board using a 5V power supply and connecting the GPIO pins to the pins highlighted in red. As explained earlier, the relay can be energized by setting the GPIO pin to high. A relay board Objective complete – mini debriefing In this section, we discussed controlling decorative lights and other holiday appliances by running a Python script on the Raspberry Pi. Let's move on to the next section to set up the digitally addressable RGB LED strip! Setting up the digitally addressable RGB matrix In this task, we will talk about setting up options available for LED lighting. We will discuss two types of LED strips, namely analog RGB LED strips and digitally-addressable RGB LED strips. A sample of the digitally addressable RGB LED strip is shown in the image that follows: A digitally addressable RGB LED Strip Prepare for lift off As the name explains, digitally-addressable RGB LED strips are those where the colour of each RGB LED can be individually controlled (in the case of the analog strip, the colors cannot be individually controlled). Where can I buy them? There are different models of the digitally addressable RGB LED strips based on different chips such as LPD6803, LPD8806, and WS2811. The strips are sold in a reel of a maximum length of 5 meters. Some sources to buy the LED strips include Adafruit (http://www.adafruit.com/product/306) and Banggood (http://www.banggood.com/5M-5050-RGB-Dream-Color-6803-IC-LED-Strip-Light-Waterproof-IP67-12V-DC-p-931386.html) and they cost about 50 USD for a reel. Some vendors (including Adafruit) sell them in strips of one meter as well. Engage thrusters Let's review how to control and use these digitally-addressable RGB LED strips. How does it work? Most digitally addressable RGB strips come with terminals to powering the LEDs, a clock pin, and a data pin. The LEDs are serially connected to each other and are controlled through the SPI (Serial Peripheral Interface). The RGB LEDs on the strip are controlled by a chip that latches data from the microcontroller/Raspberry Pi onto the LEDs with reference to the clock cycles received on the clock pin. In the case of the LPD8806 strip, each chip can control about 2 LEDs. It can control each channel of the RGB LED using a seven-bit PWM channel. More information on the function of the RGB LED strip is available at https://learn.adafruit.com/digital-led-strip. It is possible to break the LED strip into individual segments. Each segment contains about 2 LEDs, and Adafruit industries has provided an excellent tutorial to separate the individual segments of the LED strip (https://learn.adafruit.com/digital-led-strip/advanced-separating-strips). Lighting up the RGB LED strip There are two ways of connecting the RGB LED strip. They can either be connected to an Arduino and controlled by the Raspberry Pi or controlled by the Raspberry Pi directly. An Arduino-based control It is assumed that you are familiar with programming microcontrollers, especially those on the Arduino platform. An Arduino connection to the digitally addressable interface In the preceding figure, the LED strip is powered by an external power supply. (The tiny green adapter represents the external power supply. The recommended power supply for the RGB LED strip is 5V/2A per meter of LEDs (while writing this article, we got an old computer power supply to power up the LEDs). The Clock pins (the CI pin) and the Data pins (DI) of the first segment of the RGB strip are connected to the pins D2 and D3 respectively. (We are doing this since we will test the example from Adafruit industries. The example is available at https://github.com/adafruit/LPD8806/tree/master/examples.) Since the RGB strip consists of multiple segments that are serially connected, the Clock Out (CO) and Data Out (DO) pins of the first segment are connected to the Clock In (CI) and Data In (DI) pins of the second segment and so on. Let's review the example, strandtest.pde, to test the RGB LED strip. The example makes use of Software SPI (Bit Banging of the clock and data pins for lighting effects). It is also possible to use the SPI interface of the Arduino platform. In the example, we need to set the number of LEDs used for the test. For example, we need to set the number of LEDs on the strip to 64 for a two-meter strip. Here is how to do this: The following line needs to be changed: int nLEDs = 64; Once the code is uploaded, the RGB matrix should light up, as shown in this image: 8 x 8 RGB matrix lit up Let's quickly review the Arduino sketch from Adafruit. We will get started by setting up an LPD8806 object as follows: //nLEDS refer to number of LEDs in the strip. This cannot exceed 160 LEDs/5m due to current draw. LPD8806 strip = LPD8806(nLEDs, dataPin, clockPin); In the setup() sectionof the Arduino sketch, we will initialize the RGB strip "as follows: // Start up the LED stripstrip.begin(); // Update the strip, to start they are all 'off'strip.show(); As soon as we enter the main loop, scripts such as colorChase and rainbow are executed. We can make use of this Arduino sketch to implement serial port commands to control the lighting scripts using the Raspberry Pi. This task merely provides some ideas of connecting and lighting up the RGB LED strip. You should familiarize yourself with the working principles of the RGB LED strip. The Raspberry Pi has an SPI port, and hence, it is possible to control the RGB strip directly from the Raspberry Pi. Objective complete – mini debriefing In this task, we reviewed options for decorative lighting and controlling them using the Raspberry Pi and Arduino. Interface of an audio device In this task, we will work on installing MP3 and WAV file audio player tools on the Raspbian operating system. Prepare for lift off The Raspberry Pi is equipped with a 3.5mm audio jack and the speakers can be connected to that output. In order to get started, we install the ALSA utilities package and a command-line mp3 player: sudo apt-get install alsa-utils sudo apt-get install mpg321 Engage thrusters In order to use the alsa-utils or mpg321 players, we have to activate the BCM2835's sound drivers and this can be done using the modprobe command: sudo modprobe snd_bcm2835 After activating the drivers, it is possible to play the WAV files using the aplay command (aplay is a command-line player available as part of the alsa-utils package): aplay testfile.wav An MP3 file can be played using the mpg321 command (a command-line MP3 player): mpg321 testfile.mp3 In the preceding examples, the commands were executed in the directory where the WAV file or the MP3 file was located. In the Linux environment, it is possible to stop playing a file by pressing CTRL + C. Objective complete – mini debriefing We were able to install sound utilities in this task. Later, we will use the installed utilities to play audio from a web page. It is possible to play the sound files on the Raspberry Pi using the module available in Python. Some examples include: Snack sound tool kit, Pygame, and so on. Installing the web server In this section, we will install a local web server on Raspberry Pi. There are different web server frameworks that can be installed on the Raspberry Pi. They include Apache v2.0, Boost, the REST framework, and so on. Prepare for lift off As mentioned earlier, we will build a web server based on the web.py framework. This section is entirely referenced from web.py tutorials (http://webpy.github.io/). In order to install web.py, a Python module installer such as pip or easy_install is required. We will install it using the following command: sudo apt-get install python-setuptools Engage thrusters The web.py framework can be installed using the easy_install tool: sudo easy_install web.py Once the installation is complete, it is time to test it with a Hello World! example. We will open a new file using a text editor available with Python IDLE and get started with a Hello World! example for the web.py framework using the following steps: The first step is to import the web.py framework: import web The next step is defining the class that will handle the landing page. In this case, it is index: urls = ('/','index') We need to define what needs to be done when one tries to access the URL. "We will like to return the Hello world!text: class index: def GET(self): return "Hello world!" The next step is to ensure that a web page is set up using the web.py framework when the Python script is launched: if __name__ == '__main__': app = web.application(urls, globals()) app.run() When everything is put together, the following code is what we'll see: import web urls = ('/','index') class index: def GET(self): return "Hello world!" if __name__ == '__main__': app = web.application(urls,globals()) app.run() We should be able to start the web page by executing the Python script: python helloworld.py We should be able to launch the website from the IP address of the Raspberry Pi. For example, if the IP address is 10.0.0.10, the web page can be accessed at http://10.0.0.10:8080 and it displays the text Hello world. Yay! A Hello world! example using the web.py framework Objective complete – mission debriefing We built a simple web page to display the Hello world text. In the next task, we will be interfacing the Christmas tree and other decorative appliances to our web page so that we can control it from anywhere on the local network. It is possible to change the default port number for the web page access by launching the Python script as follows: python helloworld.py 1234 Now, the web page can be accessed at http://<IP_Address_of_the_Pi>:1234. Interfacing the web server In this task, we will learn to interface one decorative appliance and a speaker. We will create a form and buttons on an HTML page to control the devices. Prepare for lift off In this task, we will review the code (available along with this article) required to interface decorative appliances and lighting arranging to a web page and controlled over a local network. Let's get started with opening the file using a text editing tool (Python IDLE's text editor or any other text editor). Engage thrusters We will import the following modules to get started with the program: import web from web import form import RPi.GPIO as GPIO import os The GPIO module is initialized, the board numbering is set, and ensure that all appliances are turned off by setting the GPIO pins to low or false and declare any global variables: #Set board GPIO.setmode(GPIO.BCM) #Initialize the pins that have to be controlled GPIO.setup(25,GPIO.OUT) GPIO.output(25,False) This is followed by defining the template location: urls = ('/', 'index') render = web.template.render('templates') The buttons used in the web page are also defined: appliances_form = form.Form( form.Button("appbtn", value="tree", class_="btntree"), form.Button("appbtn", value="Santa", class_="btnSanta"), form.Button("appbtn", value="audio", class_="btnaudio")    In this example, three buttons are used, a value is assigned to each button along with their class.    In this example, we are using three buttons and the name is appbtn. A value is assigned to each button that determines the desired action when a button is clicked. For example, when a Christmas tree button is clicked, the lights need to be turned on. This action can be executed based on the value that is returned during the button press. The home page is defined in the index class. The GET method is used to render the web page and POST for button click actions. class index: def GET(self): form = appliances_form() return render.index(form, "Raspberry Pi Christmas lights controller") def POST(self): userData = web.input() if userData.appbtn == "tree" global state state = not state elif userData.appbtn == "Santa": #do something here for another appliance elif userData.appbtn == "audio": os.system("mpg321 /home/pi/test.mp3") GPIO.output(25,state) raise web.seeother('/')    In the POST method, we need to monitor the button clicks and perform an action accordingly. For example, when the button with the tree value is returned, we can change the Boolean value, state. This in turn switches the state of the GPIO pin 25. Earlier, we connected the power tail switch to pin 25. The index page file that contains the form and buttons is as follows: $def with (form,title) <html> <head> <title>$title</title> <link rel="stylesheet" type="text/css" href="/static/styles.css"> </head> <body&gt <P><center><H1>Christmas Lights Controller</H1></center> <br /> <br /> <form class="form" method="post"> $:form.render() </form> </body> </html> The styles of the buttons used on the web page are described as follows in styles.css: form .btntree { margin-left : 200px; margin-right : auto; background:transparent url("images/topic_button.png") no-repeat top left; width : 186px; height: 240px; padding : 0px; position : absolute; } form .btnSanta{ margin-left :600px; margin-right : auto; background:transparent url("images/Santa-png.png") no-repeat top left; width : 240px; height: 240px; padding : 40px; position : absolute; } body {background-image:url('bg-snowflakes-3.gif'); } The web page looks like what is shown in the following figure: Yay! We have a Christmas lights controller interface. Objective complete – mini debriefing We have written a simple web page that interfaces a Christmas tree and RGB tree and plays MP3 files. This is a great project for a holiday weekend. It is possible to view this web page from anywhere on the Internet and turn these appliances on/off (Fans of the TV show Big Bang Theory might like this idea. A step-by-step instruction on setting it up is available at http://www.everydaylinuxuser.com/2013/06/connecting-to-raspberry-pi-from-outside.html). Summary In this article, we have accomplished the following: Interfacing the RGB matrix Interfacing AC appliances to Raspberry Pi Design of a web page Interfacing devices to the web page Resources for Article: Further resources on this subject: Raspberry Pi Gaming Operating Systems [article] Calling your fellow agents [article] Our First Project – A Basic Thermometer [article]
Read more
  • 0
  • 0
  • 9493

article-image-ros-filesystem-levels
Packt
18 Aug 2015
10 min read
Save for later

The ROS Filesystem levels

Packt
18 Aug 2015
10 min read
In this article by Enrique Fernández, Luis Sánchez Crespo, Anil Mahtani, and Aaron Martinez, authors of the book Learning ROS for Robotics Programming - Second Edition, you will see the different levels of filesystems in ROS. (For more resources related to this topic, see here.) When you start to use or develop projects with ROS, you will see that although this concept can sound strange in the beginning, you will become familiar with it with time. Similar to an operating system, an ROS program is divided into folders, and these folders have files that describe their functionalities: Packages: Packages form the atomic level of ROS. A package has the minimum structure and content to create a program within ROS. It may have ROS runtime processes (nodes), configuration files, and so on. Package manifests: Package manifests provide information about a package, licenses, dependencies, compilation flags, and so on. A package manifest is managed with a file called package.xml. Metapackages: When you want to aggregate several packages in a group, you will use metapackages. In ROS Fuerte, this form for ordering packages was called Stacks. To maintain the simplicity of ROS, the stacks were removed, and now, metapackages make up this function. In ROS, there exist a lot of these metapackages; one of them is the navigation stack. Metapackage manifests: Metapackage manifests (package.xml) are similar to a normal package but with an export tag in XML. It also has certain restrictions in its structure. Message (msg) types: A message is the information that a process sends to other processes. ROS has a lot of standard types of messages. Message descriptions are stored in my_package/msg/MyMessageType.msg. Service (srv) types: Service descriptions, stored in my_package/srv/MyServiceType.srv, define the request and response data structures for services provided by each process in ROS. The workspace Basically, the workspace is a folder where we have packages, edit the source files or compile packages. It is useful when you want to compile various packages at the same time and is a good place to have all our developments localized. A typical workspace is shown in the following screenshot. Each folder is a different space with a different role: The Source space: In the Source space (the src folder), you put your packages, projects, clone packages, and so on. One of the most important files in this space is CMakeLists.txt. The src folder has this file because it is invoked by CMake when you configure the packages in the workspace. This file is created with the catkin_init_workspace command. The Build space: In the build folder, CMake and catkin keep the cache information, configuration, and other intermediate files for our packages and projects. The Development (devel) space: The devel folder is used to keep the compiled programs. This is used to test the programs without the installation step. Once the programs are tested, you can install or export the package to share with other developers. You have two options with regard to building packages with catkin. The first one is to use the standard CMake workflow. With this, you can compile one package at a time, as shown in the following commands: $ cmake packageToBuild/ $ make If you want to compile all your packages, you can use the catkin_make command line, as shown in the following commands: $ cd workspace $ catkin_make Both commands build the executables in the build space directory configured in ROS. Another interesting feature of ROS are its overlays. When you are working with a package of ROS, for example, Turtlesim, you can do it with the installed version, or you can download the source file and compile it to use your modified version. ROS permits you to use your version of this package instead of the installed version. This is very useful information if you are working on an upgrade of an installed package. Packages Usually, when we talk about packages, we refer to a typical structure of files and folders. This structure looks as follows: include/package_name/: This directory includes the headers of the libraries that you would need. msg/: If you develop nonstandard messages, put them here. scripts/: These are executable scripts that can be in Bash, Python, or any other scripting language. src/: This is where the source files of your programs are present. You can create a folder for nodes and nodelets or organize it as you want. srv/: This represents the service (srv) types. CMakeLists.txt: This is the CMake build file. package.xml: This is the package manifest. To create, modify, or work with packages, ROS gives us tools for assistance, some of which are as follows: rospack: This command is used to get information or find packages in the system. catkin_create_pkg: This command is used when you want to create a new package. catkin_make: This command is used to compile a workspace. rosdep: This command installs the system dependencies of a package. rqt_dep: This command is used to see the package dependencies as a graph. If you want to see the package dependencies as a graph, you will find a plugin called package graph in rqt. Select a package and see the dependencies. To move between packages and their folders and files, ROS gives us a very useful package called rosbash, which provides commands that are very similar to Linux commands. The following are a few examples: roscd: This command helps us change the directory. This is similar to the cd command in Linux. rosed: This command is used to edit a file. roscp: This command is used to copy a file from a package. rosd: This command lists the directories of a package. rosls: This command lists the files from a package. This is similar to the ls command in Linux. The package.xml file must be in a package, and it is used to specify information about the package. If you find this file inside a folder, probably this folder is a package or a metapackage. If you open the package.xml file, you will see information about the name of the package, dependencies, and so on. All of this is to make the installation and the distribution of these packages easy. Two typical tags that are used in the package.xml file are <build_depend> and <run _depend>. The <build_depend> tag shows what packages must be installed before installing the current package. This is because the new package might use a functionality of another package. Metapackages As we have shown earlier, metapackages are special packages with only one file inside; this file is package.xml. This package does not have other files, such as code, includes, and so on. Metapackages are used to refer to others packages that are normally grouped following a feature-like functionality, for example, navigation stack, ros_tutorials, and so on. You can convert your stacks and packages from ROS Fuerte to Hydro and catkin using certain rules for migration. These rules can be found at http://wiki.ros.org/catkin/migrating_from_rosbuild. In the following screenshot, you can see the content from the package.xml file in the ros_tutorials metapackage. You can see the <export> tag and the <run_depend> tag. These are necessary in the package manifest. If you want to locate the ros_tutorials metapackage, you can use the following command: $ rosstack find ros_tutorials The output will be a path, such as /opt/ros/hydro/share/ros_tutorials. To see the code inside, you can use the following command line: $ vim /opt/ros/hydro/share/ros_tutorials/package.xml Remember that Hydro uses metapackages, not stacks, but the rosstack find command line works to find metapackages. Messages ROS uses a simplified message description language to describe the data values that ROS nodes publish. With this description, ROS can generate the right source code for these types of messages in several programming languages. ROS has a lot of messages predefined, but if you develop a new message, it will be in the msg/ folder of your package. Inside that folder, certain files with the .msg extension define the messages. A message must have two principal parts: fields and constants. Fields define the type of data to be transmitted in the message, for example, int32, float32, and string, or new types that you have created earlier, such as type1 and type2. Constants define the name of the fields. An example of a msg file is as follows: int32 id float32 vel string name In ROS, you can find a lot of standard types to use in messages, as shown in the following table list: Primitive type Serialization C++ Python bool (1) unsigned 8-bit int uint8_t(2) bool int8 signed 8-bit int int8_t int uint8 unsigned 8-bit int uint8_t int(3) int16 signed 16-bit int int16_t int uint16 unsigned 16-bit int uint16_t int int32 signed 32-bit int int32_t int uint32 unsigned 32-bit int uint32_t int int64 signed 64-bit int int64_t long uint64 unsigned 64-bit int uint64_t long float32 32-bit IEEE float float float float64 64-bit IEEE float double float string ascii string (4) std::string string time secs/nsecs signed 32-bit ints ros::Time rospy.Time duration secs/nsecs signed 32-bit ints ros::Duration rospy.Duration A special type in ROS is the header type. This is used to add the time, frame, and so on. This permits you to have the messages numbered, to see who is sending the message, and to have more functions that are transparent for the user and that ROS is handling. The header type contains the following fields: uint32 seq time stamp string frame_id You can see the structure using the following command: $ rosmsg show std_msgs/Header Thanks to the header type, it is possible to record the timestamp and frame of what is happening with the robot. In ROS, there exist tools to work with messages. The rosmsg tool prints out the message definition information and can find the source files that use a message type. In the upcoming sections, we will see how to create messages with the right tools. Services ROS uses a simplified service description language to describe ROS service types. This builds directly upon the ROS msg format to enable request/response communication between nodes. Service descriptions are stored in .srv files in the srv/ subdirectory of a package. To call a service, you need to use the package name, along with the service name; for example, you will refer to the sample_package1/srv/sample1.srv file as sample_package1/sample1. There are tools that exist to perform functions with services. The rossrv tool prints out the service descriptions and packages that contain the .srv files, and finds source files that use a service type. If you want to create a service, ROS can help you with the service generator. These tools generate code from an initial specification of the service. You only need to add the gensrv() line to your CMakeLists.txt file. Summary In this article, we saw the different types of filesystems present in the ROS architecture. Resources for Article: Further resources on this subject: Building robots that can walk [article] Avoiding Obstacles Using Sensors [article] Managing Test Structure with Robot Framework [article]
Read more
  • 0
  • 0
  • 9492
article-image-deploying-new-hosts-vcenter
Packt
04 Jun 2015
8 min read
Save for later

Deploying New Hosts with vCenter

Packt
04 Jun 2015
8 min read
In this article by Konstantin Kuminsky author of the book, VMware vCenter Cookbook, we will review some options and features available in vCenter to improve an administrator's efficiency. (For more resources related to this topic, see here.) Deploying new hosts faster with scripted installation Scripted installation is an alternative way to deploy ESXi hosts. It can be used when several hosts need to be deployed or upgraded. The installation script contains ESXi settings and can be accessed by a host during the ESXi boot from the following locations: FTP HTTP or HTTPS NFS USB flash drive or CD-ROM How to do it... The following sections describe the process of creating an installation script and using it to boot the ESXi host. Creating an installation script An installation script contains installation options for ESXi. It's a text file with the .cfg extension. The best way to create an installation script is to use the default script supplied with the ESXi installer and modify it. The default script is located in the /etc/vmware/weasel/ folder location and is called ks.cfg. Commands that can be modified include, but are not limited to: The install, installorupgrade, or upgrade commands define the ESXi disk—location, where the installation or upgrade will be installed. The available options are: --disk: This option is the disk name which can be specified as path (/vmfs/devices/disks/vmhbaX:X:X), VML name (vml.xxxxxxxx) or as LUN UID (vmkLUM_UID) –overwritevmfs: This option wipes the existing datastore. --preservevmfs: This option keeps the existing datastore. --novmfsondisk: This option prevents a new partition from being created. The Network command, which specifies the network settings. Most of the available options are self-explanatory: --bootproto=[dhcp|static] --device: MAC address of NIC to use --ip --gateway --nameserver --netmask --hostname --vlanid A full list of installation and upgrade commands can be found in the vSphere5 documentation on the VMware website at https://www.vmware.com/support/pubs/. Use the installation script to configure ESXi In order to use the installation script, you will need to use additional ESXi boot options. Boot a host from the ESXi installation disk. When the ESXi installer screen appears, press Shift + O to provide additional boot options. In the command prompt, type the following: ks=<location of the script> <additional boot options> The valid locations are as follows: ks=cdrom:/path ks=file://path ks=protocol://path ks=usb:/path The additional options available are as follows: gateway: This option is the default gateway ip: This option is the IP address nameserver: This option is the DNS server netmask: This option is the subnet mask vlanid: This option is the VLAN ID netdevice: This option is the MAC address of NIC to use bootif: This option is the MAC address of NIC to use in PXELINUX format For example, for the HTTP location, the command will look like this: ks=http://XX.XX.XX.XX/scripts/ks-v1.cfg nameserver=XX.XX.XX.XX ip=XX.XX.XX.XX netmask=255.255.255.0 gateway=XX.XX.XX.XX Deploying new hosts faster with auto deploy vSphere Auto Deploy is VMware's solution to simplify the deployment of large numbers of ESXi hosts. It is one of the available options for ESXi deployment along with an interactive and scripted installation. The main difference of Auto Deploy compared to other deployment options is that the ESXi configuration is not stored on the host's disk. Instead, it's managed with image and host profiles by the Auto Deploy server. Getting ready Before using Auto Deploy, confirm the following: The Auto Deploy server is installed and registered with vCenter. It can be installed as a standalone server or as part of the vCenter installation. The DHCP server exists in the environment. The DHCP server is configured to point to the TFTP server for PXE boot (option 66) with the boot filename undionly.kpxe.vmw-hardwired. The TFTP server that will be used for PXE boot exists and is configured properly. The machine where Auto Deploy cmdlets will run has the following installed: Microsoft .NET 2.0 or later PowerShell 2.0 or later PowerCLI including Auto Deploy cmdlets New hosts that will be provisioned with Auto Deploy must: Meet the hardware requirements for ESXi 5 Have network connectivity to vCenter, preferably 1 Gbps or higher Have PXE boot enabled How to do it... Once prerequisites are met, the following steps are required to start deploying hosts. Configuring the TFTP server In order to configure the TFTP server with the correct boot image for ESXi, execute the following steps: In vCenter, go to Home | Auto Deploy. Switch to the Administration tab. From the Auto Deploy page, click on Download TFTP Boot ZIP. Download the file and unzip it to the appropriate folder on the TFTP server. Creating an image profile Image profies are created using Image Builder PowerCLI cmdlets. Image Builder requires PowerCLI and can be installed on a machine that's used to run administrative tasks. It doesn't have to be a vCenter server or Auto Deploy server and the only requirement for this machine is that it must have access to the software depot—a file server that stores image profiles. Image profiles can be created from scratch or by cloning an existing profile. The following steps outline the process of creating an image profile by cloning. The steps assume that: The Image Builder has been installed. The appropriate software depot has been downloaded from the VMware website by going to http://www.vmware.com/downloads and searching for the software depot. Cloning an existing profile included in the depot is the easiest way to create a new profile. The steps to do so are as follows: Add a depot with the image profile to be cloned: Add-EsxSoftwareDepot -DepotUrl <Path to softwaredepot> Find the name of the profile to be cloned using Get-ESXImageProfile. Clone the profile: New-EsxImageProfile -CloneProfile <Existing profile name> - Name <New profile name> Add a software package to the new image profile: Add-EsxSoftwarePackage -ImageProfile <New profile name> - SoftwarePackage <Package> At this point, the software package will be validated and in case of errors, or if there are any dependencies that need to be resolved, an appropriate message will be displayed. Assigning an image profile to hosts To create a rule that assigns an image profile to a host, execute the following steps: Connect to vCenter with PowerCLI: Connect-VIServer <vCenter IP or FQDN> Add the software depot with the correct image profile to the PowerCLI session: Add-EsxSoftwareDepot <depot URL> Locate the image profile using the Get-EsxImageProfile cmdlet. Define a rule that assigns hosts with certain attributes to an image profile. For example, for hosts with IP addresses for a range, run the following command: New-DeployRule -Name <Rule name> -Item <Profile name> -Pattern "ipv4=192.168.1.10-192.168.1.20" Add-DeployRule <Rule name> Assigning a host profile to hosts Optionally, the existing host profile can be assigned to hosts. To accomplish this, execute the following steps: Connect to vCenter with PowerCLI: Connect-VIServer <vCenter IP or FQDN> Locate the host profile name using the Get-VMhostProfile command. Define a rule that assigns hosts with certain attributes to a host profile. For example, for hosts with IP addresses for a range, run the following command: New-DeployRule -Name <Rule name> -Item <Profile name> -Pattern "ipv4=192.168.1.10-192.168.1.20" Add-DeployRule <Rule name> Assigning a host to a folder or cluster in vCenter To make sure a host is placed in a certain folder or cluster once it boots, do the following: Connect to vCenter with PowerCLI: Connect-VIServer <vCenter IP or FQDN> Define a rule that assigns hosts with certain attributes to a folder or cluster. For example, for hosts with IP addresses for a range, run the following command: New-DeployRule -Name <Rule name> -Item <Folder name> -Pattern "ipv4=192.168.1.10-192.168.1.20" Add-DeployRule <Rule name> If a host is assigned to a cluster it inherits that cluster's host profile. How it works... Auto Deploy utilizes the PXE boot to connect to the Auto Deploy server and get an image profile, vCenter location, and optionally, host profiles. The detailed process is as follows: The host gets gPXE executable and gPXE configuration files from the PXE TFTP server. As gPXE executes, it uses instructions from the configuration file to query the Auto Deploy server for specific information. The Auto Deploy server returns the requested information specified in the image and host profiles. The host boots using this information. Auto Deploy adds a host to the specified vCenter server. The host is placed in maintenance mode when additional information such as IP address is required from the administrator. To exit maintenance mode, the administrator will need to provide this information and reapply the host profile. When a new host boots for the first time, vCenter creates a new object and stores it together with the host and image profiles in the database. For any subsequent reboots, the existing object is used to get the correct host profile and any changes that have been made. More details can be found in the vSphere 5 documentation on the VMware website at https://www.vmware.com/support/pubs/. Summary In this article we learnt how new hosts can be deployed with scripted installation and auto deploy techniques. Resources for Article: Further resources on this subject: VMware vRealize Operations Performance and Capacity Management [Article] Backups in the VMware View Infrastructure [Article] Application Packaging in VMware ThinApp 4.7 Essentials [Article]
Read more
  • 0
  • 0
  • 9491

article-image-plotting-data-using-matplotlib-part-1
Packt
19 Nov 2009
10 min read
Save for later

Plotting data using Matplotlib: Part 1

Packt
19 Nov 2009
10 min read
The examples are: Plotting data from a database Plotting data from a web page Plotting the data extracted by parsing an Apache log file Plotting the data read from a comma-separated values (CSV) file Plotting extrapolated data using curve fitting Third-party tools using Matplotlib (NetworkX and mpmath) Let's begin Plotting data from a database Databases often tend to collect much more information than we can simply extract and watch in a tabular format (let's call it the "Excel sheet" report style). Databases not only use efficient techniques to store and retrieve data, but they are also very good at aggregating it. One suggestion we can give is to let the database do the work. For example, if we need to sum up a column, let's make the database sum the data, and not sum it up in the code. In this way, the whole process is much more efficient because: There is a smaller memory footprint for the Python code, since only the aggregate value is returned, not the whole result set to generate it The database has to read all the rows in any case. However, if it's smart enough, then it can sum values up as they are read The database can efficiently perform such an operation on more than one column at a time The data source we're going to query is from an open source project: the Debian distribution. Debian has an interesting project called UDD , Ultimate Debian Database, which is a relational database where a lot of information (either historical or actual) about the distribution is collected and can be analyzed. On the project website http://udd.debian.org/, we can fi nd a full dump of the database (quite big, honestly) that can be downloaded and imported into a local PostgreSQL instance (refer to http://wiki.debian.org/UltimateDebianDatabase/CreateLocalReplica for import instructions Now that we have a local replica of UDD, we can start querying it: # module to access PostgreSQL databasesimport psycopg2# matplotlib pyplot moduleimport matplotlib.pyplot as plt Since UDD is stored in a PostgreSQL database, we need psycopg2 to access it. psycopg2 is a third-party module available at http://initd.org/projects/psycopg # connect to UDD databaseconn = psycopg2.connect(database="udd")# prepare a cursorcur = conn.cursor() We will now connect to the database server to access the udd database instance, and then open a cursor on the connection just created. # this is the query we'll be makingquery = """select to_char(date AT TIME ZONE 'UTC', 'HH24'), count(*) from upload_history where to_char(date, 'YYYY') = '2008' group by 1 order by 1""" We have prepared the select statement to be executed on UDD. What we wish to do here is extract the number of packages uploaded to the Debian archive (per hour) in the whole year of 2008. date AT TIME ZONE 'UTC': As date field is of the type timestamp with time zone, it also contains time zone information, while we want something independent from the local time. This is the way to get a date in UTC time zone. group by 1: This is what we have encouraged earlier, that is, let the database do the work. We let the query return the already aggregated data, instead of coding it into the program. # execute the querycur.execute(query)# retrieve the whole result setdata = cur.fetchall() We execute the query and fetch the whole result set from it. # close cursor and connectioncur.close()conn.close() Remember to always close the resources that we've acquired in order to avoid memory or resource leakage and reduce the load on the server (removing connections that aren't needed anymore). # unpack data in hours (first column) and# uploads (second column)hours, uploads = zip(*data) The query result is a list of tuples, (in this case, hour and number of uploads), but we need two separate lists—one for the hours and another with the corresponding number of uploads. zip() solves this with *data, we unpack the list, returning the sublists as separate arguments to zip(), which in return, aggregates the elements in the same position in the parameters into separated lists. Consider the following example: In [1]: zip(['a1', 'a2'], ['b1', 'b2'])Out[1]: [('a1', 'b1'), ('a2', 'b2')] To complete the code: # graph codeplt.plot(hours, uploads)# the the x limits to the 'hours' limitplt.xlim(0, 23)# set the X ticks every 2 hoursplt.xticks(range(0, 23, 2))# draw a gridplt.grid()# set title, X/Y labelsplt.title("Debian packages uploads per hour in 2008")plt.xlabel("Hour (in UTC)")plt.ylabel("No. of uploads") The previous code snippet is the standard plotting code, which results in the following screenshot: From this graph we can see that in 2008, the main part of Debian packages uploads came from European contributors. In fact, uploads were made mainly in the evening hours (European time), after the working days are over (as we can expect from a voluntary project). Plotting data from the Web Often, the information we need is not distributed in an easy-to-use format such as XML or a database export but for example only on web sites. More and more often we find interesting data on a web page, and in that case we have to parse it to extract that information: this is called web scraping . In this example, we will parse a Wikipedia article to extracts some data to plot. The article is at http://it.wikipedia.org/wiki/Demografia_d'Italia and contains lots of information about Italian demography (it's in Italian because the English version lacks a lot of data); in particular, we are interested in the population evolution over the years. Probably the best known Python module for web scraping is BeautifulSoup ( http://www.crummy.com/software/BeautifulSoup/). It's a really nice library that gets the job done quickly, but there are situations (in particular with JavaScript embedded in the web page, such as for Wikipedia) that prevent it from working. As an alternative, we find lxml quite productive (http://codespeak.net/lxml/). It's a library mainly used to work with XML (as the name suggests), but it can also be used with HTML (given their quite similar structures), and it is powerful and easy–to-use. Let's dig into the code now: # to get the web pagesimport urllib2# lxml submodule for html parsingfrom lxml.html import parse# regular expression moduleimport re# Matplotlib moduleimport matplotlib.pyplot as plt Along with the Matplotlib module, we need the following modules: urllib2: This is the module (from the standard library) that is used to access resources through URL (we will download the webpage with this). lxml: This is the parsing library. re: Regular expressions are needed to parse the returned data to extract the information we need. re is a module from the standard library, so we don't need to install a third-party module to use it. # general urllib2 configuser_agent = 'Mozilla/5.0 (compatible; MSIE 5.5; Windows NT)'headers = { 'User-Agent' : user_agent }url = "http://it.wikipedia.org/wiki/Demografia_d'Italia" Here, we prepare some configuration for urllib2, in particular, the user_agent header is used to access Wikipedia and the URL of the page. # prepare the request and open the urlreq = urllib2.Request(url, headers=headers)response = urllib2.urlopen(req) Then we make a request for the URL and get the HTML back. # we parse the webpage, getroot() return the document rootdoc = parse(response).getroot() We parse the HTML using the parse() function of lxml.html and then we get the root element. XML can be seen as a tree, with a root element (the node at the top of the tree from where every other node descends), and a hierarchical structure of elements. # find the data table, using css elementstable = doc.cssselect('table.wikitable')[0] We leverage the structure of HTML accessing the first element of type table of class wikitable because that's the table we're interested in. # prepare data structures, will contain actual datayears = []people = [] Preparing the lists that will contain the parsed data. # iterate over the rows of the table, except first and last onesfor row in table.cssselect('tr')[1:-1]: We can start parsing the table. Since there is a header and a footer in the table, we skip the first and the last line from the lines (selected by the tr tag) to loop over. # get the row cell (we will use only the first two)data = row.cssselect('td') We get the element with the td tag that stands for table data: those are the cells in an HTML table. # the first cell is the yeartmp_years = data[0].text_content()# cleanup for cases like 'YYYY[N]' (date + footnote link)tmp_years = re.sub('[.]', '', tmp_years) We take the first cell that contains the year, but we need to remove the additional characters (used by Wikipedia to link to footnotes). # the second cell is the population counttmp_people = data[1].text_content()# cleanup from '.', used as separatortmp_people = tmp_people.replace('.', '') We also take the second cell that contains the population for a given year. It's quite common in Italy to separate thousands in number with a '.' character: we have to remove them to have an appropriate value. # append current data to data lists, converting to integersyears.append(int(tmp_years))people.append(int(tmp_people)) We append the parsed values to the data lists, explicitly converting them to integer values. # plot dataplt.plot(years,people)# ticks every 10 yearsplt.xticks(range(min(years), max(years), 10))plt.grid()# add a note for 2001 Censusplt.annotate("2001 Census", xy=(2001, people[years.index(2001)]), xytext=(1986, 54.5*10**6), arrowprops=dict(arrowstyle='fancy')) Running the example results in the following screenshot that clearly shows why the annotation is needed: In 2001, we had a national census in Italy, and that's the reason for the drop in that year: the values released from the National Institute for Statistics (and reported in the Wikipedia article) are just an estimation of the population. However, with a census, we have a precise count of the people living in Italy. Plotting data by parsing an Apache log file Plotting data from a log file can be seen as the art of extracting information from it. Every service has a log format different from the others. There are some exceptions of similar or same format (for example, for services that come from the same development teams) but then they may be customized and we're back at the beginning. The main differences in log files are: Fields orders: Some have time information at the beginning, others in the middle of the line, and so on Fields types: We can find several different data types such as integers, strings, and so on Fields meanings: For example, log levels can have very different meanings From all the data contained in the log file, we need to extract the information we are interested in from the surrounding data that we don't need (and hence we skip). In our example, we're going to analyze the log file of one of the most common services: Apache. In particular, we will parse the access.log file to extract the total number of hits and amount of data transferred per day. Apache is highly configurable, and so is the log format. Our Apache configuration, contained in the httpd.conf file, has this log format: "%h %l %u %t "%r" %>s %b "%{Referer}i" "%{User-Agent}i"" This is in LogFormat specification where Log directive Description %h The host making the request %l Identity of the client (which is usually not available) %u User making the request (usually not available) %t The time the request was received %r The request %>s The status code %b The size (in bytes) of the response sent to the client (excluding the headers) %{Referer}i The page from where the requests originated (for example, the HTML page where a PNG image is requested) %{User-Agent}i The user agent used to make the request
Read more
  • 0
  • 0
  • 9486
Modal Close icon
Modal Close icon