Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Save more on your purchases! discount-offer-chevron-icon
Savings automatically calculated. No voucher code required.
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Events
Videos
Audiobooks
Packt Hub
Free Learning
Arrow right icon
timer SALE ENDS IN
0 Days
:
00 Hours
:
00 Minutes
:
00 Seconds

How-To Tutorials

7018 Articles
article-image-continuous-delivery-and-devops-faqs
Packt
04 Jan 2013
9 min read
Save for later

Continuous Delivery and DevOps FAQs

Packt
04 Jan 2013
9 min read
(For more resources related to this topic, see here.) What is continuous delivery and DevOps? Let's start by looking at each separately: Continuous delivery, more commonly referred to CD, is an agile way of working whereby quality products—normally software assets—can be developed, built, tested, and shipped in quick succession. CD is a refinement on traditional agile techniques such as scrum and lean. The main refinements being the ability to decrease the time between deliveries and the ability to do this many many times—continuously in fact. Agile delivery techniques encourage you to time-box the activities/steps needed to deliver software: definition, analysis, development, testing, and sign-off but this doesn't necessarily mean that you deliver as frequently as you would like. Most agile techniques encourage you to have "potentially shippable code" at the end of your iteration/sprint. CD encourages you to ship the finished code, as frequently as possible. To achieve this ability one must look at the process of writing software in a slightly different way. Instead of focusing on delivery of a feature or a project in one go, you need to look at how you can deliver it in smaller, incremental chunks. This can be easier said than done. DevOps is another way of working whereby developers and system operators work in harmony with little or no organizational barriers between them towards a common goal. This may not sound too radical however when you consider how mainstream organizations function—especially those within government, corporate, or financial sectors—and take into account the many barriers that are in place between the development and the operational teams, this can be quite a radical shift. A simple way to imagine this is to consider how a small start-up software house operates. They have a limited amount of resource and a minimum level of bureaucracy. Developing, shipping, and supporting software is normally seamless—sometimes being done be the same people. Now look at a corporate business with R&D and operations teams. These two parts of the organization are normally separated by rules, regulations, and red tape—in some cases they will also be separated by other part of the organization (QA teams, transition teams, and so on). DevOps encourages one to break through these barriers and create an environment whereby those creating the software and those supporting the software work together in close synergy and the lines between them become blurred. DevOps is more to do with hearts and minds than anything else so can be quite tough to implement when long established bad habits are in place. Do I have to use both? You don't necessarily need to have implemented both CD and DevOps to speed up the delivery of quality software but it is true to say that both complement each other very well. If you want to continuously deliver quality software you need to have a pretty slick and streamlined process for getting the software into the production environment. This will need very close working relationships between the development teams and the operations teams. The more they work as one unit, the more seamless the process. CD will give you the tools and best of breed practices to deliver quality software quickly. DevOps will help you establish the behaviors, culture, and ways of working to fully utilize CD. If you have the opportunity to implement both, it would be foolish not to at least try. Can CD and DevOps bring quantifiable business benefit? As with any business the idea is that you make more money than you spend. The quicker you can ship and sell your service/software/widget, the quicker you generate income. If you use this same model and apply it to a typical software project you may well notice that you are investing a vast amount of effort and cash simply getting your software complete and ready to ship—this is all cost. That's not the only thing that costs, releasing the software into the production environment are not free either. It can sometimes be more costly—especially if you need large QA teams, and release teams, and change management teams, and project coordination teams, and change control boards staffed by senior managers, and bug fixing teams and ... I think you got the idea. In simple terms being able to develop, test, and ship small changes frequently (say many times per day or week) will drastically reduce the cost of delivery and allow the business to start earning money sooner. Implementing CD and/or DevOps will help you in realizing this goal. Other business benefits include; reduction in risk of release failures, reduction in delivering redundant changes (those that are no longer needed but released anyway), increased productivity (people spend less time releasing product and more time creating products), increases competitiveness, and so on. How does it fit with other product delivery methodologies? CD and DevOps should be seen as a set of tools that can be utilized to best fit your business needs. This is nothing new, any mature agile methodology should be used in this way. As such you should be able to "bolt" elements of CD and DevOps into your existing process. For example, if you currently use scrum as your product delivery methodology and have implemented CD, you could tweak your definition of done (DoD) to be "it is live" and incorporate checks against automated testing, continuous integration, and automated deployment within your acceptance criteria for each story. This therefore encourages the scrum team to work in such a way as to want to deliver their product to the live platform rather than have it sat potentially shippable waiting for someone to pick it up and release it—eventually. There may be some complexity when it comes to more traditional waterfall product delivery methodologies such as PRINCE2 or similar however there is always room for improvement—for example, the requirements gathering stage could include developers and operational team members to flesh out and understand any issues around releasing and supporting the solution when it goes live. It may not seem much but these sort of behaviors encourage closer working and collaboration. All in all CD and DevOps should be seen as complementary to your ways of working rather than a direct replacement—at least to begin with. Is it too radical for most businesses? For some organizations being able to fully implement the CD and DevOps utopia may be very complex and complicated, especially where rules, regulations, and corporate governance are strongly embedded. That isn't to say that it can't be done. Even the most cumbersome and sloth like institutions can and do benefit from utilizing CD and DevOps—the UK government for example. It does need more time and effort to prove the value of implementing all/some elements of CD and DevOps and those that are proposing such a shift really need to do their homework as there may well be high degrees of resistance to overcome. However, as more and more corporate, government organizations, and other large business embrace CD and DevOps, the body of proof to convince the doubters becomes easier to find. Look at your business and examine the pain points in the process for delivering your goods/services/software. If you can find a specific problem (or problems) which can be solved by implementing CD or DevOps then you will have a much clearer and more convincing business case. Where do I start? As with any change, you should start small and build up—incrementally. If you see an opportunity, say a small project or some organizational change within your IT dept, you should look into introducing elements of CD and DevOps. For some this may be in the form of kicking off a project to examine/build tooling to allow for automated testing, continuous integration, or automated deployments. For some it may be to break down some of the barriers between development and operational teams during the development of a specific project/product. You may be able to introduce CD and/or DevOps in one big bang but just consider how things go when you try and do that with a large software release—you don't want CD and DevOps to fail so maybe big bang is not the best way. All in all it depends on your needs, the maturity of your organization, and what problems you need CD and DevOps to solve. If you are convinced CD and DevOps will bring value and solve specific problems then the most important thing to do is start looking at ways to convince others. Do I just need to implement some tooling to get the benefits? This is a misconception which is normally held by those management types who have lost their connection to reality. They may have read something about CD and DevOps in some IT management periodical and decided that you just need to implement Jenkins and some automated tests. CD and DevOps do utilize tools but only where they bring value or solve a specific problem. Tooling alone will not remove ingrained business issues. Being able to build and test software quickly is good but without open, honest, and collaborative ways of working to ensure the code can be shipped into production just as quickly you will end up with another problem—a backlog of software which needs to be released. Sometimes tweaking an existing process through face to face discussion is all the tooling that is needed—for example changes to the change process to allow for manual release of software as soon as it's ready. Where can I find more information? There are many forms of information available in relation to CD and DevOps one of which is a small book entitled "Continuous Delivery and DevOps: A Quickstart guide" which is written by your truly and available here (http://www.packtpub.com/devops-quickstart-guide/book ) Summary In this article we saw some of the most frequently asked questions on Continuous Delivery and DevOps. Resources for Article : Further resources on this subject: Negotiation Strategy for Effective Implementation of COTS Software [Article] An Overview of Microsoft Sure Step [Article] ER Diagrams, Domain Model, and N-Layer Architecture with ASP.NET 3.5 (part1) [Article]
Read more
  • 0
  • 0
  • 10048

article-image-animating-panda3d
Packt
01 Mar 2011
8 min read
Save for later

Animating in Panda3D

Packt
01 Mar 2011
8 min read
Panda3D 1.6 Game Engine Beginner's Guide Create your own computer game with this 3D rendering and game development framework The first and only guide to building a finished game using Panda3D Learn about tasks that can be used to handle changes over time Respond to events like keyboard key presses, mouse clicks, and more Take advantage of Panda3D's built-in shaders and filters to decorate objects with gloss, glow, and bump effects Follow a step-by-step, tutorial-focused process that matches the development process of the game with plenty of screenshots and thoroughly explained code for easy pick up        Actors and Animations An Actor is a kind of object in Panda3D that adds more functionality to a static model. Actors can include joints within them. These joints have parts of the model tied to them and are rotated and repositioned by animations to make the model move and change. Actors are stored in .egg and .bam files, just like models. Animation files include information on the position and rotation of joints at specific frames in the animation. They tell the Actor how to posture itself over the course of the animation. These files are also stored in .egg and .bam files. Time for action – loading Actors and Animations Let's load up an Actor with an animation and start it playing to get a feel for how this works: Open a blank document in NotePad++ and save it as Anim_01.py in the Chapter09 folder. We need a few imports to start with. Put these lines at the top of the file: import direct.directbase.DirectStart from pandac.PandaModules import * from direct.actor.Actor import Actor We won't need a lot of code for our class' __init__ method so let's just plow through it here : class World: def __init__(self): base.disableMouse() base.camera.setPos(0, -5, 1) self.setupLight() self.kid = Actor("../Models/Kid.egg", {"Walk" : "../Animations/Walk.egg"}) self.kid.reparentTo(render) self.kid.loop("Walk") self.kid.setH(180) The next thing we want to do is steal our setupLight() method from the Track class and paste it into this class: def setupLight(self): primeL = DirectionalLight("prime") primeL.setColor(VBase4(.6,.6,.6,1)) self.dirLight = render.attachNewNode(primeL) self.dirLight.setHpr(45,-60,0) render.setLight(self.dirLight) ambL = AmbientLight("amb") ambL.setColor(VBase4(.2,.2,.2,1)) self.ambLight = render.attachNewNode(ambL) render.setLight(self.ambLight) return Lastly, we need to instantiate the World class and call the run() method. w = World() run() Save the file and run it from the command prompt to see our loaded model with an animation playing on it, as depicted in the following screenshot: What just happened? Now, we have an animated Actor in our scene, slowly looping through a walk animation. We made that happen with only three lines of code: self.kid = Actor("../Models/Kid.egg", {"Walk" : "../Animations/Walk.egg"}) self.kid.reparentTo(render) self.kid.loop("Walk") The first line creates an instance of the Actor class. Unlike with models, we don't need to use a method of loader. The Actor class constructor takes two arguments: the first is the filename for the model that will be loaded. This file may or may not contain animations in it. The second argument is for loading additional animations from separate files. It's a dictionary of animation names and the files that they are contained in. The names in the dictionary don't need to correspond to anything; they can be any string. myActor = Actor( modelPath, {NameForAnim1 : Anim1Path, NameForAnim2 : Anim2Path, etc}) The names we give animations when the Actor is created are important because we use those names to control the animations. For instance, the last line calls the method loop() with the name of the walking animation as its argument. If the reference to the Actor is removed, the animations will be lost. Make sure not to remove the reference to the Actor until both the Actor and its animations are no longer needed. Controlling animations Since we're talking about the loop() method, let's start discussing some of the different controls for playing and stopping animations. There are four basic methods we can use: play("AnimName"): This method plays the animation once from beginning to end. loop("AnimName"): This method is similar to play, but the animation doesn't stop when it's over; it replays again from the beginning. stop() or stop("AnimName"): This method, if called without an argument, stops all the animations currently playing on the Actor. If called with an argument, it only stops the named animation. Note that the Actor will remain in whatever posture they are in when this method is called. pose("AnimName", FrameNumber): Places the Actor in the pose dictated by the supplied frame without playing the animation. We have some more advanced options as well. Firstly, we can provide option fromFrame and toFrame arguments to play or loop to restrict the animation to specific frames. myActor.play("AnimName", fromFrame = FromFrame, toFrame = toFrame) We can provide both the arguments, or just one of them. For the loop() method, there is also the optional argument restart, which can be set to 0 or 1. It defaults to 1, which means to restart the animation from the beginning. If given a 0, it will start looping from the current frame. We can also use the getNumFrames("AnimName") and getCurrentFrame("AnimName") methods to get more information about a given animation. The getCurrentAnim() method will return a string that tells us which animation is currently playing on the Actor. The final method we have in our list of basic animation controls sets the speed of the animation. myActor.setPlayRate(1.5, "AnimName") The setPlayRate() method takes two arguments. The first is the new play rate, and it should be expressed as a multiplier of the original frame rate. If we feed in .5, the animation will play half as fast. If we feed in 2, the animation will play twice as fast. If we feed in -1, the animation will play at its normal speed, but it will play in reverse. Have a go hero – basic animation controls Experiment with the various animation control methods we've discussed to get a feel for how they work. Load the Stand and Thoughtful animations from the animations folder as well, and use player input or delayed tasks to switch between animations and change frame rates. Once we're comfortable with what we've gone over so far, we'll move on. Animation blending Actors aren't limited to playing a single animation at a time. Panda3D is advanced enough to offer us a very handy functionality, called blending. To explain blending, it's important to understand that an animation is really a series of offsets to the basic pose of the model. They aren't absolutes; they are changes from the original. With blending turned on, Panda3D can combine these offsets. Time for action – blending two animations We'll blend two animations together to see how this works. Open Anim_01.py in the Chapter09 folder. We need to load a second animation to be able to blend. Change the line where we create our Actor to look like the following code: self.kid = Actor("../Models/Kid.egg", {"Walk" : "../Animations/Walk.egg", "Thoughtful" : "../Animations/Thoughtful.egg"}) Now, we just need to add this code above the line where we start looping the Walk animation: self.kid.enableBlend() self.kid.setControlEffect("Walk", 1) self.kid.setControlEffect("Thoughtful", 1) Resave the file as Anim_02.py and run it from the command prompt. What just happened? Our Actor is now performing both animations to their full extent at the same time. This is possible because we made the call to the self.kid.enableBlend() method and then set the amount of effect each animation would have on the model with the self.kid.setControlEffect() method. We can turn off blending later on by using the self.kid.disableBlend() method, which will return the Actor to the state where playing or looping a new animation will stop any previous animations. Using the setControlEffect method, we can alter how much each animation controls the model. The numeric argument we pass to setControlEffect() represents a percentage of the animation's offset that will be applied, with 1 being 100%, 0.5 being 50%, and so on. When blending animations together, the look of the final result depends a great deal on the model and animations being used. Much of the work needed to achieve a good result depends on the artist. Blending works well for transitioning between animations. In this case, it can be handy to use Tasks to dynamically alter the effect animations have on the model over time. Honestly, though, the result we got with blending is pretty unpleasant. Our model is hardly walking at all, and he looks like he has a nervous twitch or something. This is because both animations are affecting the entire model at full strength, so the Walk and Thoughtful animations are fighting for control over the arms, legs, and everything else, and what we end up with is a combination of both animation's offsets. Furthermore, it's important to understand that when blending is enabled, every animation with a control effect higher than 0 will always be affecting the model, even if the animation isn't currently playing. The only way to remove an animation's influence is to set the control effect to 0. This obviously can cause problems when we want to play an animation that moves the character's legs and another animation that moves his arms at the same time, without having them screw with each other. For that, we have to use subparts.  
Read more
  • 0
  • 0
  • 10043

article-image-how-interact-database-using-rhom
Packt
28 Jul 2011
5 min read
Save for later

How to Interact with a Database using Rhom

Packt
28 Jul 2011
5 min read
  Rhomobile Beginner's Guide Step-by-step instructions to build an enterprise mobile web application from scratch         Read more about this book       (For more resources on this topic, see here.) What is ORM? ORM connects business objects and database tables to create a domain model where logic and data are presented in one wrapping. In addition, the ORM classes wrap our database tables to provide a set of class-level methods that perform table-level operations. For example, we might need to find the Employee with a particular ID. This is implemented as a class method that returns the corresponding Employee object. In Ruby code, this will look like: employee= Employee.find(1) This code will return an employee object whose ID is 1.   Exploring Rhom Rhom is a mini Object Relational Mapper (ORM) for Rhodes. It is similar to another ORM, Active Record in Rails but with limited features. Interaction with the database is simplified, as we don't need to worry about which database is being used by the phone. iPhone uses SQLite and Blackberry uses HSQL and SQLite depending on the device. Now we will create a new model and see how Rhom interacts with the database.   Time for action – Creating a company model We will create a model company. In addition to a default attribute ID that is created by Rhodes, we will have one attribute name that will store the name of the company. Now, we will go to the application directory and run the following command: $ rhogen model company name which will generate the following: [ADDED] app/Company/index.erb[ADDED] app/Company/edit.erb[ADDED] app/Company/new.erb[ADDED] app/Company/show.erb[ADDED] app/Company/index.bb.erb[ADDED] app/Company/edit.bb.erb[ADDED] app/Company/new.bb.erb[ADDED] app/Company/show.bb.erb[ADDED] app/Company/company_controller.rb[ADDED] app/Company/company.rb[ADDED] app/test/company_spec.rb We can notice the number of files generated by the Rhogen command. Now, we will add a link on the index page so that we can browse it from our homepage. Add a link in the index.erb file for all the phones except Blackberry. If the target phone is a Blackberry, add this link to the index.bb.erb file inside the app folder. We will have different views for Blackberry. <li> <a href="<%= url_for :controller => :Company %>"><span class="title"> Company</span><span class="disclosure_indicator"/></a></li> We can see from the image that a Company link is created on the homepage of our application. Now, we can build our application to add some dummy data. You can see that we have added three companies Google, Apple, and Microsoft. What just happened? We just created a model company with an attribute name, made a link to access it from our homepage, and added some dummy data to it. We will add a few companies' names because it will help us in the next section. Association Associations are connections between two models, which make common operations simpler and easier for your code. So we will create an association between the Employee model and the Company model.   Time for action – Creating an association between employee and company The relationship between an employee and a company can be defined as "An employee can be in only one company but one company may have many employees". So now we will be adding an association between an employee and the company model. After we make entries for the company in the company model, we would be able to see the company select box populated in the employee form. The relationship between the two models is defined in the employee.rb file as: belongs_to :company_id, 'Company' Here, Company corresponds to the model name and company_id corresponds to the foreign key. Since at present we have the company field instead of company_id in the employee model, we will rename company to company_id. To retrieve all the companies, which are stored in the Company model, we need to add this line in the new action of the employee_controller: @companies = Company.find(:all) The find command is provided by Rhom, which is used to form a query and retrieve results from the database. Company.find(:all) will return all the values stored in the Company model in the form of an array of objects. Now, we will edit the new.erb and edit.erb files present inside the Employee folder. <h4 class="groupTitle">Company</h4><ul> <li> <select name="employee[company_id]"> <% @companies.each do |company|%> <option value="<%= company.object%>" <%= "selected" if company.object == @employee.company_id%> > <%=company.name %></option> <%end%> </select> </li></ul> If you observe in the code, we have created a select box for selecting a company. Here we have defined a variable @companies that is an array of objects. And in each object we have two fields named company name and its ID. We have created a loop and shown all the companies that are there in the @companies object. In the above image the companies are populated in the select box, which we added before and it is displayed in the employee form. What just happened? We just created an association between the employee and company model and used this association to populate the company select box present in the employee form. As of now, Rhom has fewer features then other ORMs like Active Record. As of now there is very little support for database associations.  
Read more
  • 0
  • 0
  • 10041

article-image-java-data-objects-and-service-data-objects-soa
Packt
16 Oct 2009
11 min read
Save for later

Java Data Objects and Service Data Objects in SOA

Packt
16 Oct 2009
11 min read
JDO Java Data Objects (JDO) is a complementing standard of accessing data from your data store using a standard interface-based abstraction model of persistence in java. The original JDO (JDO 1.0) specification is quite old and is based on Java Specification Request 12 (JSR 12). The current major version of JDO (JDO 2.0) is based on JSR 243. The original specifications were done under the supervision of Sun and starting from 2.0, the development of the API and the reference implementation happens as an Apache open-source project. Why JDO? We have been happily programming to retrieve data from relational stores using JDBC, and now the big question is do we need yet another standard, JDO? If you think that as software programmers you need to provide solutions to your business problems, it makes sense for you to start with the business use cases and then do a business analysis at the end of which you will come out with a Business Domain Object Model (BDOM). The BDOM will drive the design of your entity classes, which are to be persisted to a suitable data store. Once you design your entity classes and their relationship, the next question is should you be writing code to create tables, and persist or query data from these tables (or data stores, if there are no tables). I would like to answer 'No' for this question, since the more code you write, the more are the chances of making errors, and further, developer time is costly. Moreover, today you may write JDBC for doing the above mentioned "technical functionalities", and tomorrow you may want to change all your JDBC to some other standard since you want to port your data from a relational store to a different persistence mechanism. To sum up, let us list down a few of the features of JDO which distinguishes itself from other similar frameworks. Separation of Concerns: Application developers can focus on the BDOM and leave the persistence details (storage and retrieval) to the JDO implementation. API-based: JDO is based on a java interface-based programming model. Hence all persistence behavior including most commonly used features of OR mapping is available as metadata, external to your BDOM source code. We can also Plug and Play (PnP) multiple JDO implementations, which know how to interact well with the underlying data store. Data store portability: Irrespective of whether the persistent store is a relational or object-based file, or just an XML DB or a flat file, JDO implementations can still support the code. Hence, JDO applications are independent of the underlying database. Performance: A specific JDO implementation knows how to interact better with its specific data store, which will improve performance as compared to developer written code. J2EE integration: JDO applications can take advantage of J2EE features like EJB and thus the enterprise features such as remote message processing, automatic distributed transaction coordination, security, and so on. JPOX—Java Persistent Objects JPOX is an Apache open-source project, which aims at a heterogeneous persistence solution for Java using JDO. By heterogeneous we mean, JPOX JDO will support any combination of the following four main aspects of persistence: Persistence Definition: The mechanism of defining how your BDOM classes are to be persisted to the data store. Persistence API: The programming API used to persist your BDOM objects. Query Language: The language used to find objects due to certain criteria. Data store: The underlying persistent store you are persisting your objects to. JDO Sample Using JPOX In this sample, we will take the familiar Order and LineItems scenario, and expand it to have a JDO implementation. It is assumed that you have already downloaded and extracted the JPOX libraries to your local hard drive. BDOM for the Sample We will limit our BDOM for the sample discussion to just two entity classes, that is, OrderList and LineItem. The class attributes and relationships are shown in the following screenshot: The BDOM illustrates that an Order can contain multiple line items. Conversely, each line item is related to one and only one Order. Code BDOM Entities for JDO The BDOM classes are simple entity classes with getter and setter methods for each attribute. These classes are then required to be wired for JDO persistence capability in a JDO specific configuration file, which is completely external to the core entity classes. OrderList.java OrderList is the class representing the Order, and is having a primary key attribute that is number. public class OrderList{ private int number; private Date orderDate; private Set lineItems; // other getter & setter methods go here // Inner class for composite PK public static class Oid implements Serializable{ public int number; public Oid(){ } public Oid(int param){ this.number = param; } public String toString(){ return String.valueOf(number); } public int hashCode(){ return number; } public boolean equals(Object other){ if (other != null && (other instanceof Oid)){ Oid k = (Oid)other; return k.number == this.number; } return false; } } } LineItem.java LineItem represents each item container in the Order. We don't explicitly define a primary key for LineItem even though JDO will have its own mechanism to do that. public class LineItem{ private String productId; private int numberOfItems; private OrderList orderList; // other getter & setter methods go here } package.jdo JDO requires an XML configuration file, which defines the fields that are to be persisted and to what JDBC or JDO wrapper constructs should be mapped to. For this, we can create an XML file called package.jdo with the following content and put it in the same directory where we have the entities. <?xml version="1.0" encoding="UTF-8"?> <!DOCTYPE jdo SYSTEM "file:/javax/jdo/jdo.dtd"> <jdo> <package name="com.binildas.jdo.jpox.order"> <class name="OrderList" identity-type="application" objectid-class="OrderList$Oid" table="ORDERLIST"> <field name="number" primary-key="true"> <column name="ORDERLIST_ID"/> </field> <field name="orderDate"> <column name="ORDER_DATE"/> </field> <field name="lineItems" persistence-modifier="persistent" mapped-by="orderList"> <collection element-type="LineItem"> </collection> </field> </class> <class name="LineItem" table="LINEITEM"> <field name="productId"> <column name="PRODUCT_ID"/> </field> <field name="numberOfItems"> <column name="NUMBER_OF_ITEMS"/> </field> <field name="orderList" persistence-modifier="persistent"> <column name="LINEITEM_ORDERLIST_ID"/> </field> </class> </package> </jdo> jpox.PROPERTIES In this sample, we will persist our entities to a relational database, Oracle. We specify the main connection parameters in jpox.PROPERTIES file. javax.jdo.PersistenceManagerFactoryClass=org.jpox.jdo.JDOPersistenceManagerFactory javax.jdo.option.ConnectionDriverName=oracle.jdbc.driver.OracleDriver javax.jdo.option.ConnectionURL=jdbc:oracle:thin:@127.0.0.1:1521:orcl javax.jdo.option.ConnectionUserName=scott javax.jdo.option.ConnectionPassword=tiger org.jpox.autoCreateSchema=true org.jpox.validateTables=false org.jpox.validateConstraints=false Main.java This class contains the code to test the JDO functionalities. As shown here, it creates two Orders and adds few line items to each order. First it persists these entities and then queries back these entities using the id. public class Main{ static public void main(String[] args){ Properties props = new Properties(); try{ props.load(new FileInputStream("jpox.properties")); } catch (Exception e){ e.printStackTrace(); } PersistenceManagerFactory pmf = JDOHelper.getPersistenceManagerFactory(props); PersistenceManager pm = pmf.getPersistenceManager(); Transaction tx = pm.currentTransaction(); Object id = null; try{ tx.begin(); LineItem lineItem1 = new LineItem("CD011", 1); LineItem lineItem2 = new LineItem("CD022", 2); OrderList orderList = new OrderList(1, new Date()); orderList.getLineItems().add(lineItem1); orderList.getLineItems().add(lineItem2); LineItem lineItem3 = new LineItem("CD033", 3); LineItem lineItem4 = new LineItem("CD044", 4); OrderList orderList2 = new OrderList(2, new Date()); orderList2.getLineItems().add(lineItem3); orderList2.getLineItems().add(lineItem4); pm.makePersistent(orderList); id = pm.getObjectId(orderList); System.out.println("Persisted id : "+ id); pm.makePersistent(orderList2); id = pm.getObjectId(orderList2); System.out.println("Persisted id : "+ id); orderList = (OrderList) pm.getObjectById(id); System.out.println("Retreived orderList : " + orderList); tx.commit(); } catch (Exception e){ e.printStackTrace(); if (tx.isActive()){ tx.rollback(); } } finally{ pm.close(); } } } Build and Run the JDO Sample You can download the required code for this article from http://www.packtpub.com/files//code/3216_Code.zip. Unzip the file and the code of our interest is in the folder 3216_04_Code. There is also a README.txt file, which gives detailed steps to build and run the samples. Since we use Oracle to persist entities, we need the following two libraries in the classpath: jpox-rdbms*.jar classes12.jar We require a couple of other libraries too which are specified in the build.xml file. Download these libraries and change the path in examples.PROPERTIES accordingly. To build the sample, first bring up your database server. Then to build the sample in a single command, it is easy for you to go to ch04jdo folder and execute the following command. cd ch04jdo ant The above command will execute the following steps: First it compiles the java source files Then for every class you persist, use JPOX libraries to enhance the byte code. As the last step, we create the required schema in the data store. To run the sample, execute: ant run You can now cross check whether the entities are persisted to your data store. This is as shown in the following screenshot where you can see that each line item is related to the parent order by the foreign key.   Data Services Good that you now know how to manage the basic data operations in a generic way using JDO and other techniques. By now, you also have good hands-on experience in defining and deploying web services. We all appreciate that web services are functionalities exposed in standard, platform, and technology neutral way. When we say functionality we mean the business use cases translated in the form of useful information. Information is always processed out of data. So, once we retrieve data, we need to process it to translate them into information. When we define SOA strategies at an enterprise level, we deal with multiple Line of Business (LOB) systems; some of them will be dealing with the same kind of business entity. For example, a customer entity is required for a CRM system as well as for a sales or marketing system. This necessitates a Common Data Model (CDM), which is often referred to as the Canonical Data Model or Information Model. In such a model, you will often have entities that represent "domain" concepts, for example, customer, account, address, order, and so on. So, multiple LOB systems will make use of these domain entities in different ways, seeking different information-based on the business context. OK, now we are in a position to introduce the next concept in SOA, which is "Data Services". Data Services are specialization of web services which are data and information oriented. They need to manage the traditional CRUD (Create, Read, Update, and Delete) operations as well as a few other data functionalities such as search and information modeling. The Create operation will give you back a unique ID whereas Read, Update, and Delete operations are performed on a specific unique ID. Search will usually be done with some form of search criteria and information modeling, or retrieval happens when we pull useful information out of the CDM, for example, retrieving the address for a customer. The next important thing is that no assumptions should be made that the data will be in a java resultset form or in a collection of transfer object form. Instead, you are now dealing with data in SOA context and it makes sense to visualize data in XML format. Hence, XML Schema Definition (XSDs) can be used to define the format of your requests and responses for each of these canonical data definitions. You may also want to use ad hoc queries using XQuery or XPath expressions, similar to SQL capabilities on relational data. In other words, your data retrieval and data recreation for information processing at your middle tier should support XML tools and mechanisms, and should also support the above six basic data operations. If so, higher level of abstractions in the processing tier can make use of the above data services to provide Application Specialization capabilities, specialized for the LOB systems. To make the concept clear, let us assume that we need to get the order status for a particular customer (getCustomerOrderStatus()) which will take the customer ID argument. The data services layer will have a retrieve operation passing the customer ID and the XQuery or the XPath statement will obtain the requested order information from the retrieved customer data. High level processing layers (such as LOB service tiers) can use high-level interface (for example, our getCustomerOrderStatus operation) of the Application Specialization using a web services (data services) interface and need not know or use XQuery or XPath directly. The underlying XQuery or XPath can be encapsulated, reused, and optimized.
Read more
  • 0
  • 0
  • 10035

article-image-aspnet-site-performance-reducing-page-load-time
Packt
28 Oct 2010
10 min read
Save for later

ASP.Net Site Performance: Reducing Page Load Time

Packt
28 Oct 2010
10 min read
The JavaScript code for a page falls into two groups—code required to render the page, and code required to handle user interface events, such as button clicks. The code to render the page is used to make the page look better , and to attach event handlers to for example, buttons. Although the rendering code needs to be loaded and executed in conjunction with the page itself, the user interface code can be loaded later, in response to a user interface event, such as a button click. That reduces the amount of code to be loaded, and therefore the time rendering of the page is blocked. It also reduces your bandwidth costs, because the user interface code is loaded only when it's actually needed. On the other hand, it does require separating the user interface code from the rendering code. You then need to invoke code that potentially hasn't loaded yet, tell the visitor that the code is loading, and finally invoke the code after it has loaded. Let's see how to make this all happen. Separating user interface code from render code Depending on how your JavaScript code is structured, this could be your biggest challenge in implementing on-demand loading. Make sure the time you're likely to spend on this and the subsequent testing and debugging is worth the performance improvement you're likely to gain. A very handy tool that identifies which code is used while loading the page is Page Speed, an add-on for Firefox. Besides identifying code that doesn't need to be loaded upfront, it reports many speed-related issues on your web page. Information on Page Speed is available at http://code.google.com/speed/page-speed/. OnDemandLoader library Assuming your user interface code is separated from your render code, it is time to look at implementing actual on-demand loading. To keep it simple, we'll use OnDemandLoader, a simple low-footprint object. You'll find it in the downloaded code bundle in the folder OnDemandLoad in the file OnDemandLoader.js. OnDemandLoader has the following features: It allows you to specify the script, in which it is defined, for each event-handler function. It allows you to specify that a particular script depends on some other script; for example Button1Code.js depends on library code in UILibrary1.js. A script file can depend on multiple other script files, and those script files can in turn be dependent on yet other script files. It exposes function runf, which takes the name of a function, arguments to call it with, and the this pointer to use while it's being executed. If the function is already defined, runf calls it right away. Otherwise, it loads all the necessary script files and then calls the function. It exposes the function loadScript, which loads a given script file and all the script files it depends on. Function runf uses this function to load script files. While script files are being loaded in response to a user interface event, a "Loading..." box appears on top of the affected control. That way, the visitor knows that the page is working to execute their action. If a script file has already been loaded or if it is already loading, it won't be loaded again. If the visitor does the same action repeatedly while the associated code is loading, such as clicking the same button, that event is handled only once. If the visitor clicks a second button or takes some other action while the code for the first button is still loading, both events are handled. A drawback of OnDemandLoader is that it always loads all the required scripts in parallel. If one script automatically executes a function that is defined in another script , there will be a JavaScript error if the other script hasn't loaded yet. However, if your library script files only define functions and other objects, OnDemandLoader will work well. Initializing OnDemandLoader OnDemandLoading.aspx in folder OnDemandLoad in the downloaded code bundle is a worked-out example of a page using on-demand loading. It delays the loading of JavaScript files by five seconds, to simulate slowly loading files. Only OnDemandLoader.js loads at normal speed. If you open OnDemandLoading.aspx, you'll find that it defines two arrays—the script map array and the script dependencies array. These are needed to construct the loader object that will take care of the on-demand loading. The script map array shows the script file, in which it is defined, for each function: var scriptMap = [ { fname: 'btn1a_click', src: 'js/Button1Code.js' }, { fname: 'btn1b_click', src: 'js/Button1Code.js' }, { fname: 'btn2_click', src: 'js/Button2Code.js' } ]; Here, functions btn1a_click and btn1b_click live in script file js/Button1Code. js, while function btn2_click lives in script file js/Button2Code.js. The second array defines which other script files it needs to run for each script file: var scriptDependencies = [ { src: '/js/Button1Code.js', testSymbol: 'btn1a_click', dependentOn: ['/js/UILibrary1.js', '/js/UILibrary2.js'] }, { src: '/js/Button2Code.js', testSymbol: 'btn2_click', dependentOn: ['/js/UILibrary2.js'] }, { src: '/js/UILibrary2.js', testSymbol: 'uifunction2', dependentOn: [] }, { src: '/js/UILibrary1.js', testSymbol: 'uifunction1', dependentOn: ['/js/UILibrary2.js'] } ]; This says that Button1Code.js depends on UILibrary1.js and UILibrary2.js. Further, Button2Code.js depends on UILibrary2.js. Further, UILibrary1.js relies on UILibrary2.js, and UILibrary2.js doesn't require any other script files. The testSymbol field holds the name of a function defined in the script. Any function will do, as long as it is defined in the script. This way, the on-demand loader can determine whether a script has been loaded by testing whether that name has been defined. With these two pieces of information, we can construct the loader object: <script type="text/javascript" src="js/OnDemandLoader.js"> </script> var loader = new OnDemandLoader(scriptMap, scriptDependencies); Now that the loader object has been created, let's see how to invoke user interface handler functions before their code has been loaded. Invoking not-yet-loaded functions The point of on-demand loading is that the visitor is allowed to take an action for which the code hasn't been loaded yet. How do you invoke a function that hasn't been defined yet? Here, you'll see two approaches: Call a loader function and pass it the name of the function to load and execute Create a stub function with the same name as the function you want to execute, and have the stub load and execute the actual function Let's focus on the first approach first. The OnDemandLoader object exposes a loader function runf that takes the name of a function to call, the arguments to call it with, and the current this pointer: function runf(fname, thisObj) { // implementation } Wait a minute! This signature shows a function name parameter and the this pointer, but what about the arguments to call the function with? One of the amazing features of JavaScript is that can you pass as few or as many parameters as you want to a function, irrespective of the signature. Within each function, you can access all the parameters via the built-in arguments array. The signature is simply a convenience that allows you to name some of the arguments. This means that you can call runf as shown: loader.runf('myfunction', this, 'argument1', 'argument2'); If for example, your original HTML has a button as shown: <input id="btn1a" type="button" value="Button 1a" onclick="btn1a_click(this.value, 'more info')" /> To have btn1a_click loaded on demand, rewrite this to the following (file OnDemandLoading.aspx): <input id="btn1a" type="button" value="Button 1a" onclick="loader.runf('btn1a_click', this, this.value, 'more info')" /> If, in the original HTML, the click handler function was assigned to a button programmatically as shown: <input id="btn1b" type="button" value="Button 1b" /> <script type="text/javascript"> window.onload = function() { document.getElementById('btn1b').onclick = btn1b_click; } </script> Then, use an anonymous function that calls loader.runf with the function to execute: <input id="btn1b" type="button" value="Button 1b" /> <script type="text/javascript"> window.onload = function() { document.getElementById('btn1b').onclick = function() { loader.runf('btn1b_click', this); } } </script> This is where you can use the second approach—the stub function. Instead of changing the HTML of your controls, you can load a stub function upfront before the page renders (file OnDemandLoading.aspx): function btn1b_click() { loader.runf('btn1b_click', this); } When the visitor clicks the button, the stub function is executed. It then calls loader.runf to load and execute its namesake that does the actual work, overwriting the stub function in the process. This leaves behind one problem. The on-demand loader checks whether a function with the given name is already defined before initiating a script load. And a function with that same name already exists—the stub function itself.   The solution is based on the fact that functions in JavaScript are objects. And all JavaScript objects can have properties. You can tell the on-demand loader that a function is a stub by attaching the property "stub": btn1b_click.stub = true; To see all this functionality in action, run the OnDemandLoading.aspx page in folder OnDemandLoad in the downloaded code bundle. Click on one of the buttons on the page, and you'll see how the required code is loaded on demand. It's best to do this in Firefox with Firebug installed, so that you can see the script files getting loaded in a Waterfall chart. Preloading Now that you have on-demand loading working, there is one more issue to consider: trading off bandwidth against visitor wait time. Currently, when a visitor clicks a button and the code required to process the click hadn't been loaded, loading starts in response to the click. This can be a problem if loading the code takes too much time. An alternative is to initiate loading the user interface code after the page has been loaded, instead of when a user interface event happens. That way, the code may have already loaded by the time the visitor clicks the button; or at least it will already be partly loaded, so that the code finishes loading sooner. On the other hand, this means expending bandwidth on loading code that may never be used by the visitor. You can implement preloading with the loadScript function exposed by the OnDemandLoader object. As you saw earlier, this function loads a JavaScript file plus any files it depends on, without blocking rendering. Simply add calls to loadScript in the onload handler of the page, as shown (page PreLoad.aspx in folder OnDemandLoad in the downloaded code bundle): <script type="text/javascript"> window.onload = function() { document.getElementById('btn1b').onclick = btn1b_click; loader.loadScript('js/Button1Code.js'); loader.loadScript('js/Button2Code.js'); } </script> You could preload all your user interface code, or just the code you think is likely to be needed. Now that you've looked at the load on demand approach, it's time to consider the last approach—loading your code without blocking page rendering and without getting into stub functions or other complications inherent in on-demand loading.
Read more
  • 0
  • 0
  • 10031

article-image-tensorflow-lstm-that-writes-stories-tutorial
Packt Editorial Staff
09 Jul 2018
18 min read
Save for later

Create a TensorFlow LSTM that writes stories [Tutorial]

Packt Editorial Staff
09 Jul 2018
18 min read
LSTMs are heavily employed for tasks such as text generation and image caption generation. For example, language modeling is very useful for text summarization tasks or generating captivating textual advertisements for products, where image caption generation or image annotation is very useful for image retrieval, and where a user might need to retrieve images representing some concept (for example, a cat). In this tutorial, we will implement an LSTM which will generate new stories after training on a dataset of folk stories. This article is extracted from the book Natural Language Processing with Tensorflow by Thushan Ganegedara. The application that we will cover in this article is the use of an LSTM to generate new text. For this task, we will download translations of some folk stories by the Brothers Grimm. We will use these stories to train an LSTM and ask it at the end to output a fresh new story. We will process the text by breaking it into character level bigrams (n-grams, where n=2) and make a vocabulary out of the unique bigrams. The code for this article is available on Github. First, we will discuss the data we will use for text generation and various preprocessing steps employed to clean data. About the dataset We will understand what the dataset looks like so that when we see the generated text, we can assess whether it makes sense, given the training data. We will download the first 100 books from the website, https://www.cs.cmu.edu/~spok/grimmtmp/. These are translations of a set of books (from German to English) by the Brothers Grimm. Initially, we will download the first 100 books in the website, with an automated script, as shown here: url = 'https://www.cs.cmu.edu/~spok/grimmtmp/' # Create a directory if needed dir_name = 'stories' if not os.path.exists(dir_name):    os.mkdir(dir_name)    def maybe_download(filename):  """Download a file if not present"""  print('Downloading file: ', dir_name+ os.sep+filename) if not os.path.exists(dir_name+os.sep+filename): filename, _ = urlretrieve(url + filename, dir_name+os.sep+filename) else: print('File ',filename, ' already exists.') return filename num_files = 100 filenames = [format(i, '03d')+'.txt' for i in range(1,101)] for fn in filenames: maybe_download(fn) We will now show example text snippets extracted from two randomly picked stories. The following is the first snippet: Then she said, my dearest benjamin, your father has had these coffins made for you and for your eleven brothers, for if I bring a little girl into the world, you are all to be killed and buried in them.  And as she wept while she was saying this, the son comforted her and said, weep not, dear mother, we will save ourselves, and go hence… The second text snippet is as follows: Red-cap did not know what a wicked creature he was, and was not at all afraid of him. "Good-day, little red-cap," said he. "Thank you kindly, wolf." "Whither away so early, little red-cap?" "To my grandmother's." "What have you got in your apron?" "Cake and wine.  Yesterday was baking-day, so poor sick grandmother is to have something good, to make her stronger."… Preprocessing data In terms of preprocessing, we will initially make all the text lowercase and break the text into character n-grams, where n=2. Consider the following sentence: The king was hunting in the forest. This would break down to a sequence of n-grams, as follows: ['th,' 'e ,' 'ki,' 'ng,' ' w,' 'as,' …] We will use character level bigrams because it greatly reduces the size of the vocabulary compared with using individual words. Moreover, we will be replacing all the bigrams that appear fewer than 10 times in the corpus with a special token (that is, UNK), representing that bigram is unknown. This helps us to reduce the size of the vocabulary even further. Implementing an LSTM Though there are sublibraries in TensorFlow that have already implemented LSTMs ready to go, we will implement one from scratch. This will be very valuable, as in the real world there might be situations where you cannot use these off-the-shelf components directly. We will discuss the hyperparameters and their effects used for the LSTM. Thereafter, we will discuss the parameters (weights and biases) required to implement the LSTM. We will then discuss how these parameters are used to write the operations taking place within the LSTM. This will be followed by understanding how we will sequentially feed data to the LSTM. Next, we will discuss how we can implement the optimization of the parameters using gradient clipping. Finally, we will investigate how we can use the learned model to output predictions, which are essentially bigrams that will eventually add up to a meaningful story. Defining hyperparameters We will define some hyper-parameters required for the LSTM: # Number of neurons in the hidden state variables num_nodes = 128 # Number of data points in a batch we process batch_size = 64 # Number of time steps we unroll for during optimization num_unrollings = 50 dropout = 0.2 # We use dropout The following list describes each of the hyperparameters: num_nodes: This denotes the number of neurons in the cell memory state. When data is abundant, increasing the complexity of the cell memory will give you a better performance; however, at the same time, it slows down the computations. batch_size: This is the amount of data processed in a single step. Increasing the size of the batch gives a better performance, but poses higher memory requirements. num_unrollings: This is the number of time steps used in truncated-BPTT. The higher the num_unrollings steps, the better the performance, but it will increase both the memory requirement and the computational time. dropout: Finally, we will employ dropout (that is, a regularization technique) to reduce overfitting of the model and produce better results; dropout randomly drops information from inputs/outputs/state variables before passing them to their successive operations. This creates redundant features during learning, leading to better performance. Defining parameters Now we will define TensorFlow variables for the actual parameters of the LSTM. First, we will define the input gate parameters: ix: These are weights connecting the input to the input gate im: These are weights connecting the hidden state to the input gate ib: This is the bias Here we will define the parameters: # Input gate (it) - How much memory to write to cell state # Connects the current input to the input gate ix = tf.Variable(tf.truncated_normal([vocabulary_size, num_nodes], stddev=0.02)) # Connects the previous hidden state to the input gate im = tf.Variable(tf.truncated_normal([num_nodes, num_nodes], stddev=0.02)) # Bias of the input gate ib = tf.Variable(tf.random_uniform([1, num_nodes],-0.02, 0.02)) Similarly, we will define such weights for the forget gate, candidate value (used for memory cell computations), and output gate. The forget gate is defined as follows: # Forget gate (ft) - How much memory to discard from cell state # Connects the current input to the forget gate fx = tf.Variable(tf.truncated_normal([vocabulary_size, num_nodes], stddev=0.02)) # Connects the previous hidden state to the forget gate fm = tf.Variable(tf.truncated_normal([num_nodes, num_nodes], stddev=0.02)) # Bias of the forget gate fb = tf.Variable(tf.random_uniform([1, num_nodes],-0.02, 0.02)) The candidate value (used to compute the cell state) is defined as follows: # Candidate value (c~t) - Used to compute the current cell state # Connects the current input to the candidate cx = tf.Variable(tf.truncated_normal([vocabulary_size, num_nodes], stddev=0.02)) # Connects the previous hidden state to the candidate cm = tf.Variable(tf.truncated_normal([num_nodes, num_nodes], stddev=0.02)) # Bias of the candidate cb = tf.Variable(tf.random_uniform([1, num_nodes],-0.02,0.02)) The output gate is defined as follows: # Output gate - How much memory to output from the cell state # Connects the current input to the output gate ox = tf.Variable(tf.truncated_normal([vocabulary_size, num_nodes], stddev=0.02)) # Connects the previous hidden state to the output gate om = tf.Variable(tf.truncated_normal([num_nodes, num_nodes], stddev=0.02)) # Bias of the output gate ob = tf.Variable(tf.random_uniform([1, num_nodes],-0.02,0.02)) Next, we will define variables for the state and output. These are the TensorFlow variables representing the internal cell state and the external hidden state of the LSTM cell. When defining the LSTM computational operation, we define these to be updated with the latest cell state and hidden state values we compute, using the tf.control_dependencies(...) function. # Variables saving state across unrollings. # Hidden state saved_output = tf.Variable(tf.zeros([batch_size, num_nodes]), trainable=False, name='train_hidden') # Cell state saved_state = tf.Variable(tf.zeros([batch_size, num_nodes]), trainable=False, name='train_cell') # Same variables for validation phase saved_valid_output = tf.Variable(tf.zeros([1, num_nodes]),trainable=False, name='valid_hidden') saved_valid_state = tf.Variable(tf.zeros([1, num_nodes]),trainable=False, name='valid_cell') Finally, we will define a softmax layer to get the actual predictions out: # Softmax Classifier weights and biases. w = tf.Variable(tf.truncated_normal([num_nodes, vocabulary_size], stddev=0.02)) b = tf.Variable(tf.random_uniform([vocabulary_size],-0.02,0.02)) Note: that we're using the normal distribution with zero mean and a small standard deviation. This is fine as our model is a simple single LSTM cell. However, when the network gets deeper (that is, multiple LSTM cells stacked on top of each other), more careful initialization techniques are required. One such initialization technique is known as Xavier initialization, proposed by Glorot and Bengio in their paper Understanding the difficulty of training deep feedforward neural networks, Proceedings of the 13th International Conference on Artificial Intelligence and Statistics, 2010. This is available as a variable initializer in TensorFlow, as shown here: https://www.tensorflow.org/api_docs/python/tf/contrib/layers/xavier_initializer. Defining an LSTM cell and its operations With the weights and the bias defined, we can now define the operations within an LSTM cell. These operations include the following: Calculating the outputs produced by the input and forget gates Calculating the internal cell state Calculating the output produced by the output gate Calculating the external hidden state The following is the implementation of our LSTM cell: def lstm_cell(i, o, state):    input_gate = tf.sigmoid(tf.matmul(i, ix) +                            tf.matmul(o, im) + ib)    forget_gate = tf.sigmoid(tf.matmul(i, fx) +                             tf.matmul(o, fm) + fb)    update = tf.matmul(i, cx) + tf.matmul(o, cm) + cb    state = forget_gate * state + input_gate * tf.tanh(update)    output_gate = tf.sigmoid(tf.matmul(i, ox) +                             tf.matmul(o, om) + ob)    return output_gate * tf.tanh(state), state Defining inputs and labels Now we will define training inputs (unrolled) and labels. The training inputs is a list with the num_unrolling batches of data (sequential), where each batch of data is of the [batch_size, vocabulary_size] size: train_inputs, train_labels = [],[] for ui in range(num_unrollings):    train_inputs.append(tf.placeholder(tf.float32,                               shape=[batch_size,vocabulary_size],                               name='train_inputs_%d'%ui))    train_labels.append(tf.placeholder(tf.float32,                               shape=[batch_size,vocabulary_size],                               name = 'train_labels_%d'%ui)) We also define placeholders for validation inputs and outputs, which will be used to compute the validation perplexity. Note that we do not use unrolling for validation-related computations. # Validation data placeholders valid_inputs = tf.placeholder(tf.float32, shape=[1,vocabulary_size],               name='valid_inputs') valid_labels = tf.placeholder(tf.float32, shape=[1,vocabulary_size],               name = 'valid_labels') Defining sequential calculations required to process sequential data Here we will calculate the outputs produced by a single unrolling of the training inputs in a recursive manner. We will also use dropout (refer to Dropout: A Simple Way to Prevent Neural Networks from Overfitting, Srivastava, Nitish, and others, Journal of Machine Learning Research 15 (2014): 1929-1958), as this gives a slightly better performance. Finally we compute the logit values for all the hidden output values computed for the training data: # Keeps the calculated state outputs in all the unrollings # Used to calculate loss outputs = list() # These two python variables are iteratively updated # at each step of unrolling output = saved_output state = saved_state # Compute the hidden state (output) and cell state (state) # recursively for all the steps in unrolling for i in train_inputs:    output, state = lstm_cell(i, output, state)    output = tf.nn.dropout(output,keep_prob=1.0-dropout)    # Append each computed output value    outputs.append(output) # calculate the score values logits = tf.matmul(tf.concat(axis=0, values=outputs), w) + b Next, before calculating the loss, we have to make sure that the output and the external hidden state are updated to the most current value we calculated earlier. This is achieved by adding a tf.control_dependencies condition and keeping the logit and loss calculation within the condition: with tf.control_dependencies([saved_output.assign(output), saved_state.assign(state)]):    # Classifier.    loss = tf.reduce_mean(        tf.nn.softmax_cross_entropy_with_logits_v2(            logits=logits, labels=tf.concat(axis=0,                                            values=train_labels))) We also define the forward propagation logic for validation data. Note that we do not use dropout during validation, but only during training: # Validation phase related inference logic # Compute the LSTM cell output for validation data valid_output, valid_state = lstm_cell(    valid_inputs, saved_valid_output, saved_valid_state) # Compute the logits valid_logits = tf.nn.xw_plus_b(valid_output, w, b) Defining the optimizer Here we will define the optimization process. We will use a state-of-the-art optimizer known as Adam, which is one of the best stochastic gradient-based optimizers to date. Here in the code, gstep is a variable that is used to decay the learning rate over time. We will discuss the details in the next section. Furthermore, we will use gradient clipping to avoid the exploding gradient: # Decays learning rate everytime the gstep increases tf_learning_rate = tf.train.exponential_decay(0.001,gstep,                   decay_steps=1, decay_rate=0.5) # Adam Optimizer. And gradient clipping. optimizer = tf.train.AdamOptimizer(tf_learning_rate) gradients, v = zip(*optimizer.compute_gradients(loss)) gradients, _ = tf.clip_by_global_norm(gradients, 5.0) optimizer = optimizer.apply_gradients(    zip(gradients, v)) Decaying learning rate over time As mentioned earlier, I use a decaying learning rate instead of a constant learning rate. Decaying the learning rate over time is a common technique used in deep learning for achieving better performance and reducing overfitting. The key idea here is to step-down the learning rate (for example, by a factor of 0.5) if the validation perplexity does not decrease for a predefined number of epochs. Let's see how exactly this is implemented, in more detail: First we define gstep and an operation to increment gstep, called inc_gstep as follows: # learning rate decay gstep = tf.Variable(0,trainable=False,name='global_step') # Running this operation will cause the value of gstep # to increase, while in turn reducing the learning rate inc_gstep = tf.assign(gstep, gstep+1) With this defined, we can write some simple logic to call the inc_gstep operation whenever validation loss does not decrease, as follows: # Learning rate decay related # If valid perplexity does not decrease # continuously for this many epochs # decrease the learning rate decay_threshold = 5 # Keep counting perplexity increases decay_count = 0 min_perplexity = 1e10 # Learning rate decay logic def decay_learning_rate(session, v_perplexity):    global decay_threshold, decay_count, min_perplexity    # Decay learning rate    if v_perplexity < min_perplexity:        decay_count = 0        min_perplexity= v_perplexity else:   decay_count += 1    if decay_count >= decay_threshold:        print('\t Reducing learning rate')        decay_count = 0        session.run(inc_gstep) Here we update min_perplexity whenever we experience a new minimum validation perplexity. Also, v_perplexity is the current validation perplexity. Making predictions Now we can make predictions, simply by applying a softmax activation to the logits we calculated previously. We also define prediction operation for validation logits as well: train_prediction = tf.nn.softmax(logits) # Make sure that the state variables are updated # before moving on to the next iteration of generation with tf.control_dependencies([saved_valid_output.assign(valid_output),                             saved_valid_state.assign(valid_state)]):    valid_prediction = tf.nn.softmax(valid_logits) Calculating perplexity (loss) Perplexity is a measure of how surprised the LSTM is to see the next n-gram, given the current n-gram. Therefore, a higher perplexity means poor performance, whereas a lower perplexity means a better performance: train_perplexity_without_exp = tf.reduce_sum(    tf.concat(train_labels,0)*-tf.log(tf.concat(        train_prediction,0)+1e-10))/(num_unrollings*batch_size) # Compute validation perplexity valid_perplexity_without_exp = tf.reduce_sum(valid_labels*-tf. log(valid_prediction+1e-10)) Resetting states We employ state resetting, as we are processing multiple documents. So, at the beginning of processing a new document, we reset the hidden state back to zero. However, it is not very clear whether resetting the state helps or not in practice. On one hand, it sounds intuitive to reset the memory of the LSTM cell at the beginning of each document to zero, when starting to read a new story. On the other hand, this creates a bias in state variables toward zero. We encourage you to try running the algorithm both with and without state resetting and see which method performs well. # Reset train state reset_train_state = tf.group(tf.assign(saved_state,                             tf.zeros([batch_size, num_nodes])),                             tf.assign(saved_output, tf.zeros(                             [batch_size, num_nodes]))) # Reset valid state reset_valid_state = tf.group(tf.assign(saved_valid_state,                             tf.zeros([1, num_nodes])),                             tf.assign(saved_valid_output,                             tf.zeros([1, num_nodes]))) Greedy sampling to break unimodality This is quite a simple technique where we can stochastically sample the next prediction out of the n best candidates found by the LSTM. Furthermore, we will give the probability of picking one candidate to be proportional to the likelihood of that candidate being the next bigram: def sample(distribution):    best_inds = np.argsort(distribution)[-3:]    best_probs = distribution[best_inds]/    np.sum(distribution[best_inds])    best_idx = np.random.choice(best_inds,p=best_probs)    return best_idx Generating new text Finally, we will define the placeholders, variables, and operations required for generating new text. These are defined similarly to what we did for the training data. First, we will define an input placeholder and variables for state and output. Next, we will define state resetting operations. Finally, we will define the LSTM cell calculations and predictions for the new text to be generated: # Text generation: batch 1, no unrolling. test_input = tf.placeholder(tf.float32, shape=[1, vocabulary_size], name = 'test_input') # Same variables for testing phase saved_test_output = tf.Variable(tf.zeros([1,                                num_nodes]),                                trainable=False, name='test_hidden') saved_test_state = tf.Variable(tf.zeros([1,                               num_nodes]),                               trainable=False, name='test_cell') # Compute the LSTM cell output for testing data test_output, test_state = lstm_cell( test_input, saved_test_output, saved_test_state) # Make sure that the state variables are updated # before moving on to the next iteration of generation with tf.control_dependencies([saved_test_output.assign(test_output),                             saved_test_state.assign(test_state)]):    test_prediction = tf.nn.softmax(tf.nn.xw_plus_b(test_output,                                    w, b)) # Reset test state reset_test_state = tf.group(    saved_test_output.assign(tf.random_normal([1,                             num_nodes],stddev=0.05)),    saved_test_state.assign(tf.random_normal([1,                            num_nodes],stddev=0.05))) Example generated text Let's take a look at some of the data generated by the LSTM after 50 steps of learning: they saw that the birds were at her bread, and threw behind him a comb which made a great ridge with a thousand times thousands of spikes. that was a collier. the nixie was at church, and thousands of spikes, they were flowers, however, and had hewn through the glass, the children had formed a hill of mirrors, and was so slippery that it was impossible for the nixie to cross it. then she thought, i will go home quickly and fetch my axe, and cut the hill of glass in half. long before she returned, however, and had hewn through the glass, the children saw her from afar, and he sat down close to it, and was so slippery that it was impossible for the nixie to cross it. To summarize, you can see from the output, that we actually formed a story of a water-nixie in our training corpus. However, our LSTM does not merely output the text, but it adds more color to that story by introducing new things, such as talking about a church and flowers, which were not found in the original text. To write modern natural language processing applications using deep learning algorithms and TensorFlow, you may refer to this book Natural Language Processing with TensorFlow. What is LSTM? Implement Long-short Term Memory (LSTM) with TensorFlow Recurrent Neural Network and the LSTM Architecture
Read more
  • 0
  • 1
  • 10030
Unlock access to the largest independent learning library in Tech for FREE!
Get unlimited access to 7500+ expert-authored eBooks and video courses covering every tech area you can think of.
Renews at €18.99/month. Cancel anytime
article-image-why-mesos
Packt
31 Mar 2016
8 min read
Save for later

Why Mesos?

Packt
31 Mar 2016
8 min read
In this article by Dipa Dubhasi and Akhil Das authors of the book Mastering Mesos, delves into understanding the importance of Mesos. Apache Mesos is an open source, distributed cluster management software that came out of AMPLab, UC Berkeley in 2011. It abstracts CPU, memory, storage, and other computer resources away from machines (physical or virtual), enabling fault-tolerant and elastic distributed systems to easily be built and run effectively. It is referred to as a metascheduler (scheduler of schedulers) and a "distributed systems kernel/distributed datacenter OS". It improves resource utilization, simplifies system administration, and supports a wide variety of distributed applications that can be deployed by leveraging its pluggable architecture. It is scalable and efficient and provides a host of features, such as resource isolation and high availability, which, along with a strong and vibrant open source community, makes this one of the most exciting projects. (For more resources related to this topic, see here.) Introduction to the datacenter OS and architecture of Mesos Over the past decade, datacenters have graduated from packing multiple applications into a single server box to having large datacenters that aggregate thousands of servers to serve as a massively distributed computing infrastructure. With the advent of virtualization, microservices, cluster computing, and hyper-scale infrastructure, the need of the hour is the creation of an application-centric enterprise that follows a software-defined datacenter strategy. Currently, server clusters are predominantly managed individually, which can be likened to having multiple operating systems on the PC, one each for processor, disk drive, and so on. With an abstraction model that treats these machines as individual entities being managed in isolation, the ability of the datacenter to effectively build and run distributed applications is greatly reduced. Another way of looking at the situation is comparing running applications in a datacenter to running them on a laptop. One major difference is that while launching a text editor or web browser, we are not required to check which memory modules are free and choose ones that suit our need. Herein lies the significance of a platform that acts like a host operating system and allows multiple users to run multiple applications simultaneously by utilizing a shared set of resources. Datacenters now run varied distributed application workloads, such as Spark, Hadoop, and so on, and need the capability to intelligently match resources and applications. The datacenter ecosystem today has to be equipped to manage and monitor resources and efficiently distribute workloads across a unified pool of resources with the agility and ease to cater to a diverse user base (noninfrastructure teams included). A datacenter OS brings to the table a comprehensive and sustainable approach to resource management and monitoring. This not only reduces the cost of ownership but also allows a flexible handling of resource requirements in a manner that isolated datacenter infrastructure cannot support. The idea behind a datacenter OS is that of an intelligent software that sits above all the hardware in a datacenter and ensures efficient and dynamic resource sharing. Added to this is the capability to constantly monitor resource usage and improve workload and infrastructure management in a seamless way that is not tied to specific application requirements. In its absence, we have a scenario with silos in a datacenter that force developers to build software catering to machine-specific characteristics and make the moving and resizing of applications a highly cumbersome procedure. The datacenter OS acts as a software layer that aggregates all servers in a datacenter into one giant supercomputer to deliver the benefits of multilatency, isolation, and resource control across all microservice applications. Another major advantage is the elimination of human-induced error during the continual assigning and reassigning of virtual resources. From a developer's perspective, this will allow them to easily and safely build distributed applications without restricting them to a bunch of specialized tools, each catering to a specific set of requirements. For instance, let's consider the case of Data Science teams who develop analytic applications that are highly resource intensive. An operating system that can simplify how the resources are accessed, shared, and distributed successfully alleviates their concern about reallocating hardware every time the workloads change. Of key importance is the relevance of the datacenter OS to DevOps, primarily a software development approach that emphasizes automation, integration, collaboration, and communication between traditional software developers and other IT professionals. With a datacenter OS that effectively transforms individual servers into a pool of resources, DevOps teams can focus on accelerating development and not continuously worry about infrastructure issues. In a world where distributed computing becomes the norm, the datacenter OS is a boon in disguise. With freedom from manually configuring and maintaining individual machines and applications, system engineers need not configure specific machines for specific applications as all applications would be capable of running on any available resources from any machine, even if there are other applications already running on them. Using a datacenter OS results in centralized control and smart utilization of resources that eliminate hardware and software silos to ensure greater accessibility and usability even for noninfrastructural professionals. Examples of some organizations administering their hyperscale datacenters via the datacenter OS are Google with the Borg (and next geneneration Omega) systems. The merits of the datacenter OS are undeniable, with benefits ranging from the scalability of computing resources and flexibility to support data sharing across applications to saving team effort, time, and money while launching and managing interoperable cluster applications. It is this vision of transforming the datacenter into a single supercomputer that Apache Mesos seeks to achieve. Born out of a Berkeley AMPLab research paper in 2011, it has since come a long way with a number of leading companies, such as Apple, Twitter, Netflix, and AirBnB among others, using it in production. Mesosphere is a start-up that is developing a distributed OS product with Mesos at its core. The architecture of Mesos Mesos is an open-source platform for sharing clusters of commodity servers between different distributed applications (or frameworks), such as Hadoop, Spark, and Kafka among others. The idea is to act as a centralized cluster manager by pooling together all the physical resources of the cluster and making it available as a single reservoir of highly available resources for all the different frameworks to utilize. For example, if an organization has one 10-node cluster (16 CPUs and 64 GB RAM) and another 5-node cluster (4 CPUs and 16 GB RAM), then Mesos can be leveraged to pool them into one virtual cluster of 720 GB RAM and 180 CPUs, where multiple distributed applications can be run. Sharing resources in this fashion greatly improves cluster utilization and eliminates the need for an expensive data replication process per-framework. Some of the important features of Mesos are: Scalability: It can elastically scale to over 50,000 nodes Resource isolation: This is achieved through Linux/Docker containers Efficiency: This is achieved through CPU and memory-aware resource scheduling across multiple frameworks High availability: This is through Apache ZooKeeper Interface: A web UI for monitoring the cluster state Mesos is based on the same principles as the Linux kernel and aims to provide a highly available, scalable, and fault-tolerant base for enabling various frameworks to share cluster resources effectively and in isolation. Distributed applications are varied and continuously evolving, a fact that leads Mesos' design philosophy towards a thin interface that allows an efficient resource allocation between different frameworks and delegates the task of scheduling and job execution to the frameworks themselves. The two advantages of doing so are: Different frame data replication works can independently devise methods to address their data locality, fault-tolerance, and other such needs. It simplifies the Mesos codebase and allows it to be scalable, flexible, robust, and agile Mesos' architecture hands over the responsibility of scheduling tasks to the respective frameworks by employing a resource offer abstraction that packages a set of resources and makes offers to each framework. The Mesos master node decides the quantity of resources to offer each framework, while each framework decides which resource offers to accept and which tasks to execute on these accepted resources. This method of resource allocation is shown to achieve good degree of data locality for each framework sharing the same cluster. An alternative architecture would implement a global scheduler that took framework requirements, organizational priorities, and resource availability as inputs and provided a task schedule breakdown by framework and resource as output, essentially acting as a matchmaker for jobs and resources with priorities acting as constraints. The challenges with this architecture, such as developing a robust API that could capture all the varied requirements of different frameworks, anticipating new frameworks, and solving a complex scheduling problem for millions of jobs, made the former approach a much more attractive option for the creators. Summary Thus in this article, we introduced Mesos, and then dived deep into its architecture to understand importance of Mesos. Resources for Article:   Further resources on this subject: Understanding Mesos Internals [article] Leveraging Python in the World of Big Data [article] Self-service Business Intelligence, Creating Value from Data [article]
Read more
  • 0
  • 0
  • 10027

article-image-oracle-hyperion-pivots
Packt
08 Sep 2010
7 min read
Save for later

Oracle Hyperion: Pivots

Packt
08 Sep 2010
7 min read
(For more resources on Oracle, see here.) Creating Pivot sections Pivots are added to a document by selecting the New Pivot item from the Insert Menu. When a new Pivot section is added the document, the Pivot will appear in the Section Catalogue with the default name Pivot. Renaming the section is consistent with other sections, where the section is renamed using one of the three following options: Right-click method Edit menu method Double-click method Right-click the section in the Section Catalogue and select Rename Section Select Rename Section from the Edit menu while viewing the section Double-click the name of the section in the Section Title Bar Change the section to the desired name and select OK Change the section to the desired name and select OK Change the section to the desired name and select OK   Pivot this Chart Interactive Reporting provides the ability to create a Pivot from the data contained in a Chart section. While viewing a chart, the Pivot this Chart menu option in the Insert menu can be selected. Upon selecting the menu item, a new Pivot section is added to the document and is activated in the main window. Interactive Reporting translates the axis information of the Chart to the Pivot row and column labels and the Fact elements to the Facts in the Pivot. This feature can be used with all chart types except the Scatter and Bubble charts. Pivots created using results or tables Pivot sections can be added underneath a Results section or Table section to allow for reporting on subsets of data. When a Pivot is added to the document and a Query section, Results section, or a section reporting off of a Results section is active (displayed in the main window and highlighted in the Section Catalogue), the Pivot will be allowed access to all of the elements and will reflect all of the data stored in the Results section. However, a Pivot may also be added to report off a subset of results contained in a Table section. When a Pivot is added to the document and the parent section is a Table section or presentation section with a Table section as the parent, the Pivot appears in the Section Catalogue indented underneath the Table section to denote the section is reporting off of the data contained in a table. Notice how the US Sales Pivot and US Sales Chart in the following screenshot appear indented underneath the US Sales Table. This indentation denotes that the Sales Pivot and Chart are both presenting information from the US Sales Table. Notice also how the Sales Pivot and Cost Pivot appears underneath the Results section and not indented. This shows that the Sales Pivot and the Cost Pivot present information from the Results section. Removing Pivot sections Pivot sections are removed from the document using one of the two options shown in the following table: Right-click method Edit menu method Right-click the section in the Section Catalogue and select Delete Section Select Delete Section from the Edit menu Confirm section delete by selecting Delete from the Delete Section confirmation window Confirm section delete by selecting Delete from the Delete Section confirmation window Adding content Once a Pivot is added to the document, data from the parent Results or Table section may be added to the Pivot by adding elements to the Data Layout window in the section. The Data Layout section is displayed at the bottom of the main window and can be shown or hidden by clicking on the Data Layout button in the Section Title Bar. An example of the Data Layout window is shown in the following screenshot: The Data Layout window contains three different areas controlling the display of content in the Pivot named Row Labels, Column Labels, and Facts. Items are added to the different areas by the following three methods: Drag-and-drop method Right-click method Pivot menu method Highlight the column(s) in the Elementswindow Highlight and right-click the column(s) in the Elements window Highlight the column(s) in the Elements Window Left-click and drag the column(s) to the Row Labels, Column Labels, or Fact area in the Data Layout window Select from the menu the desired location to add the column(s) to the pivot Click on the Pivot menu and highlight the Add Select Items menu item     Select from menu the desired location to add the column(s) in the pivot Row and column labels Row labels are the items displayed on the left-hand side of the Pivot, and column labels are the items displayed on the top of the Pivot. Multiple elements may be added to both the row and column labels, where the order of the columns drives the order and summarization of the data in the Pivot. Facts Facts are elements of data summarized by the elements added to the row and column labels. Facts are typically numeric data elements, but Interactive Reporting also allows for text columns to be added as a Fact. While numeric items are aggregated by a data function, text items are separated by commas creating a string of values. Data functions The Pivot differentiates the aggregation of each Fact by the Data function. The default is the Sum function and can be changed to another preset configuration. Other functions include Average, Count, Count Distinct, Maximum, Minimum, % of Column, % of Row, and % of Grand. The data function of a fact item can easily be adjusted by one of the following methods: Right-click method Pivot menu method Highlight the Fact column of data in the pivot. Highlight the Fact column of data in the pivot. Right-click and highlight the Data Function menu item Open the Pivot menu and highlight the Data Function menu item Select the Data Function of interest Select the Data Function of interest Removing content Items in the Row labels, Column labels, and Facts are easily removed. Use one of the following steps to remove a column from the Data Layout: Right-click method Keyboard method Pivot menu method Highlight and right-click on the column(s) in the Data Layout Highlight the column(s) in the Data Layout or the entire column in the Pivot Highlight the column(s) in the Data Layout Select Remove Press the Delete key on the keyboard Open the Pivot menu and select the Remove Selected Items menu item Arranging label content The Pivot section provides flexibility to easily modify and rearrange content while performing analysis. The first and most commonly used method of rearranging content is the ability to easily move row and column element in the pivot. Row and column elements are rearranged by dragging-and-dropping elements in the Data Layout, or by highlighting and moving the column of data in the physical pivot. A tab exists at the end of each row and column label allowing for the highlighting of the entire column as shown in the following screenshot. When the element is highlighted, the column can be pivoted from the row labels to the column labels, and the row and column labels can be rearranged respectively by dragging the element in front or behind other sections. When the element is dragged, a graphical outline appears as shown in the following screenshot:
Read more
  • 0
  • 0
  • 10026

article-image-why-ajax-is-a-different-type-of-software
Packt
27 Jan 2011
14 min read
Save for later

Why Ajax is a different type of software

Packt
27 Jan 2011
14 min read
Ajax is not a piece of software in the way we think about JavaScript or CSS being a piece of software. It's actually a lot more like an overlaid function. But what does that mean, exactly? Why Ajax is a bit like human speech Human speech is an overlaid function. What is meant by this is reflected in the answer to a question: "What part of the human body has the basic job of speech?" The tongue, for one answer, is used in speech, but it also tastes food and helps us swallow. The lungs and diaphragm, for another answer, perform the essential task of breathing. The brain cannot be overlooked, but it also does a great many other jobs. All of these parts of the body do something more essential than speech and, for that matter, all of these can be found among animals that cannot talk. Speech is something that is overlaid over organs that are there in the first place because of something other than speech. Something similar to this is true for Ajax, which is not a technology in itself, but something overlaid on top of other technologies. Ajax, some people say, stands for Asynchronous JavaScript and XML, but that was a retroactive expansion. JavaScript was introduced almost a decade before people began seriously talking about Ajax. Not only is it technically possible to use Ajax without JavaScript (one can substitute VBScript at the expense of browser compatibility), but there are quite a few substantial reasons to use JavaScript Object Notation (JSON) in lieu of heavy-on-the-wire eXtensible Markup Language (XML). Performing the overlaid function of Ajax with JSON replacing XML is just as eligible to be considered full-fledged Ajax as a solution incorporating XML. Ajax helps the client talk to the server Ajax is a way of using client-side technologies to talk with a server and perform partial page updates. Updates may be to all or part of the page, or simply to data handled behind the scenes. It is an alternative to the older paradigm of having a whole page replaced by a new page loaded when someone clicks on a link or submits a form. Partial page updates, in Ajax, are associated with Web 2.0, while whole page updates are associated with Web 1.0; it is important to note that "Web 2.0" and "Ajax" are not interchangeable. Web 2.0 includes more decentralized control and contributions besides Ajax, and for some objectives it may make perfect sense to develop an e-commerce site that uses Ajax but does not open the door to the same kind of community contributions as Web 2.0. Some of the key features common in Web 2.0 include: Partial page updates with JavaScript communicating with a server and rendering to a page An emphasis on user-centered design Enabling community participation to update the website Enabling information sharing as core to what this communication allows The concept of "partial page updates" may not sound very big, but part of its significance may be seen in an unintended effect. The original expectation of partial page updates was that it would enable web applications that were more responsive. The expectation was that if submitting a form would only change a small area of a page, using Ajax to just load the change would be faster than reloading the entire page for every minor change. That much was true, but once programmers began exploring, what they used Ajax for was not simply minor page updates, but making client-side applications that took on challenges more like those one would expect a desktop program to do, and the more interesting Ajax applications usually became slower. Again, this was not because you could not fetch part of the page and update it faster, but because programmers were trying to do things on the client side that simply were not possible under the older way of doing things, and were pushing the envelope on the concept of a web application and what web applications can do. Which technologies does Ajax overlay? Now let us look at some of the technologies where Ajax may be said to be overlaid. JavaScript JavaScript deserves pride of place, and while it is possible to use VBScript for Internet Explorer as much more than a proof of concept, for now if you are doing Ajax, it will almost certainly be Ajax running JavaScript as its engine. Your application will have JavaScript working with XMLHttpRequest, JavaScript working with HTML, XHTML, or HTML5; JavaScript working with the DOM, JavaScript working with CSS, JavaScript working with XML or JSON, and perhaps JavaScript working with other things. While addressing a group of Django developers or Pythonistas, it would seem appropriate to open with, "I share your enthusiasm." On the other hand, while addressing a group of JavaScript programmers, in a few ways it is more appropriate to say, "I feel your pain." JavaScript is a language that has been discovered as a gem, but its warts were enough for it to be largely unappreciated for a long time. "Ajax is the gateway drug to JavaScript," as it has been said—however, JavaScript needs a gateway drug before people get hooked on it. JavaScript is an excellent language and a terrible language rolled into one. Before discussing some of the strengths of JavaScript—and the language does have some truly deep strengths—I would like to say "I feel your pain" and discuss two quite distinct types of pain in the JavaScript language. The first source of pain is some of the language decisions in JavaScript: The Wikipedia article says it was designed to resemble Java but be easier for non-programmers, a decision reminiscent of SQL and COBOL. The Java programmer who finds the C-family idiom of for(i = 0; i < 100; ++i) available will be astonished to find that the functions are clobbering each other's assignments to i until they are explicitly declared local to the function by declaring the variables with var. There is more pain where that came from. The following two functions will not perform the naively expected mathematical calculation correctly; the assignments to i and the result will clobber each other: function outer() { result = 0; for(i = 0; i < 100; ++i) { result += inner(i); } return result } function inner(limit) { result = 0; for(i = 0; i < limit; ++i) { result += i; } return result; } The second source of pain is quite different. It is a pain of inconsistent implementation: the pain of, "Write once, debug everywhere." Strictly speaking, this is not JavaScript's fault; browsers are inconsistent. And it need not be a pain in the server-side use of JavaScript or other non-browser uses. However, it comes along for the ride for people who wish to use JavaScript to do Ajax. Cross-browser testing is a foundational practice in web development of any stripe; a good web page with semantic markup and good CSS styling that is developed on Firefox will usually look sane on Internet Explorer (or vice versa), even if not quite pixel-perfect. But program directly for the JavaScript implementation on one version of a browser, and you stand rather sharp odds of your application not working at all on another browser. The most important object by far for Ajax is the XMLHttpRequest and not only is it not the case that you may have to do different things to get an XMLHttpRequest in different browsers or sometimes different (common) versions of the same browser, and, even when you have code that will get an XMLHttpRequest object, the objects you have can be incompatible so that code that works on one will show strange bugs for another. Just because you have done the work of getting an XMLHttpRequest object in all of the major browsers, it doesn't mean you're home free. Before discussing some of the strengths of the JavaScript language itself, it would be worth pointing out that a good library significantly reduces the second source of pain. Almost any sane library will provide a single, consistent way to get XMLHttpRequest functionality, and consistent behavior for the access it provides. In other words, one of the services provided by a good JavaScript library is a much more uniform behavior, so that you are programming for only one model, or as close as it can manage, and not, for instance, pasting conditional boilerplate code to do simple things that are handled differently by different browser versions, often rendering surprisingly different interpretations of JavaScript. Many of the things we will see done well as we explore jQuery are also done well in other libraries. We previously said that JavaScript is an excellent language and a terrible language rolled into one; what is to be said in favor of JavaScript? The list of faults is hardly all that is wrong with JavaScript, and saying that libraries can dull the pain is not itself a great compliment. But in fact, something much stronger can be said for JavaScript: If you can figure out why Python is a good language, you can figure out why JavaScript is a good language. I remember, when I was chasing pointer errors in what became 60,000 lines of C, teasing a fellow student for using Perl instead of a real language. It was clear in my mind that there were interpreted scripting languages, such as the bash scripting that I used for minor convenience scripts, and then there were real languages, which were compiled to machine code. I was sure that a real language was identified with being compiled, among other things, and that power in a language was the sort of thing C traded in. (I wonder why he didn't ask me if he wasn't a real programmer because he didn't spend half his time chasing pointer errors.) Within the past year or so I've been asked if "Python is a real programming language or is just used for scripting," and something similar to the attitude shift I needed to appreciate Perl and Python is needed to properly appreciate JavaScript. The name "JavaScript" is unfortunate; like calling Python "Assembler Kit", it's a way to ask people not to see its real strengths. (Someone looking for tools for working on an assembler would be rather disgusted to buy an "Assembler Kit" and find Python inside. People looking for Java's strengths in JavaScript will almost certainly be disappointed.) JavaScript code may look like Java in an editor, but the resemblance is a façade; besides Mocha, which had been renamed LiveScript, being renamed to JavaScript just when Netscape was announcing Java support in web browsers, it is has been described as being descended from NewtonScript, Self, Smalltalk, and Lisp, as well as being influenced by Scheme, Perl, Python, C, and Java. What's under the Java façade is pretty interesting. And, in the sense of the simplifying "façade" design pattern, JavaScript was marketed in a way almost guaranteed not to communicate its strengths to programmers. It was marketed as something that nontechnical people could add snippets of, in order to achieve minor, and usually annoying, effects on their web pages. It may not have been a toy language, but it sure was dressed up like one. Python may not have functions clobbering each other's variables (at least not unless they are explicitly declared global), but Python and JavaScript are both multiparadigm languages that support object-oriented programming, and their versions of "object-oriented" have a lot in common, particularly as compared to (for instance) Java. In Java, an object's class defines its methods and the type of its fields, and this much is set in stone. In Python, an object's class defines what an object starts off as, but methods and fields can be attached and detached at will. In JavaScript, classes as such do not exist (unless simulated by a library such as Prototype), but an object can inherit from another object, making a prototype and by implication a prototype chain, and like Python it is dynamic in that fields can be attached and detached at will. In Java, the instanceof keyword is important, as are class casts, associated with strong, static typing. Python doesn't have casts, and its isinstance() function is seen by some as a mistake. The concern is that Python, like JavaScript, is a duck-typing language: "If it looks like a duck, and it quacks like a duck, it's a duck!" In a duck-typing language, if you write a program that polls weather data, and there's a ForecastFromScreenscraper object that is several years old and screenscrapes an HTML page, you should be able to write a ForecastFromRSS object that gets the same information much more cleanly from an RSS feed. You should be able to use it as a drop-in replacement as long as you have the interface right. That is different from Java; at least if it were a ForecastFromScreenscraper object, code would break immediately if you handed it a ForecastFromRSS object. Now, in fairness to Java, the "best practices" Java way to do it would probably separate out an IForecast interface, which would be implemented by both ForecastFromScreenscraper and later ForecastFromRSS, and Java has ways of allowing drop-in replacements if they have been explicitly foreseen and planned for. However, in duck-typed languages, the reality goes beyond the fact that if the people in charge designed things carefully and used an interface for a particular role played by an object, you can make a drop-in replacement. In a duck-typed language, you can make a drop-in replacement for things that the original developers never imagined you would want to replace. JavaScript's reputation is changing. More and more people are recognizing that there's more to the language than design flaws. More and more people are looking past the fact that JavaScript is packaged like Java, like packaging a hammer to give the impression that it is basically like a wrench. More and more people are looking past the silly "toy language" Halloween costume that JavaScript was stuffed into as a kid. One of the ways good programmers grow is by learning new languages, and JavaScript is not just the gateway to mainstream Ajax; it is an interesting language in itself. With that much stated, we will be making a carefully chosen, selective use of JavaScript, and not make a language lover's exploration of the JavaScript language, overall. Much of our work will be with the jQuery library; if you have just programmed a little "bare JavaScript", discovering jQuery is a bit like discovering Python, in terms of a tool that cuts like a hot knife through butter. It takes learning, but it yields power and interesting results soon as well as having some room to grow. What is XMLHttpRequest in relation to Ajax? The XMLHttpRequest object is the reason why the kind of games that can be implemented with Ajax technologies do not stop at clones of Tetris and other games that do not know or care if they are attached to a network. They include massive multiplayer online role-playing games where the network is the computer. Without having something like XMLHttpRequest, "Ajax chess" would probably mean a game of chess against a chess engine running in your browser's JavaScript engine; with XMLHttpRequest, "Ajax chess" is more likely man-to-man chess against another human player connected via the network. The XMLHttpRequest object is the object that lets Gmail, Google Maps, Bing Maps, Facebook, and many less famous Ajax applications deliver on Sun's promise: the network is the computer. There are differences and some incompatibilities between different versions of XMLHttpRequest, and efforts are underway to advance "level-2-compliant" XMLHttpRequest implementations, featuring everything that is expected of an XMLHttpRequest object today and providing further functionality in addition, somewhat in the spirit of level 2 or level 3 CSS compliance. We will not be looking at level 2 efforts, but we will look at the baseline of what is expected as standard in most XMLHttpRequest objects. The basic way that an XMLHttpRequest object is used is that the object is created or reused (the preferred practice usually being to reuse rather than create and discard a large number), a callback event handler is specified, the connection is opened, the data is sent, and then when the network operation completes, the callback handler retrieves the response from XMLHttpRequest and takes an appropriate action. A bare-bones XMLHttpRequest object can be expected to have the following methods and properties.
Read more
  • 0
  • 0
  • 10014

Packt
11 Aug 2015
17 min read
Save for later

Ext JS 5 – an Introduction

Packt
11 Aug 2015
17 min read
In this article by Carlos A. Méndez, the author of the book Learning Ext JS - Fourth Edition, we will see some of the important features in Ext JS. When learning a new technology such as Ext JS, some developers face a hard time to begin with, so this article will cover up certain important points that have been included in the recent version of Ext JS. We will be referencing certain online documentations, blogs and forums looking for answers, trying to figure out how the library and all the components work together. Even though there are tutorials in the official learning center, it would be great to have a guide to learn the library from the basics to a more advanced level. Ext JS is a state-of-the-art framework to create Rich Internet Applications (RIAs). The framework allows us to create cross-browser applications with a powerful set of components and widgets. The idea behind the framework is to create user-friendly applications in rapid development cycles, facilitate teamwork (MVC or MVVM), and also have a long-term maintainability. Ext JS is not just a library of widgets anymore; the brand new version is a framework full of new exciting features for us to play with. Some of these features are the new class system, the loader, the new application package, which defines a standard way to code our applications, and much more awesome stuff. The company behind the Ext JS library is Sencha Inc. They work on great products that are based on web standards. Some of the most famous products that Sencha also have are Sencha Touch and Sencha Architect. In this article, we will cover some of the basic concepts of the framework of version 5 and take a look at some of the new features in Ext JS 5. (For more resources related to this topic, see here.) Considering Ext JS for your next project Ext JS is a great library to create RIAs that require a lot of interactivity with the user. If you need complex components to manage your information, then Ext is your best option because it contains a lot of widgets such as the grid, forms, trees, panels, and a great data package and class system. Ext JS is best suited for enterprise or intranet applications; it's a great tool to develop an entire CRM or ERP software solution. One of the more appealing examples is the Desktop sample (http://dev.sencha.com/ext/5.1.0/examples/desktop/index.html). It really looks and feels like a native application running in the browser. In some cases, this is an advantage because the users already know how to interact with the components and we can improve the user experience. Ext JS 5 came out with a great tool to create themes and templates in a very simple way. The framework for creating themes is built on top of Compass and Sass, so we can modify some variables and properties and in a few minutes we can have a custom template for our Ext JS applications. If we want something more complex or unique, we can modify the original template to suit our needs. This might be more time-consuming depending on our experience with Compass and Sass. Compass and Sass are extensions for CSS. We can use expressions, conditions, variables, mixins, and many more awesome things to generate well-formatted CSS. You can learn more about Compass on their website at http://compass-style.org/. The new class system allows us to define classes incredibly easily. We can develop our application using the object-oriented programming paradigm and take advantage of the single and multiple inheritances. This is a great advantage because we can implement any of the available patterns such as MVC, MVVM, Observable, or any other. This will allow us to have a good code structure, which leads us to have easy access for maintenance. Another thing to keep in mind is the growing community around the library; there are lot of people around the world that are working with Ext JS right now. You can even join the meeting groups that have local reunions frequently to share knowledge and experiences; I recommend you to look for a group in your city or create one. The new loader system is a great way to load our modules or classes on demand. We can load only the modules and applications that the user needs just in time. This functionality allows us to bootstrap our application faster by loading only the minimal code for our application to work. One more thing to keep in mind is the ability to prepare our code for deployment. We can compress and obfuscate our code for a production environment using the Sencha Command, a tool that we can run on our terminal to automatically analyze all the dependencies of our code and create packages. Documentation is very important and Ext JS has great documentation, which is very descriptive with a lot of examples, videos, and sample code so that we can see it in action right on the documentation pages, and we can also read the comments from the community. What's new in Ext JS 5 Ext JS 5 introduces a great number of new features, and we'll briefly cover a few of the significant additions in version 5 as follows: Tablet support and new themes: This has introduced the ability to create apps compatible with touch-screen devices (touch-screen laptops, PCs, and tablets). The Crisp theme is introduced and is based on the Neptune theme. Also, there are new themes for tablet support, which are Neptune touch and Crisp touch. New application architecture – MVVM: Adding a new alternative to MVC Sencha called MVVM (which stands for Model-View-ViewModel), this new architecture has data binding and two-way data binding, allowing us to decrease much of the extra code that some of us were doing in past versions. This new architecture introduces: Data binding View controllers View models Routing: Routing provides deep linking of application functionality and allows us to perform certain actions or methods in our application by translating the URL. This gives us the ability to control the application state, which means that we can go to a specific part or a direct link to our application. Also, it can handle multiple actions in the URL. Responsive configurations: Now we have the ability to set the responsiveConfig property (new property) to some components, which will be a configuration object that represents conditions and criteria on which the configurations set will be applied, if the rule meets these configurations. As an example: responsiveConfig: { 'width > 800': { region: 'west' }, 'width <= 800':{ region: 'north' } } Data package improvements: Some good changes came in version 5 relating to data handling and data manipulation. These changes allowed developers an easier journey in their projects, and some of the new things are: Common Data (the Ext JS Data class, Ext.Data, is now part of the core package) Many-to-many associations Chained stores Custom field types Event system: The event logic was changed, and is now a single listener attached at the very top of the DOM hierarchy. So this means when a DOM element fires an event, it bubbles to the top of the hierarchy before it's handled. So Ext JS intercepts this and checks the relevant listeners you added to the component or store. This reduces the number of interactions on the DOM and also gives us the ability to enable gestures. Sencha Charts: Charts can work on both Ext JS and Sencha Touch, and have enhanced performance on tablet devices. Legacy Ext JS 4 charts were converted into a separate package to minimize the conversion/upgrade. In version 5, charts have new features such as: Candlestick and OHLC series Pan, zoom, and crosshair interactions Floating axes Multiple axes SVG and HTML Canvas support Better performance Greater customization Chart themes Tab Panels: Tab panels have more options to control configurations such as icon alignment and text rotation. Thanks to new flexible Sass mixins, we can easily control presentation options. Grids: This component, which has been present since version 2x, is one of the most popular components, and we may call it one of the cornerstones of this framework. In version 5, it got some awesome new features: Components in Cells Buffered updates Cell updaters Grid filters (The popular "UX" (user extension) has been rewritten and integrated into the framework. Also filters can be saved in the component state.) Rendering optimizations Widgets: This is a lightweight component, which is a middle ground between Ext.Component and the Cell renderer. Breadcrumb bars: This new component displays the data of a store (a specific data store for the tree component) in a toolbar form. This new control can be a space saver on small screens or tablets. Form package improvements: Ext JS 5 introduces some new controls and significant changes on others: Tagfield: This is a new control to select multiple values. Segmented buttons: These are buttons with presentation such as multiple selection on mobile interfaces. Goodbye to TriggerField: In version 5, TriggerField is deprecated and now the way to create triggers is by using the Text field and implementing the triggers on the TextField configuration. (TriggerField in version 4 is a text field with a configured button(s) on the right side.)  Field and Form layouts: Layouts were refactored using HTML and CSS, so there is improvement as the performance is now better. New SASS Mixins (http://sass-lang.com/): Several components that were not able to be custom-themed now have the ability to be styled in multiple ways in a single theme or application. These components are: Ext.menu.Menu Ext.form.Labelable Ext.form.FieldSet Ext.form.CheckboxGroup Ext.form.field.Text Ext.form.field.Spinner Ext.form.field.Display Ext.form.field.Checkbox The Sencha Core package: The core package contains code shared between Ext JS and Sencha Touch and in the future, this core will be part of the next major release of Sencha Touch. The Core includes: Class system Data Events Element Utilities Feature/environment detection Preparing for deployment So far, we have seen a few features that helps to architect a JavaScript code; but we need to prepare our application for a production environment. So initially, when an application is in the development environment, we need to make sure that Ext JS classes (also our own classes) are dynamically loaded when the application requires to use them. In this environment, it's really helpful to load each class in a separate file. This will allow us to debug the code easily, and find and fix bugs. Now, before the application is compiled, we must know the three basic parts of an application, as marked here: app.json: This file contains specific details about our application. Also, Sencha CMD processes this file first. build.xml: This file contains a minimal initial Ant script, and imports a task file located at .sencha/app/build-impl.xml. .sencha: This folder contains many files related to, and are to be used for, the build process. The app.json file As we said before, the app.json file contains the information about the settings of the application. Open the file and take a look. We can make changes to this file, such as the theme that our application is going to use. For example, we can use the following line of code: "theme": "my-custom-theme-touch", Alternatively, we can use a normal theme: "theme": "my-custom-theme", We can also use the following for using charts: "requires": [ "sencha-charts" ], This was to specify that we are going to use the charts or draw classes in our application (the chart package for Ext JS 5). Now, at the end of the file, there is an ID for the application: "id": "7833ee81-4d14-47e6-8293-0cb8120281ab" After this ID, we can add other properties. As an example, suppose the application will be generated for Central and South America. Then we need to include the locale (ES or PT), so we can add the following: ,"locales":["es"] We can also add multiple languages: ,"locales":["es","pt","en"] This will cause the compilation process to include the corresponding locale files located at ext/packages/ext-locale/build. However, this article can't cover each property in the file, so it's recommended that you take a deep look into the Sencha CMD documentation at: http://docs-origin.sencha.com/cmd/5.x/microloader.html to learn more about the app.json file. The Sencha command To create our production build, we need to use the Sencha Command. This tool will help us in our purpose. If you are running Sencha CMD on Windows 7 or Windows 8, it's recommended that you run the tool with "administrator privileges". So let's type this in our console tool: [path of my app]\sencha app build In my case (Windows OS 7; 64-bit), I typed: K:\x_extjsdev\app_test\myapp>sencha app build After the command runs, you will see something like this in your console tool: So, let's check out the build folder inside our application folder. We may have the following list of files: Notice that the build process has created these: resources: This file will contain a copy of our resources folder, plus one or more CSS files starting with myApp-all app.js: This file contains all of the necessary JS (Ext JS core classes, components, and our custom application classes) app.json: This is a small manifest file compressed index.html: This file is similar to our index file in development mode, except for the line: <script id="microloader" type="text/javascript" src="bootstrap.js"></script> This was replaced by some compressed JavaScript code, which will act in a similar way to the micro loader. Notice that the serverside folder, where we use some JSON files (other cases can be PHP, ASP, and so on), does not exist in the production folder. Well, the reason is that that folder is not part of what Sencha CMD and build files consider. Normally, many developers will say, "Hey, let's copy the folder and let's move on." However, the good news is that we can include that folder with an Apache Ant task Customizing the build.xml file We can add custom code (Apache Ant style) to perform new tasks and things we need in order to make our application build even better. Let's open the build.xml file. You will see something like this: <?xml version="1.0" encoding="utf-8"?> <project name="myApp" default=".help"> <!-- comments... --> <import file="${basedir}/.sencha/app/build-impl.xml"/> <!-- comments... --> </project> So, let's place the following code before </project>: <target name="-after-build" depends="init"> <copy todir="${build.out.base.path}/serverside" overwrite="false"> <fileset dir="${app.dir}/serverside" includes="**/*"/> </copy> </target> </project> This new code inside the build.xml file establishes that after making the whole building process, if there is no error during the Init process then it will copy the (${app.dir}/ serverside) folder to the (${build.out.base.path}/serverside) output path. So now, let's type the command for building the application again: sencha app build –c In this case, we added -c to first clean the build/production folder and create a new set of files. After the process completes, take a look at the folder contents, and you will see this: Notice that now the serverside folder has been copied to the production build folder, thanks to the custom code we placed in build.xml file. Compressing the code After building our application, let's open the app.js file. We may see something like what is shown here: By default, the build process uses the YUI compressor to compact the JS code (http://yui.github.io/yuicompressor/). Inside the .sencha folder, there are many files, and depending on the type of build we are creating, there are some files such as the base file, where the properties are defined in defaults.properties. This file must not be changed whatsoever; for that, we have other files that can override the values defined in this file. As an example for the production build, we have the following files: production.defaults.properties: This file will contain some properties/variables that will be used for the production build. production.properties: This file has only comments. The idea behind this file is that developers place the variables they want in order to customize the production build. By default, in the production.defaults.properties file, you will see something like the following code: # Comments ...... # more comments...... build.options.logger=no build.options.debug=false # enable the full class system optimizer app.output.js.optimize=true build.optimize=${build.optimize.enable} enable.cache.manifest=true enable.resource.compression=true build.embedded.microloader.compressor=-closure Now, as an example of compression, let's make a change and place some variables inside the production.properties file. The code we will place here will override the properties set in defaults.properties and production.defaults.properties. So, let's write the following code after the comments: build.embedded.microloader.compressor=-closure build.compression.yui=0 build.compression.closure=1 build.compression=-closure With this code, we are setting up the build process to use closure as the JavaScript compressor and also for the micro loader. Now save the file and use the Sencha CMD tool once again: sencha app build Wait for the process to end and take a look at app.js. You can notice that the code is quite different. This is because the code compiler (closure) was the one that made the compression. Run the app and you will notice no change in the behavior and use of the application. As we have used the production.properties file in this example, notice that in the .sencha folder, we have some other files for different environments, such as: Environment File (or files) Testing testing.defaults.properties and testing.properties Development development.defaults.properties and development.properties Production production.defaults.properties and production.properties It's not recommended that you change the *.default.properties file. That's the reason of the *.properties file, so that you can set your own variables, and doing this will override the settings on default file. Packaging and deploying Finally, after we have built our application, we have our production build/package ready to be deployed. We will have the following structure in our folder: Now we have all the files required to make our application work on a public server. We don't need to upload anything from the Ext JS folder because we have all that we need in app.js (all of the Ext JS code and our code). Also, the resources file contains the images, CSS (the theme used in the app), and of course our serverside folder. So now, we need to upload all of the content to the server: And we are ready to test the production in a public server. Summary In this article, you learned the reasons that will make us to consider using Ext JS 5 for developing projects. We briefly mentioned some of the significant additional features in version 5 that are instrumental in developing applications. Later, we talked about compiling and preparing an application for a production environment. Using Sencha CMD and also configuring JSON or XML files to build a project can sometimes be an overwhelming situation, but don't panic! Check out the documentation of Sencha and Apache. Do remember that there's no reason to be afraid of testing and playing with the configurations. It's all part of learning and knowing how to use Sencha Ext JS. Resources for Article: Further resources on this subject: The Login Page using Ext JS [Article] So, what is Ext JS? [Article] AngularJS Performance [Article]
Read more
  • 0
  • 0
  • 9996
article-image-getting-started-with-microsofts-guidance
Prakhar Mishra
26 Sep 2023
8 min read
Save for later

Getting Started with Microsoft’s Guidance

Prakhar Mishra
26 Sep 2023
8 min read
Dive deeper into the world of AI innovation and stay ahead of the AI curve! Subscribe to our AI_Distilled newsletter for the latest insights and books. Don't miss out – sign up today!IntroductionThe emergence of a massive language model is a watershed moment in the field of artificial intelligence (AI) and natural language processing (NLP). Because of their extraordinary capacity to write human-like text and perform a range of language-related tasks, these models, which are based on deep learning techniques, have earned considerable interest and acceptance. This field has undergone significant scientific developments in recent years. Researchers all over the world have been developing better and more domain-specific LLMs to meet the needs of various use cases.Large Language Models (LLMs) such as GPT-3 and its descendants, like any technology or strategy, have downsides and limits. And, in order to use LLMs properly, ethically, and to their maximum capacity, it is critical to grasp their downsides and limitations. Unlike large language models such as GPT-4, which can follow the majority of commands. Language models that are not equivalently large enough (such as GPT-2, LLaMa, and its derivatives) frequently suffer from the difficulty of not following instructions adequately, particularly the part of instruction that asks for generating output in a specific structure. This causes a bottleneck when constructing a pipeline in which the output of LLMs is fed to other downstream functions.Introducing Guidance - an effective and efficient means of controlling modern language models compared to conventional prompting methods. It supports both open (LLaMa, GPT-2, Alpaca, and so on) and closed LLMs (ChatGPT, GPT-4, and so on). It can be considered as a part of a larger ecosystem of tools for expanding the capabilities of language models.Guidance uses Handlebars - a templating language. Handlebars allow us to build semantic templates effectively by compiling templates into JavaScript functions. Making it’s execution faster than other templating engines. Guidance also integrates well with Jsonformer - a bulletproof way to generate structured JSON from language models. Here’s a detailed notebook on the same. Also, in case you were to use OpenAI from Azure AI then Guidance has you covered - notebook.Moving on to some of the outstanding features that Guidance offers. Feel free to check out the entire list of features.Features1. Guidance Acceleration - This addition significantly improves inference performance by efficiently utilizing the Key/Value caches as we proceed through the prompt by keeping a session state with the LLM inference. Benchmarking revealed a 50% reduction in runtime when compared to standard prompting approaches. Here’s the link to one of the benchmarking exercises. The below image shows an example of generating a character profile of an RPG game in JSON format. The green highlights are the generations done by the model, whereas the blue and no highlights are the ones that are copied as it is from the input prompt, unlike the traditional method that tries to generate every bit of it.SourceNote: As of now, the Guidance Acceleration feature is implemented for open LLMs. We can soon expect to see if working with closed LLMs as well.2.  Token Healing - This feature attempts to correct tokenization artifacts that commonly occur at the border between the end of a prompt and the start of a group of generated tokens.For example - If we ask LLM to auto-complete a URL with the below-mentioned Input, it’s likely to produce the shown output. Apart from the obvious limitation that the URL might not be valid. I'd like to draw your attention to the extra space it creates (highlighted in red). Such considerations make it difficult to construct a dependable parsing function and robustly absorb its result into subsequent phases.Input: “The link is <a href=http:”Actual Output: “The link is <a href=http: //www.google.com/search?q”Expected Output: “The link is <a href=http://www.google.com/search?q” This is the exact bucket of problems that Token Healing tries to solve using the backtracking method. Feel free to check out this jupyter notebook for more examples.3. Guaranteed Output Structure - Large language models are fantastic at producing useful outputs, but not so much at producing outputs in a specified format (especially open-source ones like LLaMa, GPT-2, and so on). When we want to use the output of a language model as input to another system, this is frequently an issue. With Handlebars, guidance guarantees the output format to be the same as what was being asked for.Let’s now see Guidance in action -InstallationInstalling guidance is a breeze, just do a pip :$ pip install guidanceAssume we are now creating a product description for an e-commerce website. Here's how the traditional generation compares to the guidance generation. Feel free to play with this colab notebook with both the below examples.Traditional GenerationInput:Consider you are an e-commerce expert. You need to write a product description for a product to be listed on the e-commerce website. The product description consists of fixed set of fields to be filled in the JSON. The below shown JSON is the structure of the description with constraints for each of the attributes enclosed within < > brackets. Please follow the mentioned constraint and format diligently. { prod_id: <numeric value of 5 digits>, prod_name: <name starts with the prefix 'p_'>, prod_price: <should be an integer between 1 and 16. Should end with suffix '$'> } The product description isOutput:Consider you are an e-commerce expert. You need to write a product description for a product to be listed on the e-commerce website. The product description consists of fixed set of fields to be filled in the JSON. The below shown JSON is the structure of the description with constraints for each of the attributes enclosed within < > brackets. Please follow the mentioned constraint and format diligently. {   prod_id: <numeric value of 5 digits>,     prod_name: <name starts with the prefix 'p_'>, prod_price: <should be an integer between 1 and 16. Should end with suffix '$'> } The product description is { resentprod_id: <numeric value of 5 digits>, resentprod_name: <name begins with the prefix 'p_'>, resentprod_price: <should be an integer between 1 and 16. Should end with suffix '$'> } In the above example, the product description has 5 constraint fields and 5 attribute fields. The constraints are as follows: resentprod_id: - value of 5 digits, resentprod_name: - name of the product, resentprod_price: - price of the product, resentprod_price_suffix: - suffix of the product price, resentprod_id: - the product id, resentpro diabetic_id: value of 4 digits, resentprod_ astronomer_id: - value of 4 digits, resentprod_ star_id: - value of 4 digits, resentprod_is_generic: - if the product is generic and not the generic type, resentprod_type: - the type of the product, resentprod_is_generic_typeHere’s the code for the above example with GPT-2 language model -``` from transformers import AutoModelForCausalLM, AutoTokenizer tokenizer = AutoTokenizer.from_pretrained("gpt2-large") model = AutoModelForCausalLM.from_pretrained("gpt2-large") inputs = tokenizer(Input, return_tensors="pt") tokens = model.generate( **inputs, max_new_tokens=256, temperature=0.7, do_sample=True, )Output:tokenizer.decode(tokens[0], skip_special_tokens=True)) ```Guidance GenerationInput w/ code:guidance.llm = guidance.llms.Transformers("gpt-large") # define the prompt program = guidance("""Consider you are an e-commerce expert. You need to write a product description for a product to be listed on the e-commerce website. The product description consists of fixed set of fields to be filled in the JSON. The following is the format ```json { "prod_id": "{{gen 'id' pattern='[0-9]{5}' stop=','}}", "prod_name": "{{gen 'name' pattern='p_[A-Za-z]+' stop=','}}", "prod_price": "{{gen 'price' pattern='\b([1-9]|1[0-6])\b\$' stop=','}}" }```""") # execute the prompt Output = program()Output:Consider you are an e-commerce expert. You need to write a product description for a product to be listed on the e-commerce website. The product description consists of a fixed set of fields to be filled in the JSON. The following is the format```json { "prod_id": "11231", "prod_name": "p_pizzas", "prod_price": "11$" }```As seen in the preceding instances, with guidance, we can be certain that the output format will be followed within the given restrictions no matter how many times we execute the identical prompt. This capability makes it an excellent choice for constructing any dependable and strong multi-step LLM pipeline.I hope this overview of Guidance has helped you realize the value it may provide to your daily prompt development cycle. Also, here’s a consolidated notebook showcasing all the features of Guidance, feel free to check it out.Author BioPrakhar has a Master’s in Data Science with over 4 years of experience in industry across various sectors like Retail, Healthcare, Consumer Analytics, etc. His research interests include Natural Language Understanding and generation, and has published multiple research papers in reputed international publications in the relevant domain. Feel free to reach out to him on LinkedIn
Read more
  • 0
  • 0
  • 9993

article-image-asynchronous-programming-futures-and-promises
Packt
30 Dec 2016
18 min read
Save for later

Asynchronous Programming with Futures and Promises

Packt
30 Dec 2016
18 min read
This article by Aleksandar Prokopec, author of the book Learning Concurrent Programming in Scala - Second Edition, explains the concepts of asynchronous programming in Scala. Asynchronous programming helps you eliminate blocking; instead of suspending the thread whenever a resource is not available, a separate computation is scheduled to proceed once the resource becomes available. In a way, many of the concurrency patterns seen so far support asynchronous programming; thread creation and scheduling execution context tasks can be used to start executing a computation concurrent to the main program flow. Still, it is not straightforward to use these facilities directly when avoiding blocking or composing asynchronous computations. In this article, we will focus on two abstractions in Scala that are specifically tailored for this task—futures and promises. More specifically, we will study the following topics: Starting asynchronous computations and using Future objects Using Promise objects to interface Blocking threads inside asynchronous computations Alternative future frameworks (For more resources related to this topic, see here.) Futures The parallel executions in a concurrent program proceed on entities called threads. At any point, the execution of a thread can be temporarily suspended until a specific condition is fulfilled. When this happens, we say that the thread is blocked. Why do we block threads in the first place in concurrent programming? One of the reasons is that we have a finite amount of resources; so, multiple computations that share these resources sometimes need to wait. In other situations, a computation needs specific data to proceed, and if that data is not yet available, threads responsible for producing the data could be slow or the source of the data could be external to the program. A classic example is waiting for the data to arrive over the network. Let's assume that we have a getWebpage method that returns that webpage's contents when given a url string with the location of the webpage: def getWebpage(url: String): String The return type of the getWebpage method is String; the method must return a string with the webpage's contents. Upon sending an HTTP request, though, the webpage's contents are not available immediately. It takes some time for the request to travel over the network to the server and back before the program can access the document. The only way for the method to return the contents of the webpage as a string value is to wait for the HTTP response to arrive. However, this can take a relatively long period of time from the program's point of view even with a high-speed Internet connection, the getWebpage method needs to wait. Since the thread that called the getWebpage method cannot proceed without the contents of the webpage, it needs to pause its execution; therefore, the only way to correctly implement the getWebpage method is for blocking. We already know that blocking can have negative side effects, so can we change the return value of the getWebpage method to some special value that can be returned immediately? The answer is yes. In Scala, this special value is called a future. The future is a placeholder, that is, a memory location for the value. This placeholder does not need to contain a value when the future is created; the value can be placed into the future eventually by getWebpage. We can change the signature of the getWebpage method to return a future as follows: def getWebpage(url: String): Future[String] Here, the Future[String] type means that the future object can eventually contain a String value. We can now implement getWebpage without blocking—we can start the HTTP request asynchronously and place the webpage's contents into the future when they become available. When this happens, we can say that the getWebpage method completes the future. Importantly, after the future is completed with some value, that value can no longer change. The Future[T] type encodes latency in the program—use it to encode values that will become available later during execution. This removes blocking from the getWebpage method, but it is not clear how the calling thread can extract the content of the future. Polling is one non-blocking way of extracting the content. In the polling approach, the calling thread calls a special method for blocking until the value becomes available. While this approach does not eliminate blocking, it transfers the responsibility of blocking from the getWebpage method to the caller thread. Java defines its own Future type to encode values that will become available later. However, as a Scala developer, you should use Scala's futures instead; they allow additional ways of handling future values and avoid blocking, as we will soon see. When programming with futures in Scala, we need to distinguish between future values and future computations. A future value of the Future[T] type denotes some value of the T type in the program that might not be currently available, but could become available later. Usually, when we say a future, we really mean a future value. In the scala.concurrent package, futures are represented with the Future[T] trait: trait Future[T] By contrast, a future computation is an asynchronous computation that produces a future value. A future computation can be started by calling the apply method on the Future companion object. This method has the following signature in the scala.concurrent package: def apply[T](b: =>T)(implicit e: ExecutionContext): Future[T] This method takes a by-name parameter of the T type. This is the body of the asynchronous computation that results in some value of type T. It also takes an implicit ExecutionContext parameter, which abstracts over where and when the thread gets executed. Recall that Scala's implicit parameters can either be specified when calling a method, in the same way as normal parameters, or they can be left out; in this case, the Scala compiler searches for a value of the ExecutionContext type in the surrounding scope. Most Future methods take an implicit execution context. Finally, the Future.apply method returns a future of the T type. This future is completed with the value resulting from the asynchronous computation, b. Starting future computations Let's see how to start a future computation in an example. We first import the contents of the scala.concurrent package. We then import the global execution context from the Implicits object. This makes sure that the future computations execute on the global context—the default execution context you can use in most cases: import scala.concurrent._ import ExecutionContext.Implicits.global object FuturesCreate extends App { Future { log("the future is here") } log("the future is coming") Thread.sleep(1000) } The order in which the log method calls (in the future computation and the main thread) execute is nondeterministic. The Future singleton object followed by a block is syntactic sugar for calling the Future.apply method. In the following example, we can use the scala.io.Source object to read the contents of our build.sbt file in a future computation. The main thread calls the isCompleted method on the future value, buildFile, returned from the future computation. Chances are that the build file was not read so fast, so isCompleted returns false. After 250 milliseconds, the main thread calls isCompleted again, and this time, isCompleted returns true. Finally, the main thread calls the value method, which returns the contents of the build file: import scala.io.Source object FuturesDataType extends App { val buildFile: Future[String] = Future { val f = Source.fromFile("build.sbt") try f.getLines.mkString("n") finally f.close() } log(s"started reading the build file asynchronously") log(s"status: ${buildFile.isCompleted}") Thread.sleep(250) log(s"status: ${buildFile.isCompleted}") log(s"value: ${buildFile.value}") } In this example, we used polling to obtain the value of the future. The Future singleton object's polling methods are non-blocking, but they are also nondeterministic; the isCompleted method will repeatedly return false until the future is completed. Importantly, completion of the future is in a happens-before relationship with the polling calls. If the future completes before the invocation of the polling method, then its effects are visible to the thread after the polling completes. Shown graphically, polling looks as shown in the following figure: Polling diagram Polling is like calling your potential employer every five minutes to ask if you're hired. What you really want to do is hand in a job application and then apply for other jobs instead of busy-waiting for the employer's response. Once your employer decides to hire you, he will give you a call on the phone number you left him. We want futures to do the same; when they are completed, they should call a specific function that we left for them. Promises Promises are objects that can be assigned a value or an exception only once. This is why promises are sometimes also called single-assignment variables. A promise is represented with the Promise[T] type in Scala. To create a promise instance, we use the Promise.apply method on the Promise companion object: def apply[T](): Promise[T] This method returns a new promise instance. Like the Future.apply method, the Promise.apply method returns immediately; it is non-blocking. However, the Promise.apply method does not start an asynchronous computation, it just creates a fresh Promise object. When the Promise object is created, it does not contain a value or an exception. To assign a value or an exception to a promise, we use the success or failure method, respectively. Perhaps you have noticed that promises are very similar to futures. Both futures and promises are initially empty and can be completed with either a value or an exception. This is intentional—every promise object corresponds to exactly one future object. To obtain the future associated with a promise, we can call the future method on the promise. Calling this method multiple times always returns the same future object. A promise and a future represent two aspects of a single-assignment variable. The promise allows you to assign a value to the future object, whereas the future allows you to read that value. In the following code snippet, we create two promises, p and q, that can hold string values. We then install a foreach callback on the future associated with the p promise and wait for 1 second. The callback is not invoked until the p promise is completed by calling the success method. We then fail the q promise in the same way and install a failed.foreach callback: object PromisesCreate extends App { val p = Promise[String] val q = Promise[String] p.future foreach { case x => log(s"p succeeded with '$x'") } Thread.sleep(1000) p success "assigned" q failure new Exception("not kept") q.future.failed foreach { case t => log(s"q failed with $t") } Thread.sleep(1000) } Alternatively, we can use the complete method and specify a Try[T] object to complete the promise. Depending on whether the Try[T] object is a success or a failure, the promise is successfully completed or failed. Importantly, after a promise is either successfully completed or failed, it cannot be assigned an exception or a value again in any way. Trying to do so results in an exception. Note that this is true even when there are multiple threads simultaneously calling success or complete. Only one thread completes the promise, and the rest throw an exception. Assigning a value or an exception to an already completed promise is not allowed and throws an exception. We can also use the trySuccess, tryFailure, and tryComplete methods that correspond to success, failure, and complete states, respectively, but return a Boolean value to indicate whether the assignment was successful. Recall that using the Future.apply method and callback methods with referentially transparent functions results in deterministic concurrent programs. As long as we do not use the trySuccess, tryFailure, and tryComplete methods, and none of the success, failure, and complete methods ever throws an exception, we can use promises and retain determinism in our programs. We now have everything we need to implement our custom Future.apply method. We call it the myFuture method in the following example. The myFuture method takes a b by-name parameter that is the asynchronous computation. First, it creates a p promise. Then, it starts an asynchronous computation on the global execution context. This computation tries to evaluate b and complete the promise. However, if the b body throws a nonfatal exception, the asynchronous computation fails the promise with that exception. In the meanwhile, the myFuture method returns the future immediately after starting the asynchronous computation: import scala.util.control.NonFatal object PromisesCustomAsync extends App { def myFuture[T](b: =>T): Future[T] = { val p = Promise[T] global.execute(new Runnable { def run() = try { p.success(b) } catch { case NonFatal(e) => p.failure(e) } }) p.future } val f = myFuture { "naa" + "na" * 8 + " Katamari Damacy!" } f foreach { case text => log(text) } } This is a common pattern when producing futures. We create a promise, let some other computation complete that promise, and return the corresponding future. However, promises were not invented just for our custom future's myFuture computation method. In the following sections, we will study use cases in which promises are useful. Futures and blocking Futures and asynchronous computations mainly exist to avoid blocking, but in some cases, we cannot live without it. It is, therefore, valid to ask how blocking interacts with futures. There are two ways to block with futures. The first way is to wait until a future is completed. The second way is by blocking from within an asynchronous computation. We will study both the topics in this section. Awaiting futures In rare situations, we cannot use callbacks or future combinators to avoid blocking. For example, the main thread that starts multiple asynchronous computations has to wait for these computations to finish. If an execution context uses daemon threads, as is the case with the global execution context, the main thread needs to block to prevent the JVM process from terminating. In these exceptional circumstances, we use the ready and result methods on the Await object from the scala.concurrent package. The ready method blocks the caller thread until the specified future is completed. The result method also blocks the caller thread, but returns the value of the future if it was completed successfully or throws the exception in the future if the future is failed. Both the methods require specifying a timeout parameter; the longest duration that the caller should wait for the completion of the future before a TimeoutException method is thrown. To specify a timeout, we import the scala.concurrent.duration package. This allows us to write expressions such as 10.seconds: import scala.concurrent.duration._ object BlockingAwait extends App { val urlSpecSizeFuture = Future { val specUrl = "http://www.w3.org/Addressing/URL/url-spec.txt" Source.fromURL(specUrl).size } val urlSpecSize = Await.result(urlSpecSizeFuture, 10.seconds) log(s"url spec contains $urlSpecSize characters") } In this example, the main thread starts a computation that retrieves the URL specification and then awaits. By this time, the World Wide Web Consortium (W3C) is worried that a DOS attack is under way, so this is the last time we download the URL specification. Blocking in asynchronous computations Waiting for the completion of a future is not the only way to block. Some legacy APIs do not use callbacks to asynchronously return results. Instead, such APIs expose the blocking methods. After we call a blocking method, we lose control over the thread; it is up to the blocking method to unblock the thread and return the control back. Execution contexts are often implemented using thread pools. By starting future computations that block, it is possible to reduce parallelism and even cause deadlocks. This is illustrated in the following example, in which 16 separate future computations call the sleep method, and the main thread waits until they complete for an unbounded amount of time: val startTime = System.nanoTime val futures = for (_ <- 0 until 16) yield Future { Thread.sleep(1000) } for (f <- futures) Await.ready(f, Duration.Inf) val endTime = System.nanoTime log(s"Total time = ${(endTime - startTime) / 1000000} ms") log(s"Total CPUs = ${Runtime.getRuntime.availableProcessors}") Assume that you have eight cores in your processor. This program does not end in one second. Instead, a first batch of eight futures started by Future.apply will block all the worker threads for one second, and then another batch of eight futures will block for another second. As a result, none of our eight processor cores can do any useful work for one second. Avoid blocking in asynchronous computations, as it can cause thread starvation. If you absolutely must block, then the part of the code that blocks should be enclosed within the blocking call. This signals to the execution context that the worker thread is blocked and allows it to temporarily spawn additional worker threads if necessary:   val futures = for (_ <- 0 until 16) yield Future { blocking { Thread.sleep(1000) } } With the blocking call around the sleep call, the global execution context spawns additional threads when it detects that there is more work than the worker threads. All 16 future computations can execute concurrently and the program terminates after one second. The Await.ready and Await.result statements block the caller thread until the future is completed and are in most cases used outside asynchronous computations. They are blocking operations. The blocking statement is used inside asynchronous code to designate that the enclosed block of code contains a blocking call. It is not a blocking operation by itself. Alternative future frameworks Scala futures and promises API resulted from an attempt to consolidate several different APIs for asynchronous programming, among them, legacy Scala futures, Akka futures, Scalaz futures, and Twitter's Finagle futures. Legacy Scala futures and Akka futures have already converged to the futures and promises API that you've learned about so far in this article. Finagle's com.twitter.util.Future type is planned to eventually implement the same interface as scala.concurrent.Future package, while the Scalaz scalaz.concurrent.Future type implements a slightly different interface. In this section, we give a brief of Scalaz futures. To use Scalaz, we add the following dependency to the build.sbt file: libraryDependencies += "org.scalaz" %% "scalaz-concurrent" % "7.0.6" We now encode an asynchronous tombola program using Scalaz. The Future type in Scalaz does not have the foreach method. Instead, we use its runAsync method, which asynchronously runs the future computation to obtain its value and then calls the specified callback: import scalaz.concurrent._ object Scalaz extends App { val tombola = Future { scala.util.Random.shuffle((0 until 10000).toVector) } tombola.runAsync { numbers => log(s"And the winner is: ${numbers.head}") } tombola.runAsync { numbers => log(s"... ahem, winner is: ${numbers.head}") } } Unless you are terribly lucky and draw the same permutation twice, running this program reveals that the two runAsync calls print different numbers. Each runAsync call separately computes the permutation of the random numbers. This is not surprising, as Scalaz futures have the pull semantics, in which the value is computed each time some callback requests it, in contrast to Finagle and Scala futures' push semantics, in which the callback is stored, and applied if and when the asynchronously computed value becomes available. To achieve the same semantics, as we would have with Scala futures, we need to use the start combinator that runs the asynchronous computation once, and caches its result: val tombola = Future { scala.util.Random.shuffle((0 until 10000).toVector) } start With this change, the two runAsync calls use the same permutation of random numbers in the tombola variable and print the same values. We will not dive further into the internals of alternate frameworks. The fundamentals about futures and promises that you learned about in this article should be sufficient to easily familiarize yourself with other asynchronous programming libraries, should the need arise. Summary This article presented some powerful abstractions for asynchronous programming. We saw how to encode latency with the Future type. You learned that futures and promises are closely tied together and that promises allow interfacing with legacy callback-based systems. In cases, where blocking was unavoidable, you learned how to use the Await object and the blocking statement. Finally, you learned that the Scala Async library is a powerful alternative for expressing future computations more concisely. Resources for Article: Further resources on this subject: Introduction to Scala [article] Concurrency in Practice [article] Integrating Scala, Groovy, and Flex Development with Apache Maven [article]
Read more
  • 0
  • 0
  • 9988

article-image-deploying-your-own-server
Packt
30 Sep 2015
16 min read
Save for later

Deploying on your own server

Packt
30 Sep 2015
16 min read
In this article by Jack Stouffer, the author of the book Mastering Flask, you will learn how to deploy and host your application on the different options available, and the advantages and disadvantages related to them. The most common way to deploy any web app is to run it on a server that you have control over. Control in this case means access to the terminal on the server with an administrator account. This type of deployment gives you the most amount of freedom out of the other choices as it allows you to install any program or tool you wish. This is in contrast to other hosting solutions where the web server and database are chosen for you. This type of deployment also happens to be the least expensive option. The downside to this freedom is that you take the responsibility of keeping the server up, backing up user data, keeping the software on the server up to date to avoid security issues, and so on. Entire books have been written on good server management, so if this is not a responsibility that you believe you or your company can handle, it would be best if you choose one of the other deployment options. This section will be based on a Debian Linux-based server, as Linux is far and away the most popular OS for running web servers, and Debian is the most popular Linux distro (a particular combination of software and the Linux kernel released as a package). Any OS with Bash and a program called SSH (which will be introduced in the next section) will work for this article, the only differences will be the command-line programs to install software on the server. (For more resources related to this topic, see here.) Each of these web servers will use a protocol named Web Server Gateway Interface (WSGI), which is a standard designed to allow Python web applications to easily communicate with web servers. We will never directly work with WSGI. However, most of the web server interfaces we will be using will have WSGI in their name, and it can be confusing if you don't know what the name is. Pushing code to your server with fabric To automate the process of setting up and pushing our application code to the server, we will use a Python tool called fabric. Fabric is a command-line program that reads and executes Python scripts on remote servers using a tool called SSH. SSH is a protocol that allows a user of one computer to remotely log in to another computer and execute commands on the command line, provided that the user has an account on the remote machine. To install fabric, we will use pip: $ pip install fabric Fabric commands are collections of command-line programs to be run on the remote machine's shell, in this case, Bash. We are going to make three different commands: one to run our unit tests, one to set up a brand new server to our specifications, and one to have the server update its copy of the application code with git. We will store these commands in a new file at the root of our project directory called fabfile.py. As it's the easiest to create, let's make the test command first: from fabric.api import local def test(): local('python -m unittest discover') To run this function from the command line, we can use fabric's command-line interface by passing the name of the command to run: $ fab test [localhost] local: python -m unittest discover ..... --------------------------------------------------------------------- Ran 5 tests in 6.028s OK Fabric has three main commands: local, run, and sudo. The local function, as seen in the preceding function, runs commands on the local computer. The run and sudo functions run commands on a remote machine, but sudo runs commands as an administrator. All of these functions notify fabric if the command ran successfully or not. If a command didn't run successfully, meaning that our tests failed in this case, any other commands in the function will not be run. This is useful for our commands because it allows us to force ourselves not to push any code to the server that does not pass our tests. Now we need to create the command to set up a new server from scratch. What this command will do is install the software our production environment needs as well as downloads the code from our centralized git repository. It will also create a new user that will act as the runner of the web server as well as the owner of the code repository. Do not run your webserver or have your code deployed by the root user. This opens your application to a whole host of security vulnerabilities. This command will differ based on your operating system, and we will be adding to this command in the rest of the article based on what server you choose: from fabric.api import env, local, run, sudo, cd env.hosts = ['deploy@[your IP]'] def upgrade_libs(): sudo("apt-get update") sudo("apt-get upgrade") def setup(): test() upgrade_libs() # necessary to install many Python libraries sudo("apt-get install -y build-essential") sudo("apt-get install -y git") sudo("apt-get install -y python") sudo("apt-get install -y python-pip") # necessary to install many Python libraries sudo("apt-get install -y python-all-dev") run("useradd -d /home/deploy/ deploy") run("gpasswd -a deploy sudo") # allows Python packages to be installed by the deploy user sudo("chown -R deploy /usr/local/") sudo("chown -R deploy /usr/lib/python2.7/") run("git config --global credential.helper store") with cd("/home/deploy/"): run("git clone [your repo URL]") with cd('/home/deploy/webapp'): run("pip install -r requirements.txt") run("python manage.py createdb") There are two new fabric features in this script. One is the env.hosts assignment, which tells fabric the user and IP address of the machine it should be logging in to. Second, there is the cd function used in conjunction with the with keyword, which executes any functions in the context of that directory instead of the home directory of the deploy user. The line that modifies the git configuration is there to tell git to remember your repository's username and password, so you do not have to enter it every time you wish to push code to the server. Also, before the server is set up, we make sure to update the server's software to keep the server up to date. Finally, we have the function to push our new code to the server. In time, this command will also restart the web server and reload any configuration files that come from our code. But this depends on the server you choose, so this is filled out in the subsequent sections: def deploy(): test() upgrade_libs() with cd('/home/deploy/webapp'): run("git pull") run("pip install -r requirements.txt") So, if we were to begin working on a new server, all we would need to do to set it up is to run the following commands: $ fabric setup $ fabric deploy Running your web server with supervisor Now that we have automated our updating process, we need some program on the server to make sure that our web server, and database if you aren't using SQLite, is running. To do this, we will use a simple program called supervisor. All that supervisor does is automatically run command-line programs in background processes and allows you to see the status of running programs. Supervisor also monitors all of the processes its running, and if the process dies, it tries to restart it. To install supervisor, we need to add it to the setup command in our fabfile.py: def setup(): … sudo("apt-get install -y supervisor") To tell supervisor what to do, we need to create a configuration file and then copy it to the /etc/supervisor/conf.d/ directory of our server during the deploy fabric command. Supervisor will load all of the files in this directory when it starts and attempt to run them. In a new file in the root of our project directory named supervisor.conf, add the following: [program:webapp] command= directory=/home/deploy/webapp user=deploy [program:rabbitmq] command=rabbitmq-server user=deploy [program:celery] command=celery worker -A celery_runner directory=/home/deploy/webapp user=deploy This is the bare minimum configuration needed to get a web server up and running. But, supervisor has a lot more configuration options. To view all of the customizations, go to the supervisor documentation at http://supervisord.org/. This configuration tells supervisor to run a command in the context of /home/deploy/webapp under the deploy user. The right hand of the command value is empty because it depends on what server you are running and will be filled in for each section. Now we need to add a sudo call in the deploy command to copy this configuration file to the /etc/supervisor/conf.d/ directory: def deploy(): … with cd('/home/deploy/webapp'): … sudo("cp supervisord.conf /etc/supervisor/conf.d/webapp.conf") sudo('service supervisor restart') A lot of projects just create the files on the server and forget about them, but having the configuration file stored in our git repository and copied on every deployment gives several advantages. First, this means that it easy to revert changes if something goes wrong using git. Second, it means that we don't have to log in to our server in order to make changes to the files. Don't use the Flask development server in production. Not only it fails to handle concurrent connections, but it also allows arbitrary Python code to be run on your server. Gevent The simplest option to get a web server up and running is to use a Python library called gevent to host your application. Gevent is a Python library that adds an alternative way of doing concurrent programming outside of the Python threading library called coroutines. Gevent has an interface for running WSGI applications that is both simple and has good performance. A simple gevent server can easily handle hundreds of concurrent users, which is more in number than 99 percent of websites on the Internet will ever have. The downside to this option is that its simplicity means a lack of configuration options. There is no way, for example, to add rate limiting to the server or to add HTTPS traffic. This deployment option is purely for sites that you don't expect to receive a huge amount of traffic. Remember YAGNI (short for You Aren't Gonna Need It); only upgrade to a different web server if you really need to. Coroutines are a bit outside of the scope of this book, so a good explanation can be found at https://en.wikipedia.org/wiki/Coroutine. To install gevent, we will use pip: $ pip install gevent In a new file in the root of the project directory named gserver.py, add the following: from gevent.wsgi import WSGIServer from webapp import create_app app = create_app('webapp.config.ProdConfig') server = WSGIServer(('', 80), app) server.serve_forever() To run the server with supervisor, just change the command value to the following: [program:webapp] command=python gserver.py directory=/home/deploy/webapp user=deploy Now when you deploy, gevent will be automatically installed for you by running your requirements.txt on every deployment, that is, if you are properly pip freeze–ing after every new dependency is added. Tornado Tornado is another very simple way to deploy WSGI apps purely with Python. Tornado is a web server that is designed to handle thousands of simultaneous connections. If your application needs real-time data, Tornado also supports websockets for continuous, long-lived connections to the server. Do not use Tornado in production on a Windows server. The Windows version of Tornado is not only much slower, but it is considered beta quality software. To use Tornado with our application, we will use Tornado's WSGIContainer in order to wrap the application object to make it Tornado compatible. Then, Tornado will start to listen on port 80 for requests until the process is terminated. In a new file named tserver.py, add the following: from tornado.wsgi import WSGIContainer from tornado.httpserver import HTTPServer from tornado.ioloop import IOLoop from webapp import create_app app = WSGIContainer(create_app("webapp.config.ProdConfig")) http_server = HTTPServer(app) http_server.listen(80) IOLoop.instance().start() To run the Tornado with supervisor, just change the command value to the following: [program:webapp] command=python tserver.py directory=/home/deploy/webapp user=deploy Nginx and uWSGI If you need more performance or customization, the most popular way to deploy a Python web application is to use the web server Nginx as a frontend for the WSGI server uWSGI by using a reverse proxy. A reverse proxy is a program in networks that retrieves contents for a client from a server as if they returned from the proxy itself as shown in the following figure: Nginx and uWSGI are used in this way because we get the power of the Nginx frontend while having the customization of uWSGI. Nginx is a very powerful web server that became popular by providing the best combination of speed and customization. Nginx is consistently faster than other web severs, such as Apache httpd, and has native support for WSGI applications. The way it achieves this speed is several good architecture decisions as well as the decision early on that they were not going to try to cover a large amount of use cases like Apache does. Having a smaller feature set makes it much easier to maintain and optimize the code. From a programmer's perspective, it is also much easier to configure Nginx, as there is no giant default configuration file (httpd.conf) that needs to be overridden with .htaccess files in each of your project directories. One downside is that Nginx has a much smaller community than Apache, so if you have an obscure problem, you are less likely to be able to find answers online. Also, it's possible that a feature that most programmers are used to in Apache isn't supported in Nginx. uWSGI is a web server that supports several different types of server interfaces, including WSGI. uWSGI handles severing the application content as well as things such as load balancing traffic across several different processes and threads. To install uWSGI, we will use pip in the following way: $ pip install uwsgi In order to run our application, uWSGI needs a file with an accessible WSGI application. In a new file named wsgi.py in the top level of the project directory, add the following: from webapp import create_app app = create_app("webapp.config.ProdConfig") To test uWSGI, we can run it from the command line with the following: $ uwsgi --socket 127.0.0.1:8080 --wsgi-file wsgi.py --callable app --processes 4 --threads 2 If you are running this on your server, you should be able to access port 8080 and see your app (if you don't have a firewall that is). What this command does is load the app object from the wsgi.py file and makes it accessible from localhost on port 8080. It also spawns four different processes with two threads each, which are automatically load balanced by a master process. This amount of processes is the overkill for the vast, vast majority of websites. To start off, use a single process with two threads and scale up from there. Instead of adding all of the configuration options on the command line, we can create a text file to hold our configuration, which brings the same benefits for configuration that were listed in the section on supervisor. In a new file in the root of the project directory named uwsgi.ini, add the following: [uwsgi] socket = 127.0.0.1:8080 wsgi-file = wsgi.py callable = app processes = 4 threads = 2 uWSGI supports hundreds of configuration options as well as several official and unofficial plugins. To leverage the full power of uWSGI, you can explore the documentation at http://uwsgi-docs.readthedocs.org/. Let's run the server now from supervisor: [program:webapp] command=uwsgi uwsgi.ini directory=/home/deploy/webapp user=deploy We also need to install Nginx during the setup function: def setup(): … sudo("apt-get install -y nginx") Because we are installing Nginx from the OS's package manager, the OS will handle running Nginx for us. At the time of writing, the Nginx version in the official Debian package manager is several years old. To install the most recent version, follow the instructions here: http://wiki.nginx.org/Install. Next, we need to create an Nginx configuration file and then copy it to the /etc/nginx/sites-available/ directory when we push the code. In a new file in the root of the project directory named nginx.conf, add the following server { listen 80; server_name your_domain_name; location / { include uwsgi_params; uwsgi_pass 127.0.0.1:8080; } location /static { alias /home/deploy/webapp/webapp/static; } } What this configuration file does is tell Nginx to listen for incoming requests on port 80 and forward all requests to the WSGI application that is listening on port 8080. Also, it makes an exception for any requests for static files and instead sends those requests directly to the file system. Bypassing uWSGI for static files gives a great performance boost, as Nginx is really good at serving static files quickly. Finally, in the fabfile.py file: def deploy(): … with cd('/home/deploy/webapp'): … sudo("cp nginx.conf " "/etc/nginx/sites-available/[your_domain]") sudo("ln -sf /etc/nginx/sites-available/your_domain " "/etc/nginx/sites-enabled/[your_domain]") sudo("service nginx restart") Apache and uWSGI Using Apache httpd with uWSGI has mostly the same setup. First off, we need an apache configuration file in a new file in the root of our project directory named apache.conf: <VirtualHost *:80> <Location /> ProxyPass / uwsgi://127.0.0.1:8080/ </Location> </VirtualHost> This file just tells Apache to pass all requests on port 80 to the uWSGI web server listening on port 8080. But, this functionality requires an extra Apache plugin from uWSGI called mod proxy uWSGI. We can install this as well as Apache in the set command: def setup(): … sudo("apt-get install -y apache2") sudo("apt-get install -y libapache2-mod-proxy-uwsgi") Finally, in the deploy command, we need to copy our Apache configuration file into Apache's configuration directory. def deploy(): … with cd('/home/deploy/webapp'): … sudo("cp apache.conf " "/etc/apache2/sites-available/[your_domain]") sudo("ln -sf /etc/apache2/sites-available/[your_domain] " "/etc/apache2/sites-enabled/[your_domain]") sudo("service apache2 restart") Summary In this article you learnt that there are many different options to hosting your application, each having their own pros and cons. Deciding on one depends on the amount of time and money you are willing to spend as well as the total number of users you expect. Resources for Article: Further resources on this subject: Handling sessions and users[article] Snap – The Code Snippet Sharing Application[article] Man, Do I Like Templates! [article] from fabric.api import local def test():     local('python -m unittest discover')
Read more
  • 0
  • 0
  • 9985
Packt
04 Sep 2013
7 min read
Save for later

Quick start – writing your first MDX query

Packt
04 Sep 2013
7 min read
(For more resources related to this topic, see here.) Step 1 – open the SQL Server Management Studio and connect to the cube The Microsoft SQL Server Management Studio (SSMS) is a client application used by the administrators to manage instances and by developers to create object and write queries. We will use SSMS to connect on the cube and write our first MDX query. Here's a screenshot of SSMS with a connection on a SSAS server: Click on the Windows button, click on All Programs, click on Microsoft SQL Server 2012, and then click on SQL Server Management Studio. In the Connect to Server window, in the Server type box, select Analysis Services. In the Server name box, type the name of your Analysis Services server. Click on Connect. In the SQL Server Management Studio window, click on the File menu, click on New, and then click on Analysis Services MDX Query. In the Connect to Analysis Services window, in the Server name box, type the name of you Analysis Services server, and then click on Connect. SELECT FROM WHERE If you have already written SQL queries, you might have already made connections with the T-SQL language. Here's my tip for you: don't, you will only hurt yourself. Some words are the same, but it is better to think MDX when writing MDX rather than to think SQL when writing MDX. Step 2 – SELECT The SELECT clause is the main part of the MDX query. You will define what are the measure and dimension members that you want to display. You also have to define on which axis of your result set you want to display the measure and dimension members. Axes Axes are the columns and rows of the result set. With SQL Server Analysis Services, upto 128 axes can be specified. The axes have a number which is zero-based. The first axe is 0, the second on is 1, and so on. So, if you want to use two axes, the first one will be 0 and the second will be 1. You cannot use axe 0 and axe 2, if you don't define axe 1. For the first five axes, you can use the axis alias instead. After the axe 4, you will have to revert to the number because no other aliases are available. Axe Number Alias 0 Columns 1 Rows 2 Pages 3 Sections 4 Chapters Even if SSAS supports 128 axes, if you try to use more than two axes in SSMS in your query, you will get this error when you execute your MDX query: Results cannot be displayed for cellsets with more than two axes. So, always write your MDX queries using only two axes in SSMS and separate them with a comma. Tuples A tuple is a specific point in the cube where dimensions meet. A tuple can contain one or more members from the cube's dimensions, but you cannot have two members from the same dimension. If you want to display only the calendar year 2008, you will have to write [Date].[CY 2008]. If you want to have more than one dimension, you have to enclose them using parenthesis () and separate them with a comma. Calendar year for United States will look like ([Date].[CY 2008], [Geography].[United States]). Even if you are writing a tuple with only a single member from a single dimension, it is good practice to enclose it in parenthesis. Sets If you want to display the year 2005 to 2008, you will write four single-dimension tuples which composes a set. When writing the set, you separate the tuples with commas and wrap it all with curly braces {} and separate the tuples with commas such as {[Date].[CY 2005], [Date].[CY 2006] , [Date].[CY 2007] , [Date].[CY 2008]} to have the calendar years from 2005 to 2008. Since all the tuples are from the same dimension, you can also write it using a colon (:), such as {[Date].[CY 2005]: [Date].[CY 2008]} which will give you the years 2005 to 2008. With SSAS 2012, you can write {[Date].[CY 2008]: [Date].[CY 2005]} and the result will still be from 2005 to 2008. What about the calendar year 2008 for both Canada and the United States? You will write two tuples. A set can be composed of one or more tuples. The tuples must have the same dimensionality; otherwise, an error will occur. Meaning that the first member is from the Date dimension and the second from the Geography dimension. You cannot have the first tuple with Date-Geography and the second being Geography-Date; you will encounter an error. So the calendar year 2008 with Canada and United States will look such as {([Date].[CY 2008], [Geography].[Canada]), ([Date].[CY 2008], [Geography].[United States])}. When writing tuples, always use the form [Dimension].[Level].[MemberName]. So, [Geography].[Canada] should be written as [Geography].[Country].[Canada]. You could also use the member key instead of the member name. In SSAS, use the ampersand (&) when using the key; [Geography].[State-Province].[Quebec] with the name becomes [Geography].[State-Province].&[QC]&[CA] using the keys. What happens when you want to write bigger sets such as for the bikes and components product category in Canada and the United States from 2005 to 2008? Enter the Crossjoin function. Crossjoin takes two or more sets for arguments and returns you a set with the cross products or the specified sets. Crossjoin ({[Product].[Category].[Bikes], [Product].[Category].[Components]}, {[Geography].[Country].[Canada], [Geography].[Country].[United States]}, {[Date].[CY 2005] : [Date].[CY 2008]}) The MDX queries can be written using line-break to add visibility to the code. So each time we write a new set and even tuples, we write it on a new line and add some indentation: Crossjoin ({[Product].[Category].[Bikes], [Product].[Category].[Components]},{[Geography].[Country].[Canada], [Geography].[Country].[United States]}, {[Date].[CY 2005] : [Date].[CY 2008]}) Step 3 – FROM The FROM clause defines where the query will get the data. It can be one of the following four things: A cube. A perspective (a subset of dimensions and measures). A subcube (a MDX query inside a MDX query). A dimension (a dimension inside your SSAS database, you must use the dollar sign ($) before the name of the dimension). Step 4 – WHERE The WHERE clause is used to filter the dimensions and members out of the MDX query. The set used in the WHERE clause won't be displayed in your result set. Step 5 – comments Comment your code. You never know when somebody else will take a look on your queries and trying to understand what has been written could be harsh. There are three ways to use delimit comments inside the query: /* and */ // -- (pair of dashes) The /* and */ symbols can be used to comment multiple lines of text in your query. Everything between the /* and the */ symbols will be ignored when the MDX query is parsed. Use // or -- to begin a comment on a single line. Step 6 – your first MDX query So if you want to display the Resellers Sales Amount and Reseller Order Quantity measures on the columns, the years from 2006 to 2008 with the bikes and components product categories for Canada. First, identify what will go where. Start with the two axes, continue with the FROM clause, and finish with the WHERE clause. SELECT{[Measures].[Reseller Sales Amount], [Measures].[Reseller Order Quantity]} on columns,Crossjoin({[Date].[CY 2006] : [Date].[CY 2008]}, {[Product].[Category].[Bikes], [Product].[Category].[Components]}) on rowsFROM [Adventure Works]WHERE {[Geography].[Country].[Canada]} This query will return the following result set:     Reseller Sales Amount Reseller Order Quantity CY 2006 Bikes $3,938,283.99 4,563 CY 2006 Components $746,576.15 2,954 CY 2007 Bikes $4,417,665.71 5,395 CY 2007 Components $997,617.89 4,412 CY 2008 Bikes $1,909,709.62 2,209 CY 2008 Components $370,698.68 1,672 Summary In this article, we saw how to write the MDX queries in various steps. We used the FROM, WHERE, and SELECT clauses in writing the queries. This article was a quick start guide for starting to query and it will help you write more complex queries. Happy querying! Resources for Article : Further resources on this subject: Connecting to Microsoft SQL Server Compact 3.5 with Visual Studio [Article] MySQL Linked Server on SQL Server 2008 [Article] Microsoft SQL Azure Tools [Article]
Read more
  • 0
  • 0
  • 9984

article-image-getting-started-with-vagrant
Timothy Messier
25 Sep 2014
6 min read
Save for later

Getting started with Vagrant

Timothy Messier
25 Sep 2014
6 min read
As developers, one of the most frustrating types of bugs are those that only happen in production. Continuous integration servers have gone a long way in helping prevent these types of configuration bugs, but wouldn't it be nice to avoid these bugs altogether? Developers’ machines tend to be a mess of software. Multiple versions of languages, random services with manually tweaked configs, and debugging tools all contribute to a cluttered and unpredictable environment. Sandboxes Many developers are familiar with sandboxing development. Tools like virtualenv and rvm were created to install packages in isolated environments, allowing for multiple versions of packages to be installed on the same machine. Newer languages like nodejs install packages in a sandbox by default. These are very useful tools for managing package versions in both development and production, but they also lead to some of the aforementioned clutter. Additionally, these tools are specific to particular languages. They do not allow for easily installing multiple versions of services like databases. Luckily, there is Vagrant to address all these issues (and a few more). Getting started Vagrant at its core is simply a wrapper around virtual machines. One of Vagrant's strengths is its ease of use. With just a few commands, virtual machines can be created, provisioned, and destroyed. First grab the latest download for your OS from Vagrant's download page (https://www.vagrantup.com/downloads.html). NOTE: For Linux users, although your distro may have a version of Vagrant via its package manager, it is most likely outdated. You probably want to use the version from the link above to get the latest features. Also install Virtualbox (https://www.virtualbox.org/wiki/Downloads) using your preferred method. Vagrant supports other virtualization providers as well as docker, but Virtualbox is the easiest to get started with. When most Vagrant commands are run, they will look for a file named Vagrantfile (or vagrantfile) in the current directory. All configuration for Vagrant is done in this file and it is isolated to a particular Vagrant instance. Create a new Vagrantfile: $ vagrant init hashicorp/precise64 This creates a Vagrantfile using the base virtual image, hashicorp/precise64, which is a stock Ubuntu 12.04 image provided by Hashicorp. This file contains many useful comments about the various configurations that can be done. If this command is run with the --minimal flag, it will create a file without comments, like the following: # -*- mode: ruby -*- # vi: set ft=ruby : # Vagrantfile API/syntax version. Don't touch unless you know what you're doing! VAGRANTFILE_API_VERSION = "2" Vagrant.configure(VAGRANTFILE_API_VERSION) do |config| config.vm.box = "hashicorp/precise64" end Now create the virtual machine: $ vagrant up This isn’t too difficult. You now have a nice clean machine in just two commands. What did Vagrant up do? First, if you did not already have the base box, hashicorp/precise64, it was downloaded from Vagrant Cloud. Then, Vagrant made a new machine based on the current directory name and a unique number. Once the machine was booted it created a shared folder between the host's current directory and the guest's /vagrant. To access the new machine run: $ vagrant ssh Provisioning At this point, additional software needs to be installed. While a clean Ubuntu install is nice, it does not have the software to develop or run many applications. While it may be tempting to just start installing services and libraries with apt-get, that would just start to cause the same old clutter. Instead, use Vagrant's provisioning infrastructure. Vagrant has support for all of the major provision tools like Salt (http://www.saltstack.com/), Chef (http://www.getchef.com/chef/) or Ansible (http://www.ansible.com/home). It also supports calling shell commands. For the sake of keeping this post focused on Vagrant, an example using the shell provisioner will be used. However, to unleash the full power of Vagrant, use the same provisioner used for production systems. This will enable the virtual machine to be configured using the same provisioning steps as production, thus ensuring that the virtual machine mirrors the production environment. To continue this example, add a provision section to the Vagrantfile: # -*- mode: ruby -*- # vi: set ft=ruby : # Vagrantfile API/syntax version. Don't touch unless you know what you're doing! VAGRANTFILE_API_VERSION = "2" Vagrant.configure(VAGRANTFILE_API_VERSION) do |config| config.vm.box = "hashicorp/precise64" config.vm.provision "shell", path: "provision.sh" end This tells Vagrant to run the script provision.sh. Create this file with some install commands: #!/bin/bash # Install nginx apt-get -y install nginx Now tell Vagrant to provision the machine: $ vagrant provision NOTE: When running Vagrant for the first time, Vagrant provision is automatically called. To force a provision, call vagrant up --provision. The virtual machine should now have nginx installed with the default page being served. To access this page from the host, port forwarding can be set in the Vagrantfile: # -*- mode: ruby -*- # vi: set ft=ruby : # Vagrantfile API/syntax version. Don't touch unless you know what you're doing! VAGRANTFILE_API_VERSION = "2" Vagrant.configure(VAGRANTFILE_API_VERSION) do |config| config.vm.box = "hashicorp/precise64" config.vm.network "forwarded_port", guest: 80, host: 8080 config.vm.provision "shell", path: "provision.sh" end For the forwarding to take effect, the virtual machine must be restarted: $ vagrant halt && vagrant up Now in a browser go to http://localhost:8080 to see the nginx welcome page. Next steps This simple example shows how easy Vagrant is to use, and how quickly machines can be created and configured. However, to truly battle issues like "it works on my machine", Vagrant needs to leverage the same provisioning tools that are used for production servers. If these provisioning scripts are cleanly implemented, Vagrant can easily leverage them and make creating a pristine production-like environment easy. Since all configuration is contained in a single file, it becomes simple to include a Vagrant configuration along with code in a repository. This allows developers to easily create identical environments for testing code. Final notes When evaluating any open source tool, at some point you need to examine the health of the community supporting the tool. In the case of Vagrant, the community seems very healthy. The primary developer continues to be responsive about bugs and improvements, despite having launched a new company in the wake of Vagrant's success. New features continue to roll in, all the while keeping a very stable product. All of this is good news since there seem to be no other tools that make creating clean sandbox environments as effortless as Vagrant. About the author Timothy Messier is involved in many open source devops projects, including Vagrant and Salt, and can be contacted at tim.messier@gmail.com.
Read more
  • 0
  • 0
  • 9978
Modal Close icon
Modal Close icon