Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Save more on your purchases! discount-offer-chevron-icon
Savings automatically calculated. No voucher code required.
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Newsletter Hub
Free Learning
Arrow right icon
timer SALE ENDS IN
0 Days
:
00 Hours
:
00 Minutes
:
00 Seconds

How-To Tutorials

7008 Articles
article-image-connecting-react-redux-and-firebase-part-2
AJ Webb
10 Nov 2016
11 min read
Save for later

Connecting React to Redux and Firebase – Part 2

AJ Webb
10 Nov 2016
11 min read
This is the second part in a series on using Redux and Firebase with React. If you haven't read through the first part, then you should go back and do so, since this post will build on the other. If you have gone through the first part, then you're in the right place. In this final part of this two-part series, you will be creating actions in your app to update the store. Then you will create a Firebase app and set up async actions that will subscribe to Firebase. Whenever the data on Firebase changes, the store will automatically update and your app will receive the latest state. After all of that is wired up, you'll create a quick user interface. Lets get started. Creating Actions in Redux For your actions, you are going to create a new file inside of src/: [~/code/yak]$ touch src/actions.js Inside of actions.js, you will create your first action; an action in Redux is simply an object that contains a type and a payload. The type allows you to catch the action inside of your reducer and then use the payload to update state. The type can be a string literal or a constant. I recommend using constants for the same reasons given by the Redux team. In this app you will set up some constants to follow this practice. [~/code/yak]$ touch src/constants.js Create your first constant in src/constants.js: export const SEND_MESSAGE = 'CHATS:SEND_MESSAGE'; It is good practice to namespace your constants, like the constant above the CHATS. Namespace can be specific to the CHATS portion of the application and helps document your constants. Now go ahead and use that constant to create an action. In src/actions.js, add: import { SEND_MESSAGE } from './constants'; export function sendMessage(message) { return { type: SEND_MESSAGE, payload: message }; } You now have your first action! You just need to add a way to handle that action in your reducer. So open up src/reducers.js and do just that. First import the constant. import { SEND_MESSAGE } from './constants'; Then inside of the yakApp function, you'll handle different actions: export function yakApp(state = initialState, action) { switch(action.type) { case SEND_MESSAGE: return Object.assign({}, state, { messages: state.messages.concat([action.payload]) }); default: return state; } } A couple of things are happening here, you'll notice that there is a default case that returns state, your reducer must return some sort of state. If you don't return state your app will stop working. Redux does a good job of letting you know what's happening, but it is still good to know. Another thing to notice is that the SEND_MESSAGE case does not mutate the state. Instead it creates a new state and returns it. Mutating the state can result in bad side effects that are hard to debug; do your very best to never mutate the state. You should also avoid mutating the arguments passed to a reducer, performing side effects and calling any non-pure functions within your reducer. For the most part, your reducer is set up to take the new state and return it in conjunction with the old state. Now that you have an action and a reducer that is handling that action, you are ready for your component to interact with them. In src/App.js add an input and a button: <p className="App-intro"> <input type="text" />{' '} <button>Send Message</button> </p> Now that you have the input and button in, you're going to add two things. The first is a function that runs when the button is clicked. The second is a reference on the input so that the function can easily find the value of your input. js <p className="App-intro"> <input type="text" ref={(i) => this.message = i}/>{' '} <button onClick={this.handleSendMessage.bind(this)}>Send Message</button> </p> Now that you have those set, you will need to add the sendMessage function, but you'll also need to be able to dispatch an action. So you'll want to map your actions to props, similar to how you mapped state to props in the previous guide. At the end of the file, under the mapStateToProps function, add the following function: function mapDispatchToProps(dispatch) { return { sendMessage: (msg) => dispatch(sendMessage(msg)) }; } Then you'll need to add the mapDispatchToProps function to the export statement: export default connect(mapStateToProps, mapDispatchToProps)(App); And you'll need to import the sendMessage action into src/App.js: import { sendMessage } from './actions'; Finally you'll need to create the handleSendMessage method inside of your App class, just above the render method: handleSendMessage() { const { sendMessage } = this.props; const { value } = this.message; if (!value) { return false; } sendMessage(value); this.message.value = ''; } If you still have a console.log statement inside of your render method, from the last guide, you should see your message being added to the array each time you click on the button. The final piece of UI that you need to create is a list of all the messages you've received. Add the following to the render method in src/App.js, just below the <p/> that contains your input: {messages.map((message, i) => ( <p key={i} style={{textAlign: 'left', padding: '0 10px'}}>{message}</p> ))} Now you have a crude chat interface; you can improve it later. Set up Firebase If you've never used Firebase before, you'll first need a Google account. If you don't have a Google account, sign up for one; then you'll be able to create a new Firebase project. Head over to Firebase and create a new project. If you have trouble naming it, try YakApp_[YourName]. After you have created your Firebase project, you'll be taken to the project. There is a button that says "Add Firebase to your web app"; click on it and you'll be able to get the configuration information that you will need for your app to be able to work with Firebase. A dialog will open and you should see something like this: You will only be using the database for this project, so you'll only need a portion of the config. Keep the config handy while you prepare the app to use Firebase. First add the Firebase package to your app: [~/code/yak]$ npm install --save firebase Open src/actions.js and import firebase: import firebase from 'firebase' You will only be using Firebase in your actions; that is the reason you are importing it here. Once imported, you'll need to initialize Firebase. After the import section, add your Firebase config (the one from the image above), initialize Firebase and create a reference to messages: const firebaseConfig = { apiKey: '[YOUR API KEY]', databaseURL: '[YOUR DATABASE URL]' } firebase.initializeApp(firebaseConfig); const messages = firebase.database().ref('messages'); You'll notice that you only need the apiKey and databaseUrl for now. Eventually you might want to add auth and storage, and other facets of Firebase, but for now you only need database. Now that Firebase is set up and initialized, you can use it to store the chat history. One last thing before you leave the Firebase console: Firebase automatically sets the database rules to require users to be logged in. That is a great idea, but authentication is outside of the scope of this post. So you'll need to turn that off in the rules. In the Firebase console, click on the "Database" navigation item on the left side of the page; now there should be a tab bar with a "Rules" option. Click on "Rules" and replace what is in the textbox with: { "rules": { ".read": true, ".write": true } } Create subscriptions In order to subscribe to Firebase, you will need a way to send async actions. So you'll need another package to help with that. Go ahead and install redux-thunk. [~/code/yak]$ npm install --save redux-thunk After it's finished installing, you'll need to add it as middleware to your store. That means it's time to head back to src/index.js and add the extra parameters to the createStore function. First import redux-thunk into src/index.js and import another method from redux itself. import { createStore, applyMiddleware } from 'redux';import thunk from 'redux-thunk'; Now update the createStore call to be: const store = createStore(yakApp, undefined, applyMiddleware(thunk)); You are now ready to create async actions. In order to create more actions, you're going to need more constants. Head over to src/constants.js and add the following constant. export const RECEIVE_MESSAGE = 'CHATS:RECEIVE_MESSAGE'; This is the only constant you will need for now; after this guide, you should create edit and delete actions. At that point, you'll need more constants for those actions. For now, only concern yourself with adding and subscribing. You'll need to refactor your actions and reducer to handle these new features. In src/actions.js, refactor the sendMessage action to be an async action. The way redux-thunk works is that it intercepts any action that is dispatched and inspects it. If the action returns a function instead of an object, redux-thunk stops it from reaching the reducer and runs the function. If the action returns an object, redux-thunk ignores it and allows it to continue to the reducer. Change sendMessage to look like the following: export function sendMessage(message) { return function() { messages.push(message); }; } Now when you type a message in and hit the "Submit Message" button, the message will get stored in Firebase. Try it out! UH OH! There's a problem! You are adding messages to Firebase but they aren't showing up in your app anymore! That's ok. You can fix that! You'll need to create a few more actions though, and refactor your reducer. First off, you can delete one of your constants. You no longer need the constant from earlier: export const SEND_MESSAGE = 'CHATS:SEND_MESSAGE'; Go ahead and remove it; that leaves you with only one constant. Change your constant import in both src/actions.js and src/reducer.js: import { RECEIVE_MESSAGE } from './constants'; Now in actions, add the following action: function receiveMessage(message) { return { type: RECEIVE_MESSAGE, payload: message }; } That should look familiar; it's almost identical to the original sendMessage action. You'll also need to rename the action that your reducer is looking for. So now your reducer function should look like this: export function yakApp(state = initialState, action) { switch(action.type) { case RECEIVE_MESSAGE: return Object.assign({}, state, { messages: state.messages.concat([action.payload]) }); default: return state; } } Before the list starts to show up again, you'll need to create a subscription to Firebase. Sounds more complicated than it really is. Add the following action to src/actions.js: export function subscribeToMessages() { return function(dispatch) { messages.on('child_added', data => dispatch(receiveMessage(data.val()))); } } Now, in src/App.js you'll need to dispatch that action and set up the subscription. First change your import statement from ./actions to include subscribeToMessages: import { sendMessage, subscribeToMessages } from './actions'; Now, in mapDispatchToProps you need to map subscribeToMessages: function mapDispatchToProps(dispatch) { return { sendMessage: (msg) => dispatch(sendMessage(msg)), subscribeToMessages: () => dispatch(subscribeToMessages()), }; } Finally, inside of the App class, add the componentWillMount life cycle method just above the handleSendMessage method, and call subscribeToMessages: componentWillMount() { this.props.subscribeToMessages(); } Once you save the file, you should see that your app is subscribed to the Firebase database, which is automatically updating your Redux state, and your UI is displaying all the messages stored in Firebase. Have some fun and open another browser window! Conclusion You have now created a React app from scratch, added Redux to it, refactored it to connect to Firebase and, via subscriptions, updated your entire app state! What will you do now? Of course the app could use a better UI; you could redesign it! Maybe make it a little more usable. Try learning how to create users and show who yakked about what! The sky is the limit, now that you know how to put all the pieces together! Feel free to share what you've done with the app in the comments! About the author AJ Webb is team lead and frontend engineer for @tannerlabs and a co-creator of Payba.cc.
Read more
  • 0
  • 0
  • 17251

article-image-why-we-need-design-patterns
Packt
10 Nov 2016
16 min read
Save for later

Why we need Design Patterns?

Packt
10 Nov 2016
16 min read
In this article by Praseed Pai, and Shine Xavier, authors of the book .NET Design Patterns, we will try to understand the necessity of choosing a pattern-based approach to software development. We start with some principles of software development, which one might find useful while undertaking large projects. The working example in the article starts with a requirements specification and progresses towards a preliminary implementation. We will then try to iteratively improve the solution using patterns and idioms, and come up with a good design that supports a well-defined programming Interface. In this process, we will learn about some software development principles (listed below) one can adhere to, including the following: SOLID principles for OOP Three key uses of design patterns Arlow/Nuestadt archetype patterns Entity, value, and data transfer objects Leveraging the .NET Reflection API for plug-in architecture (For more resources related to this topic, see here.) Some principles of software development Writing quality production code consistently is not easy without some foundational principles under your belt. The purpose of this section is to whet the developer's appetite, and towards the end, some references are given for detailed study. Detailed coverage of these principles warrants a separate book on its own scale. The authors have tried to assimilate the following key principles of software development which would help one write quality code: KISS: Keep it simple, Stupid DRY: Don't repeat yourself YAGNI: You aren't gonna need it Low coupling: Minimize coupling between classes SOLID principles: Principles for better OOP William of Ockham had framed the maxim Keep it simple, Stupid (KISS). It is also called law of parsimony. In programming terms, it can be translated as "writing code in a straightforward manner, focusing on a particular solution that solves the problem at hand". This maxim is important because, most often, developers fall into the trap of writing code in a generic manner for unwarranted extensibility. Even though it initially looks attractive, things slowly go out of bounds. The accidental complexity introduced in the code base for catering to improbable scenarios, often reduces readability and maintainability. The KISS principle can be applied to every human endeavor. Learn more about KISS principle by consulting the Web. Don't repeat yourself (DRY), a maxim which most programmers often forget while implementing their domain logic. Most often, in a collaborative development scenario, code gets duplicated inadvertently due to lack of communication and proper design specifications. This bloats the code base, induces subtle bugs, and make things really difficult to change. By following the DRY maxim at all stages of development, we can avoid additional effort and make the code consistent. The opposite of DRY is write everything twice (WET). You aren't gonna need it (YAGNI), a principle that compliments the KISS axiom. It serves as a warning for people who try to write code in the most general manner, anticipating changes right from the word go,. Too often, in practice, most of this code is not used to make potential code smells. While writing code, one should try to make sure that there are no hard-coded references to concrete classes. It is advisable to program to an interface as opposed to an implementation. This is a key principle which many patterns use to provide behavior acquisition at runtime. A dependency injection framework could be used to reduce coupling between classes. SOLID principles are a set of guidelines for writing better object-oriented software. It is a mnemonic acronym that embodies the following five principles: 1 Single Responsibility Principle (SRP) A class should have only one responsibility. If it is doing more than one unrelated thing, we need to split the class. 2 Open Close Principle (OCP) A class should be open for extension, closed for modification. 3 Liskov Substitution Principle (LSP) Named after Barbara Liskov, a Turing Award laureate, who postulated that a sub-class (derived class) could substitute any super class (base class) references without affecting the functionality. Even though it looks like stating the obvious, most implementations have quirks which violate this principle. 4 Interface segregation principle (ISP) It is more desirable to have multiple interfaces for a class (such classes can also be called components) than having one Uber interface that forces implementation of all methods (both relevant and non-relevant to the solution context). 5 Dependency Inversion (DI) This is a principle which is very useful for Framework design. In the case of Frameworks, the client code will be invoked by server code, as opposed to the usual process of client invoking the server. The main principle here is that abstraction should not depend upon details, rather, details should depend upon abstraction. This is also called the "Hollywood Principle" (Do not call us, we will call you back). The authors consider the preceding five principles primarily as a verification mechanism. This will be demonstrated by verifying the ensuing case study implementations for violation of these principles. Karl Seguin has written an e-book titled Foundations of Programming – Building Better Software, which covers most of what has been outlined here. Read his book to gain an in-depth understanding of most of these topics. The SOLID principles are well covered in the Wikipedia page on the subject, which can be retrieved from https://en.wikipedia.org/wiki/SOLID_(object-oriented_design). Robert Martin's Agile Principles, Patterns, and Practices in C# is a definitive book on learning about SOLID, as Robert Martin itself is the creator of these principles, even though Michael Feathers coined the acronym. Why patterns are required? According to the authors, the three key advantages of pattern-oriented software development that stand out are as follows: A language/platform-agnostic way to communicate about software artifacts A tool for refactoring initiatives (targets for refactoring) Better API design With the advent of the pattern movement, the software development community got a canonical language to communicate about software design, architecture, and implementation. Software development is a craft which has got trade-offs attached to each strategy, and there are multiple ways to develop software. The various pattern catalogues brought some conceptual unification for this cacophony in software development. Most developers around the world today who are worth their salt, can understand and speak this language. We believe you will be able to do the same at the end of the article. Fancy yourself stating the following about your recent implementation: For our tax computation example, we have used command pattern to handle the computation logic. The commands (handlers) are configured using an XML file, and a factory method takes care of the instantiation of classes on the fly using Lazy loading. We cache the commands, and avoid instantiation of more objects by imposing singleton constraints on the invocation. We support prototype pattern where command objects can be cloned. The command objects have a base implementation, where concrete command objects use the template method pattern to override methods which are necessary. The command objects are implemented using the design by contracts idiom. The whole mechanism is encapsulated using a Façade class, which acts as an API layer for the application logic. The application logic uses entity objects (reference) to store the taxable entities, attributes like tax parameters are stored as value objects. We use data transfer object (DTO) to transfer the data from the application layer to the computational layer. Arlow/Nuestadt-based archetype pattern is the unit of structuring the tax computation logic. For some developers, the preceding language/platform-independent description of the software being developed is enough to understand the approach taken. This will boos developer productivity (during all phases of SDLC, including development, maintenance, and support) as the developers will be able to get a good mental model of the code base. Without Pattern catalogs, such succinct descriptions of the design or implementation would have been impossible. In an Agile software development scenario, we develop software in an iterative fashion. Once we reach a certain maturity in a module, developers refactor their code. While refactoring a module, patterns do help in organizing the logic. The case study given next will help you to understand the rationale behind "Patterns as refactoring targets". APIs based on well-defined patterns are easy to use and impose less cognitive load on programmers. The success of the ASP.NET MVC framework, NHibernate, and API's for writing HTTP modules and handlers in the ASP.NET pipeline are a few testimonies to the process. Personal income tax computation - A case study Rather than explaining the advantages of patterns, the following example will help us to see things in action. Computation of the annual income tax is a well-known problem domain across the globe. We have chosen an application domain which is well known to focus on the software development issues. The application should receive inputs regarding the demographic profile (UID, Name, Age, Sex, Location) of a citizen and the income details (Basic, DA, HRA, CESS, Deductions) to compute his tax liability. The System should have discriminants based on the demographic profile, and have a separate logic for senior citizens, juveniles, disabled people, old females, and others. By discriminant we mean demographic that parameters like age, sex and location should determine the category to which a person belongs and apply category-specific computation for that individual. As a first iteration, we will implement logic for the senior citizen and ordinary citizen category. After preliminary discussion, our developer created a prototype screen as shown in the following image: Archetypes and business archetype pattern The legendary Swiss psychologist, Carl Gustav Jung, created the concept of archetypes to explain fundamental entities which arise from a common repository of human experiences. The concept of archetypes percolated to the software industry from psychology. The Arlow/Nuestadt patterns describe business archetype patterns like Party, Customer Call, Product, Money, Unit, Inventory, and so on. An Example is the Apache Maven archetype, which helps us to generate projects of different natures like J2EE apps, Eclipse plugins, OSGI projects, and so on. The Microsoft patterns and practices describes archetypes for targeting builds like Web applications, rich client application, mobile applications, and services applications. Various domain-specific archetypes can exist in respective contexts as organizing and structuring mechanisms. In our case, we will define some archetypes which are common in the taxation domain. Some of the key archetypes in this domain are: Sr.no Archetype Description 1 SeniorCitizenFemale Tax payers who are female, and above the age of 60 years 2 SeniorCitizen Tax payers who are male, and above the age of 60 years 3 OrdinaryCitizen Tax payers who are Male/Female, and above 18 years of age 3 DisabledCitizen Tax payers who have any disability 4 MilitaryPersonnel Tax payers who are military personnel 5 Juveniles Tax payers whose age is less than 18 years We will use demographic parameters as discriminant to find the archetype which corresponds to the entity. The whole idea of inducing archetypes is to organize the tax computation logic around them. Once we are able to resolve the archetypes, it is easy to locate and delegate the computations corresponding to the archetypes. Entity, value, and data transfer objects We are going to create a class which represents a citizen. Since citizen needs to be uniquely identified, we are going to create an entity object, which is also called reference object (from DDD catalog). The universal identifier (UID) of an entity object is the handle which an application refers. Entity objects are not identified by their attributes, as there can be two people with the same name. The ID uniquely identifies an entity object. The definition of an entity object is given as follows: public class TaxableEntity { public int Id { get; set; } public string Name { get; set; } public int Age { get; set; } public char Sex { get; set; } public string Location { get; set; } public TaxParamVO taxparams { get; set; } } In the preceding class definition, Id uniquely identifies the entity object. TaxParams is a value object (from DDD catalog) associated with the entity object. Value objects do not have a conceptual identity. They describe some attributes of things (entities). The definition of TaxParams is given as follows: public class TaxParamVO { public double Basic {get;set;} public double DA { get; set; } public double HRA { get; set; } public double Allowance { get; set; } public double Deductions { get; set; } public double Cess { get; set; } public double TaxLiability { get; set; } public bool Computed { get; set; } } While writing applications ever since Smalltalk, Model-view-controller (MVC) is the most dominant paradigm for structuring applications. The application is split into a model layer ( which mostly deals with data), view layer (which acts as a display layer), and a controller (to mediate between the two). In the Web development scenario, they are physically partitioned across machines. To transfer data between layers, the J2EE pattern catalog identified the DTO to transfer data between layers. The DTO object is defined as follows: public class TaxDTO { public int id { } public TaxParamVO taxparams { } } If the layering exists within the same process, we can transfer these objects as-is. If layers are partitioned across processes or systems, we can use XML or JSON serialization to transfer objects between the layers. A computation engine We need to separate UI processing, input validation, and computation to create a solution which can be extended to handle additional requirements. The computation engine will execute different logic depending upon the command received. The GoF command pattern is leveraged for executing the logic based on the command received. The command pattern consists of four constituents. They are: Command object Parameters Command Dispatcher Client The command object's interface has an Execute method. The parameters to the command objects are passed through a bag. The client invokes the command object by passing the parameters through a bag to be consumed by the Command Dispatcher. The Parameters are passed to the command object through the following data structure: public class COMPUTATION_CONTEXT { private Dictionary<String, Object> symbols = new Dictionary<String, Object>(); public void Put(string k, Object value) { symbols.Add(k, value); } public Object Get(string k) { return symbols[k]; } } The ComputationCommand interface, which all the command objects implement, has only one Execute method, which is shown next. The Execute method takes a bag as parameter. The COMPUTATION_CONTEXT data structure acts as the bag here. Interface ComputationCommand { bool Execute(COMPUTATION_CONTEXT ctx); } Since we have already implemented a command interface and bag to transfer the parameters, it is time that we implement a command object. For the sake of simplicity, we will implement two commands where we hardcode the tax liability. public class SeniorCitizenCommand : ComputationCommand { public bool Execute(COMPUTATION_CONTEXT ctx) { TaxDTO td = (TaxDTO)ctx.Get("tax_cargo"); //---- Instead of computation, we are assigning //---- constant tax for each arcetypes td.taxparams.TaxLiability = 1000; td.taxparams.Computed = true; return true; } } public class OrdinaryCitizenCommand : ComputationCommand { public bool Execute(COMPUTATION_CONTEXT ctx) { TaxDTO td = (TaxDTO)ctx.Get("tax_cargo"); //---- Instead of computation, we are assigning //---- constant tax for each arcetypes td.taxparams.TaxLiability = 1500; td.taxparams.Computed = true; return true; } } The commands will be invoked by a CommandDispatcher Object, which takes an archetype string and a COMPUTATION_CONTEXT object. The CommandDispatcher acts as an API layer for the application. class CommandDispatcher { public static bool Dispatch(string archetype, COMPUTATION_CONTEXT ctx) { if (archetype == "SeniorCitizen") { SeniorCitizenCommand cmd = new SeniorCitizenCommand(); return cmd.Execute(ctx); } else if (archetype == "OrdinaryCitizen") { OrdinaryCitizenCommand cmd = new OrdinaryCitizenCommand(); return cmd.Execute(ctx); } else { return false; } } } The application to engine communication The data from the application UI, be it Web or Desktop, has to flow to the computation engine. The following ViewHandler routine shows how data, retrieved from the application UI, is passed to the engine, via the Command Dispatcher, by a client: public static void ViewHandler(TaxCalcForm tf) { TaxableEntity te = GetEntityFromUI(tf); if (te == null){ ShowError(); return; } string archetype = ComputeArchetype(te); COMPUTATION_CONTEXT ctx = new COMPUTATION_CONTEXT(); TaxDTO td = new TaxDTO { id = te.id, taxparams = te.taxparams}; ctx.Put("tax_cargo",td); bool rs = CommandDispatcher.Dispatch(archetype, ctx); if ( rs ) { TaxDTO temp = (TaxDTO)ctx.Get("tax_cargo"); tf.Liabilitytxt.Text = Convert.ToString(temp.taxparams.TaxLiability); tf.Refresh(); } } At this point, imagine that a change in requirement has been received from the stakeholders. Now, we need to support tax computation for new categories. Initially, we had different computations for senior citizen and ordinary citizen. Now we need to add new Archetypes. At the same time, to make the software extensible (loosely coupled) and maintainable, it would be ideal if we provide the capability to support new Archetypes in a configurable manner as opposed to recompiling the application for every new archetype owing to concrete references. The Command Dispatcher object does not scale well to handle additional archetypes. We need to change the assembly whenever a new archetype is included, as the tax computation logic varies for each archetype. We need to create a pluggable architecture to add or remove archetypes at will. The plugin system to make system extensible Writing system logic without impacting the application warrants a mechanism—that of loading a class on the fly. Luckily, the .NET Reflection API provides a mechanism for one to load a class during runtime, and invoke methods within it. A developer worth his salt should learn the Reflection API to write systems which change dynamically. In fact, most of the technologies like ASP.NET, Entity framework, .NET Remoting, and WCF work because of the availability of Reflection API in the .NET stack. Henceforth, we will be using an XML configuration file to specify our tax computation logic. A sample XML file is given next: <?xml version="1.0"?> <plugins> <plugin archetype ="OrindaryCitizen" command="TaxEngine.OrdinaryCitizenCommand"/> <plugin archetype="SeniorCitizen" command="TaxEngine.SeniorCitizenCommand"/> </plugins> The contents of the XML file can be read very easily using LINQ to XML. We will be generating a Dictionary object by the following code snippet: private Dictionary<string,string> LoadData(string xmlfile) { return XDocument.Load(xmlfile) .Descendants("plugins") .Descendants("plugin") .ToDictionary(p => p.Attribute("archetype").Value, p => p.Attribute("command").Value); } Summary In this article, we have covered quite a lot of ground in understanding why pattern-oriented software development is a good way to develop modern software. We started the article citing some key principles. We progressed further to demonstrate the applicability of these key principles by iteratively skinning an application which is extensible and resilient to changes. Resources for Article: Further resources on this subject: Debugging Your .NET Application [article] JSON with JSON.Net [article] Using ASP.NET Controls in SharePoint [article]
Read more
  • 0
  • 0
  • 37642

article-image-using-ros-uavs
Packt
10 Nov 2016
11 min read
Save for later

Using ROS with UAVs

Packt
10 Nov 2016
11 min read
In this article by Carol Fairchild and Dr. Thomas L. Harman, co-authors of the book ROS Robotics by Example, you will discover the field of ROS Unmanned Air Vehicles (UAVs), quadrotors, in particular. The reader is invited to learn about the simulated hector quadrotor and take it for a flight. The ROS wiki currently contains a growing list of ROS UAVs. These UAVs are as follows: (For more resources related to this topic, see here.) AscTec Pelican and Hummingbird quadrotors Berkeley's STARMAC Bitcraze Crazyflie DJI Matrice 100 Onboard SDK ROS support Erle-copter ETH sFly Lily CameraQuadrotor Parrot AR.Drone Parrot Bebop Penn's AscTec Hummingbird Quadrotors PIXHAWK MAVs Skybotix CoaX helicopter Refer to http://wiki.ros.org/Robots#UAVs for future additions to this list and to the website http://www.ros.org/news/robots/uavs/ to get the latest ROS UAV news. The preceding list contains primarily quadrotors except for the Skybotix helicopter. A number of universities have adopted the AscTec Hummingbird as their ROS UAV of choice. For this book, we present a simulator called Hector Quadrotor and two real quadrotors Crazyflie and Bebop that use ROS. Introducing Hector quadrotor The hardest part of learning about flying robots is the constant crashing. From the first-time learning of flight control to testing new hardware or flight algorithms, the resulting failures can have a huge cost in terms of broken hardware components. To answer this difficulty, a simulated air vehicle designed and developed for ROS is ideal. A simulated quadrotor UAV for the ROS Gazebo environment has been developed by the Team Hector Darmstadt of Technische Universität Darmstadt. This quadrotor, called Hector Quadrotor, is enclosed in the hector_quadrotor metapackage. This metapackage contains the URDF description for the quadrotor UAV, its flight controllers, and launch files for running the quadrotor simulation in Gazebo. Advanced uses of the Hector Quadrotor simulation allows the user to record sensor data such as Lidar, depth camera, and many more. The quadrotor simulation can also be used to test flight algorithms and control approaches in simulation. The hector_quadrotor metapackage contains the following key packages: hector_quadrotor_description: This package provides a URDF model of Hector Quadrotor UAV and the quadrotor configured with various sensors. Several URDF quadrotor models exist in this package each configured with specific sensors and controllers. hector_quadrotor_gazebo: This package contains launch files for executing Gazebo and spawning one or more Hector Quadrotors. hector_quadrotor_gazebo_plugins: This package contains three UAV specific plugins, which are as follows: The simple controller gazebo_quadrotor_simple_controller subscribes to a geometry_msgs/Twist topic and calculates the required forces and torques A gazebo_ros_baro sensor plugin simulates a barometric altimeter The gazebo_quadrotor_propulsion plugin simulates the propulsion, aerodynamics, and drag from messages containing motor voltages and wind vector input hector_gazebo_plugins: This package contains generic sensor plugins not specific to UAVs such as IMU, magnetic field, GPS, and sonar data. hector_quadrotor_teleop: This package provides a node and launch files for controlling a quadrotor using a joystick or gamepad. hector_quadrotor_demo: This package provides sample launch files that run the Gazebo quadrotor simulation and hector_slam for indoor and outdoor scenarios. The entire list of packages for the hector_quadrotor metapackage appears in the next section. Loading Hector Quadrotor The repository for the hector_quadrotor software is at the following website: https://github.com/tu-darmstadt-ros-pkg/hector_quadrotor The following commands will install the binary packages of hector_quadrotor into the ROS package repository on your computer. If you wish to install the source files, instructions can be found at the following website: http://wiki.ros.org/hector_quadrotor/Tutorials/Quadrotor%20outdoor%20flight%20demo (It is assumed that ros-indigo-desktop-full has been installed on your computer.) For the binary packages, type the following commands to install the ROS Indigo version of Hector Quadrotor: $ sudo apt-get update $ sudo apt-get install ros-indigo-hector-quadrotor-demo A large number of ROS packages are downloaded and installed in the hector_quadrotor_demo download with the main hector_quadrotor packages providing functionality that should now be somewhat familiar. This installation downloads the following packages: hector_gazebo_worlds hector_geotiff hector_map_tools hector_mapping hector_nav_msgs hector_pose_estimation hector_pose_estimation_core hector_quadrotor_controller hector_quadrotor_controller_gazebo hector_quadrotor_demo hector_quadrotor_description hector_quadrotor_gazebo hector_quadrotor_gazebo_plugins hector_quadrotor_model hector_quadrotor_pose_estimation hector_quadrotor_teleop hector_sensors_description hector_sensors_gazebo hector_trajectory_serve hector_uav_msgs message_to_tf A number of these packages will be discussed as the Hector Quadrotor simulations are described in the next section. Launching Hector Quadrotor in Gazebo Two demonstration tutorials are available to provide the simulated applications of the Hector Quadrotor for both outdoor and indoor environments. These simulations are described in the next sections. Before you begin the Hector Quadrotor simulations, check your ROS master using the following command in your terminal window: $ echo $ROS_MASTER_URI If this variable is set to localhost or the IP address of your computer, no action is needed. If not, type the following command: $ export ROS_MASTER_URI=http://localhost:11311 This command can also be added to your .bashrc file. Be sure to delete or comment out (with a #) any other commands setting the ROS_MASTER_URI variable. Flying Hector outdoors The quadrotor outdoor flight demo software is included as part of the hector_quadrotor metapackage. Start the simulation by typing the following command: $ roslaunch hector_quadrotor_demo outdoor_flight_gazebo.launch This launch file loads a rolling landscape environment into the Gazebo simulation and spawns a model of the Hector Quadrotor configured with a Hokuyo UTM-30LX sensor. An rviz node is also started and configured specifically for the quadrotor outdoor flight. A large number of flight position and control parameters are initialized and loaded into the Parameter Server. Note that the quadrotor propulsion model parameters for quadrotor_propulsion plugin and quadrotor drag model parameters for quadrotor_aerodynamics plugin are displayed. Then look for the following message: Physics dynamic reconfigure ready. The following screenshots show both the Gazebo and rviz display windows when the Hector outdoor flight simulation is launched. The view from the onboard camera can be seen in the lower left corner of the rviz window. If you do not see the camera image on your rviz screen, make sure that Camera has been added to your Displays panel on the left and that the checkbox has been checked. If you would like to pilot the quadrotor using the camera, it is best to uncheck the checkboxes for tf and robot_model because the visualizations sometimes block the view: Hector Quadrotor outdoor gazebo view Hector Quadrotor outdoor rviz view The quadrotor appears on the ground in the simulation ready for takeoff. Its forward direction is marked by a red mark on its leading motor mount. To be able to fly the quadrotor, you can launch the joystick controller software for the Xbox 360 controller. In a second terminal window, launch the joystick controller software with a launch file from the hector_quadrotor_teleop package: $ roslaunch hector_quadrotor_teleop xbox_controller.launch This launch file launches joy_node to process all joystick input from the left stick and right stick on the Xbox 360 controller as shown in the following figure. The message published by joy_node contains the current state of the joystick axes and buttons. The quadrotor_teleop node subscribes to these messages and publishes messages on the cmd_vel topic. These messages provide the velocity and direction for the quadrotor flight. Several joystick controllers are currently supported by the ROS joy package including PS3 and Logitech devices. For this launch, the joystick device is accessed as /dev/input/js0 and is initialized with a deadzone of 0.050000. Parameters to set the joystick axes are as follows: * /quadrotor_teleop/x_axis: 5 * /quadrotor_teleop/y_axis: 4 * /quadrotor_teleop/yaw_axis: 1 * /quadrotor_teleop/z_axis: 2 These parameters map to the Left Stick and the Right Stick controls on the Xbox 360 controller shown in the following figure. The direction of these sticks control are as follows: Left Stick: Forward (up) is to ascend Backward (down) is to descend Right is to rotate clockwise Left is to rotate counterclockwise Right Stick: Forward (up) is to fly forward Backward (down) is to fly backward Right is to fly right Left is to fly left Xbox 360 joystick controls for Hector Now use the joystick to fly around the simulated outdoor environment! The pilot's view can be seen in the Camera image view on the bottom left of the rviz screen. As you fly around in Gazebo, keep an eye on the Gazebo launch terminal window. The screen will display messages as follows depending on your flying ability: [ INFO] [1447358765.938240016, 617.860000000]: Engaging motors! [ WARN] [1447358778.282568898, 629.410000000]: Shutting down motors due to flip over! When Hector flips over, you will need to relaunch the simulation. Within ROS, a clearer understanding of the interactions between the active nodes and topics can be obtained by using the rqt_graph tool. The following diagram depicts all currently active nodes (except debug nodes) enclosed in oval shapes. These nodes publish to the topics enclosed in rectangles that are pointed to by arrows. You can use the rqt_graph command in a new terminal window to view the same display: ROS nodes and topics for Hector Quadrotor outdoor flight demo The rostopic list command will provide a long list of topics currently being published. Other command line tools such as rosnode, rosmsg, rosparam, and rosservice will help gather specific information about Hector Quadrotor's operation. To understand the orientation of the quadrotor on the screen, use the Gazebo GUI to show the vehicle's tf reference frame. Select quadrotor in the World panel on the left, then select the translation mode on the top environment toolbar (looks like crossed double-headed arrows). This selection will bring up the red-green-blue axis for the x-y-z axes of the tf frame, respectively. In the following figure, the x axis is pointing to the left, the y axis is pointing to the right (toward the reader), and the z axis is pointing up. Hector Quadrotor tf reference frame An YouTube video of hector_quadrotor outdoor scenario demo shows the hector_quadrotor in Gazebo operated with a gamepad controller: https://www.youtube.com/watch?v=9CGIcc0jeuI Flying Hector indoors The quadrotor indoor SLAM demo software is included as part of the hector_quadrotor metapackage. To launch the simulation, type the following command: $ roslaunch hector_quadrotor_demo indoor_slam_gazebo.launch The following screenshots show both the rviz and Gazebo display windows when the Hector indoor simulation is launched: Hector Quadrotor indoor rviz and gazebo views If you do not see this image for Gazebo, roll your mouse wheel to zoom out of the image. Then you will need to rotate the scene to a top-down view, in order to find the quadrotor press Shift + right mouse button. The environment was the offices at Willow Garage and Hector starts out on the floor of one of the interior rooms. Just like in the outdoor demo, the xbox_controller.launch file from the hector_quadrotor_teleop package should be executed: $ roslaunch hector_quadrotor_teleop xbox_controller.launch If the quadrotor becomes embedded in the wall, waiting a few seconds should release it and it should (hopefully) end up in an upright position ready to fly again. If you lose sight of it, zoom out from the Gazebo screen and look from a top-down view. Remember that the Gazebo physics engine is applying minor environment conditions as well. This can create some drifting out of its position. The rqt graph of the active nodes and topics during the Hector indoor SLAM demo is shown in the following figure. As Hector is flown around the office environment, the hector_mapping node will be performing SLAM and be creating a map of the environment. ROS nodes and topics for Hector Quadrotor indoor SLAM demo The following screenshot shows Hector Quadrotor mapping an interior room of Willow Garage: Hector mapping indoors using SLAM The 3D robot trajectory is tracked by the hector_trajectory_server node and can be shown in rviz. The map along with the trajectory information can be saved to a GeoTiff file with the following command: $ rostopic pub syscommand std_msgs/String "savegeotiff" The savegeotiff map can be found in the hector_geotiff/map directory. An YouTube video of hector_quadrotor stack indoor SLAM demo shows hector_quadrotor in Gazebo operated with a gamepad controller: https://www.youtube.com/watch?v=IJbJbcZVY28 Summary In this article, we learnt about Hector Quadrotors, loading Hector Quadrotors, launching Hector Quadrotor in Gazebo, and also about flying Hector outdoors and indoors. Resources for Article: Further resources on this subject: Working On Your Bot [article] Building robots that can walk [article] Detecting and Protecting against Your Enemies [article]
Read more
  • 0
  • 0
  • 37490

article-image-data-clustering
Packt
10 Nov 2016
6 min read
Save for later

Data Clustering

Packt
10 Nov 2016
6 min read
In this article by Rodolfo Bonnin, the author of the book Building Machine Learning Projects with TensorFlow, we will start applying data transforming operations. We will begin finding interesting patterns in some given information, discovering groups of data, or clusters, and using clustering techniques. (For more resources related to this topic, see here.) In this process we'll also gain two new tools: the ability to generate synthetic sample sets from a collection of representative data structures via the scikit-learn library, and the ability to graphically plot our data and model results, this time via the matplotlib library. The topics we will cover in this article are as follows: Getting an idea of how clustering works, and comparing it to other alternative existent classification techniques Using scikit-learn and matplotlib to enrichen the possibilities of dataset choices, and to get professional looking graphical representation of the data Implementing the K-means clustering algorithm Test some variations of the K-means methods to improve the fit and/or the convergence rate Three types of learning from data Based on how we approach the supervision of the samples, we can extract three types of learning: Unsupervised learning: The fully unsupervised approach directly takes a number of undetermined elements and builds a classification of them, looking at different properties that could determine its class Semi-supervised learning: The semi-supervised approach has a number of known classified items and then applies techniques to discover the class of the remaining items Supervised learning: In supervised learning, we start from a population of samples, which have a known type beforehand, and then build a model from it Normally there are three sample populations: one from which the model grows, called training set, one that is used to test the model, called training set, and then there are the samples for which we will be doing classification. Types of data learning based on supervision: unsupervised, semi-supervised, and supervised Unsupervised data clustering One of the simplest operations that can be initially applied to an unknown dataset is to try to understand the possible grouping or common features that the dataset members have. To do so, we could try to find representative points in them that summarize a balance of the parameters of the members of the group. This value could be, for example, the mean or the median of all the cluster members. This also guides to the idea of defining a notion of distance between members: all the members of the groups should be obviously at short distances between them and the representative points, that from the central points of the other groups. In the following image, we can see the results of a typical clustering algorithm and the representation of the cluster centers: Sample clustering algorithm output K-means K-means is a very well-known clustering algorithm that can be easily implemented. It is very straightforward and can guide (depending on the data layout) to a good initial understanding of the provided information. Mechanics of K-means K-means tries to divide a set of samples into K disjoint groups or clusters, using as a main indicator the mean value (be it 1D, 2D, and so on) of the members. This point is normally called centroid, referring to the arithmetic entity with the same name. One important characteristic of K-means is that K should be provided beforehand, and so some previous knowledge of the data is needed to avoid a non-representative result. Algorithm iteration criterion The criterion and goal of this method is to minimize the sum of squared distances from the cluster's member to the actual centroid of all cluster contained samples. This is also known as minimization of inertia. Error minimization criteria for K-means K-means algorithm breakdown The mechanism of the K-means algorithm can be summarized in the following graphic: Simplified flow chart of the K-means process And this is a simplified summary of the algorithm: We start with unclassified samples and take K elements as the starting centroids. There are also possible simplifications of this algorithm that take the first elements in the element list, for the sake of brevity. We then calculate the distances between the samples and the first chosen samples, and so we get the first calculated centroids (or other representative values). You can see in the moving centroids in the illustration toward a more common sense centroid. After the centroids change, their displacement will provoke the individual distances to change, and so the cluster membership can change. So this is the time when we recalculate the centroids and repeat the first steps, in case the stop condition isn't met. The stopping conditions could be of various types: After n iterations (it could be that either we chose a too large number and we'll have unnecessary rounds of computing, or it could converge slowly and we will have a very unconvincing results) if the centroid doesn't have a very stable means. This stop condition could also be used as a last resort if we have a really long iterative process. Referring to the previous mean result, a possibly better criterion for the convergence of the iterations is to take a look at the changes of the centroids, be it in total displacement or total cluster element switches. The last one is employed normally, so we will stop the process once there are no more element-changing clusters: K-means simplified graphic Pros and cons of K-means The advantages of this method are: It scales very well (most of the calculations can be run in parallel) It has been used in a very large range of applications But its simplicity has also a price (no silver bullet rule applies): It requires apriori knowledge (the number of possible clusters should be known beforehand) The outlier values can push the values of the centroids, as they have the same value as any other sample As we assume that the figure is convex and isotropic, it doesn't work very well with non-circle-like delimited clusters Summary In this article, we got a simple overview of some of the most basic models we can implement, but we tried to be as detailed in the explanation as possible. From now on, we are able to generate synthetic datasets, allowing us to rapidly test the adequacy of a model for different data configurations and so evaluate the advantages and shortcoming of them without having to load models with a greater number of unknown characteristics. You can also refer to the following books on the similar topics: Getting Started with TensorFlow: https://www.packtpub.com/big-data-and-business-intelligence/getting-started-tensorflow R Machine Learning Essentials: https://www.packtpub.com/big-data-and-business-intelligence/r-machine-learning-essentials Building Machine Learning Systems with Python - Second Edition: https://www.packtpub.com/big-data-and-business-intelligence/building-machine-learning-systems-python-second-edition Resources for Article: Further resources on this subject: Supervised Machine Learning [article] Unsupervised Learning [article] Preprocessing the Data [article]
Read more
  • 0
  • 0
  • 2294

article-image-creating-personal-web-portal-pwp
Packt
10 Nov 2016
8 min read
Save for later

Creating a Personal Web Portal (PWP)

Packt
10 Nov 2016
8 min read
In this article by Sherwin John Calleja Tragura, author of the book Spring MVC Blueprints, we will discuss about creating a robust and simple personal web portal that can serve as a personal web page, or a professional reference site, for anyone. Usually, these kinds of web sites are used as mashups, or dashboards, of centralized sources of information describing and individual or group. (For more resources related to this topic, see here.) Technically, a personal web portal is a composition of web components like CSS, HTML, and JavaScript, woven together to create a formal, simple or exquisite presentation of any content. It can be used, in its simplest form, as a personal portfolio or an enterprise form like an e-commerce content management system. Commercially, these portals are drafted and designed using the principles of the Rich-client platform or responsive web designs. In the industry, most companies suggest that clients try easy-to-use-tools like PHP frameworks (for example, CodeIgniter, Laravel, Drupal) and seldom advise using JEE-based portals. Overview of the project The personal web portal (PWP) created publishes a simple biography, and professional information, one can at least share through the web. The prototype is a session-driven one that can do dynamic transactions, like updating information on the web pages, and posting notes on the page without using any back-end database. Through using wireframes, the following are the initial drafts and design of the web portal: The Home Page: This is the first page of the site that shows updatable quotes, and inspiring messages coming from the owner of the portal. It contains a sticky-note feature at the side that allows visitors to post their short greetings to the owner in real-time. The Personal Information Page: This page highlights personal information of the owner including the owner's name, age, hobbies, birth date, and age. This page contains some part of the blogger's educational history. The page is dynamic and can be updated at anytime by the owner. The Professional Information Page: This page presents details about the owner's career background. It lists down all the previous jobs of the account owner, and enumerates all skills-related information. This page is also updatable. The Reach Out Page: This serves as the contact information page of the owner. Moreover, it allows visitors to send their contact information, and specifically their electronic mail address, to the portal owner. Update pages: The Home, Personal and Professional pages has updateable pages for the owner to update the content of the portal. The prototype has the capability to update the information presented in the content at anytime the user desires. This simple prototype, called PWP, will give clear steps on how to build personal sites from the ground-up, using Spring MVC 4.x specifications. It will give enthusiasts the opportunity to start creating Spring-based web portals in just a day, without using any database backend. To those who are new to the Spring MVC 4.x concept, this article will be a good start in building full-blown portal sites. Technical requirements In order to start the development, the following tools need to be installed onto the platform: Java Development Kit (JDK) 1.7.x Spring Tool Suite (Eclipse) 3.6 Maven 3.x Spring Framework 4.1 Apache Tomcat 7.x Any operating system First, the JDK 1.7.x installer must be installed. Visit the site http://www.oracle.com/technetwork/java/javase/downloads/jdk7-downloads-1880260.html to download the installer. Next, setup the Spring Tool Suite 3.6 (Eclipse-based) which will be the official Integrated Development Environment (IDE) of this article. Download the Spring Tool Suite 3.6 at https://spring.io/tools/sts. Setting-up the development environment This article recommends Spring Tool Suite (Eclipse) 3.6 since it has all the Spring Framework 4.x plug-ins, and other dependencies needed by the projects. To start us off, the following screenshot shows the dashboard of the STS IDE: Conversely, Apache Maven 3.x will be used to build and deploy the project for this article. Apache Maven is a software project management and comprehension tool. Based on the concept of a project object model (POM), Maven can manage a project's build, reporting and documentation from a central piece of information (https://maven.apache.org/). There is already a Maven plugin installed in the STS IDE that can be used to generate the needed development directory structure. Among the many ways to create Spring MVC projects, this article focuses on two styles, namely: Converting a dynamic web project to a Maven specimen Creating a Maven project from scratch Converting a dynamic web project to a Maven project To start creating the project, press CTRL + N to browse the Menu wizard of the IDE. This menu wizard contains all the types of project modules you'll need to start a project. The Menu wizard should look similar to the following screenshot: Once on the menu, browse the Web option and choose Dynamic Web Project. Afterwards, just follow the series of instructions to create the chosen project module until you reached the last menu wizard, which looks like the following figure: This last instruction (Web Module panel) will auto-generate the deployment descriptor (web.xml) of the project. Always click on the Generate web-xml deployment descriptor checkbox option. The deployment descriptor is an XML file that must reside inside the /WEB-INF/ folder of all JEE projects. This file describes how a component, module or application can be deployed. A JEE project must always be in the web.xml file otherwise the project will be defective. Since Spring 4.x container supports the Servlet Specification 3.0 in Tomcat 7 and above, web.xml is no longer mandatory and can be replaced by org.springframework.web.servlet.support.AbstractAnnotationConfigDispatcherServletInitializer or org.springframework.web.servlet.support.AbstractDispatcherServletInitializer class. The next major step is to convert the newly created dynamic web project to a Maven one. To complete the conversion, right-click on the project and navigate to the Configure | Convert Maven Project command set, as shown in the following screenshot: It is always best for the developer to study the directory structure of the project folder created before the actual implementation starts. Below is the directory structure of the Maven project after the conversion: The project directories are just like the usual Eclipse Dynamic Web project without the pom.xml file. Creating a Maven project from scratch Another method of creating a Spring MVC web project is by creating a Maven project from the start. Be sure to install the Maven 3.2 plugin in STS Eclipse. Browse the Menu wizard again, and locate the Maven option. Click on the Maven Project to generate a new Maven project. After clicking this option, a wizard will pop up, asking if an archetype is needed or not to create the Maven project. An archetype is a Maven plugin whose main objective is to create a project structure as per its template. To start quickly, choose an archetype plugin to create a simple java application here. It is recommended to create the project using the archetype maven-archetype-webapp. However, skipping the archetype selection can still be a valid option. After you've done this, proceed with the Select an Archetype window shown in the following screenshot. Locate maven-archetype-webapp then proceed with the last process. The selection of the Archetype maven-archetype-webapp will require the input of Maven parameters before the ending the whole process with a new Maven project: The required parameters for the Maven group or project are as follows: Group Id (groupId): This is the ID of the project's group and must be unique among all the project's groups Artifact Id (artifactId): This is the ID of the project. This is generally the name of the project Version (version): This is the version of the project Package (package): The initial or core package of the sources For more information on Maven plugin and configuration details, visit the documentation and samples on the site http://maven.apache.org/. After providing the Maven parameters, the project source folder structure will be similar to the following screenshot: Summary Using the basic Spring Framework 4.x APIs, web portal creators can create their own platform to promote their personal philosophy, business, ideology, religion, and other concepts. Although it is an advantage to use existing portal platforms made in other language like PHP and Python, it is still fulfilling to design and develop our own portal based on an open-source framework. The PWP is just prototype software that needs to be upgraded to have a backend database, security, and other social media plugins, in order to make the software commercially competitive. Resources for Article: Further resources on this subject: Designing your very own ASP.NET MVC Application [article] ASP.NET MVC 2: Validating MVC [article] Using ASP.NET Controls in SharePoint [article]
Read more
  • 0
  • 0
  • 9877

article-image-testing-components-service-dependencies
Victor Mejia
10 Nov 2016
5 min read
Save for later

Testing Components with Service Dependencies

Victor Mejia
10 Nov 2016
5 min read
It is very common for your Angular 2 components to depend on a service that performs actions, such as fetching data. In this post we will look at testing components with service dependencies, and at testing asynchronous actions. We will be using Jasmine for our tests. If you have not read Getting Started Testing Angular 2 Components, I strongly suggest you do so before continuing. Angular 2 Component with a Service Dependency Continuing with our contact manager application, we need to have a ContactService that fetches data from a server: import { Injectable } from '@angular/core'; import { Http } from '@angular/http'; import 'rxjs/add/operator/map'; @Injectable() export class ContactService { constructor(private http: Http){ } getContacts() { return this.http.get('/contacts.json') .map(res => res.json()); } } The Http service is injected here, and TypeScript will automatically assign the injected service to this.http. With this service ready to use, we are now ready to inject it into our ContactsComponent : import { Component, OnInit } from '@angular/core'; import { ContactService } from '../shared/contact.service'; @Component({ selector: 'contacts', template: ` <button (click)="getContacts()">Get Contacts</button> <profile *ngFor="let profile of contacts" [info]="profile"></profile> ` }) export class ContactsComponent implements OnInit { contacts: Array<any>; constructor(private contactService: ContactService) { } ngOnInit() { } getContacts() { this.contactService.getContacts() .subscribe(data => { this.contacts = data; }); } } We have an action set up, so when we click on the button, we make a call to our ContactService to fetch the data and assign the result. Once the call is resolved, the data will display. Setting up your unit test What we have to keep in mind is that we want to test our components in isolation. What this means is that instead of using the actual ContactService implementation, we create a MockContactService that returns mock data (array of Profile s). let mockData = [ { name: 'Victor Mejia', email: 'victor.mejia@example.com', phone: '123-456-7890' } ]; class MockContactService { getContacts(url) { return Observable.create((observer: Observer<Array<Profile>>) => { observer.next(mockData); }); } } When configuring our testing module, we add a new property,providers, where we specify the usage of our mock service: TestBed.configureTestingModule({ declarations: [ContactsComponent], providers: [ { provide: ContactService, useClass: MockContactService } ] }); We can now go ahead and get handles on fixture, component, and element : import { TestBed, async, ComponentFixture } from '@angular/core/testing'; import { ContactsComponent } from './contacts.component'; import { ContactService } from '../shared/contact.service'; import { Profile } from '../shared/profile.model'; import { Observable } from 'rxjs/Observable'; import { Observer } from 'rxjs/Observer'; let mockData = [ { name: 'Victor Mejia', email: 'victor.mejia@example.com', phone: '123-456-7890' } ]; class MockContactService { getContacts(url) { return Observable.create((observer: Observer<Array<Profile>>) => { observer.next(mockData); }); } } let fixture: ComponentFixture<ContactsComponent>; let component: ContactsComponent; let element: HTMLElement; describe('Component: Contacts', () => { beforeEach(async(() => { TestBed.configureTestingModule({ declarations: [ContactsComponent], providers: [ { provide: ContactService, useClass: MockContactService } ] }); TestBed.compileComponents() .then(() => { fixture = TestBed.createComponent(ContactsComponent); component = fixture.debugElement.componentInstance; element = fixture.debugElement.nativeElement }); })); }); Ensuring calls to our service A good test to always perform is to ensure that your actions are making the correct calls to your service. To do so, we can spy on the getContacts() function on the service, calling the component action and then ensuring that the function was indeed called: describe('getContacts', () => { it('should make a call to contactService.getContacts()', () => { spyOn(component.contactService, 'getContacts').and.callThrough(); component.getContacts(); expect(component.contactService.getContacts).toHaveBeenCalled(); }); }); Ensuring data is set A follow-up test to be performed is to ensure that the data is being set on the component after the call to the API is resolved. Since our call to getContacts() is performing an asynchronous action, we should use the async function in the it: it('should set the contacts property after fetching data', async(() => { ... })); It wraps the test function in an asynchronous “test zone”. Basically, it automatically completes when the asynchronous actions are complete. Next, we can make a call to component.getContacts() . However, we don’t want to run our specs until after that call has been resolved. There is a useful function we can use in our fixture, fixture.whenStable(). This returns a promise that resolves after asynchronous activity. Our test should now look as follows: it('should set the contacts property after fetching data', async(() => { component.getContacts(); fixture.whenStable().then(() => { expect(component.contacts).toEqual(mockData); }); })); We simply run a check to ensure that the contacts property is set to what the API call returns. Finer Async Control There are times when you want finer control, such as dealing with time intervals, and so on. To do so, we can simply use the fakeAsync in conjunction with the tick() function to simulate the passage of time. it('asynchronous timed test...', fakeAsync(() => { component.asyncActionWithTime(); tick(2000); // "advance" 2 seconds expect(...).toBe(...); })); Conclusion Angular 2 has wonderful APIs that make it really easy to test your components. We have seen how to test components with service dependencies, along with asynchronous actions. Time to start writing tests!
Read more
  • 0
  • 0
  • 9531
Unlock access to the largest independent learning library in Tech for FREE!
Get unlimited access to 7500+ expert-authored eBooks and video courses covering every tech area you can think of.
Renews at €18.99/month. Cancel anytime
article-image-internationalization
Packt
10 Nov 2016
16 min read
Save for later

Internationalization

Packt
10 Nov 2016
16 min read
In this article by Jérémie Bouchet author of the book Magento Extensions Development. We will see how to handle this aspect of our extension and how it is handled in a complex extension using an EAV table structure. In this article, we will cover the following topics: The EAV approach Store relation table Translation of template interface texts (For more resources related to this topic, see here.) The EAV approach The EAV structure in Magento is used for complex models, such as customer and product entities. In our extension, if we want to add a new field for our events, we would have to add a new column in the main table. With the EAV structure, each attribute is stored in a separate table depending on its type. For example, catalog_product_entity, catalog_product_entity_varchar and catalog_product_entity_int. Each row in the subtables has a foreign key reference to the main table. In order to handle multiple store views in this structure, we will add a column for the store ID in the subtables. Let's see an example for a product entity, where our main table contains only the main attribute: The varchar table structure is as follows: The 70 attribute corresponds to the product name and is linked to our 1 entity. There is a different product name for the store view, 0 (default) and 2 (in French in this example). In order to create an EAV model, you will have to extend the right class in your code. You can inspire your development on the existing modules, such as customers or products. Store relation table In our extension, we will handle the store views scope by using a relation table. This behavior is also used for the CMS pages or blocks, reviews, ratings, and all the models that are not EAV-based and need to be store views-related. Creating the new table The first step is to create the new table to store the new data: Create the [extension_path]/Setup/UpgradeSchema.php file and add the following code: <?php namespace BlackbirdTicketBlasterSetup; use MagentoEavSetupEavSetup; use MagentoEavSetupEavSetupFactory; use MagentoFrameworkSetupUpgradeSchemaInterface; use MagentoFrameworkSetupModuleContextInterface; use MagentoFrameworkSetupSchemaSetupInterface; /** * @codeCoverageIgnore */ class UpgradeSchema implements UpgradeSchemaInterface { /** * EAV setup factory * * @varEavSetupFactory */ private $eavSetupFactory; /** * Init * * @paramEavSetupFactory $eavSetupFactory */ public function __construct(EavSetupFactory $eavSetupFactory) { $this->eavSetupFactory = $eavSetupFactory; } public function upgrade(SchemaSetupInterface $setup, ModuleContextInterface $context) { if (version_compare($context->getVersion(), '1.3.0', '<')) { $installer = $setup; $installer->startSetup(); /** * Create table 'blackbird_ticketblaster_event_store' */ $table = $installer->getConnection()->newTable( $installer->getTable('blackbird_ticketblaster_event_store') )->addColumn( 'event_id', MagentoFrameworkDBDdlTable::TYPE_SMALLINT, null, ['nullable' => false, 'primary' => true], 'Event ID' )->addColumn( 'store_id', MagentoFrameworkDBDdlTable::TYPE_SMALLINT, null, ['unsigned' => true, 'nullable' => false, 'primary' => true], 'Store ID' )->addIndex( $installer->getIdxName('blackbird_ticketblaster_event_store', ['store_id']), ['store_id'] )->addForeignKey( $installer->getFkName('blackbird_ticketblaster_event_store', 'event_id', 'blackbird_ticketblaster_event', 'event_id'), 'event_id', $installer->getTable('blackbird_ticketblaster_event'), 'event_id', MagentoFrameworkDBDdlTable::ACTION_CASCADE )->addForeignKey( $installer->getFkName('blackbird_ticketblaster_event_store', 'store_id', 'store', 'store_id'), 'store_id', $installer->getTable('store'), 'store_id', MagentoFrameworkDBDdlTable::ACTION_CASCADE )->setComment( 'TicketBlaster Event To Store Linkage Table' ); $installer->getConnection()->createTable($table); $installer->endSetup(); } } } The upgrade method will handle all the necessary updates in our database for our extension. In order to differentiate the update for a different version of the extension, we surround the script with a version_compare() condition. Once this code is set, we need to tell Magento that our extension has new database upgrades to process. Open the [extension_path]/etc/module.xml file and change the version number 1.2.0 to 1.3.0: <?xml version="1.0"?> <config xsi_noNamespaceSchemaLocation="../../../../../lib/internal/Magento/Framework/Module/etc/module.xsd"> <module name="Blackbird_TicketBlaster" setup_version="1.3.0"> <sequence> <module name="Magento_Catalog"/> <module name="Blackbird_AnotherModule"/> </sequence> </module> </config> In your terminal, run the upgrade by typing the following command: php bin/magentosetup:upgrade The new table structure now contains two columns: event_id and store_id. This table will store which events are available for store views: If you have previously created events, we recommend emptying the existing blackbird_ticketblaster_event table, because they won't have a default store view and this may trigger an error output. Adding the new input to the edit form In order to select the store view for the content, we will need to add the new input to the edit form. Before running this code, you should add a new store view: Here's how to do that: Open the [extension_path]/Block/Adminhtml/Event/Edit/Form.php file and add the following code in the _prepareForm() method, below the last addField() call: /* Check is single store mode */ if (!$this->_storeManager->isSingleStoreMode()) { $field = $fieldset->addField( 'store_id', 'multiselect', [ 'name' => 'stores[]', 'label' => __('Store View'), 'title' => __('Store View'), 'required' => true, 'values' => $this->_systemStore->getStoreValuesForForm(false, true) ] ); $renderer = $this->getLayout()->createBlock( 'MagentoBackendBlockStoreSwitcherFormRendererFieldsetElement' ); $field->setRenderer($renderer); } else { $fieldset->addField( 'store_id', 'hidden', ['name' => 'stores[]', 'value' => $this->_storeManager->getStore(true)->getId()] ); $model->setStoreId($this->_storeManager->getStore(true)->getId()); } This results in a new multiselect field in the form. Saving the new data in the new table Now we have the form and the database table, we have to write the code to save the data from the form: Open the [extension_path]/Model/Event.php file and add the following method at its end: /** * Receive page store ids * * @return int[] */ public function getStores() { return $this->hasData('stores') ? $this->getData('stores') : $this->getData('store_id'); } Open the [extension_path]/Model/ResourceModel/Event.php file and replace all the code with the following code: <?php namespace BlackbirdTicketBlasterModelResourceModel; class Event extends MagentoFrameworkModelResourceModelDbAbstractDb { [...] The afterSave() method is handling our insert queries in the new table. The afterload() and getLoadSelect() methods are handling the new load mode to select the right events. Your new table is now filled when you save your events; they are also properly loaded when you go back to your edit form. Showing the store views in the admin grid In order to inform admin users of the selected store views for one event, we will add a new column in the admin grid: Open the [extension_path]/Model/ResourceModel/Event/Collection.php file and replace all the code with the following code: <?php namespace BlackbirdTicketBlasterModelResourceModelEvent; class Collection extends MagentoFrameworkModelResourceModelDbCollectionAbstractCollection { [...] Open the [extention_path]/view/adminhtml/ui_component/ticketblaster_event_listing.xml file and add the following XML instructions before the end of the </filters> tag: <filterSelect name="store_id"> <argument name="optionsProvider" xsi_type="configurableObject"> <argument name="class" xsi_type="string">MagentoCmsUiComponentListingColumnCmsOptions</argument> </argument> <argument name="data" xsi_type="array"> <item name="config" xsi_type="array"> <item name="dataScope" xsi_type="string">store_id</item> <item name="label" xsi_type="string" translate="true">Store View</item> <item name="captionValue" xsi_type="string">0</item> </item> </argument> </filterSelect> Before the actionsColumn tag, add the new column: <column name="store_id" class="MagentoStoreUiComponentListingColumnStore"> <argument name="data" xsi_type="array"> <item name="config" xsi_type="array"> <item name="bodyTmpl" xsi_type="string">ui/grid/cells/html</item> <item name="sortable" xsi_type="boolean">false</item> <item name="label" xsi_type="string" translate="true">Store View</item> </item> </argument> </column> You can refresh your grid page and see the new column added at the end. Magento remembers the previous column's order. If you add a new column, it will always be added at the end of the table. You will have to manually reorder them by dragging and dropping them. Modifying the frontend event list Our frontend list (/events) is still listing all the events. In order to list only the events available for our current store view, we need to change a file: Edit the [extension_path]/Block/EventList.php file and replace the code with the following code: <?php namespace BlackbirdTicketBlasterBlock; use BlackbirdTicketBlasterApiDataEventInterface; use BlackbirdTicketBlasterModelResourceModelEventCollection as EventCollection; use MagentoCustomerModelContext; class EventList extends MagentoFrameworkViewElementTemplate implements MagentoFrameworkDataObjectIdentityInterface { /** * Store manager * * @var MagentoStoreModelStoreManagerInterface */ protected $_storeManager; /** * @var MagentoCustomerModelSession */ protected $_customerSession; /** * Construct * * @param MagentoFrameworkViewElementTemplateContext $context * @param BlackbirdTicketBlasterModelResourceModelEventCollectionFactory $eventCollectionFactory, * @param array $data */ public function __construct( MagentoFrameworkViewElementTemplateContext $context, BlackbirdTicketBlasterModelResourceModelEventCollectionFactory $eventCollectionFactory, MagentoStoreModelStoreManagerInterface $storeManager, MagentoCustomerModelSession $customerSession, array $data = [] ) { parent::__construct($context, $data); $this->_storeManager = $storeManager; $this->_eventCollectionFactory = $eventCollectionFactory; $this->_customerSession = $customerSession; } /** * @return BlackbirdTicketBlasterModelResourceModelEventCollection */ public function getEvents() { if (!$this->hasData('events')) { $events = $this->_eventCollectionFactory ->create() ->addOrder( EventInterface::CREATION_TIME, EventCollection::SORT_ORDER_DESC ) ->addStoreFilter($this->_storeManager->getStore()->getId()); $this->setData('events', $events); } return $this->getData('events'); } /** * Return identifiers for produced content * * @return array */ public function getIdentities() { return [BlackbirdTicketBlasterModelEvent::CACHE_TAG . '_' . 'list']; } /** * Is logged in * * @return bool */ public function isLoggedIn() { return $this->_customerSession->isLoggedIn(); } } Note that we have a new property available and instantiated in our constructor: storeManager. Thanks to this class, we can filter our collection with the store view ID by calling the addStoreFilter() method on our events collection. Restricting the frontend access by store view The events will not be listed in our list page if they are not available for the current store view, but they can still be accessed with their direct URL, for example http://[magento_url]/events/view/index/event_id/2. We will change this to restrict the frontend access by store view: Open the [extention_path]/Helper/Event.php file and replace the code with the following code: <?php namespace BlackbirdTicketBlasterHelper; use BlackbirdTicketBlasterApiDataEventInterface; use BlackbirdTicketBlasterModelResourceModelEventCollection as EventCollection; use MagentoFrameworkAppActionAction; class Event extends MagentoFrameworkAppHelperAbstractHelper { /** * @var BlackbirdTicketBlasterModelEvent */ protected $_event; /** * @var MagentoFrameworkViewResultPageFactory */ protected $resultPageFactory; /** * Store manager * * @var MagentoStoreModelStoreManagerInterface */ protected $_storeManager; /** * Constructor * * @param MagentoFrameworkAppHelperContext $context * @param BlackbirdTicketBlasterModelEvent $event * @param MagentoFrameworkViewResultPageFactory $resultPageFactory * @SuppressWarnings(PHPMD.ExcessiveParameterList) */ public function __construct( MagentoFrameworkAppHelperContext $context, BlackbirdTicketBlasterModelEvent $event, MagentoFrameworkViewResultPageFactory $resultPageFactory, MagentoStoreModelStoreManagerInterface $storeManager, ) { $this->_event = $event; $this->_storeManager = $storeManager; $this->resultPageFactory = $resultPageFactory; $this->_customerSession = $customerSession; parent::__construct($context); } /** * Return an event from given event id. * * @param Action $action * @param null $eventId * @return MagentoFrameworkViewResultPage|bool */ public function prepareResultEvent(Action $action, $eventId = null) { if ($eventId !== null && $eventId !== $this->_event->getId()) { $delimiterPosition = strrpos($eventId, '|'); if ($delimiterPosition) { $eventId = substr($eventId, 0, $delimiterPosition); } $this->_event->setStoreId($this->_storeManager->getStore()->getId()); if (!$this->_event->load($eventId)) { return false; } } if (!$this->_event->getId()) { return false; } /** @var MagentoFrameworkViewResultPage $resultPage */ $resultPage = $this->resultPageFactory->create(); // We can add our own custom page handles for layout easily. $resultPage->addHandle('ticketblaster_event_view'); // This will generate a layout handle like: ticketblaster_event_view_id_1 // giving us a unique handle to target specific event if we wish to. $resultPage->addPageLayoutHandles(['id' => $this->_event->getId()]); // Magento is event driven after all, lets remember to dispatch our own, to help people // who might want to add additional functionality, or filter the events somehow! $this->_eventManager->dispatch( 'blackbird_ticketblaster_event_render', ['event' => $this->_event, 'controller_action' => $action] ); return $resultPage; } } The setStoreId() method called on our model will load the model only for the given ID. The events are no longer available through their direct URL if we are not on their available store view. Translation of template interface texts In order to translate the texts written directly in the template file, for the interface or in your PHP class, you need to use the __('Your text here') method. Magento looks for a corresponding match within all the translation CSV files. There is nothing to be declared in XML; you simply have to create a new folder at the root of your module and create the required CSV: Create the [extension_path]/i18n folder. Create [extension_path]/i18n/en_US.csv and add the following code: "Event time:","Event time:" "Please sign in to read more details.","Please sign in to read more details." "Read more","Read more" Create [extension_path]/i18n/en_US.csv and add the following code: "Event time:","Date de l'évènement :" "Pleasesign in to read more details.","Merci de vous inscrire pour plus de détails." "Read more","Lire la suite" The CSV file contains the correspondences between the key used in the code and the value in its final language. Translation of e-mail templates: creating and translating the e-mails We will add a new form in the Details page to share the event to a friend. The first step is to declare your e-mail template. To declare your e-mail template, create a new [extension_path]/etc/email_templates.xml file and add the following code: <?xml version="1.0"?> <config xsi_noNamespaceSchemaLocation="urn:magento:module:Magento_Email:etc/email_templates.xsd"> <template id="ticketblaster_email_email_template" label="Share Form" file="share_form.html" type="text" module="Blackbird_TicketBlaster" area="adminhtml"/> </config> This XML line declares a new template ID, label, file path, module, and area (frontend or adminhtml). Next, create the corresponding template by creating the [extension_path]/view/adminhtml/email/share_form.html file and add the following code: <!--@subject Share Form@--> <!--@vars { "varpost.email":"Sharer Email", "varevent.title":"Event Title", "varevent.venue":"Event Venue" } @--> <p>{{trans "Your friend %email is sharing an event with you:" email=$post.email}}</p> {{trans "Title: %title" title=$event.title}}<br/> {{trans "Venue: %venue" venue=$event.venue}}<br/> <p>{{trans "View the detailed page: %url" url=$event.url}}</p> Note that in order to translate texts within the HTML file, we use the trans function, which works like the default PHP printf() function. The function will also use our i18n CSV files to find a match for the text. Your e-mail template can also be overridden directly from the backoffice: Marketing | Email templates. The e-mail template is ready; we will also add the ability to change it in the system configuration and allow users to determine the sender's e-mail and name: Create the [extension_path]/etc/adminhtml/system.xml file and add the following code: <?xml version="1.0"?> <config xsi_noNamespaceSchemaLocation="urn:magento:module:Magento_Config:etc/system_file.xsd"> <system> <section id="ticketblaster" translate="label" type="text" sortOrder="100" showInDefault="1" showInWebsite="1" showInStore="1"> <label>Ticket Blaster</label> <tab>general</tab> <resource>Blackbird_TicketBlaster::event</resource> <group id="email" translate="label" type="text" sortOrder="50" showInDefault="1" showInWebsite="1" showInStore="1"> <label>Email Options</label> <field id="recipient_email" translate="label" type="text" sortOrder="10" showInDefault="1" showInWebsite="1" showInStore="1"> <label>Send Emails To</label> <validate>validate-email</validate> </field> <field id="sender_email_identity" translate="label" type="select" sortOrder="20" showInDefault="1" showInWebsite="1" showInStore="1"> <label>Email Sender</label> <source_model>MagentoConfigModelConfigSourceEmailIdentity</source_model> </field> <field id="email_template" translate="label comment" type="select" sortOrder="30" showInDefault="1" showInWebsite="1" showInStore="1"> <label>Email Template</label> <comment>Email template chosen based on theme fallback when "Default" option is selected.</comment> <source_model>MagentoConfigModelConfigSourceEmailTemplate</source_model> </field> </group> </section> </system> </config> Create the [extension_path]/etc/config.xml file and add the following code: <?xml version="1.0"?> <config xsi_noNamespaceSchemaLocation="urn:magento:module:Magento_Store:etc/config.xsd"> <default> <ticketblaster> <email> <recipient_email> <![CDATA[hello@example.com]]> </recipient_email> <sender_email_identity>custom2</sender_email_identity> <email_template>ticketblaster_email_email_template</email_template> </email> </ticketblaster> </default> </config> Thanks to these two files, you can change the configuration for the e-mail template in the Admin panel (Stores | Configuration). Let's create our HTML form and the controller that will handle our submission: Open the existing [extension_path]/view/frontend/templates/view.phtml file and add the following code at the end: <form action="<?php echo $block->getUrl('events/view/share', array('event_id' => $event->getId())); ?>" method="post" id="form-validate" class="form"> <h3> <?php echo __('Share this event to my friend'); ?> </h3> <input type="email" name="email" class="input-text" placeholder="email" /> <button type="submit" class="button"><?php echo __('Share'); ?></button> </form> Create the [extension_path]/Controller/View/Share.php file and add the following code: <?php namespace BlackbirdTicketBlasterControllerView; use MagentoFrameworkExceptionNotFoundException; use MagentoFrameworkAppRequestInterface; use MagentoStoreModelScopeInterface; use BlackbirdTicketBlasterApiDataEventInterface; class Share extends MagentoFrameworkAppActionAction { [...] This controller will get the necessary configuration entirely from the admin and generate the e-mail to be sent. Testing our code by sending the e-mail Go to the page of an event and fill in the form we prepared. When you submit it, Magento will send the e-mail immediately. Summary In this article, we addressed all the main processes that are run for internationalization. We can now create and control the availability of our events with regards to Magento's stores and translate the contents of our pages and e-mails. Resources for Article: Further resources on this subject: Magento Theme Distribution [article] Installing Magento [article] Magento 2 – the New E-commerce Era [article]
Read more
  • 0
  • 0
  • 33122

article-image-introduction-practical-business-intelligence
Packt
10 Nov 2016
20 min read
Save for later

Introduction to Practical Business Intelligence

Packt
10 Nov 2016
20 min read
In this article by Ahmed Sherif, author of the book Practical Business Intelligence, is going to explain what is business intelligence? Before answering this question, I want to pose and answer another question. What isn't business intelligence? It is not spreadsheet analysis done with transactional data with thousands of rows. One of the goals of Business Intelligence or BI is to shield the users of the data from the intelligent logic lurking behind the scenes of the application that is delivering the data to them. If the integrity of the data is compromised in any way by an individual not intimately familiar with the data source, then there cannot, by definition, be intelligence in the business decisions made within that same data. The single source of truth is the key for any Business Intelligence operation whether it is a mom and pop soda shop or a Fortune 500 company. Any report, dashboard, or analytical application that is delivering information to a user through a BI tool but the numbers cannot be tied back to the original source will break the trust between the user and the data and will defeat the purpose of Business Intelligence. (For more resources related to this topic, see here.) In my opinion, the most successful tools used for business intelligence directly shield the business user from the query logic used for displaying that same data in some form of visual manner. Business Intelligence has taken many forms in terms of labels over the years. Business Intelligence is the process of delivering actionable business decisions from analytical manipulation and presentation of data within the confines of a business environment. The delivery process mentioned in the definition will focus its attention on. The beauty of BI is that it is not owned by any one particular tool that is proprietary to a specific industry or company. Business Intelligence can be delivered using many different tools, some even that were not originally intended to be used for BI. The tool itself should not be the source where the query logic is applied to generate the business logic of the data. The tool should primarily serve as the delivery mechanism of the query that is generated by the data warehouse that houses both the data, as well as the logic. In this chapter we will cover the following topics: Understanding the Kimball method Understanding business intelligence Data and SQL Working with data and SQL Working with business intelligence tools Downloading and installing MS SQL Server 2014 Downloading and installing AdventureWorks Understanding the Kimball method As we discuss the data warehouse where our data is being housed, we will be remised not to bring up Ralph Kimball, one of the original architects of the data warehouse.  Kimball's methodology incorporated dimensional modeling, which has become the standard for modeling a data warehouse for Business Intelligence purposes. Dimensional modeling incorporates joining tables that have detail data and tables that have lookup data. A detail table is known as a fact table in dimensional modeling. An example of a fact table would be a table holding thousands of rows of transactional sales from a retail store.  The table will house several ID's affiliated with the product, the sales person, the purchase date, and the purchaser just to name a few. Additionally, the fact table will store numeric data for each individual transaction, such as sales quantity for sales amount to name a few examples. These numeric values will be referred to as measures. While there is usually one fact table, there will also be several lookup or dimensional tables that will have one table for each ID that is used in a fact table. So, for example,  there would be one dimensional table for the product name affiliated with a product ID. There would be one dimensional table for the month, week, day, and year of the id affiliated with the date. These dimensional tables are also referred to as Lookup Tables, because they kind of look up what the name of a dimension ID is affiliated with. Usually, you would find as many dimensional tables as there are ID's in the fact table. The dimensional tables would all be joined to the one fact table creating something of a 'star' look. Hence, the name for this type of table join is known as a star schema which is represented diagrammatically in the following figure. It is customary that the fact table will be the largest table in a data warehouse while the lookup tables will all be quite small in rows, some as small as one row. The tables are joined by ID's, also known as surrogate keys. Surrogate keys allow for the most efficient join between a fact table and a dimensional table as they are usually a data type of integer. As more and more detail is added to a dimensional table, that new dimension is just given the next number in line, usually starting with 1. Query performance between tables joins suffers when we introduce non-numeric characters into the join or worse, symbols (although most databases will not allow that). Understanding business intelligence architecture I will continuously hammer home the point that the various tools utilized to deliver the visual and graphical BI components should not house any internal logic to filter data out of the tool nor should it be the source of any built in calculations. The tools themselves should not house this logic as they will be utilized by many different users. If each user who develops a BI app off of the tool incorporates different internal filters without the tool, the single source of truth tying back to the data warehouse will become multiple sources of truths.  Any logic applied to the data to filter out a specific dimension or to calculate a specific measure should be applied in the data warehouse and then pulled into the tool. For example, if the requirement for a BI dashboard was to show current year and prior year sales for US regions only, the filter for region code would be ideally applied in the data warehouse as opposed to inside of the tool. The following is a query written in SQL joining two tables from the AdventureWorks database that highlights the difference between dimenions and measures.  The 'region' column is a dimension column and the 'SalesYTD' and 'SalesPY' are measure columns. In this example, the 'TerritoryID' is serving as the key join between 'SalesTerritory' and 'SalesPerson'. Since the measures are coming from the 'SalesPerson' table, that table will serve as the fact table and 'SalesPerson.TerritoryID' will serve as the fact ID. Since the Region column is dimensional and coming from the 'SalesTerritory' table, that table will serve as the dimensional or lookup table and 'SalesTerritory.TerritoryID' will serve as the dimension ID. In a finely-tuned data warehouse both the fact ID and dimension ID would be indexed to allow for efficient query performance. This performance is obtained by sorting the ID's numerically so that a row from one table that is being joined to another table does not have to be searched through the entire table but only a subset of that table. When the table is only a few hundred rows, it may not seem necessary to index columns, but when the table grows to a few hundred million rows, it may become necessary. Select region.Name as Region ,round(sum(sales.SalesYTD),2) as SalesYTD ,round(sum(sales.SalesLastYear),2) as SalesPY FROM [AdventureWorks2014].[Sales].[SalesTerritory] region left outer join [AdventureWorks2014].[Sales].[SalesPerson] sales on sales.TerritoryID = region.TerritoryID where region.CountryRegionCode = 'US' Group by region.Name order by region.Name asc There are several reasons why applying the logic at the database level is considered a best practice. Most of the time, these requests for filtering data or manipulating calculations are done at the BI tool level because it is easier for the developer than to go to the source. However, if these filters are being performed due to data quality issues then applying logic at the reporting level is only masking an underlying data issue that needs to be addressed across the entire data warehouse. You would be doing yourself a disservice in the long run as you will be establishing a precedence that the data quality would be handled by the report developer as opposed to the database administrator. You are just adding additional work onto your plate. Ideal BI tools will quickly connect to the data source and then allow for slicing and dicing of your dimensions and measures in a manner that will quickly inform the business of useful and practical information. Ultimately, the choice of a BI tool by an individual or an organization will come down to the ease of use of the tool as well as the flexibility to showcase the data through various components such as graphs, charts, widgets, and infographics. Management If you are a Business Intelligence manager looking to establish a department with a variety of tools to help flesh out your requirements, could serve as a good source for interview questions to weed out unqualified candidates. A manager could use to distinguish some of the nuances between these different skillsets and prioritize hiring based on immediate needs. Data Scientist The term Data Scientist has been misused in the BI industry, in my humble opinion. It has been lumped in with Data Analyst as well as BI Developer. Unfortunately, these three positions have separate skillsets and you will do yourself a disservice by assuming one person can do multiple positions successfully. A Data Scientist will be able to apply statistical algorithms behind the data that is being extracted from the BI tools and make predictions about what will happen in the future with that same data set. Due to this skillset, a Data Scientist may find the chapters focusing on R and Python to be of particular importance because of their abilities to leverage predictive capabilities within their BI delivery mechanisms. Data Analyst A Data Analyst is probably the second most misused position behind a Data Scientist. Typically, a Data Analyst should be analyzing the data that is coming out of the BI tools that are connected to the data warehouse. Most Data Analysts are comfortable working with Microsoft Excel. Often times they are asked to take on additional roles in developing dashboards that require additional programming skills.  This is where they would find some comfort using a tool like Power BI, Tableau, or QlikView. These tools would allow for a Data Analyst to quickly develop a storyboard or visualization that would allow for quick analysis with minimal programming skills. Visualization Developer A 'dataviz' developer is someone who can create complex visualizations out of data and showcase interesting interactions between different measures inside of a dataset that cannot necessarily be seen with a traditional chart or graph. More often than not these developers possess some programming background such as JavaScript, HTML, or CSS. These developers are also used to developing applications directly for the web and therefore would find D3.js a comfortable environment to program in. Working with Data and SQL The examples and exercises that will come from the AdventureWorks database.  The AdventureWorks database has a comprehensive list of tables that mimics an actual bicycle retailor. The examples will draw on different tables from the database to highlight BI reporting from the various segments appropriate for the AdventureWorks Company. These segments include Human Resources, Manufacturing, Sales, Purchasing, and Contact Management. A different segment of the data will be highlighted in each chapter utilizing a specific set of tools. A cursory understanding of SQL (structured query language) will be helpful to get a grasp of how data is being aggregated with dimensions and measures. Additionally, an understanding of the SQL statements used will help with the validation process to ensure a single source of truth between the source data and the output inside of the BI tool of choice. For more information about learning SQL, visit the following website: www.sqlfordummies.com Working with business intelligence tools Over the course of the last 20 years, there have been a growing number of software products released that were geared towards Business Intelligence. In addition, there have been a number of software products and programming languages that were not initially built for BI but later on became a staple for the industry. The tools used were chosen based on the fact that they were either built off of open source technology or they were products from companies that provided free versions of their software for development purposes. Many companies from the big enterprise firms have their own BI tools and they are quite popular. However, unless you have a license with them, it is unlikely that you will be able to use their tool without having to shell out a small fortune. Power BI and Excel Power BI is one of the more relatively newer BI tools from Microsoft.  It is known as a self-service solution and integrates seamlessly with other data sources such as Microsoft Excel and Microsoft SQL Server.  Our primary purpose in using Power BI will be to generate interactive dashboards, reports, and datasets for users. In addition to using Power BI we will also focus on utilizing Microsoft Excel to assist with some data analysis and validation of results that are being pulled from our data warehouse.  Pivot tables are very popular within MS Excel and will be used to validate aggregation done inside of the data warehouse. D3.js D3.js, also known as data-driven documents, is a JavaScript library known for delivery beautiful visualizations by manipulating documents based on data. Since D3 is rooted in JavaScript, all visualizations make a seamless transition to the web. D3 allows for major customization to any part of visualization and because of this flexibility, it will require a steeper learning curve that probably any other software program. D3 can consume data easily as a .json or a .csv file.  Additionally, the data can also be imbedded directly within the JavaScript code that renders the visualization on the web. R R is a free and open source statistical programming language that produces beautiful graphics. The R language has been widely used among the statistical community and more recently in the data science and machine learning community as well. Due to this fact, it has picked up steam in recent years as a platform for displaying and delivering effective and practical BI. In addition to visualizing BI, R has the ability to also visualize predictive analysis with algorithms and forecasts. While R is a bit raw in its interface, there have been some IDE's (Integrated Development Environment) that have been developed to ease the user experience. RStudio will be used to deliver the visualisations developed within R. Python Python is considered the most traditional programming language of all the different languages. It is a widely used general purpose programming language with several modules that are very powerful in analysing and visualizing data. Similar to R, Python is a bit raw in its own form for delivering beautiful graphics as a BI tool; however, with the incorporation of an IDE the user interface becomes much more of a pleasurable development experience. PyCharm will be the IDE used to develop BI with Python. PyCharm is free to use and allows creation of the iPython notebook which delivers seamless integration between Python and the powerful modules that will assist with BI. As a note, all code in Python will be developed using the Python 3 syntax. QlikView QlikView is a software company specializing in delivering business intelligence solutions using their desktop tool. QlikView is one of the leaders in delivering quick visualizations based on data and queries through their desktop application. They advertise themselves to be self-service BI for business users. While they do offer solutions that target more enterprise organizations, they also offer a free version of their tool for personal use. Tableau is probably the closest competitor in terms of delivering similar BI solutions. Tableau Tableau is a software company specializing in delivering business intelligence solutions using their desktop tool. If this sounds familiar to QlikView, it's probably because it's true. Both are leaders in the field of establishing a delivery mechanism with easy installation, setup, and connectivity to the available data. Tableau has a free version of their desktop tool. Again, Tableau excels at delivering both beautiful visualizations quickly as well as self-service data discovery to more advanced business users. Microsoft SQL Server Microsoft SQL will serve as the data warehouse for the examples that we will with the BI Tools. Microsoft SQL Server is relatively simple to install and set up as well it is free to download. Additionally, there are example databases that configure seamlessly with it, such as the AdventureWorks database. Downloading and Installing MS SQL Server 2014 First things first. We will need to get started with getting our database and data warehouse up and running so that we can begin to develop our BI environment. We will visit the Microsoft website below to start the download selection process. https://www.microsoft.com/en-us/download/details.aspx?id=42299 Select the specified language that is applicable to you and also select the MS SQL Server Express version with Advanced features that is 64-bit edition as shown in the following screenshot. Ideally you'll want to be working in a 64-bit edition when dealing with servers. After selecting the file, the download process should begin. Depending on your connection speed it could take some time as the file is slightly larger than 1 GB. The next step in the process is selecting a new stand-alone instance of SQL Server 2014 unless you already have a version and wish to upgrade instead as shown in the following screenshot.. After accepting the license terms, continue through the steps in the Global Rules as well as the Product Updates to get to the setup installation files. For the feature selection tab, make sure the following features are selected for your installation as shown in the following screenshot. Our preference is to label a named instance of this database to something related to the work we are doing.  Since this will be used for Business Intelligence, I went ahead and name this instance 'SQLBI' as shown in the following screenshot: The default Server Configuration settings are sufficient for now, there is no need to change anything under that section as shown in the following screenshot. Unless you are required to do so within your company or organization, for personal use it is sufficient to just go with Windows Authentication mode for sign-on as shown in the following screenshot. We will not need to do any configuring of reporting services, so it is sufficient for our purposes to just with installing Reporting Services Native mode without any need for configuration at this time. At this point the installation will proceed and may take anywhere between 20-30 minutes depending on the cpu resources. If you continue to have issues with your installation, you can visit the following website from Microsoft for additional help. http://social.technet.microsoft.com/wiki/contents/articles/23878.installing-sql-server-2014-step-by-step-tutorial.aspx Ultimately, if everything with the installation is successful, you'll want to see all portions of the installation have a green check mark next to their name and be labeled 'Successful' as shown in the following screenshot. Downloading and Installing AdventureWorks We are almost finished with getting our business intelligence data warehouse complete. We are now at the stage where we will extract and load data into our data warehouse. The last part is to download and install the AdventureWorks database from Microsoft. The zipped file for AdventureWorks 2014 is located in the following website from Microsoft: https://msftdbprodsamples.codeplex.com/downloads/get/880661 Once the file is downloaded and unzipped, you will find a file named the following: AdventureWorks2014.bak Copy that file and paste it in the following folder where it will be incorporated with your Microsoft SQL Server 2014 Express Edition. C:Program FilesMicrosoft SQL ServerMSSQL12.SQLBIMSSQLBackup Also note that the MSSQL12.SQLBI subfolder will vary user by user depending on what you named your SQL instance when you were installing MS SQL Server 2014. Once that has been copied over, we can fire up Management Studio for SQL Server 2014 and start up a blank new query by going to File New Query with Current Connection Once you have a blank query set up, copy and paste the following code in the and execute it: use [master] Restore database AdventureWorks2014 from disk = 'C:Program FilesMicrosoft SQL ServerMSSQL12.SQLBIMSSQLBackupAdventureWorks2014.bak' with move 'AdventureWorks2014_data' to 'C:Program FilesMicrosoft SQL ServerMSSQL12.SQLBIMSSQLDATAAdventureWorks2014.mdf', Move 'AdventureWorks2014_log' to 'C:Program FilesMicrosoft SQL ServerMSSQL12.SQLBIMSSQLDATAAdventureWorks2014.ldf' , replace Once again, please note that the MSSQL12.SQLBI subfolder will vary user by user depending on what you named your SQL instance when you were installing MS SQL Server 2014. At this point in time within the database you should have received a message saying that Microsoft SQL Server has processed 24248 pages for database 'AdventureWorks2014'. Once you have refreshed your database tab on the upper left hand corner of SQL Server, the AdventureWorks database will become visible as well as all of the appropriate tables as shown in the following screenshot: One final step that we will need to verify just to make sure that your login account has all of the appropriate server settings. When you right-click on the SQL Server name on the upper left hand portion of Management Studio, select the properties.  Select Permissions inside Properties. Find your username and check all of the rights under the Grant column as shown in the following screenshot: Finally, we need to also ensure that the folder that houses Microsoft SQL Server 2014 also has the appropriate rights enabled for your current user.  That specific folder is located under C:Program FilesMicrosoft SQL Server. For purposes of our exercises, we will assign all rights for the SQL Server user to the following folder as shown in the following screenshot: We are now ready to begin connecting our BI tools to our data! Summary The emphasis will be placed on implementing Business Intelligence best practices within the various tools that will be used based on the different levels of data that is provided within the AdventureWorks database. In the next chapter we will cover extracting additional data from the web that will be joined to the AdventureWorks database. This process is known as web scraping and can be performed with great success using tools such as Python and R. In addition to collecting the data, we will focus on transforming the collected data for optimal query performance. Resources for Article: Further resources on this subject: LabVIEW Basics [article] Thinking Probabilistically [article] Clustering Methods [article]
Read more
  • 0
  • 1
  • 4035

article-image-connecting-react-redux-firebase-part-1
AJ Webb
09 Nov 2016
7 min read
Save for later

Connecting React to Redux & Firebase - Part 1

AJ Webb
09 Nov 2016
7 min read
Have you tried using React and now you want to increase your abilities? Are you ready to scale up your small React app? Have you wondered how to offload all of your state into a single place and keep your components more modular? Using Redux with React allows you to have a single source of truth for all of your app's state. The two of them together allow you to never have to set state on a component and allows your components to be completely reusable. For some added sugar, you'll also learn how to leverage Firebase and use Redux actions to subscribe to data that is updated in real time. In this two-part post, you'll walk through creating a new chat app called Yak using React's new CLI, integrating Redux into your app, updating your state and connecting it all to Firebase. Let's get started. Setting up This post is written with the assumption that you have Node.js and NPM already installed. Also, it assumes some knowledge of JavaScript and React.js. If you don't already have Node.js and NPM installed, head over to the Node.js install instructions. At the time of writing this post, the Node.js version is 6.6.0 and NPM version is 3.10.8. Once you have Node.js installed, open up your favorite terminal app and install the NPM package Create React App; the current version at the time of writing this post is 0.6.0, so make sure to specify that version. [~]$ npm install -g create-react-app@0.6.0 Now we'll want to set up our app and install our dependencies. First we'll navigate to where we want our project to live. I like to keep my projects at ~/code, so I'll navigate there. You may need to create the directory using mkdir if you don't have it, or you might want to store it elsewhere. It doesn't matter which you choose; just head to where you want to store your project. [~]$ cd ~/code Once there, use Create React App to create the app: [~/code]$ create-react-app yak This command is going to create a directory called yak and create all the necessary files you need in order to start a baseline React.js app. Once the command has completed you should see some commands that you can run in your new app. Create React App has created the boilerplate files for you. Take a moment to familiarize yourself with these files. .gitignore All the files and directories you want ignored by git. README.md Documentation on what has been created. This is a good resource to lean on as you're learning React.js and using your app. node_modules All the packages that are required to run and build the application up to this point. package.json Instructs NPM how scripts run on your app, which packages your app depending on and other meta things such as version and app name. public All the static files that aren't used within the app. Mainly for index.html and favicon.ico. src All the app files; the app is run by Webpack and is set up to watch all the files inside of this directory. This is where you will spend the majority of your time. There are two files that cannot be moved while working on the app; they are public/index.html and src/index.js. The app relies on these two files in order to run. You can change them, but don't move them. Now to get started, navigate into the app folder and start the app. [~/code]$ cd yak [~/code/yak]$ npm start The app should start and automatically open http://localhost:3000/ in your default browser. You should see a black banner with the React.js logo spinning and some instructions on how to get started. To stop the app, press ctrl-c in the terminal window that is running the app. Getting started with Redux Next install Redux and React-Redux: [~/code/yak]$ npm install --save redux react-redux Redux will allow the app to have a single source of truth for state throughout the app. The idea is to keep all the React components ignorant of the state, and to pass that state to them via props. Containers will be used to select data from the state and pass the data to the ignorant components via props. React-Redux is a utility that assists in integrating React with Redux. Redux's state is read-only and you can only change the state by emitting an action that a reducer function uses to take the previous state and return a new state. Make sure as you are writing your reducers to never mutate the state (more on that later). Now you will add Redux to your app, in src/index.js. Just below importing ReactDOM, add: import { createStore, compose } from 'redux'; import { Provider } from 'react-redux'; You now have the necessary functions and components to set up your Redux store and pass it to your React app. Go ahead and get your store initialized. After the last import statement and before ReactDOM.render() is where you will create your store. const store = createStore(); Yikes! If you run the app and open your inspector, you should see the following console error: Uncaught Error: Expected the reducer to be a function. That error is thrown because the createStore function requires a reducer as the first parameter. The second parameter is an optional initial state and the last parameter is for middleware that you may want for your store. Go ahead and create a reducer for your store, and ignore the other two parameters for now. [~/code/yak]$ touch src/reducer.js Now open reducer.js and add the following code to the reducer: const initialState = { messages: [] }; export function yakApp(state = initialState, action) { return state; } Here you have created an initial state for the current reducer, and a function that is either accepting a new state or using ES6 default arguments to set an undefined state to the initial state. The function is simply returning the state and not making any changes for now. This is a perfectly valid reducer and will work to solve the console error and get the app running again. Now it's time to add it to the store. Back in src/index.js, import the reducer in and then set the yakApp function to your store. import { yakApp } from './reducer'; const store = createStore(yakApp); Restart the app and you'll see that it is now working again! One last thing to get things set up in your bootstrapping file src/index.js. You have your store and have imported Provider; now it’s time to connect the two and allow the app to have access to the store. Update the ReactDOM.render method to look like the following. ReactDOM.render( <Provider store={store}> <App /> </Provider>, document.getElementById('root') ); Now you can jump into App.js and connect your store. In App.js, add the following import statement: import { connect } from 'react-redux'; At the bottom of the file, just before the export statement, add: function mapStateToProps(state) { return { messages: state.messages } } And change the export statement to be: export default connect(mapStateToProps)(App); That's it! Your App component is now connected to the redux store. The messages array is being mapped to this.props. Go ahead and try it; add a console log to the render() method just before the return statement. console.log(this.props.messages); The console should log an empty array. This is great! Conclusion In this post, you've learned to create a new React app without having to worry about tooling. You've integrated Redux into the app and created a simple reducer. You've also connected the reducer to your React component. But how do you add data to the array as messages are sent? How do you persist the array of messages after you leave the app? How do you connect all of this to your UI? How do you allow your users to create glorious data for you? In the next post, you'll learn to do all those things. Stay tuned! About the author AJ Webb is a team lead and frontend engineer for @tannerlabs, and the co-creator of Payba.cc.
Read more
  • 0
  • 0
  • 17314

article-image-data-access-layer
Packt
09 Nov 2016
13 min read
Save for later

Data Access Layer

Packt
09 Nov 2016
13 min read
In this article by Alexander Zaytsev, author of NHibernate 4.0 Cookbook, we will cover the following topics: Transaction Auto-wrapping for the data access layer Setting up an NHibernate repository Using Named Queries in the data access layer (For more resources related to this topic, see here.) Introduction There are two styles of data access layer common in today's applications. Repositories and Data Access Objects. In reality, the distinction between these two have become quite blurred, but in theory, it's something like this: A repository should act like an in-memory collection. Entities are added to and removed from the collection, and its contents can be enumerated. Queries are typically handled by sending query specifications to the repository. A DAO (Data Access Object) is simply an abstraction of an application's data access. Its purpose is to hide the implementation details of the database access, from the consuming code. The first recipe shows the beginnings of a typical data access object. The remaining recipes show how to set up a repository-based data access layer with NHibernate's various APIs. Transaction Auto-wrapping for the data access layer In this recipe, we'll show you how we can set up the data access layer to wrap all data access in NHibernate transactions automatically. How to do it... Create a new class library named Eg.Core.Data. Install NHibernate to Eg.Core.Data using NuGet Package Manager Console. Add the following two DOA classes: public class DataAccessObject<T, TId> where T : Entity<TId> { private readonly ISessionFactory _sessionFactory; private ISession session { get { return _sessionFactory.GetCurrentSession(); } } public DataAccessObject(ISessionFactory sessionFactory) { _sessionFactory = sessionFactory; } public T Get(TId id) { return WithinTransaction(() => session.Get<T>(id)); } public T Load(TId id) { return WithinTransaction(() => session.Load<T>(id)); } public void Save(T entity) { WithinTransaction(() => session.SaveOrUpdate(entity)); } public void Delete(T entity) { WithinTransaction(() => session.Delete(entity)); } private TResult WithinTransaction<TResult>(Func<TResult> func) { if (!session.Transaction.IsActive) { // Wrap in transaction TResult result; using (var tx = session.BeginTransaction()) { result = func.Invoke(); tx.Commit(); } return result; } // Don't wrap; return func.Invoke(); } private void WithinTransaction(Action action) { WithinTransaction<bool>(() => { action.Invoke(); return false; }); } } public class DataAccessObject<T> : DataAccessObject<T, Guid> where T : Entity { } How it works... NHibernate requires that all data access occurs inside an NHibernate transaction. Remember, the ambient transaction created by TransactionScope is not a substitute for an NHibernate transaction This recipe, however, shows a more explicit approach. To ensure that at least all our data access layer calls are wrapped in transactions, we create a private WithinTransaction method that accepts a delegate, consisting of some data access methods, such as session.Save or session.Get. This WithinTransaction method first checks if the session has an active transaction. If it does, the delegate is invoked immediately. If it doesn't, a new NHibernate transaction is created, the delegate is invoked, and finally the transaction is committed. If the data access method throws an exception, the transaction will be rolled back automatically as the exception bubbles up to the using block. There's more... This transactional auto-wrapping can also be set up using SessionWrapper from the unofficial NHibernate AddIns project at https://bitbucket.org/fabiomaulo/unhaddins. This class wraps a standard NHibernate session. By default, it will throw an exception when the session is used without an NHibernate transaction. However, it can be configured to check for and create a transaction automatically, much in the same way I've shown you here. See also Setting up an NHibernate repository Setting up an NHibernate Repository Many developers prefer the repository pattern over data access objects. In this recipe, we'll show you how to set up the repository pattern with NHibernate. How to do it... Create a new, empty class library project named Eg.Core.Data. Add a reference to Eg.Core project. Add the following IRepository interface: public interface IRepository<T>: IEnumerable<T> where T : Entity { void Add(T item); bool Contains(T item); int Count { get; } bool Remove(T item); } Create a new, empty class library project named Eg.Core.Data.Impl. Add references to the Eg.Core and Eg.Core.Data projects. Add a new abstract class named NHibernateBase using the following code: protected readonly ISessionFactory _sessionFactory; protected virtual ISession session { get { return _sessionFactory.GetCurrentSession(); } } public NHibernateBase(ISessionFactory sessionFactory) { _sessionFactory = sessionFactory; } protected virtual TResult WithinTransaction<TResult>( Func<TResult> func) { if (!session.Transaction.IsActive) { // Wrap in transaction TResult result; using (var tx = session.BeginTransaction()) { result = func.Invoke(); tx.Commit(); } return result; } // Don't wrap; return func.Invoke(); } protected virtual void WithinTransaction(Action action) { WithinTransaction<bool>(() => { action.Invoke(); return false; }); } Add a new class named NHibernateRepository using the following code: public class NHibernateRepository<T> : NHibernateBase, IRepository<T> where T : Entity { public NHibernateRepository( ISessionFactory sessionFactory) : base(sessionFactory) { } public void Add(T item) { WithinTransaction(() => session.Save(item)); } public bool Contains(T item) { if (item.Id == default(Guid)) return false; return WithinTransaction(() => session.Get<T>(item.Id)) != null; } public int Count { get { return WithinTransaction(() => session.Query<T>().Count()); } } public bool Remove(T item) { WithinTransaction(() => session.Delete(item)); return true; } public IEnumerator<T> GetEnumerator() { return WithinTransaction(() => session.Query<T>() .Take(1000).GetEnumerator()); } IEnumerator IEnumerable.GetEnumerator() { return WithinTransaction(() => GetEnumerator()); } } How it works... The repository pattern, as explained in http://martinfowler.com/eaaCatalog/repository.html, has two key features: It behaves as an in-memory collection Query specifications are submitted to the repository for satisfaction. In this recipe, we are concerned only with the first feature, behaving as an in-memory collection. The remaining recipes in this article will build on this base, and show various methods for satisfying the second point. Because our repository should act like an in-memory collection, it makes sense that our IRepository<T> interface should resemble ICollection<T>. Our NHibernateBase class provides both contextual session management and the automatic transaction wrapping explained in the previous recipe. NHibernateRepository simply implements the members of IRepository<T>. There's more... The Repository pattern reduces data access to its absolute simplest form, but this simplification comes with a price. We lose much of the power of NHibernate behind an abstraction layer. Our application must either do without even basic session methods like Merge, Refresh, and Load, or allow them to leak through the abstraction. See also Transaction Auto-wrapping for the data access layer Using Named Queries in the data access layer Using Named Queries in the data access layer Named Queries encapsulated in query objects is a powerful combination. In this recipe, we'll show you how to use Named Queries with your data access layer. Getting ready To complete this recipe you will need Common Service Locator from Microsoft Patterns & Practices. The documentation and source code could be found at http://commonservicelocator.codeplex.com. Complete the previous recipe Setting up an NHibernate repository. Include the Eg.Core.Data.Impl assembly as an additional mapping assembly in your test project's App.Config with the following xml: <mapping assembly="Eg.Core.Data.Impl"/> How to do it... In the Eg.Core.Data project, add a folder for the Queries namespace. Add the following IQuery interfaces: public interface IQuery { } public interface IQuery<TResult> : IQuery { TResult Execute(); } Add the following IQueryFactory interface: public interface IQueryFactory { TQuery CreateQuery<TQuery>() where TQuery :IQuery; } Change the IRepository interface to implement the IQueryFactory interface, as shown in the following code: public interface IRepository<T> : IEnumerable<T>, IQueryFactory where T : Entity { void Add(T item); bool Contains(T item); int Count { get; } bool Remove(T item); } In the Eg.Core.Data.Impl project, change the NHibernateRepository constructor and add the _queryFactory field, as shown in the following code: private readonly IQueryFactory _queryFactory; public NHibernateRepository( ISessionFactory sessionFactory, IQueryFactory queryFactory) : base(sessionFactory) { _queryFactory = queryFactory; } Add the following method to NHibernateRepository: public TQuery CreateQuery<TQuery>() where TQuery : IQuery { return _queryFactory.CreateQuery<TQuery>(); } In the Eg.Core.Data.Impl project, add a folder for the Queries namespace. Install Common Service Locator using NuGet Package Manager Console, using the command. Install-Package CommonServiceLocator To the Queries namespace, add this QueryFactory class: public class QueryFactory : IQueryFactory { private readonly IServiceLocator _serviceLocator; public QueryFactory(IServiceLocator serviceLocator) { _serviceLocator = serviceLocator; } public TQuery CreateQuery<TQuery>() where TQuery : IQuery { return _serviceLocator.GetInstance<TQuery>(); } } Add the following NHibernateQueryBase class: public abstract class NHibernateQueryBase<TResult> : NHibernateBase, IQuery<TResult> { protected NHibernateQueryBase( ISessionFactory sessionFactory) : base(sessionFactory) { } public abstract TResult Execute(); } Add an empty INamedQuery interface, as shown in the following code: public interface INamedQuery { string QueryName { get; } } Add a NamedQueryBase class, as shown in the following code: public abstract class NamedQueryBase<TResult> : NHibernateQueryBase<TResult>, INamedQuery { protected NamedQueryBase(ISessionFactory sessionFactory) : base(sessionFactory) { } public override TResult Execute() { var nhQuery = GetNamedQuery(); return Transact(() => Execute(nhQuery)); } protected abstract TResult Execute(IQuery query); protected virtual IQuery GetNamedQuery() { var nhQuery = session.GetNamedQuery(QueryName); SetParameters(nhQuery); return nhQuery; } protected abstract void SetParameters(IQuery nhQuery); public virtual string QueryName { get { return GetType().Name; } } } In Eg.Core.Data.Impl.Test, add a test fixture named QueryTests inherited from NHibernateFixture. Add the following test and three helper methods: [Test] public void NamedQueryCheck() { var errors = new StringBuilder(); var queryObjectTypes = GetNamedQueryObjectTypes(); var mappedQueries = GetNamedQueryNames(); foreach (var queryType in queryObjectTypes) { var query = GetQuery(queryType); if (!mappedQueries.Contains(query.QueryName)) { errors.AppendFormat( "Query object {0} references non-existent " + "named query {1}.", queryType, query.QueryName); errors.AppendLine(); } } if (errors.Length != 0) Assert.Fail(errors.ToString()); } private IEnumerable<Type> GetNamedQueryObjectTypes() { var namedQueryType = typeof(INamedQuery); var queryImplAssembly = typeof(BookWithISBN).Assembly; var types = from t in queryImplAssembly.GetTypes() where namedQueryType.IsAssignableFrom(t) && t.IsClass && !t.IsAbstract select t; return types; } private IEnumerable<string> GetNamedQueryNames() { var nhCfg = NHConfigurator.Configuration; var mappedQueries = nhCfg.NamedQueries.Keys .Union(nhCfg.NamedSQLQueries.Keys); return mappedQueries; } private INamedQuery GetQuery(Type queryType) { return (INamedQuery) Activator.CreateInstance( queryType, new object[] { SessionFactory }); } For our example query, in the Queries namespace of Eg.Core.Data, add the following interface: public interface IBookWithISBN : IQuery<Book> { string ISBN { get; set; } } Add the implementation to the Queries namespace of Eg.Core.Data.Impl using the following code: public class BookWithISBN : NamedQueryBase<Book>, IBookWithISBN { public BookWithISBN(ISessionFactory sessionFactory) : base(sessionFactory) { } public string ISBN { get; set; } protected override void SetParameters( NHibernate.IQuery nhQuery) { nhQuery.SetParameter("isbn", ISBN); } protected override Book Execute(NHibernate.IQuery query) { return query.UniqueResult<Book>(); } } Finally, add the embedded resource mapping, BookWithISBN.hbm.xml, to Eg.Core.Data.Impl with the following xml code: <?xml version="1.0" encoding="utf-8" ?> <hibernate-mapping > <query name="BookWithISBN"> <![CDATA[ from Book b where b.ISBN = :isbn ]]> </query> </hibernate-mapping> How it works... As we learned in the previous recipe, according to the repository pattern, the repository is responsible for fulfilling queries, based on the specifications submitted to it. These specifications are limiting. They only concern themselves with whether a particular item matches the given criteria. They don't care for other necessary technical details, such as eager loading of children, batching, query caching, and so on. We need something more powerful than simple where clauses. We lose too much to the abstraction. The query object pattern defines a query object as a group of criteria that can self-organize in to a SQL query. The query object is not responsible for the execution of this SQL. This is handled elsewhere, by some generic query runner, perhaps inside the repository. While a query object can better express the different technical requirements, such as eager loading, batching, and query caching, a generic query runner can't easily implement those concerns for every possible query, especially across the half-dozen query APIs provided by NHibernate. These details about the execution are specific to each query, and should be handled by the query object. This enhanced query object pattern, as Fabio Maulo has named it, not only self-organizes into SQL but also executes the query, returning the results. In this way, the technical concerns of a query's execution are defined and cared for with the query itself, rather than spreading into some highly complex, generic query runner. According to the abstraction we've built, the repository represents the collection of entities that we are querying. Since the two are already logically linked, if we allow the repository to build the query objects, we can add some context to our code. For example, suppose we have an application service that runs product queries. When we inject dependencies, we could specify IQueryFactory directly. This doesn't give us much information beyond "This service runs queries." If, however, we inject IRepository<Product>, we have a much better idea about what data the service is using. The IQuery interface is simply a marker interface for our query objects. Besides advertising the purpose of our query objects, it allows us to easily identify them with reflection. The IQuery<TResult> interface is implemented by each query object. It specifies only the return type and a single method to execute the query. The IQueryFactory interface defines a service to create query objects. For the purpose of explanation, the implementation of this service, QueryFactory, is a simple service locator. IQueryFactory is used internally by the repository to instantiate query objects. The NamedQueryBase class handles most of the plumbing for query objects, based on named HQL and SQL queries. As a convention, the name of the query is the name of the query object type. That is, the underlying named query for BookWithISBN is also named BookWithISBN. Each individual query object must simply implement SetParameters and Execute(NHibernate.IQuery query), which usually consists of a simple call to query.List<SomeEntity>() or query.UniqueResult<SomeEntity>(). The INamedQuery interface both identifies the query objects based on Named Queries, and provides access to the query name. The NamedQueryCheck test uses this to verify that each INamedQuery query object has a matching named query. Each query has an interface. This interface is used to request the query object from the repository. It also defines any parameters used in the query. In this example, IBookWithISBN has a single string parameter, ISBN. The implementation of this query object sets the :isbn parameter on the internal NHibernate query, executes it, and returns the matching Book object. Finally, we also create a mapping containing the named query BookWithISBN, which is loaded into the configuration with the rest of our mappings. The code used in the query object setup would look like the following code: var query = bookRepository.CreateQuery<IBookWithISBN>(); query.ISBN = "12345"; var book = query.Execute(); See also Transaction Auto-wrapping for the data access layer Setting up an NHibernate repository Summary In this article we learned how to transact Auto-wrapping for the data access layer, setting up an NHibernate repository and how to use Named Queries in the data access layer Resources for Article: Further resources on this subject: Memory Management [article] Getting Started with Spring Security [article] Design with Spring AOP [article]
Read more
  • 0
  • 0
  • 13630
article-image-introduction-r-programming-language-and-statistical-environment
Packt
09 Nov 2016
34 min read
Save for later

Introduction to R Programming Language and Statistical Environment

Packt
09 Nov 2016
34 min read
In this article by Simon Walkowiak author of the book Big Data Analytics with R, we will have the opportunity to learn some most important R functions from base R installation and well-known third party packages used for data crunching, transformation, and analysis. More specifically in this article you will learn to: Understand the landscape of available R data structures Be guided through a number of R operations allowing you to import data from standard and proprietary data formats Carry out essential data cleaning and processing activities such as subsetting, aggregating, creating contingency tables, and so on Inspect the data by implementing a selection of Exploratory Data Analysis techniques such as descriptive statistics Apply basic statistical methods to estimate correlation parameters between two (Pearson's r) or more variables (multiple regressions) or find the differences between means for two (t-tests) or more groups Analysis of variance (ANOVA) Be introduced to more advanced data modeling tasks like logistic and Poisson regressions (For more resources related to this topic, see here.) Learning R This book assumes that you have been previously exposed to R programming language, and this article would serve more as a revision, and an overview, of the most essential operations, rather than a very thorough handbook on R. The goal of this work is to present you with specific R applications related to Big Data and the way you can combine R with your existing Big Data analytics workflows instead of teaching you basics of data processing in R. There is a substantial number of great introductory and beginner-level books on R available at IT specialized bookstores or online, directly from Packt Publishing, and other respected publishers, as well as on the Amazon store. Some of the recommendations include the following: R in Action: Data Analysis and Graphics with R by Robert Kabacoff (2015), 2nd edition, Manning Publications R Cookbook by Paul Teetor (2011), O'Reilly Discovering Statistics Using R by Andy Field, Jeremy Miles, and Zoe Field (2012), SAGE Publications R for Data Science by Dan Toomey (2014), Packt Publishing An alternative route to the acquisition of good practical R skills is through a large number of online resources, or more traditional tutor-led in-class training courses. The first option offers you an almost limitless choice of websites, blogs, and online guides. A good starting point is the main and previously mentioned Comprehensive R Archive Network (CRAN) page (https://cran.r-project.org/), which, apart from the R core software, contains several well-maintained manuals and Task Views—community run indexes of R packages dealing with specific statistical or data management issues. R-bloggers on the other hand (http://www.r-bloggers.com/) deliver regular news on R in the form of R-related blog posts or tutorials prepared by R enthusiasts and data scientists. Other interesting online sources, which you will probably find yourself using quite often, are as follows: http://www.inside-r.org/—news and information from and by R community http://www.rdocumentation.org/—a useful search engine of R packages and functions http://blog.rstudio.org/—a blog run and edited by RStudio engineers http://www.statmethods.net/—a very informative tutorial-laden website based on the popular R book R in Action by Rob Kabacoff However, it is very likely that after some initial reading, and several months of playing with R, your most frequent destinations to seek further R-related information and obtain help on more complex use cases for specific functions will become StackOverflow(http://stackoverflow.com/) and, even better, StackExchange (http://stackexchange.com/). StackExchange is in fact a network of support and question-and-answer community-run websites, which address many problems related to statistical, mathematical, biological, and other methods or concepts, whereas StackOverflow, which is currently one of the sub-sites under the StackExchange label, focuses more on applied programming issues and provides users with coding hints and solutions in most (if not all) programming languages known to developers. Both tend to be very popular amongst R users, and as of late December 2015, there were almost 120,000 R-tagged questions asked on StackOverflow. The http://stackoverflow.com/tags/r/info page also contains numerous links and further references to free interactive R learning resources, online books and manuals and many other. Another good idea is to start your R adventure from user-friendly online training courses available through online-learning providers like Coursera (https://www.coursera.org), DataCamp (https://www.datacamp.com), edX (https://www.edx.org), or CodeSchool (https://www.codeschool.com). Of course, owing to the nature of such courses, a successful acquisition of R skills is somewhat subjective, however, in recent years, they have grown in popularity enormously, and they have also gained rather positive reviews from employers and recruiters alike. Online courses may then be very suitable, especially for those who, for various reasons, cannot attend a traditional university degree with R components, or just prefer to learn R at their own leisure or around their working hours. Before we move on to the practical part, whichever strategy you are going to use to learn R, please do not be discouraged by the first difficulties. R, like any other programming language, or should I say, like any other language (including foreign languages), needs time, patience, long hours of practice, and a large number of varied exercises to let you explore many different dimensions and complexities of its syntax and rich libraries of functions. If you are still struggling with your R skills, however, I am sure the next section will get them off the ground. Revisiting R basics In the following section we will present a short revision of the most useful and frequently applied R functions and statements. We will start from a quick R and RStudio installation guide and then proceed to creating R data structures, data manipulation, and transformation techniques, and basic methods used in the Exploratory Data Analysis (EDA). Although the R codes listed in this book have been tested extensively, as always in such cases, please make sure that your equipment is not faulty and that you will be running all the following scripts at your own risk. Getting R and RStudio ready Depending on your operating system (whether Mac OS X, Windows, or Linux) you can download and install specific base R files directly from https://cran.r-project.org/. If you prefer to use RStudio IDE you still need to install R core available from CRAN website first and then download and run installers of the most recent version of RStudio IDE specific for your platform from https://www.rstudio.com/products/rstudio/download/. Personally I prefer to use RStudio, owing to its practical add-ons such as code highlighting and more user-friendly GUI, however, there is no particular reason why you can't use just the simple R core installation if you want to. Having said that, in this book we will be using RStudio in most of the examples. All code snippets have been executed and run on a MacBook Pro laptop with Mac OS X (Yosemite) operating system, 2.3 GHz Intel Core i5 processor, 1TB solid-state hard drive and 16GB of RAM memory, but you should also be fine with a much weaker configuration. In this article we won't be using any large data, and even in the remaining parts of this book the data sets used are limited to approximately 100MB to 130MB in size each. You are also provided with links and references to full Big Data whenever possible. If you would like to follow the practical parts of this book you are advised to download and unzip the R code and data for each article from the web page created for this book by Packt Publishing. If you use this book in PDF format it is not advisable to copy the code and paste it into the R console. When printed, some characters (like quotation marks " ") may be encoded differently than in R and the execution of such commands may result in errors being returned by the R console. Once you have downloaded both R core and RStudio installation files, follow the on-screen instructions for each installer. When you have finished installing them, open your RStudio software. Upon initialization of the RStudio you should see its GUI with a number of windows distributed on the screen. The largest one is the console in which you input and execute the code, line by line. You can also invoke the editor panel (it is recommended) by clicking on the white empty file icon in the top left corner of the RStudio software or alternatively by navigating to File | New File | R Script. If you have downloaded the R code from the book page of the Packt Publishing website, you may also just click on the Open an existing file (Ctrl + O) (a yellow open folder icon) and locate the downloaded R code on your computer's hard drive (or navigate to File | Open File…). Now your RStudio session is open and we can adjust some most essential settings. First, you need to set your working directory to the location on your hard drive where your data files are. If you know the specific location you can just type the setwd() command with a full and exact path to the location of your data as follows: > setwd("/Users/simonwalkowiak/Desktop/data") Of course your actual path will differ from mine, shown in the preceding code, however please mind that if you copy the path from the Windows Explorer address bar you will need to change the backslashes to forward slashes / (or to double backslashes \). Also, the path needs to be kept within the quotation marks "…". Alternatively you can set your working directory by navigating to Session | Set Working Directory | Choose Directory… to manually select the folder in which you store the data for this session. Apart from the ones we have already described, there are other ways to set your working directory correctly. In fact most of the operations, and even more complex data analysis and processing activities, can be achieved in R in numerous ways. For obvious reasons, we won't be presenting all of them, but we will just focus on the frequently used methods and some tips and hints applicable to special or difficult scenarios. You can check whether your working directory has been set correctly by invoking the following line: > getwd() [1] "/Users/simonwalkowiak/Desktop/data" From what you can see, the getwd() function returned the correct destination for my previously defined working directory. Setting the URLs to R repositories It is always good practice to check whether your R repositories are set correctly. R repositories are servers located at various institutes and organizations around the world, which store recent updates and new versions of third-party R packages. It is recommended that you set the URL of your default repository to the CRAN server and choose a mirror that is located relatively close to you. To set the repositories you may use the following code: > setRepositories(addURLs = c(CRAN = "https://cran.r-project.org/")) You can check your current, or default, repository URLs by invoking the following function: > getOption("repos") The output will confirm your URL selection:               CRAN "https://cran.r-project.org/" You will be able to choose specific mirrors when you install a new package for the first time during the session, or you may navigate to Tools | Global Options… | Packages. In the Package management section of the window you can alter the default CRAN mirror location—click on Change… button to adjust. Once your repository URLs and working directory are set, you can go on to create data structures that are typical for R programming language. R data structures The concept of data structures in various programming languages is extremely important and cannot be overlooked. Similarly in R, available data structures allow you to hold any type of data and use them for further processing and analysis. The kind of data structure which you use, puts certain constraints on how you can access and process data stored in this structure, and what manipulation techniques you can use. This section will briefly guide you through a number of basic data structures available in R language. Vectors Whenever I teach statistical computing courses, I always start by introducing R learners to vectors as the first data structure they should get familiar with. Vectors are one-dimensional structures that can hold any type of data that is numeric, character, or logical. In simple terms, a vector is a sequence of some sort of values (for example numeric, character, logical, and many more) of specified length. The most important thing that you need to remember is that an atomic vector may contain only one type of data. Let's then create a vector with 10 random deviates from a standard normal distribution, and store all its elements in an object which we will call vector1. In your RStudio console (or its editor) type the following: > vector1 <- rnorm(10) Let's now see the contents of our newly created vector1: > vector1 [1] -0.37758383 -2.30857701 2.97803059 -0.03848892 1.38250714 [6] 0.13337065 -0.51647388 -0.81756661 0.75457226 -0.01954176 As we drew random values, your vector most likely contains different elements to the ones shown in the preceding example. Let's then make sure that my new vector (vector2) is the same as yours. In order to do this we need to set a seed from which we will be drawing the values: > set.seed(123) > vector2 <- rnorm(10, mean=3, sd=2) > vector2 [1] 1.8790487 2.5396450 6.1174166 3.1410168 3.2585755 6.4301300 [7] 3.9218324 0.4698775 1.6262943 2.1086761 In the preceding code we've set the seed to an arbitrary number (123) in order to allow you to replicate the values of elements stored in vector2 and we've also used some optional parameters of the rnorm() function, which enabled us to specify two characteristics of our data, that is the arithmetic mean (set to 3) and standard deviation (set to 2). If you wish to inspect all available arguments of the rnorm() function, its default settings, and examples of how to use it in practice, type ?rnorm to view help and information on that specific function. However, probably the most common way in which you will be creating a vector of data is by using the c() function (c stands for concatenate) and then explicitly passing the values of each element of the vector: > vector3 <- c(6, 8, 7, 1, 2, 3, 9, 6, 7, 6) > vector3 [1] 6 8 7 1 2 3 9 6 7 6 In the preceding example we've created vector3 with 10 numeric elements. You can use the length() function of any data structure to inspect the number of elements: > length(vector3) [1] 10 The class() and mode() functions allow you to determine how to handle the elements of vector3 and how the data are stored in vector3 respectively. > class(vector3) [1] "numeric" > mode(vector3) [1] "numeric" The subtle difference between both functions becomes clearer if we create a vector that holds levels of categorical variable (known as a factor in R) with character values: > vector4 <- c("poor", "good", "good", "average", "average", "good", "poor", "good", "average", "good") > vector4 [1] "poor" "good" "good" "average" "average" "good" "poor" [8] "good" "average" "good" > class(vector4) [1] "character" > mode(vector4) [1] "character" > levels(vector4) NULL In the preceding example, both the class() and mode() outputs of our character vector are the same, as we still haven't set it to be treated as a categorical variable, and we haven't defined its levels (the contents of the levels() function is empty—NULL). In the following code we will explicitly set the vector to be recognized as categorical with three levels: > vector4 <- factor(vector4, levels = c("poor", "average", "good")) > vector4 [1] poor good good average average good poor good [8] average good Levels: poor average good The sequence of levels doesn't imply that our vector is ordered. We can order the levels of factors in R using the ordered() command. For example, you may want to arrange the levels of vector4 in reverse order, starting from "good": > vector4.ord <- ordered(vector4, levels = c("good", "average", "poor")) > vector4.ord [1] poor good good average average good poor good [8] average good Levels: good < average < poor You can see from the output that R has now properly recognized the order of our levels, which we had defined. We can now apply class() and mode() functions on the vector4.ord object: > class(vector4.ord) [1] "ordered" "factor" > mode(vector4.ord) [1] "numeric" You may very likely be wondering why the mode() function returned "numeric" type instead of "character". The answer is simple. By setting the levels of our factor, R has assigned values 1, 2, and 3 to "good", "average" and "poor" respectively, exactly in the same order as we had defined them in the ordered() function. You can check this using levels() and str() functions: > levels(vector4.ord) [1] "good" "average" "poor" > str(vector4.ord) Ord.factor w/ 3 levels "good"<"average"<..: 3 1 1 2 2 1 3 1 2 1 Just to finalize the subject of vectors, let's create a logical vector, which contains only TRUE and FALSE values: > vector5 <- c(TRUE, FALSE, TRUE, FALSE, FALSE, FALSE, TRUE, FALSE, FALSE, FALSE) > vector5 [1] TRUE FALSE TRUE FALSE FALSE FALSE TRUE FALSE FALSE FALSE Similarly, for all other vectors already presented, feel free to check their structure, class, mode, and length using appropriate functions shown in this section. What outputs did those commands return? Scalars The reason why I always start from vectors is that scalars just seem trivial when they follow vectors. To simplify things even more, think of scalars as one-element vectors which are traditionally used to hold some constant values for example: > a1 <- 5 > a1 [1] 5 Of course you may use scalars in computations and also assign any one-element outputs of mathematical or statistical operations to another, arbitrary named scalar for example: > a2 <- 4 > a3 <- a1 + a2 > a3 [1] 9 In order to complete this short subsection on scalars, create two separate scalars which will hold a character and a logical value. Matrices A matrix is a two-dimensional R data structure in which each of its elements must be of the same type; that is numeric, character, or logical. As matrices consist of rows and columns, their shape resembles tables. In fact, when creating a matrix, you can specify how you want to distribute values across its rows and columns for example: > y <- matrix(1:20, nrow=5, ncol=4) > y [,1] [,2] [,3] [,4] [1,] 1 6 11 16 [2,] 2 7 12 17 [3,] 3 8 13 18 [4,] 4 9 14 19 [5,] 5 10 15 20 In the preceding example we have allocated a sequence of 20 values (from 1 to 20) into five rows and four columns, and by default they have been distributed by column. We may now create another matrix in which we will distribute the values by rows and give names to rows and columns using the dimnames argument (dimnames stands for names of dimensions) in the matrix() function: > rows <- c("R1", "R2", "R3", "R4", "R5") > columns <- c("C1", "C2", "C3", "C4") > z <- matrix(1:20, nrow=5, ncol=4, byrow=TRUE, dimnames=list(rows, columns)) > z C1 C2 C3 C4 R1 1 2 3 4 R2 5 6 7 8 R3 9 10 11 12 R4 13 14 15 16 R5 17 18 19 20 As we are talking about matrices it's hard not to mention anything about how to extract specific elements stored in a matrix. This skill will actually turn out to be very useful when we get to subsetting real data sets. Looking at the matrix y, for which we didn't define any names of its rows and columns, notice how R denotes them. The row numbers come in the format [r, ], where r is a consecutive number of a row, whereas the column are identified by [ ,c], where c is a consecutive number of a column. If you then wished to extract a value stored in the fourth row of the second column of our matrix y, you could use the following code to do so: > y[4,2] [1] 9 In case you wanted to extract the whole column number three from our matrix y, you could type the following: > y[,3] [1] 11 12 13 14 15 As you can see, we don't even need to allow an empty space before the comma in order for this short script to work. Let's now imagine you would like to extract three values stored in the second, third and fifth rows of the first column in our vector z with named rows and columns. In this case, you may still want to use the previously shown notation, you do not need to refer explicitly to the names of dimensions of our matrix z. Additionally, notice that for several values to extract we have to specify their row locations as a vector—hence we will put their row coordinates inside the c() function which we had previously used to create vectors: > z[c(2, 3, 5), 1] R2 R3 R5 5 9 17 Similar rules of extracting data will apply to other data structures in R such as arrays, lists, and data frames, which we are going to present next. Arrays Arrays are very similar to matrices with only one exception: they contain more dimensions. However, just like matrices or vectors, they may only hold one type of data. In R language, arrays are created using the array() function: > array1 <- array(1:20, dim=c(2,2,5)) > array1 , , 1 [,1] [,2] [1,] 1 3 [2,] 2 4 , , 2 [,1] [,2] [1,] 5 7 [2,] 6 8 , , 3 [,1] [,2] [1,] 9 11 [2,] 10 12 , , 4 [,1] [,2] [1,] 13 15 [2,] 14 16 , , 5 [,1] [,2] [1,] 17 19 [2,] 18 20 The dim argument, which was used within the array() function, specifies how many dimensions you want to distribute your data across. As we had 20 values (from 1 to 20) we had to make sure that our array can hold all 20 elements, therefore we decided to assign them into two rows, two columns, and five dimensions (2 x 2 x 5 = 20). You can check dimensionality of your multi-dimensional R objects with dim() command: > dim(array1) [1] 2 2 5 As with matrices, you can use standard rules for extracting specific elements from your arrays. The only difference is that now you have additional dimensions to take care of. Let's assume you would want to extract a specific value located in the second row of the first column in the fourth dimension of our array1: > array1[2, 1, 4] [1] 14 Also, if you need to find a location of a specific value, for example 11, within the array, you can simply type the following line: > which(array1==11, arr.ind=TRUE) dim1 dim2 dim3 [1,] 1 2 3 Here, the which() function returns indices of the array (arr.ind=TRUE), where the sought value equals 11 (hence ==). As we had only one instance of value 11 in our array, there is only one row specifying its location in the output. If we had more instances of 11, additional rows would be returned indicating indices for each element equal to 11. Data frames The following two short subsections concern two of probably the most widely used R data structures. Data frames are very similar to matrices, but they may contain different types of data. Here you might have suddenly thought of a typical rectangular data set with rows and columns or observations and variables. In fact you are correct. Most of the data sets are indeed imported into R as data frames. You can also create a simple data frame manually with the data.frame() function, but as each column in the data frame may be of a different type, we must first create vectors which will hold data for specific columns: > subjectID <- c(1:10) > age <- c(37,23,42,25,22,25,48,19,22,38) > gender <- c("male", "male", "male", "male", "male", "female", "female", "female", "female", "female") > lifesat <- c(9,7,8,10,4,10,8,7,8,9) > health <- c("good", "average", "average", "good", "poor", "average", "good", "poor", "average", "good") > paid <- c(T, F, F, T, T, T, F, F, F, T) > dataset <- data.frame(subjectID, age, gender, lifesat, health, paid) > dataset subjectID age gender lifesat health paid 1 1 37 male 9 good TRUE 2 2 23 male 7 average FALSE 3 3 42 male 8 average FALSE 4 4 25 male 10 good TRUE 5 5 22 male 4 poor TRUE 6 6 25 female 10 average TRUE 7 7 48 female 8 good FALSE 8 8 19 female 7 poor FALSE 9 9 22 female 8 average FALSE 10 10 38 female 9 good TRUE The preceding example presents a simple data frame which contains some dummy imaginary data, possibly a sample from a basic psychological experiment, which measured subjects' life satisfaction (lifesat) and their health status (health) and also collected other socio-demographic information such as age and gender, and whether the participant was a paid subject or a volunteer. As we deal with various types of data, the elements for each column had to be amalgamated into a single structure of a data frame using the data.frame() command, and specifying the names of objects (vectors) in which we stored all values. You can inspect the structure of this data frame with the previously mentioned str() function: > str(dataset) 'data.frame': 10 obs. of 6 variables: $ subjectID: int 1 2 3 4 5 6 7 8 9 10 $ age : num 37 23 42 25 22 25 48 19 22 38 $ gender : Factor w/ 2 levels "female","male": 2 2 2 2 2 1 1 1 1 1 $ lifesat : num 9 7 8 10 4 10 8 7 8 9 $ health : Factor w/ 3 levels "average","good",..: 2 1 1 2 3 1 2 3 1 2 $ paid : logi TRUE FALSE FALSE TRUE TRUE TRUE ... The output of str() gives you some basic insights into the shape and format of your data in the dataset object, for example, number of observations and variables, names of variables, types of data they hold, and examples of values for each variable. While discussing data frames, it may also be useful to introduce you to another way of creating subsets. As presented earlier, you may apply standard extraction rules to subset data of your interest. For example, suppose you want to print only those columns which contain age, gender, and life satisfaction information from our dataset data frame. You may use the following two alternatives (the output not shown to save space, but feel free to run it): > dataset[,2:4] #or > dataset[, c("age", "gender", "lifesat")] Both lines of code will produce exactly the same results. The subset() function however gives you additional capabilities of defining conditional statements which will filter the data, based on the output of logical operators. You can replicate the preceding output using subset() in the following way: > subset(dataset[c("age", "gender", "lifesat")]) Assume now that you want to create a subset with all subjects who are over 30 years old, and with a score of greater than or equal to eight on the life satisfaction scale (lifesat). The subset() function comes very handy: > subset(dataset, age > 30 & lifesat >= 8) subjectID age gender lifesat health paid 1 1 37 male 9 good TRUE 3 3 42 male 8 average FALSE 7 7 48 female 8 good FALSE 10 10 38 female 9 good TRUE Or you want to produce an output with two socio-demographic variables of age and gender, of only these subjects who were paid to participate in this experiment: > subset(dataset, paid==TRUE, select=c("age", "gender")) age gender 1 37 male 4 25 male 5 22 male 6 25 female 10 38 female We will perform much more thorough and complex data transformations on real data frames in the second part of this article. Lists A list in R is a data structure, which is a collection of other objects. For example, in the list you can store vectors, scalars, matrices, arrays, data frames, and even other lists. In fact, lists in R are vectors, but they differ from atomic vectors, which we introduced earlier in this section as lists that can hold many different types of data. In the following example, we will construct a simple list (using list() function) which will include a variety of other data structures: > simple.vector1 <- c(1, 29, 21, 3, 4, 55) > simple.matrix <- matrix(1:24, nrow=4, ncol=6, byrow=TRUE) > simple.scalar1 <- 5 > simple.scalar2 <- "The List" > simple.vector2 <- c("easy", "moderate", "difficult") > simple.list <- list(name=simple.scalar2, matrix=simple.matrix, vector=simple.vector1, scalar=simple.scalar1, difficulty=simple.vector2) >simple.list $name [1] "The List" $matrix [,1] [,2] [,3] [,4] [,5] [,6] [1,] 1 2 3 4 5 6 [2,] 7 8 9 10 11 12 [3,] 13 14 15 16 17 18 [4,] 19 20 21 22 23 24 $vector [1] 1 29 21 3 4 55 $scalar [1] 5 $difficulty [1] "easy" "moderate" "difficult" > str(simple.list) List of 5 $ name : chr "The List" $ matrix : int [1:4, 1:6] 1 7 13 19 2 8 14 20 3 9 ... $ vector : num [1:6] 1 29 21 3 4 55 $ scalar : num 5 $ difficulty: chr [1:3] "easy" "moderate" "difficult" Looking at the preceding output, you can see that we have assigned names to each component in our list and the str() function prints them as if they were variables of a standard rectangular data set. In order to extract specific elements from a list, you first need to use a double square bracket notation [[x]] to identify a component x within the list. For example, assuming you want to print an element stored in its first row and the third column of the second component you may use the following line in R: > simple.list[[2]][1,3] [1] 3 Owing to their flexibility, lists are commonly used as preferred data structures in the outputs of statistical functions. It is then important for you to know how you can deal with lists and what sort of methods you can apply to extract and process data stored in them. Once you are familiar with the basic features of data structures available in R, you may wish to visit Hadley Wickham's online book at http://adv-r.had.co.nz/ in which he explains various more advanced concepts related to each native data structure in R language, and different techniques of subsetting data, depending on the way they are stored. Exporting R data objects In the previous section we created numerous objects, which you can inspect in the Environment tab window in RStudio. Alternatively, you may use the ls() function to list all objects stored in your global environment: > ls() If you've followed the article along, and run the script for this book line-by-line, the output of the ls() function should hopefully return 27 objects: [1] "a1" "a2" "a3" [4] "age" "array1" "columns" [7] "dataset" "gender" "health" [10] "lifesat" "paid" "rows" [13] "simple.list" "simple.matrix" "simple.scalar1" [16] "simple.scalar2" "simple.vector1" "simple.vector2" [19] "subjectID" "vector1" "vector2" [22] "vector3" "vector4" "vector4.ord" [25] "vector5" "y" "z" In this section we will present various methods of saving the created objects to your local drive and exporting their contents to a number of the most commonly used file formats. Sometimes, for various reasons, it may happen that you need to leave your project and exit RStudio or shut your PC down. If you do not save your created objects, you will lose all of them, the moment you close RStudio. Remember that R stores created data objects in the RAM of your machine, and whenever these objects are not in use any longer, R frees them from the memory, which simply means that they get deleted. Of course this might turn out to be quite costly, especially if you had not saved your original R script, which would have enabled you to replicate all the steps of your data processing activities when you start a new session in R. In order to prevent the objects from being deleted, you can save all or selected ones as .RData files on your hard drive. In the first case, you may use the save.image() function which saves your whole current workspace with all objects to your current working directory: > save.image(file = "workspace.RData") If you are dealing with large objects, first make sure you have enough storage space available on your drive (this is normally not a problem any longer), or alternatively you can reduce the size of the saved objects using one of the compression methods available. For example, the above workspace.RData file was 3,751 bytes in size without compression, but when xz compression was applied the size of the resulting file decreased to 3,568 bytes. > save.image(file = "workspace2.RData", compress = "xz") Of course, the difference in sizes in the presented example is minuscule, as we are dealing with very small objects, however it gets much more significant for bigger data structures. The trade-off of applying one of the compression methods is the time it takes for R to save and load .RData files. If you prefer to save only chosen objects (for example dataset data frame and simple.list list) you can achieve this with the save() function: > save(dataset, simple.list, file = "two_objects.RData") You may now test whether the above solutions worked by cleaning your global environment of all objects, and then loading one of the created files, for example: > rm(list=ls()) > load("workspace2.RData") As an additional exercise, feel free to explore other functions which allow you to write text representations of R objects, for example dump() or dput(). More specifically, run the following commands and compare the returned outputs: > dump(ls(), file = "dump.R", append = FALSE) > dput(dataset, file = "dput.txt") The save.image() and save() functions only create images of your workspace or selected objects on the hard drive. It is an entirely different story if you want to export some of the objects to data files of specified formats, for example, comma-separated, tab-delimited, or proprietary formats like Microsoft Excel, SPSS, or Stata. The easiest way to export R objects to generic file formats like CSV, TXT, or TAB is through the cat() function, but it only works on atomic vectors: > cat(age, file="age.txt", sep=",", fill=TRUE, labels=NULL, append=TRUE) > cat(age, file="age.csv", sep=",", fill=TRUE, labels=NULL, append=TRUE) The preceding code creates two files, one as a text file and another one as a comma-separated format, both of which contain values from the age vector that we had previously created for the dataset data frame. The sep argument is a character vector of strings to append after each element, the fill option is a logical argument which controls whether the output is automatically broken into lines (if set to TRUE), the labels parameter allows you to add a character vector of labels for each printed line of data in the file, and the append logical argument enables you to append the output of the call to the already existing file with the same name. In order to export vectors and matrices to TXT, CSV, or TAB formats you can use the write() function, which writes out a matrix or a vector in a specified number of columns for example: > write(age, file="agedata.csv", ncolumns=2, append=TRUE, sep=",") > write(y, file="matrix_y.tab", ncolumns=2, append=FALSE, sep="t") Another method of exporting matrices provides the MASS package (make sure to install it with the install.packages("MASS") function) through the write.matrix() command: > library(MASS) > write.matrix(y, file="ymatrix.txt", sep=",") For large matrices, the write.matrix() function allows users to specify the size of blocks in which the data are written through the blocksize argument. Probably the most common R data structure that you are going to export to different file formats will be a data frame. The generic write.table() function gives you an option to save your processed data frame objects to standard data formats for example TAB, TXT, or CSV: > write.table(dataset, file="dataset1.txt", append=TRUE, sep=",", na="NA", col.names=TRUE, row.names=FALSE, dec=".") The append and sep arguments should already be clear to you as they were explained earlier. In the na option you may specify an arbitrary string to use for missing values in the data. The logical parameter col.names allows users to append the names of columns to the output file, and the dec parameter sets the string used for decimal points and must be a single character. In the example, we used row.names set to FALSE, as the names of the rows in the data are the same as the values of the subjectID column. However, it is very likely that in other data sets the ID variable may differ from the names (or numbers) of rows, so you may want to control it depending on the characteristics of your data. Two similar functions write.csv() and write.csv2() are just convenience wrappers for saving CSV files, and they only differ from the generic write.table() function by default settings of some of their parameters, for example sep and dec. Feel free to explore these subtle differences at your leisure. To complete this section of the article we need to present how to export your R data frames to third-party formats. Amongst several frequently used methods, at least four of them are worth mentioning here. First, if you wish to write a data frame to a proprietary Microsoft Excel format, such as XLS or XLSX, you should probably use the WriteXLS package (please use install.packages("WriteXLS") if you have not done it yet) and its WriteXLS() function: > library(WriteXLS) > WriteXLS("dataset", "dataset1.xlsx", SheetNames=NULL, row.names=FALSE, col.names=TRUE, AdjWidth=TRUE, envir=parent.frame()) The WriteXLS() command offers users a number of interesting options, for instance you can set the names of the worksheets (SheetNames argument), adjust the widths of columns depending on the number of characters of the longest value (AdjWidth), or even freeze rows and columns just as you do it in Excel (FreezeRow and FreezeCol parameters). Please note that in order for the WriteXLS package to work, you need to have Perl installed on your machine. The package creates Excel files using Perl scripts called WriteXLS.pl for Excel 2003 (XLS) files, and WriteXLSX.pl for Excel 2007 and later version (XLSX) files. If Perl is not present on your system, please make sure to download and install it from https://www.perl.org/get.html. After the Perl installation, you may have to restart your R session and load the WriteXLS package again to apply the changes. For solutions to common Perl issues please visit the following websites: https://www.perl.org/docs.html, http://www.ahinea.com/en/tech/perl-unicode-struggle.html, and http://www.perl.com/pub/2012/04/perlunicook-standard-preamble.html or search StackOverflow and similar websites for R and Perl related specific problems. Another very useful way of writing R objects to the XLSX format is provided by the openxlsx package through the write.xlsx() function, which, apart from data frames, also allows lists to be easily written to Excel spreadsheets. Please note that Windows users may need to install the Rtools package in order to use openxlsx functionalities. The write.xlsx() function gives you a large choice of possible options to set, including a custom style to apply to column names (through headerStyle argument), the color of cell borders (borderColour), or even its line style (borderStyle). The following example utilizes only the most common and minimal arguments required to write a list to the XLSX file, but be encouraged to explore other options offered by this very flexible function: > write.xlsx(simple.list, file = "simple_list.xlsx") A third-party package called foreign makes it possible to write data frames to other formats used by well-known statistical tools such as SPSS, Stata, or SAS. When creating files, the write.foreign() function requires users to specify the names of both the data and code files. Data files hold raw data, whereas code files contain scripts with the data structure and metadata (value and variable labels, variable formats, and so on) written in the proprietary syntax. In the following example, the code writes the dataset data frame to the SPSS format: > library(foreign) > write.foreign(dataset, "datafile.txt", "codefile.txt", package="SPSS") Finally, another package called rio contains only three functions, allowing users to quickly import(), export() and convert() data between a large array of file formats, (for example TSV, CSV, RDS, RData, JSON, DTA, SAV, and many more). The package, in fact, is dependent on a number of other R libraries, some of which, for example foreign and openxlsx, have already been presented in this article. The rio package does not introduce any new functionalities apart from the default arguments characteristic for underlying export functions, so you still need to be familiar with the original functions and their parameters if you require more advanced exporting capabilities. But, if you are only looking for a no-fuss general export function, the rio package is definitely a good shortcut to take: > export(dataset, format = "stata") > export(dataset, "dataset1.csv", col.names = TRUE, na = "NA") Summary In this article, we have provided you with quite a bit of theory, and hopefully a lot of practical examples of data structures available to R users. You've created several objects of different types, and you've become familiar with a variety of data and file formats to offer. We then showed you how to save R objects held in your R workspace to external files on your hard drive, or to export them to various standard and proprietary file formats. Resources for Article: Further resources on this subject: Fast Data Manipulation with R [article] The Data Science Venn Diagram [article] Deployment and DevOps [article]
Read more
  • 0
  • 0
  • 11956

article-image-wordpress-management-wp-cli
Marcel Reschke
09 Nov 2016
6 min read
Save for later

Wordpress Management with WP-CLI

Marcel Reschke
09 Nov 2016
6 min read
Managing WordPress via the web interface can be a struggle sometimes. Even Common tasks can take a lot of effort to complete, especially when they are recurring. Luckily, with the WP-CLI (WordPress Command-Line Interface), you have a command line tool to manage your WordPress installations. In this post, I will show you how to install WP-CLI, get a WordPressinstallation up and running, and manage WordPress themes and plugins. Install WP-CLI When it comes to installing WP-CLI, all you have to do is check the system requirements, download the WP-CLI archive, and make it executable. Checking system requirements To use a command-line tool like WP-CLI, you need shell access to your web server’s filesystem. This can be achieved through your terminal application on your local machine or via ssh on a remote host. Most serious hosting plans, and every server plan, will provide you with at least one SSH account. The other requirements are very basic: UNIX-like environment (OS X, Linux, FreeBSD, Cygwin) PHP 5.3.29 or later WordPress 3.7 or later Downloading the WP-CLI PHP archive WP-CLI is distributed in a PHP Archive file (.phar). So at your shell’s prompt, download the wp-cli.phar using wget or curl: $ curl -O https://raw.githubusercontent.com/wp-cli/builds/gh-pages/phar/wp-cli.phar To verify the download, execute the archive with the php binary: $ phpwp-cli.phar --info If the download issuccessful, you will see an output like this: PHP binary: /usr/bin/php PHP version: 5.5.29 php.ini used: WP-CLI root dir: phar://wp-cli.phar WP-CLI packages dir: WP-CLI global config: WP-CLI project config: WP-CLI version: 0.24.0 Now you are ready to use WP-CLI to manage your WordPress installations, but to save time and typed characters, it’s a commonly used convention to call WP-CLI with the wp command. Set up the wp command for execution To avoid typing phpwp-cli.phar ... every time you want to use WP-CLI, you have to modify the archive’s file permissions and move it to somewhere in your PATH or add an alias to your shell. To make the archive executable, edit its permissions with chmod: $ chmod +x wp-cli.phar After this,move the archive to a directory somewhere within your PATH: $ sudo mv wp-cli.phar /usr/local/bin/wp If you are on shared hosting, you might have not sufficient rights to move an executable to /usr/local/bin/. In that case, you can add a shell alias to your shells configuration. Put the following line in the .bashrc or .profile file in your home directory, depending on your shell and your operating system: aliaswp='~/wp-cli.phar' To make the shell recognize the new alias, you have to reinitialize the configuration file. For example: $ source .bashrc Now you can use the wp command to manage WordPress with WP-CLI. Let’s start with some basic tasks. Managing WordPress with WP-CLI With WP-CLI, you can manage each and every aspect of your WordPress installation that can be managed through the WordPress admin area—and quite a few more. To get a glimpse of the tasks that can be done with WP-CLI, you simply type: $ wp help But let’s start with installing the WordPress core. Install a blog To geta fresh WordPress installation done, you just need your database information and a few commands. To download the latest WordPress core,simply type: $ wp core download You can also specify the version and language you want to download: $ wp core download --locale=de_DE --version=4.5.2 To create the configuration file (wp-config.php) use the core config command: $ wp core download --locale=de_DE --version=4.5.2 To finalize the installation process, you have to configure your blog’s information and the admin user data: $ wp core install --url="http://yourdomain.com" --title="My new blog" --admin_name="admin" --admin_email="admin@yourdomain.com" --admin_password="secretpassword" You can now point your browser to your WordPress admin login and start blogging. Managing WordPress themes and plugins After installing WordPress WP-CLI, you can start using it to manage your themes and plugins. Managing themes Managing themes with WP-CLI is much quicker than using the WordPress admin area. To install and activate a theme from wordpress.org, you type: $ wp theme install twentysixteen --activate You can also install a theme from a local .zip file: $ wp theme install ../mynewtheme.zip Or from aURL: $ wp theme install https://github.com/Automattic/_s/archive/master.zip To list all installed themes,type: $ wp theme list Managing plugins Managing plugins with WP-CLI is nearly the same. You use the wp plugin command instead of wp theme. When you don’t know the slug of a plugin, you can search the wordpress.org plugin repository: $ wp plugin search "yoastseo" You will get an output like this: Success: Showing 10 of 285 plugins. +-------------------------------------------+-------------------------------------+--------+ | name | slug | rating | +-------------------------------------------+-------------------------------------+--------+ | Yoast SEO | wordpress-seo | 80 | | Meta Box for Yoast SEO | meta-box-yoast-seo | 0 | | Remove Yoast SEO comments | remove-yoast-seo-comments | 0 | | ACF-Content Analysis for Yoast SEO | acf-content-analysis-for-yoast-seo | 100 | | qTranslate-X&#38; Yoast SEO | dennisridder-qtx-seo | 0 | | SEO Advanced Custom Fields Analyzer | seo-advanced-custom-fields-analyzer | 0 | | Integration: Yoast SEO &#38; qTranslate-X | wp-seo-qtranslate-x | 72 | | Uninstall Yoast SEO | uninstall-yoast-seo | 100 | | Remove Branding for Yoast SEO | remove-branding-for-yoast-seo | 84 | | All Meta Stats Yoast SEO Addon | all-meta-stats-yoast-seo-addon | 100 | +-------------------------------------------+-------------------------------------+--------+ Find the slug and install it afterward: $ wp plugin install wordpress-seo Then you can activate the plugin with: $ wp plugin activate wordpress-seo Deactivating a plugin is pretty much the same: $ wp plugin deactivate wordpress-seo Conclusion You should now have a basic understanding of how to manage WordPress sites with WP-CLI.You can also use WP-CLI to manage your database, do backups, keep things up to date, handle posts and comments, and maintain WordPress multi-sites. Its full strength is especially exploited when used in shell scripts, or in a more development-style app environment with version control, unit testing, and multi-stage deployment. About the author Marcel Reschke is a developer based out of Germany. He can be found on GitHub .
Read more
  • 0
  • 0
  • 10544

article-image-machine-learning-technique-supervised-learning
Packt
09 Nov 2016
7 min read
Save for later

Machine Learning Technique: Supervised Learning

Packt
09 Nov 2016
7 min read
In this article by Andrea Isoni author of the book Machine Learning for the Web, we will the most relevant regression and classification techniques are discussed. All of these algorithms share the same background procedure, and usually the name of the algorithm refers to both a classification and a regression method. The linear regression algorithms, Naive Bayes, decision tree, and support vector machine are going to be discussed in the following sections. To understand how to employ the techniques, a classification and a regression problem will be solved using the mentioned methods. Essentially, a labeled train dataset will be used to train the models, which means to find the values of the parameters, as we discussed in the introduction. As usual, the code is available in the my GitHub folder at https://github.com/ai2010/machine_learning_for_the_web/tree/master/chapter_3/. (For more resources related to this topic, see here.) We will conclude the article with an extra algorithm that may be used for classification, although it is not specifically designed for this purpose (hidden Markov model). We will now begin to explain the general causes of error in the methods when predicting the true labels associated with a dataset. Model error estimation We said that the trained model is used to predict the labels of new data, and the quality of the prediction depends on the ability of the model to generalize, that is, the correct prediction of cases not present in the trained data. This is a well-known problem in literature and related to two concepts: bias and variance of the outputs. The bias is the error due to a wrong assumption in the algorithm. Given a point x(t) with label yt, the model is biased if it is trained with different training sets, and the predicted label ytpred will always be different from yt. The variance error instead refers to the different wrongly predicted labels of the given point x(t). A classic example to explain the concepts is to consider a circle with the true value at the center (true label), as shown in the following figure. The closer the predicted labels are to the center, the more unbiased the model and the lower the variance (top left in the following figure). The other three cases are also shown here: Variance and bias example. A model with low variance and low bias errors will have the predicted labels that is blue dots (as show in the preceding figure) concentrated on the red center (true label). The high bias error occurs when the predictions are far away from the true label, while high variance appears when the predictions are in a wide range of values. We have already seen that labels can be continuous or discrete, corresponding to regression classification problems respectively. Most of the models are suitable for solving both problems, and we are going to use word regression and classification referring to the same model. More formally, given a set of N data points and corresponding labels, a model with a set of parameters with the true parameter values will have the mean square error (MSE), equal to: We will use the MSE as a measure to evaluate the methods discussed in this article. Now we will start describing the generalized linear methods. Generalized linear models The generalized linear model is a group of models that try to find the M parameters that form a linear relationship between the labels yi and the feature vector x(i) that is as follows: Here, are the errors of the model. The algorithm for finding the parameters tries to minimize the total error of the model defined by the cost function J: The minimization of J is achieved using an iterative algorithm called batch gradient descent: Here, a is called learning rate, and it is a trade-off between convergence speed and convergence precision. An alternative algorithm that is called stochastic gradient descent, that is loop for : The qj is updated for each training example i instead of waiting to sum over the entire training set. The last algorithm converges near the minimum of J, typically faster than batch gradient descent, but the final solution may oscillate around the real values of the parameters. The following paragraphs describe the most common model and the corresponding cost function, J. Linear regression Linear regression is the simplest algorithm and is based on the model: The cost function and update rule are: Ridge regression Ridge regression, also known as Tikhonov regularization, adds a term to the cost function J such that: , where l is the regularization parameter. The additional term has the function needed to prefer a certain set of parameters over all the possible solutions penalizing all the parameters qj different from 0. The final set of qj shrank around 0, lowering the variance of the parameters but introducing a bias error. Indicating with the superscript l the parameters from the linear regression, the ridge regression parameters are related by the following formula: This clearly shows that the larger the l value, the more the ridge parameters are shrunk around 0. Lasso regression Lasso regression is an algorithm similar to ridge regression, the only difference being that the regularization term is the sum of the absolute values of the parameters: Logistic regression Despite the name, this algorithm is used for (binary) classification problems, so we define the labels. The model is given the so-called logistic function expressed by: In this case, the cost function is defined as follows: From this, the update rule is formally the same as linear regression (but the model definition,  , is different): Note that the prediction for a point p,  , is a continuous value between 0 and 1. So usually, to estimate the class label, we have a threshold at =0.5 such that: The logistic regression algorithm is applicable to multiple label problems using the techniques one versus all or one versus one. Using the first method, a problem with K classes is solved by training K logistic regression models, each one assuming the labels of the considered class j as +1 and all the rest as 0. The second approach consists of training a model for each pair of labels (  trained models). Probabilistic interpretation of generalized linear models Now that we have seen the generalized linear model, let’s find the parameters qj that satisfy the relationship: In the case of linear regression, we can assume  as normally distributed with mean 0 and variance s2 such that the probability  is  equivalent to: Therefore, the total likelihood of the system can be expressed as follows: In the case of the logistic regression algorithm, we are assuming that the logistic function itself is the probability: Then the likelihood can be expressed by: In both cases, it can be shown that maximizing the likelihood is equivalent to minimizing the cost function, so the gradient descent will be the same. k-nearest neighbours (KNN) This is a very simple classification (or regression) method in which given a set of feature vectors  with corresponding labels yi, a test point x(t) is assigned to the label value with the majority of the label occurrences in the K nearest neighbors  found, using a distance measure such as the following: Euclidean: Manhattan: Minkowski:  (if q=2, this reduces to the Euclidean distance) In the case of regression, the value yt is calculated by replacing the majority of occurrences by the average of the labels .  The simplest average (or the majority of occurrences) has uniform weights, so each point has the same importance regardless of their actual distance from x(t). However, a weighted average with weights equal to the inverse distance from x(t) may be used. Summary In this article, the major classification and regression algorithms, together with the techniques to implement them, were discussed. You should now be able to understand in which situation each method can be used and how to implement it using Python and its libraries (sklearn and pandas). Resources for Article: Further resources on this subject: Supervised Machine Learning [article] Unsupervised Learning [article] Specialized Machine Learning Topics [article]
Read more
  • 0
  • 0
  • 1739
article-image-breaking-microservices-architecture
Packt
08 Nov 2016
15 min read
Save for later

Breaking into Microservices Architecture

Packt
08 Nov 2016
15 min read
In this article by Narayan Prusty, the author of the book Modern JavaScript Applications, we will see the architecture of server side application development for complex and large applications (applications with huge number of users and large volume of data) shouldn't just involve faster response and providing web services for wide variety of platforms. It should be easy to scale, upgrade, update, test, and deploy. It should also be highly available, allowing the developers write components of the server side application in different programming languages and use different databases. Therefore, this leads the developers who build large and complex applications to switch from the common monolithic architecture to microservices architecture that allows us to do all this easily. As microservices architecture is being widely used in enterprises that build large and complex applications, it's really important to learn how to design and create server side applications using this architecture. In this chapter, we will discuss how to create applications based on microservices architecture with Node.js using the Seneca toolkit. (For more resources related to this topic, see here.) What is monolithic architecture? To understand microservices architecture, it's important to first understand monolithic architecture, which is its opposite. In monolithic architecture, different functional components of the server side application, such as payment processing, account management, push notifications, and other components, all blend together in a single unit. For example, applications are usually divided into three parts. The parts are HTML pages or native UI that run on the user's machine, server side application that runs on the server, and database that also runs on the server. The server side application is responsible for handling HTTP requests, retrieving and storing data in a database, executing algorithms, and so on. If the server side application is a single executable (that is running is a single process) that does all these task, than we say that the server side application is monolithic. This is a common way of building server side applications. Almost every major CMS, web servers, server side frameworks, and so on are built using monolithic architecture. This architecture may seem successful, but problems are likely to arise when your application is large and complex. Demerits of monolithic architecture The following are some of the issues caused by server side applications built using the monolithic architecture. Scaling monolithic architecture As traffic to your server side application increases, you will need to scale your server side application to handle the traffic. In case of monolithic architecture, you can scale the server side application by running the same executable on multiple servers and place the servers behind a load balancer or you can use round robin DNS to distribute the traffic among the servers: In the preceding diagram, all the servers will be running the same server side application. Although scaling is easy, scaling monolithic server side application ends up with scaling all the components rather than the components that require greater resource. Thus, causing unbalanced utilization of resources sometimes, depending on the quantity and types of resources the components need. Let's consider some examples to understand the issues caused while scaling monolithic server side applications: Suppose there is a component of server side application that requires a more powerful or special kind of hardware, we cannot simply scale this particular component as all the components are packed together, therefore everything needs to be scaled together. So, to make sure that the component gets enough resources, you need to run the server side application on some more servers with powerful or special hardware, leading to consumption of more resources than actually required. Suppose we have a component that requires to be executed on a specific server operating system that is not free of charge, we cannot simply run this particular component in a non-free operating system as all the components are packed together and therefore, just to execute this specific component, we need to install the non-free operating system in all servers, increasing the cost largely. These are just some examples. There are many more issues that you are likely to come across while scaling a monolithic server side application. So, when we scale monolithic server side applications, the components that don't need more powerful or special kind of resource starts receiving them, therefore deceasing resources for the component that needs them. We can say that scaling monolithic server side application involves scaling all components that are forcing to duplicate everything in the new servers. Writing monolithic server side applications Monolithic server side applications are written in a particular programming language using a particular framework. Enterprises usually have developers who are experts in different programming languages and frameworks to build server side applications; therefore, if they are asked to build a monolithic server side application, then it will be difficult for them to work together. The components of a monolithic server side application can be reused only in the same framework using, which it's built. So, you cannot reuse them for some other kind of project that's built using different technologies. Other issues of monolithic architecture Here are some other issues that developers might face. Depending on the technology that is used to build the monolithic server side application: It may need to be completely rebuild and redeployed for every small change made to it. This is a time-consuming task and makes your application inaccessible for a long time. It may completely fail if any one of the component fails. It's difficult to build a monolithic application to handle failure of specific components and degrade application features accordingly. It may be difficult to find how much resources are each components consuming. It may be difficult to test and debug individual components separately. Microservices architecture to the rescue We saw the problems caused by monolithic architecture. These problems lead developers to switch from monolithic architecture to microservices architecture. In microservices architecture, the server side application is divided into services. A service (or microservice) is a small and independent process that constitutes a particular functionality of the complete server side application. For example, you can have a service for payment processing, another service for account management, and so on; the services need to communicate with each other via network. What do you mean by "small" service? You must be wondering how small a service needs to be and how to tell whether a service is small or not? Well, it actually depends on many factors such as the type of application, team management, availability of resources, size of application, and how small you think is small? However, a small service doesn't have to be the one that is written is less lines of code or provides a very basic functionality. A small service can be the one on which a team of developers can work independently, which can be scaled independently to other services, scaling it doesn't cause unbalanced utilization of recourses, and overall they are highly decoupled (independent and unaware) of other services. You don't have to run each service in a different server, that is, you can run multiple services in a single computer. The ratio of server to services depends on different factors. A common factor is the amount and type of resources and technologies required. For example, if a service needs a lot of RAM and CPU time, then it would be better to run it individually on a server. If there are some services that don't need much resources, then you can run them all in a single server together. The following diagram shows an example of the microservices architecture: Here, you can think of Service 1 as the web server with which a browser communicates and other services providing APIs for various functionalities. The web services communicate with other services to get data. Merits of microservices architecture Due to the fact that services are small and independent and communicate via network, it solves many problems that monolithic architecture had. Here are some of the benefits of microservices architecture: As the services communicate via network, they can be written in different programming languages using different frameworks Making a change to a service only requires that particular service to be redeployed instead of all the services, which is a faster procedure It becomes easier to measure how much resources are consumed by each service as each service runs in a different process It becomes easier to test and debug, as you can analyze each service separately Services can be reused by other applications as they interact via network calls Scaling services Apart from the preceding benefits, one of the major benefits of microservices architecture is that you can scale individual services that require scaling instead of all the services, therefore preventing duplication of resources and unbalanced utilization of resources. Suppose we want to scale Service 1 in the preceding diagram. Here is a diagram that shows how it can be scaled: Here, we are running two instances of Service 1 on two different servers kept behind a load balancer, which distributes the traffic between them. All other services run the same way as scaling them wasn't required. If you wanted to scale Service 3, then you can run multiple instances of Service 3 on multiple servers and place them behind a load balancer. Demerits of microservices architecture Although there are a lot of merits of using microservices architecture compared to monolithic architecture, there are some demerits of microservices architecture as well: As the server side application is divided into services, deploying, and optionally, configuring each service separately is cumbersome and a time-consuming task. Note that developers often use some sort automation technology (such as AWS, Docker, and so on) to make deployment somewhat easier; however, to use it, you still need a good level of experience and expertise of that technology. Communication between services is likely to lag as it's done via network. This sort of server side applications is more prone to network security vulnerabilities as services communicate via network. Writing code for communicating with other services can be harder, that is, you need to make network calls and then parse the data to read it. This also requires more processing. Note that although there are frameworks to build server side applications using microservices that make fetching and parsing of data easier, it still doesn't deduct the processing and network wait time. You will surely need some sort of monitoring tool to monitor services as they may go down due to network, hardware, or software failure. Although you may use the monitoring tool only when your application suddenly stops, to build the monitoring software or use some sort of service, monitoring software needs some level of extra experience and expertise. Microservices-based server side applications are slower than monolithic-based server side applications as communication via networks is slower compared to memory. When to use microservices architecture? It may seem like its difficult to choose between monolithic and microservices architecture, but it's actually not so hard to decide between them. If you are building a server side application using monolithic architecture and you feel that you are unlikely to face any monolithic issues that we discussed earlier, then you can stick to monolithic architecture. In future, if you are facing issues that can be solved using microservices architecture, then you should switch to microservices architecture. If you are switching from a monolithic architecture to microservices architecture, then you don't have to rewrite the complete application, instead you can only convert the components that are causing issues to services by doing some code refactoring. This sort of server side applications where the main application logic is monolithic but some specific functionality is exposed via services is called microservices architecture with monolithic core. As issues increase further, you can start converting more components of the monolithic core to services. If you are building a server side application using monolithic architecture and you feel that you are likely to face any of the monolithic issues that we discussed earlier, then you should immediately switch to microservices architecture or microservices architecture with monolithic core, depending on what suits you the best. Data management In microservices architecture, each service can have its own database to store data and can also use a centralized database to store. Some developers don't use a centralized database at all, instead all services have their own database to store the data. To synchronize the data between the services, the services omit events when their data is changed and other services subscribe to the event and update the data. The problem with this mechanism is that if a service is down, then it may miss some events. There is also going to be a lot of duplicate data, and finally, it is difficult to code this kind of system. Therefore, it's a good idea to have a centralized database and also let each service to maintain their own database if they want to store something that they don't want to share with others. Services should not connect to the centralized database directly, instead there should be another service called database service that provides APIs to work with the centralized database. This extra layer has many advantages, such as the underlying schema can be changed without updating and redeploying all the services that are dependent on the schema, we can add a caching layer without making changes to the services, you can change the type of database without making any changes to the services and there are many other benefits. We can also have multiple database services if there are multiple schemas, or if there are different types of databases, or due to some other reason that benefits the overall architecture and decouples the services. Implementing microservices using Seneca Seneca is a Node.js framework for creating server side applications using microservices architecture with monolithic core. Earlier, we discussed that in microservices architecture, we create a separate service for every component, so you must be wondering what's the point of using a framework for creating services that can be done by simply writing some code to listen to a port and reply to requests. Well, writing code to make requests, send responses, and parse data requires a lot of time and work, but a framework like Seneca make all this easy. Also converting components of monolithic core to services is also a cumbersome task as it requires a lot of code refactoring, but Seneca makes it easy by introducing a concept of actions and plugins. Finally, services written in any other programming language or framework will be able to communicate with Seneca services. In Seneca, an action represents a particular operation. An action is a function that's identified by an object literal or JSON string called as the action's pattern. In Seneca, these operations of a component of monolithic core are written using actions, which we may later want to move from monolithic core to a service and expose it to other services and monolithic core via network. Why actions? You might be wondering what is the benefit of using actions instead of functions to write operations and how actions make it easy to convert components of monolithic core to services? Suppose you want to move an operation of monolithic core that is written using a function to a separate service and expose the function via network then you cannot simply copy and paste the function to the new service, instead you need to define a route (if you are using Express). To call the function inside the monolithic core, you will need to write code to make an HTTP request to the service. To call this operation inside the service, you can simply call a function so that there are two different code snippets depending from where you are executing the operation. Therefore, moving operations requires a lot of code refactoring. However, if you would have written the preceding operation using the Seneca action, then it would have been really easy to move the operation to a separate service. In case the operation is written using action, and you want to move the operation to a separate service and expose the operation via network, then you can simply copy and paste the action to the new service. That's it. Obviously, we also need to tell the service to expose the action via network and tell the monolithic core where to find the action, but all these require just couple of lines of code. A Seneca service exposes actions to other services and monolithic core. While making request to a service, we need to provide a pattern matching an action's pattern to be called in the service. Why patterns? Patterns make it easy to map a URL to action, patterns can overwrite other patterns for specific conditions, therefore it prevents editing of the existing code, as editing of the existing code in a production site is not safe and have many other disadvantages. Seneca also has a concept of plugins. A seneca plugin is actually a set of actions that can be easily distributed and plugged in to a service or monolithic core. As our monolithic core becomes larger and complex, we can convert components to services. That is, move actions of certain components to services. Summary In this chapter, we saw the difference between monolithic and microservices architecture. Then we discussed what microservices architecture with monolithic core means and its benefits. Finally, we jumped into the Seneca framework for implementing microservices architecture with monolithic core and discussed how to create a basic login and registration functionality to demonstrate various features of the Seneca framework and how to use it. In the next chapter, we will create a fully functional e-commerce website using Seneca and Express frameworks Resources for Article: Further resources on this subject: Microservices – Brave New World [article] Patterns for Data Processing [article] Domain-Driven Design [article]
Read more
  • 0
  • 0
  • 12965

article-image-building-voice-technology-iot-projects
Packt
08 Nov 2016
10 min read
Save for later

Building Voice Technology on IoT Projects

Packt
08 Nov 2016
10 min read
In this article by Agus Kurniawan, authors of Smart Internet of Things Projects, we will explore how to make your IoT board speak something. Various sound and speech modules will be explored as project journey. (For more resources related to this topic, see here.) We explore the following topics Introduce a speech technology Introduce sound sensor and actuator Introduce pattern recognition for speech technology Review speech and sound modules Build your own voice commands for IoT projects Make your IoT board speak Make Raspberry Pi speak Introduce a speech technology Speech is the primary means of communication among people. A speech technology is a technology which is built by speech recognition research. A machine such as a computer can understand what human said even the machine can recognize each speech model so the machine can differentiate each human's speech. A speech technology covers speech-to-text and text-to-speech topics. Some researchers already define several speech model for some languages, for instance, English, German, China, French. A general of speech research topics can be seen in the following figure: To convert speech to text, we should understand about speech recognition. Otherwise, if we want to generate speech sounds from text, we should learn about speech synthesis. This article doesn't cover about speech recognition and speech synthesis in heavy mathematics and statistics approach. I recommend you read textbook related to those topics. In this article, we will learn how to work sound and speech processing on IoT platform environment. Introduce sound sensors and actuators Sound sources can come from human, animal, car, and etc. To process sound data, we should capture the sound source from physical to digital form. This happens if we use sensor devices which capture the physical sound source. A simple sound sensor is microphone. This sensor can record any source via microphone. We use a microphone module which is connected to your IoT board, for instance, Arduino and Raspberry Pi. One of them is Electret Microphone Breakout, https://www.sparkfun.com/products/12758. This is a breakout module which exposes three pin outs: AUD, GND, and VCC. You can see it in the following figure. Furthermore, we can generate sound using an actuator. A simple sound actuator is passive buzzer. This component can generate simple sounds with limited frequency. You can generate sound by sending on signal pin through analog output or PWM pin. Some manufacturers also provide a breakout module for buzzer. Buzzer actuator form is shown in the following figure. Buzzer usually is passive actuator. If you want to work with active sound actuator, you can use a speaker. This component is easy to find on your local or online store. I also found it on Sparkfun, https://www.sparkfun.com/products/11089 which you can see it in the following figure. To get experiences how to work sound sensor/actuator, we build a demo to capture sound source by getting sound intensity. In this demo, I show how to detect a sound intensity level using sound sensor, an Electret microphone. The sound source can come from sounds of voice, claps, door knocks or any sounds loud enough to be picked up by a sensor device. The output of sensor device is analog value so MCU should convert it via a microcontroller's analog-to-digital converter. The following is a list of peripheral for our demo. Arduino board. Resistor 330 Ohm. Electret Microphone Breakout, https://www.sparkfun.com/products/12758. 10 Segment LED Bar Graph - Red, https://www.sparkfun.com/products/9935. You can use any color for LED bar. You can also use Adafruit Electret Microphone Breakout to be attached into Arduino board. You can review it on https://www.adafruit.com/product/1063. To build our demo, you wire those components as follows Connect Electret Microphone AUD pin to Arduino A0 pin Connect Electret Microphone GND pin to Arduino GND pin Connect Electret Microphone VCC pin to Arduino 3.3V pin Connect 10 Segment LED Bar Graph pins to Arduino digital pins: 3, 4, 5, 6, 7, 8, 9, 10, 11, 12 which already connected to resistor 330 Ohm You can see the final wiring of our demo in the following figure: 10 segment led bar graph module is used to represent of sound intensity level. In Arduino we can use analogRead() to read analog input from external sensor. Output of analogRead() returns value 0 - 1023. Total output in voltage is 3.3V because we connect Electret Microphone Breakout with 3.3V on VCC. From this situation, we can set 3.3/10 = 0.33 voltage for each segment led bar. The first segment led bar is connected to Arduino digital pin 3. Now we can implement to build our sketch program to read sound intensity and then convert measurement value into 10 segment led bar graph. To obtain a sound intensity, we try to read sound input from analog input pin. We read it during a certain time, called sample window time, for instance, 250 ms. During that time, we should get the peak value or maximum value of analog input. The peak value will be set as sound intensity value. Let's start to implement our program. Open Arduino IDE and write the following sketch program. // Sample window width in mS (250 mS = 4Hz) const int sampleWindow = 250; unsigned int sound; int led = 13; void setup() { Serial.begin(9600); pinMode(led, OUTPUT); pinMode(3, OUTPUT); pinMode(4, OUTPUT); pinMode(5, OUTPUT); pinMode(6, OUTPUT); pinMode(7, OUTPUT); pinMode(8, OUTPUT); pinMode(9, OUTPUT); pinMode(10, OUTPUT); pinMode(11, OUTPUT); pinMode(12, OUTPUT); } void loop() { unsigned long start= millis(); unsigned int peakToPeak = 0; unsigned int signalMax = 0; unsigned int signalMin = 1024; // collect data for 250 milliseconds while (millis() - start < sampleWindow) { sound = analogRead(0); if (sound < 1024) { if (sound > signalMax) { signalMax = sound; } else if (sound < signalMin) { signalMin = sound; } } } peakToPeak = signalMax - signalMin; double volts = (peakToPeak * 3.3) / 1024; Serial.println(volts); display_bar_led(volts); } void display_bar_led(double volts) { display_bar_led_off(); int index = round(volts/0.33); switch(index){ case 1: digitalWrite(3, HIGH); break; case 2: digitalWrite(3, HIGH); digitalWrite(3, HIGH); break; case 3: digitalWrite(3, HIGH); digitalWrite(4, HIGH); digitalWrite(5, HIGH); break; case 4: digitalWrite(3, HIGH); digitalWrite(4, HIGH); digitalWrite(5, HIGH); digitalWrite(6, HIGH); break; case 5: digitalWrite(3, HIGH); digitalWrite(4, HIGH); digitalWrite(5, HIGH); digitalWrite(6, HIGH); digitalWrite(7, HIGH); break; case 6: digitalWrite(3, HIGH); digitalWrite(4, HIGH); digitalWrite(5, HIGH); digitalWrite(6, HIGH); digitalWrite(7, HIGH); digitalWrite(8, HIGH); break; case 7: digitalWrite(3, HIGH); digitalWrite(4, HIGH); digitalWrite(5, HIGH); digitalWrite(6, HIGH); digitalWrite(7, HIGH); digitalWrite(8, HIGH); digitalWrite(9, HIGH); break; case 8: digitalWrite(3, HIGH); digitalWrite(4, HIGH); digitalWrite(5, HIGH); digitalWrite(6, HIGH); digitalWrite(7, HIGH); digitalWrite(8, HIGH); digitalWrite(9, HIGH); digitalWrite(10, HIGH); break; case 9: digitalWrite(3, HIGH); digitalWrite(4, HIGH); digitalWrite(5, HIGH); digitalWrite(6, HIGH); digitalWrite(7, HIGH); digitalWrite(8, HIGH); digitalWrite(9, HIGH); digitalWrite(10, HIGH); digitalWrite(11, HIGH); break; case 10: digitalWrite(3, HIGH); digitalWrite(4, HIGH); digitalWrite(5, HIGH); digitalWrite(6, HIGH); digitalWrite(7, HIGH); digitalWrite(8, HIGH); digitalWrite(9, HIGH); digitalWrite(10, HIGH); digitalWrite(11, HIGH); digitalWrite(12, HIGH); break; } } void display_bar_led_off() { digitalWrite(3, LOW); digitalWrite(4, LOW); digitalWrite(5, LOW); digitalWrite(6, LOW); digitalWrite(7, LOW); digitalWrite(8, LOW); digitalWrite(9, LOW); digitalWrite(10, LOW); digitalWrite(11, LOW); digitalWrite(12, LOW); } Save this sketch program as ch05_01. Compile and deploy this program into Arduino board. After deployed the program, you can open Serial Plotter tool. You can find this tool from Arduino menu Tools -| Serial Plotter. Set the baud rate as 9600 baud on the Serial Plotter tool. Try to make noise on a sound sensor device. You can see changing values on graphs from Serial Plotter tool. A sample of Serial Plotter can be seen in the following figure: How to work? The idea to obtain a sound intensity is easy. We get a value among sound signal peaks. Firstly, we define a sample width, for instance, 250 ms for 4Hz. // Sample window width in mS (250 mS = 4Hz) const int sampleWindow = 250; unsigned int sound; int led = 13; On the setup() function, we initialize serial port and our 10 segment led bar graph. void setup() { Serial.begin(9600); pinMode(led, OUTPUT); pinMode(3, OUTPUT); pinMode(4, OUTPUT); pinMode(5, OUTPUT); pinMode(6, OUTPUT); pinMode(7, OUTPUT); pinMode(8, OUTPUT); pinMode(9, OUTPUT); pinMode(10, OUTPUT); pinMode(11, OUTPUT); pinMode(12, OUTPUT); } On the loop() function, we perform to calculate a sound intensity related to a sample width. After obtained a peak-to-peak value, we convert it into voltage form. unsigned long start= millis(); unsigned int peakToPeak = 0; unsigned int signalMax = 0; unsigned int signalMin = 1024; // collect data for 250 milliseconds while (millis() - start < sampleWindow) { sound = analogRead(0); if (sound < 1024) { if (sound > signalMax) { signalMax = sound; } else if (sound < signalMin) { signalMin = sound; } } } peakToPeak = signalMax - signalMin; double volts = (peakToPeak * 3.3) / 1024; Then, we show a sound intensity in volt form in serial port and 10 segment led by calling display_bar_led(). Serial.println(volts); display_bar_led(volts); Inside the display_bar_led() function, we turn off all LEDs on 10 segment led bar graph by calling display_bar_led_off() which sends LOW on all LEDs using digitalWrite(). After that, we calculate a range value from volts. This value will be converted as total showing LEDs. display_bar_led_off(); int index = round(volts/0.33); Introduce pattern recognition for speech technology Pattern recognition is one of topic in machine learning and as baseline for speech recognition. In general, we can construct speech recognition system in the following figure: From human speech, we should convert it into digital form, called discrete data. Some signal processing methods are applied to handle pre-processing such as removing noise from data. Now in pattern recognition we do perform speech recognition method. Researchers did some approaches such as computing using Hidden Markov Model (HMM) to identity sound related to word. Performing feature extraction in speech digital data is a part of pattern recognition activities. The output will be used as input in pattern recognition input. The output of pattern recognition can be applied as Speech-to-Text and Speech command on our IoT projects. Reviewing speech and sound modules for IoT devices In this section, we review various speech and sound modules which can be integrated into our MCU board. There are a lot of modules related to speech and sound processing. Each module has unique features which fits with your works. One of speech and sound modules is EasyVR 3 & EasyVR Shield 3 from VeeaR. You can review this module on http://www.veear.eu/introducing-easyvr-3-easyvr-shield-3/. Several languages already have been supported such as English (US), Italian, German, French, Spanish, and Japanese. You can see EasyVR 3 module in the following figure: EasyVR 3 board also is available as a shield for Arduino. If you buy an EasyVR Shield 3, you will obtain EasyVR board and its Arduino shield. You can see the form of EasyVR Shield 3 on the following figure: The second module is Emic 2. It was designed by Parallax in conjunction with Grand Idea Studio, http:// www.grandideastudio.com/, to make voice synthesis a total no-brainer. You can send texts to the module to generate human speech through serial protocol. This module is useful if you want to make boards speak. Further information about this module, you can visit and buy this module on https://www.parallax.com/product/30016. The following is a form of Emic-2 module: Summary We have learned some basic sound and voice processing. We also explore several sound and speech modules to integrate into your IoT project. We built a program to read sound intensity level at the first. Resources for Article: Further resources on this subject: Introducing IoT with Particle's Photon and Electron [article] Web Typography [article] WebRTC in FreeSWITCH [article]
Read more
  • 0
  • 0
  • 25394
Modal Close icon
Modal Close icon