Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Save more on your purchases! discount-offer-chevron-icon
Savings automatically calculated. No voucher code required.
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Newsletter Hub
Free Learning
Arrow right icon
timer SALE ENDS IN
0 Days
:
00 Hours
:
00 Minutes
:
00 Seconds

How-To Tutorials - Languages

135 Articles
article-image-python-governance-vote-results-are-here-the-steering-council-model-is-the-winner
Prasad Ramesh
18 Dec 2018
3 min read
Save for later

Python governance vote results are here: The steering council model is the winner

Prasad Ramesh
18 Dec 2018
3 min read
The election to select the governance model for Python following the stepping down of Guido van Rossum as the BDFL earlier this year has ended and PEP 8016 was selected as the winner. PEP 8016 is the steering council model that has a focus on providing a minimal and solid foundation for governance decisions. The vote has chosen a governance PEP that will be implemented on the Python project. The winner: PEP 8016 the steering council model Authored by Nathaniel J. Smith, and Donald Stufft, this proposal involves a model for Python governance based on a steering council. The council has vast authority, which they intend to use as rarely as possible, instead, they plan to use this power to establish standard processes. The steering council committee consists of five people. A general philosophy is followed—it's better to split up large changes into a series of small changes to be reviewed independently. As opposed to trying to do everything in one PEP, the focus is on providing a minimal and solid foundation for future governance decisions. This PEP was accepted on December 17, 2018. Goals of the steering council model The main goals of this proposal are: Sticking to the basics aka ‘be boring’. The authors don't think Python is a good place to experiment with new and untested governance models. Hence, this proposal sticks to mature, well-known, processes that have been tested previously. A high-level approach where the council does not involve much very common in large successful open source projects. The low-level details are directly derived from Django's governance. Being simple enough for minimum viable governance. The proposal attempts to slim things down to the minimum required, just enough to make it workable. The trimming includes the council, the core team, and the process for changing documentation. A goal is to ‘be comprehensive’. The things that need to be defined are covered well for future use. Having a clear set of rules will also help minimize confusion. To ‘be flexible and light-weight’. The authors are aware that to find the best processes for working together, it will take time and experimentation. Hence, they keep the document as minimal as possible, for maximal flexibility to adjust things later. The need for heavy-weight processes like whole-project votes is also minimized. The council will work towards maintaining the quality of and stability of the Python language and the CPython interpreter. Make contribution process easy, maintain relations with the core team, establish a decision-making process for PEPs, and so on. They have powers to make decisions on PEPs, enforce project code of conduct, etc. To know more about the election to the committee visit the Python website. NumPy drops Python 2 support. Now you need Python 3.5 or later. NYU and AWS introduce Deep Graph Library (DGL), a python package to build neural network graphs Python 3.7.2rc1 and 3.6.8rc1 released
Read more
  • 0
  • 0
  • 20456

article-image-awk-programming-langauge
Pavan Ramchandani
17 May 2018
9 min read
Save for later

That '70s language: AWK programming

Pavan Ramchandani
17 May 2018
9 min read
AWK is an interpreted programming language designed for text processing and report generation. It is typically used for data manipulation, such as searching for items within data, performing arithmetic operations, and restructuring raw data for generating reports in most Unix-like operating systems. Today, we will explore the AWK philosophy and different types of AWK that exist, starting from its original implementation in 1977 at AT&T's Laboratories, Inc. We will also look at the various implementation areas of AWK in data science today. Using AWK programs, one can handle repetitive text-editing problems with very simple and short programs. It is a pattern-action language; it searches for patterns in a given input and, when a match is found, it performs the corresponding action. The pattern can be made of strings, regular expressions, comparison operations on numbers, fields, variables, and so on. It reads the input files and splits each input line of the file into fields automatically. AWK has most of the well-designed features that every programming language should contain. Its syntax particularly resembles that of the C programming language. It is named after its original three authors: Alfred V. Aho Peter J. Weinberger Brian W. Kernighan AWK is a very powerful, elegant, and simple that every person dealing with text processing should be familiar with. This article is an excerpt from a book written by Shiwang Kalkhanda, titled Learning AWK Programming. This book will introduce you to AWK programming language and get you hands-on working with practical implementation of AWK. Types of AWK The AWK language was originally implemented as an AWK utility on Unix. Today, most Linux distributions provide GNU implementation of AWK (GAWK), and a symlink for AWK is created from the original GAWK binary. The AWK utility can be categorized into the following three types, depending upon the type of interpreter it uses for executing AWK programs: AWK: This is the original AWK interpreter available from AT&T Laboratories. However, it is not used much nowadays and hence it might not be well-maintained. Its limitation is that it splits a line into a maximum 99 fields. It was updated and replaced in the mid-1980s with an enhanced version called New AWK (NAWK). NAWK: This is AT&T's latest development on the AWK interpreter. It is well-maintained by one of the original authors of AWK - Dr. Brian W. Kernighan. GAWK: This is the GNU project's implementation of the AWK programming language. All GNU/Linux distributions are shipped with GAWK by default and hence it is the most popular version of AWK. GAWK interpreter is fully compatible with AWK and NAWK. Beyond these, we also have other, less popular, AWK interpreters and translators, mentioned as follows. These variants are useful in operations when you want to translate your AWK program to C, C++, or Perl: MAWK: Michael Brennan interpreter for AWK. TAWK: Thompson Automation interpreter/compiler/Microsoft Windows DLL for AWK. MKSAWK: Mortice Kern Systems interpreter/compiler/for AWK. AWKCC: An AWK translator to C (might not be well-maintained). AWKC++: Brian Kernighan's AWK translator to C++ (experimental). It can be downloaded from: https://9p.io/cm/cs/who/bwk/awkc++.ps. AWK2C: An AWK translator to C. It uses GNU AWK libraries extensively. A2P: An AWK translator to Perl. It comes with Perl. AWKA: Yet another AWK translator to C (comes with the library), based on MAWK. It can be downloaded from: http://awka.sourceforge.net/download.html. When and where to use AWK AWK is simpler than any other utility for text processing and is available as the default on Unix-like operating systems. However, some people might say Perl is a superior choice for text processing, as AWK is functionally a subset of Perl, but the learning curve for Perl is steeper than that of AWK; AWK is simpler than Perl. AWK programs are smaller and hence quicker to execute. Anybody who knows the Linux command line can start writing AWK programs in no time. Here are a few use cases of AWK: Text processing Producing formatted text reports/labels Performing arithmetic operations on fields of a file Performing string operations on different fields of a file Programs written in AWK are smaller than they would be in other higher-level languages for similar text processing operations. AWK programs are interpreted on a GNU/Linux Terminal and thus avoid the compiling, debugging phase of software development in other languages. Getting started with installation This section describes how to set up the AWK environment on your GNU/Linux system, and we'll also discuss the workflow of AWK. Then, we'll look at different methods for executing AWK programs. Installation on Linux Generally, AWK is installed by default on most GNU/Linux distributions. Using the which command, you can check whether it is installed on your system or not. In case AWK is not installed on your system, you can do so in one of two ways: Using the package manager of the corresponding GNU/Linux system Compiling from the source code Let's take a look at each method in detail in the following sections. Using the package manager Different flavors of GNU/Linux distribution have different package-management utilities. If you are using a Debian-based GNU/Linux distribution, such as Ubuntu, Mint, or Debian, then you can install it using the Advance Package Tool (APT) package manager, as follows: [ shiwang@linux ~ ] $ sudo apt-get update -y [ shiwang@linux ~ ] $ sudo apt-get install gawk -y Similarly, to install AWK on an RPM-based GNU/Linux distribution, such as Fedora, CentOS, or RHEL, you can use the Yellowdog Updator Modified (YUM) package manager, as follows: [ root@linux ~ ] # yum update -y [ root@linux ~ ] # yum install gawk -y For installation of AWK on openSUSE, you can use the zypper (zypper command line) package-management utility, as follows: [ root@linux ~ ] # zypper update -y [ root@linux ~ ] # zypper install gawk -y Once the installation is finished, make sure AWK is accessible through the command line. We can check that using the which command, which will return the absolute path of AWK on our system: [ root@linux ~ ] # which awk /usr/bin/awk You can also use awk --version to find the AWK version on our system: [ root@linux ~ ] # awk --version Compiling from the source code Like every other open source utility, the GNU AWK source code is freely available for download as part of the GNU project. Previously, you saw how to install AWK using the package manager; now, you will see how to install AWK by compiling from its source code on the GNU/Linux distribution. The following steps are applicable to most of the GNU/Linux software for installation: Download the source code from a GNU project ftp site. Here, we will use the wget command line utility to download it, however you are free to choose any other program, such as curl, you feel comfortable with: [ shiwang@linux ~ ] $ wget http://ftp.gnu.org/gnu/gawk/gawk-4.1.3.tar.xz Extract the downloaded source code: [ shiwang@linux ~ ] $ tar xvf gawk-4.1.3.tar.xz Change your working directory and execute the configure file to configure the GAWK as per the working environment of your system: [ shiwang@linux ~ ] $ cd gawk-4.1.3 && ./configure Once the configure command completes its execution successfully, it will generate the make file. Now, compile the source code by executing the make command: [ shiwang@linux ~ ] $ make Type make install to install the programs and any data files and documentation. When installing into a prefix owned by root, it is recommended that the package be configured and built as a regular user, and only the make install phase is executed with root privileges: [ shiwang@linux ~ ] $ sudo make install Upon successful execution of these five steps, you have compiled and installed AWK on your GNU/Linux distribution. You can verify this by executing the which awk command in the Terminal or awk --version: [ root@linux ~ ] # which awk /usr/bin/awk Now you have a working AWK/GAWK installation and we are ready to begin AWK programming, but before that, our next section describes the workflow of the AWK interpreter. If you are running on macOS X, AWK, and not GAWK, would be installed as a default on it. For GAWK installation on macOS X, please refer to MacPorts for GAWK. Workflow of AWK Having a basic knowledge of the AWK interpreter workflow will help you to better understand AWK and will result in more efficient AWK program development. Hence, before getting your hands dirty with AWK programming, you need to understand its internals. The AWK workflow can be summarized as shown in the following figure: Let's take a look at each operation: READ OPERATION: AWK reads a line from the input stream (file, pipe, or stdin) and stores it in memory. It works on text input, which can be a file, the standard input stream, or from a pipe, which it further splits into records and fields: Records: An AWK record is a single, continuous data input that AWK works on. Records are bounded by a record separator, whose value is stored in the RS variable. The default value of RS is set to a newline character. So, the lines of input are considered records for the AWK interpreter. Records are read continuously until the end of the input is reached. Figure 1.2  shows how input data is broken into records and then goes further into how it is split into fields: Fields: Each record can further be broken down into individual chunks called fields. Like records, fields are bounded. The default field separator is any amount of whitespace, including tab and space characters. So by default, lines of input are further broken down into individual words separated by whitespace. You can refer to the fields of a record by a field number, beginning with 1. The last field in each record can be accessed by its number or with the NF special variable, which contains the number of fields in the current record, as shown in Figure 1.3: EXECUTE OPERATION: All AWK commands are applied sequentially on the input (records and fields). By default, AWK executes commands on each record/line. This behavior of AWK can be restricted by the use of patterns. REPEAT OPERATION: The process of read and execute is repeated until the end of the file is reached. The following flowchart depicts the workflow:   We introduced you to the AWK programming language and got ourselves a quick primer to get started with application development. If you found this post is useful, do check out the book Learning AWK Programming to learn more about the intricacies of AWK programming language for text processing. The oldest programming languages in use today What is the difference between functional and object oriented programming? Systems programming with Go in UNIX and Linux  
Read more
  • 0
  • 0
  • 20184

Packt
15 Mar 2017
13 min read
Save for later

About Java Virtual Machine – JVM Languages

Packt
15 Mar 2017
13 min read
In this article by Vincent van der Leun, the author of the book, Introduction to JVM Languages, you will learn the history of the JVM and five important languages that run on the JVM. (For more resources related to this topic, see here.) While many other programming languages have come in and gone out of the spotlight, Java always managed to return to impressive spots, either near to, and lately even on, the top of the list of the most used languages in the world. It didn't take language designers long to realize that they as well could run their languages on the JVM—the virtual machine that powers Java applications—and take advantage of its performance, features, and extensive class library. In this article, we will take a look at common JVM use cases and various JVM languages. The JVM was designed from the ground up to run anywhere. Its initial goal was to run on set-top boxes, but when Sun Microsystems found out the market was not ready in the mid '90s, they decided to bring the platform to desktop computers as well. To make all those use cases possible, Sun invented their own binary executable format and called it Java bytecode. To run programs compiled to Java bytecode, a Java Virtual Machine implementation must be installed on the system. The most popular JVM implementations nowadays are Oracle's free but partially proprietary implementation and the fully open source OpenJDK project (Oracle's Java runtime is largely based on OpenJDK). This article covers the following subjects: Popular JVM use cases Java language Scala language Clojure language Kotlin language Groovy The Java platform as published by Google on Android phones and tablets is not covered in this article. One of the reasons is that the Java version used on Android is still based on the Java 6 SE platform from 2006. However, some of the languages covered in this article can be used with Android. Kotlin, in particular, is a very popular choice for modern Android development. Popular use cases Since the JVM platform was designed with a lot of different use cases in mind, it will be no surprise that the JVM can be a very viable choice for very different scenarios. We will briefly look at the following use cases: Web applications Big data Internet of Things (IoT) Web applications With its focus on performance, the JVM is a very popular choice for web applications. When built correctly, applications can scale really well if needed across many different servers. The JVM is a well-understood platform, meaning that it is predictable and many tools are available to debug and profile problematic applications. Because of its open nature, the monitoring of JVM internals is also very well possible. For web applications that have to serve thousands of users concurrently, this is an important advantage. The JVM already plays a huge role in the cloud. Popular examples of companies that use the JVM for core parts of their cloud-based services include Twitter (famously using Scala), Amazon, Spotify, and Netflix. But the actual list is much larger. Big data Big data is a hot topic. When data is regarded too big for traditional databases to be analyzed, one can set up multiple clusters of servers that will process the data. Analyzing the data in this context can, for example, be searching for something specific, looking for patterns, and calculating statistics. This data could have been obtained from data collected from web servers (that, for example, logged visitor's clicks), output obtained from external sensors at a manufacturer plant, legacy servers that have been producing log files over many years, and so forth. Data sizes can vary wildly as well, but often, will take up multiple terabytes in total. Two popular technologies in the big data arena are: Apache Hadoop (provides storage of data and takes care of data distribution to other servers) Apache Spark (uses Hadoop to stream data and makes it possible to analyze the incoming data) Both Hadoop and Spark are for the most part written in Java. While both offer interfaces for a lot of programming languages and platforms, it will not be a surprise that the JVM is among them. The functional programming paradigm focuses on creating code that can run safely on multiple CPU cores, so languages that are fully specialized in this style, such as Scala or Clojure, are very appropriate candidates to be used with either Spark or Hadoop. Internet of Things - IoT Portable devices that feature internet connectivity are very common these days. Since Java was created with the idea of running on embedded devices from the beginning, the JVM is, yet again, at an advantage here. For memory constrained systems, Oracle offers Java Micro Edition Embedded platform. It is meant for commercial IoT devices that do not require a standard graphical or console-based user interface. For devices that can spare more memory, the Java SE Embedded edition is available. The Java SE Embedded version is very close to the Java Standard Edition discussed in this article. When running a full Linux environment, it can be used to provide desktop GUIs for full user interaction. Java SE Embedded is installed by default on Raspbian, the standard Linux distribution of the popular Raspberry Pi low-cost, credit card-sized computers. Both Java ME Embedded and Java SE Embedded can access the General Purpose input/output (GPIO) pins on the Raspberry Pi, which means that sensors and other peripherals connected to these ports can be accessed by Java code. Java Java is the language that started it all. Source code written in Java is generally easy to read and comprehend. It started out as a relatively simple language to learn. As more and more features were added to the language over the years, its complexity increased somewhat. The good news is that beginners don't have to worry about the more advanced topics too much, until they are ready to learn them. Programmers that want to choose a different JVM language from Java can still benefit from learning the Java syntax, especially once they start using libraries or frameworks that provide Javadocs as API documentation. Javadocs is a tool that generates HTML documentation based on special comments in the source code. Many libraries and frameworks provide the HTML documents generated by Javadocs as part of their documentation. While Java is not considered a pure Object Orientated Programming (OOP) language because of its support for primitive types, it is still a serious OOP language. Java is known for its verbosity, it has strict requirements for its syntax. A typical Java class looks like this: package com.example; import java.util.Date; public class JavaDemo { private Date dueDate = new Date(); public void getDueDate(Date dueDate) { this.dueDate = dueDate; } public Date getValue() { return this.dueDate; } } A real-world example would implement some other important additional methods that were omitted for readability. Note that declaring the dueDate variable, the Date class name has to be specified twice; first, when declaring the variable type and the second time, when instantiating an object of this class. Scala Scala is a rather unique language. It has a strong support for functional programming, while also being a pure object orientated programming language at the same time. While a lot more can be said about functional programming, in a nutshell, functional programming is about writing code in such a way that existing variables are not modified while the program is running. Values are specified as function parameters and output is generated based on their parameters. Functions are required to return the same output when specifying the same parameters on each call. A class is supposed to not hold internal states that can change over time. When data changes, a new copy of the object must be returned and all existing copies of the data must be left alone. When following the rules of functional programming, which requires a specific mindset of programmers, the code is safe to be executed on multiple threads on different CPU cores simultaneously. The Scala installation offers two ways of running Scala code. It offers an interactive shell where code can directly be entered and is run right away. This program can also be used to run Scala source code directly without manually first compiling it. Also offered is scalac, a traditional compiler that compiles Scala source code to Java bytecode and compiles to files with the .class extension. Scala comes with its own Scala Standard Library. It complements the Java Class Library that is bundled with the Java Runtime Environment (JRE) and installed as part of the Java Developers Kit (JDK). It contains classes that are optimized to work with Scala's language features. Among many other things, it implements its own collection classes, while still offering compatibility with Java's collections. Scala's equivalent of the code shown in the Java section would be something like the following: package com.example import java.util.Date class ScalaDemo(var dueDate: Date) { } Scala will generate the getter and setter methods automatically. Note that this class does not follow the rules of functional programming as the dueDate variable is mutable (it can be changed at any time). It would be better to define the class like this: class ScalaDemo(val dueDate: Date) { } By defining dueDate with the val keyword instead of the var keyword, the variable has become immutable. Now Scala only generates a getter method and the dueDate can only be set when creating an instance of ScalaDemo class. It will never change during the lifetime of the object. Clojure Clojure is a language that is rather different from the other languages covered in this article. It is a language largely inspired by the Lisp programming language, which originally dates from the late 1950s. Lisp stayed relevant by keeping up to date with technology and times. Today, Common Lisp and Scheme are arguably the two most popular Lisp dialects in use today and Clojure is influenced by both. Unlike Java and Scala, Clojure is a dynamic language. Variables do not have fixed types and when compiling, no type checking is performed by the compiler. When a variable is passed to a function that it is not compatible with the code in the function, an exception will be thrown at run time. Also noteworthy is that Clojure is not an object orientated language, unlike all other languages in this article. Clojure still offers interoperability with Java and the JVM as it can create instances of objects and can also generate class files that other languages on the JVM can use to run bytecode compiled by Clojure. Instead of demonstrating how to generate a class in Clojure, let's write a function in Clojure that would consume a javademo instance and print its dueDate: (defn consume-javademo-instance [d] (println (.getDueDate d))) This looks rather different from the other source code in this article. Code in Clojure is written by adding code to a list. Each open parenthesis and the corresponding closing parenthesis in the preceding code starts and ends a new list. The first entry in the list is the function that will be called, while the other entries of that list are its parameters. By nesting the lists, complex evaluations can be written. The defn macro defines a new function that will be called consume-javademo-instance. It takes one parameter, called d. This parameter should be the javademo instance. The list that follows is the body of the function, which prints the value of the getDueDate function of the passed javademo instance in the variable, d. Kotlin Like Java and Scala, Kotlin, is a statically typed language. Kotlin is mainly focused on object orientated programming but supports procedural programming as well, so the usage of classes and objects is not required. Kotlin's syntax is not compatible with Java; the code in Kotlin is much less verbose. It still offers a very strong compatibility with Java and the JVM platform. The Kotlin equivalent of the Java code would be as follows: import java.util.Date data class KotlinDemo(var dueDate: Date) One of the more noticeable features of Kotlin is its type system, especially its handling of null references. In many programming languages, a reference type variable can hold a null reference, which means that a reference literally points to nothing. When accessing members of such null reference on the JVM, the dreaded NullPointerException is thrown. When declaring variables in the normal way, Kotlin does not allow references to be assigned to null. If you want a variable that can be null, you'll have to add the question mark (?)to its definition: var thisDateCanBeNull: Date? = Date() When you now access the variable, you'll have to let the compiler know that you are aware that the variable can be null: if (thisDateCanBeNull != null) println("${thisDateCanBeNull.toString()}") Without the if check, the code would refuse to compile. Groovy Groovy was an early alternative language for the JVM. It offers, for a large degree, Java syntax compatibility, but the code in Groovy can be much more compact because many source code elements that are required in Java are optional in Groovy. Like Clojure and mainstream languages such as Python, Groovy is a dynamic language (with a twist, as we will discuss next). Unusually, while Groovy is a dynamic language (types do not have to be specified when defining variables), it still offers optional static compilation of classes. Since statically compiled code usually performs better than dynamic code, this can be used when the performance is important for a particular class. You'll give up some convenience when switching to static compilation, though. Some other differences with Java is that Groovy supports operator overloading. Because Groovy is a dynamic language, it offers some tricks that would be very hard to implement with Java. It comes with a huge library of support classes, including many wrapper classes that make working with the Java Class Library a much more enjoyable experience. A JavaDemo equivalent in Groovy would look as follows: @Canonical class GroovyDemo { Date dueDate } The @Canonical annotation is not necessary but recommended because it will generate some support methods automatically that are used often and required in many use cases. Even without it, Groovy will automatically generate the getter and setter methods that we had to define manually in Java. Summary We started by looking at the history of the Java Virtual Machine and studied some important use cases of the Java Virtual Machine: web applications, big data, and IoT (Internet of Things). We then looked at five important languages that run on the JVM: Java (a very readable, but also very verbose statically typed language), Scala (both a strong functional and OOP programming language), Clojure (a non-OOP functional programming language inspired by Lisp and Haskell), Kotlin (a statically typed language, that protects the programmer from very common NullPointerException errors), and Groovy (a dynamic language with static compiler support that offers a ton of features). Resources for Article: Further resources on this subject: Using Spring JMX within Java Applications [article] Tuning Solr JVM and Container [article] So, what is Play? [article]
Read more
  • 0
  • 0
  • 19452

article-image-8-recipes-to-master-promises-in-ecmascript-2018
Richa Tripathi
08 May 2018
17 min read
Save for later

8 recipes to master Promises in ECMAScript 2018

Richa Tripathi
08 May 2018
17 min read
What are Promises in ECMAScript? In earlier versions of JavaScript, the callback pattern was the most common way to organize asynchronous code. It got the job done, but it didn't scale well. With callbacks, as more asynchronous functions are added, the code becomes more deeply nested, and it becomes more difficult to add to, refactor, and understand the code. This situation is commonly known as callback hell. Promises were introduced to improve on this situation. Promises allow the relationships of asynchronous operations to be rearranged and organized with more freedom and flexibility. In this context, today we will learn about Promises and how to use it to create and organize asynchronous functions. We will also explore how to handle error conditions. Creating and waiting for Promises Promises provide a way to compose and combine asynchronous functions in an organized and easier to read way. This recipe demonstrates a very basic usage of promises. This recipe assumes that you already have a workspace that allows you to create and run ES modules in your browser for all the recipes given below: How to do it... Open your command-line application and navigate to your workspace. Create a new folder named 03-01-creating-and-waiting-for-promises. Copy or create an index.html that loads and runs a main function from main.js. Create a main.js file that creates a promise and logs messages before and after the promise is created, as well as while the promise is executing and after it has been resolved: // main.js export function main () { console.log('Before promise created'); new Promise(function (resolve) { console.log('Executing promise'); resolve(); }).then(function () { console.log('Finished promise'); }); console.log('After promise created'); } Start your Python web server and open the following link in your browser: http://localhost:8000/. You will see the following output: How it works... By looking at the order of the log messages, you can clearly see the order of operations. First, the initial log is executed. Next, the promise is created with an executor method. The executor method takes resolve as an argument. The resolve function fulfills the promise. Promises adhere to an interface named thenable. This means that we can chain then callbacks. The callback we attached with this method is executed after the resolve function is called. This function executes asynchronously (not immediately after the Promise has been resolved). Finally, there is a log after the promise has been created. The order the logs messages appear reveals the asynchronous nature of the code. All of the logs are seen in the order they appear in the code, except the Finished promise message. That function is executed asynchronously after the main function has exited! Resolving Promise results In the previous recipe, we saw how to use promises to execute asynchronous code. However, this code is pretty basic. It just logs a message and then calls resolve. Often, we want to use asynchronous code to perform some long-running operation, then return that value. This recipe demonstrates how to use resolve in order to return the result of a long-running operation. How to do it... Open your command-line application and navigate to your workspace.  Create a new folder named 3-02-resolving-promise-results. Copy or create an index.html that loads and runs a main function from main.js. Create a main.js file that creates a promise and logs messages before and after the promise is created: // main.js export function main () { console.log('Before promise created'); new Promise(function (resolve) { }); console.log('After promise created'); } Within the promise, resolve a random number after a 5-second timeout: new Promise(function (resolve) { setTimeout(function () { resolve(Math.random()); }, 5000); }) Chain a then call off the promise. Pass a function that logs out the value of its only argument: new Promise(function (resolve) { setTimeout(function () { resolve(Math.random()); }, 5000); }).then(function (result) { console.log('Long running job returned: %s', result); }); Start your Python web server and open the following link in your browser: http://localhost:8000/. You should see the following output: How it works... Just as in the previous recipe, the promise was not fulfilled until resolve was executed (this time after 5 seconds). This time however, we passed the called  resolve immediately with a random number for an argument. When this happens, the argument is provided to the callback for the subsequent then function. We'll see in future recipes how this can be continued to create promise chains. Rejecting Promise errors In the previous recipe, we saw how to use resolve to provide a result from a successfully fulfilled promise. Unfortunately, the code doesn't always run as expected. Network connections can be down, data can be corrupted, and uncountable other errors can occur. We need to be able to handle those situations as well. This recipe demonstrates how to use reject when errors arise. How to do it... Open your command-line application and navigate to your workspace. Create a new folder named 3-03-rejecting-promise-errors. Copy or create an index.html that loads and runs a main function from main.js. Create a main.js file that creates a promise, and logs messages before and after the promise is created and when the promise is fulfilled: new Promise(function (resolve) { resolve(); }).then(function (result) { console.log('Promise Completed'); }); Add a second argument to the promise callback named reject, and call reject with a new error: new Promise(function (resolve, reject) { reject(new Error('Something went wrong'); }).then(function (result) { console.log('Promise Completed'); }); Chain a catch call off the promise. Pass a function that logs out its only argument: new Promise(function (resolve, reject) { reject(new Error('Something went wrong'); }).then(function (result) { console.log('Promise Completed'); }).catch(function (error) { console.error(error); }); Start your Python web server and open the following link in your browser: http://localhost:8000/. You should see the following output: How it works... Previously we saw how to use resolve to return a value in the case of a successful fulfillment of a promise. In this case, we called reject before resolve. This means that the Promise finished with an error before it could resolve. When the Promise completes in an error state, the then callbacks are not executed. Instead we have to use catch in order to receive the error that the Promise rejects. You'll also notice that the catch callback is only executed after the main function has returned. Like successful fulfillment, listeners to unsuccessful ones execute asynchronously. See also Handle errors with Promise.catch Simulating finally with Promise.then Chaining Promises So far in this article, we've seen how to use promises to run single asynchronous tasks. This is helpful but doesn't provide a significant improvement over the callback pattern. The real advantage that promises offer comes when they are composed. In this recipe, we'll use promises to combine asynchronous functions in series. How to do it... Open your command-line application and navigate to your workspace. Create a new folder named 3-04-chaining-promises. Copy or create an index.html that loads and runs a main function from main.js. Create a main.js file that creates a promise. Resolve a random number from the promise: new Promise(function (resolve) { resolve(Math.random()); }); ); Chain a then call off of the promise. Return true from the callback if the random value is greater than or equal to 0.5: new Promise(function (resolve, reject) { resolve(Math.random()); }).then(function(value) { return value >= 0.5; }); Chain a final then call after the previous one. Log out a different message if the argument is true or false: new Promise(function (resolve, reject) { resolve(Math.random()); }).then(function (value) { return value >= 0.5; }).then(function (isReadyForLaunch) { if (isReadyForLaunch) { console.log('Start the countdown! '); } else { console.log('Abort the mission. '); } }); Start your Python web server and open the following link in your browser: http://localhost:8000/. If you are lucky, you'll see the following output: If you are unlucky, we'll see the following output: How it works... We've already seen how to use then to wait for the result of a promise. Here, we are doing the same thing multiple times in a row. This is called a promise chain. After the promise chain is started with the new promise, all of the subsequent links in the promise chain return promises as well. That is, the callback of each then function is resolve like another promise. See also Using Promise.all to resolve multiple Promises Handle errors with Promise.catch Simulating finally with a final Promise.then call Starting a Promise chain with Promise.resolve In this article's preceding recipes, we've been creating new promise objects with the constructor. This gets the jobs done, but it creates a problem. The first callback in the promise chain has a different shape than the subsequent callbacks. In the first callback, the arguments are the resolve and reject functions that trigger the subsequent then or catch callbacks. In subsequent callbacks, the returned value is propagated down the chain, and thrown errors are captured by catch callbacks. This difference adds mental overhead. It would be nice to have all of the functions in the chain behave in the same way. In this recipe, we'll see how to use Promise.resolve to start a promise chain. How to do it... Open your command-line application and navigate to your workspace. Create a new folder named 3-05-starting-with-resolve. Copy or create an index.html that loads and runs a main function from main.js. Create a main.js file that calls Promise.resolve with an empty object as the first argument: export function main () { Promise.resolve({}) } Chain a then call off of resolve, and attach rocket boosters to the passed object: export function main () { Promise.resolve({}).then(function (rocket) { console.log('attaching boosters'); rocket.boosters = [{ count: 2, fuelType: 'solid' }, { count: 1, fuelType: 'liquid' }]; return rocket; }) } Add a final then call to the chain that lets you know when the boosters have been added: export function main () { Promise.resolve({}) .then(function (rocket) { console.log('attaching boosters'); rocket.boosters = [{ count: 2, fuelType: 'solid' }, { count: 1, fuelType: 'liquid' }]; return rocket; }) .then(function (rocket) { console.log('boosters attached'); console.log(rocket); }) } Start your Python web server and open the following link in your browser: http://localhost:8000/. You should see the following output: How it works... Promise.resolve creates a new promise that resolves the value passed to it. The subsequent then method will receive that resolved value as it's argument. This method can seem a little roundabout but can be very helpful for composing asynchronous functions. In effect, the constituents of the promise chain don't need to be aware that they are in the chain (including the first step). This makes transitioning from code that doesn't use promises to code that does much easier. Using Promise.all to resolve multiple promises So far, we've seen how to use promises to perform asynchronous operations in sequence. This is useful when the individual steps are long-running operations. However, this might not always be the more efficient configuration. Quite often, we can perform multiple asynchronous operations at the same time. In this recipe, we'll see how to use Promise.all to start multiple asynchronous operations, without waiting for the previous one to complete. How to do it... Open your command-line application and navigate to your workspace. Create a new folder named 3-06-using-promise-all. Copy or create an index.html that loads and runs a main function from main.js. Create a main.js file that creates an object named rocket, and calls Promise.all with an empty array as the first argument: export function main() { console.log('Before promise created'); const rocket = {}; Promise.all([]) console.log('After promise created'); } Create a function named addBoosters that creates an object with boosters to an object: function addBoosters (rocket) { console.log('attaching boosters'); rocket.boosters = [{ count: 2, fuelType: 'solid' }, { count: 1, fuelType: 'liquid' }]; return rocket; } Create a function named performGuidanceDiagnostic that returns a promise of a successfully completed task: function performGuidanceDiagnostic (rocket) { console.log('performing guidance diagnostic'); return new Promise(function (resolve) { setTimeout(function () { console.log('guidance diagnostic complete'); rocket.guidanceDiagnostic = 'Completed'; resolve(rocket); }, 2000); }); } Create a function named loadCargo that adds a payload to the cargoBay: function loadCargo (rocket) { console.log('loading satellite'); rocket.cargoBay = [{ name: 'Communication Satellite' }] return rocket; } Use Promise.resolve to pass the rocket object to these functions within Promise.all: export function main() { console.log('Before promise created'); const rocket = {}; Promise.all([ Promise.resolve(rocket).then(addBoosters), Promise.resolve(rocket).then(performGuidanceDiagnostic), Promise.resolve(rocket).then(loadCargo) ]); console.log('After promise created'); } Attach a then call to the chain and log that the rocket is ready for launch: const rocket = {}; Promise.all([ Promise.resolve(rocket).then(addBoosters), Promise.resolve(rocket).then(performGuidanceDiagnostic), Promise.resolve(rocket).then(loadCargo) ]).then(function (results) { console.log('Rocket ready for launch'); console.log(results); }); Start your Python web server and open the following link in your browser: http://localhost:8000/. You should see the following output: How it works... Promise.all is similar to Promise.resolve; the arguments are resolved as promises. The difference is that instead of a single result, Promise.all accepts an iterable argument, each member of which is resolved individually. In the preceding example, you can see that each of the promises is initiated immediately. Two of them are able to complete while performGuidanceDiagnostic continues. The promise returned by Promise.all is fulfilled when all the constituent promises have been resolved. The results of the promises are combined into an array and propagated down the chain. You can see that three references to rocket are packed into the results argument. And you can see that the operations of each promise have been performed on the resulting object. There's more As you may have guessed, the results of the constituent promises don't have to return the same value. This can be useful, for example, when performing multiple independent network requests. The index of the result for each promise corresponds to the index of the operation within the argument to Promise.all. In these cases, it can be useful to use array destructuring to name the argument of the then callback: Promise.all([ findAstronomers, findAvailableTechnicians, findAvailableEquipment ]).then(function ([astronomers, technicians, equipment]) { // use results for astronomers, technicians, and equipment }); Handling errors with Promise.catch In a previous recipe, we saw how to fulfill a promise with an error state using reject, and we saw that this triggers the next catch callback in the promise chain. Because promises are relatively easy to compose, we need to be able to handle errors that are reported in different ways. Luckily promises are able to handle this seamlessly. In this recipe, we'll see how Promises.catch can handle errors that are reported by being thrown or through rejection. How to do it... Open your command-line application and navigate to your workspace.  Create a new folder named 3-07-handle-errors-promise-catch. Copy or create an index.html that loads and runs a main function from main.js. Create a main.js file with a main function that creates an object named rocket: export function main() { console.log('Before promise created'); const rocket = {}; console.log('After promise created'); } Create a function addBoosters that throws an error: function addBoosters (rocket) { throw new Error('Unable to add Boosters'); } Create a function performGuidanceDiagnostic that returns a promise that rejects an error: function performGuidanceDiagnostic (rocket) { return new Promise(function (resolve, reject) { reject(new Error('Unable to finish guidance diagnostic')); }); } Use Promise.resolve to pass the rocket object to these functions, and chain a catch off each of them: export function main() { console.log('Before promise created'); const rocket = {}; Promise.resolve(rocket).then(addBoosters) .catch(console.error); Promise.resolve(rocket).then(performGuidanceDiagnostic) .catch(console.error); console.log('After promise created'); } Start your Python web server and open the following link in your browser: http://localhost:8000/. You should see the following output: How it works... As we saw before, when a promise is fulfilled in a rejected state, the callback of the catch functions is triggered. In the preceding recipe, we see that this can happen when the reject method is called (as with performGuidanceDiagnostic). It also happens when a function in the chain throws an error (as will addBoosters). This has similar benefit to how Promise.resolve can normalize asynchronous functions. This handling allows asynchronous functions to not know about the promise chain, and announce error states in a way that is familiar to developers who are new to promises. This makes expanding the use of promises much easier. Simulating finally with the promise API In a previous recipe, we saw how catch can be used to handle errors, whether a promise has rejected, or a callback has thrown an error. Sometimes, it is desirable to execute code whether or not an error state has been detected. In the context of try/catch blocks, the finally block can be used for this purpose. We have to do a little more work to get the same behavior when working with promises In this recipe, we'll see how a final then call to execute some code in both successful and failing fulfillment states. How to do it... Open your command-line application and navigate to your workspace.  Create a new folder named 3-08-simulating-finally. Copy or create an index.html that loads and runs a main function from main.js. Create a main.js file with a main function that logs out messages for before and after promise creation: export function main() { console.log('Before promise created'); console.log('After promise created'); } Create a function named addBoosters that throws an error if its first parameter is false: function addBoosters(shouldFail) { if (shouldFail) { throw new Error('Unable to add Boosters'); } return { boosters: [{ count: 2, fuelType: 'solid' }, { count: 1, fuelType: 'liquid' }] }; } Use Promise.resolve to pass a Boolean value that is true if a random number is greater than 0.5 to addBoosters: export function main() { console.log('Before promise created'); Promise.resolve(Math.random() > 0.5) .then(addBoosters) console.log('After promise created'); } Add a then function to the chain that logs a success message: export function main() { console.log('Before promise created'); Promise.resolve(Math.random() > 0.5) .then(addBoosters) .then(() => console.log('Ready for launch: ')) console.log('After promise created'); } Add a catch to the chain and log out the error if thrown: export function main() { console.log('Before promise created'); Promise.resolve(Math.random() > 0.5) .then(addBoosters) .then(() => console.log('Ready for launch: ')) .catch(console.error) console.log('After promise created'); } Add a then after the catch, and log out that we need to make an announcement: export function main() { console.log('Before promise created'); Promise.resolve(Math.random() > 0.5) .then(addBoosters) .then(() => console.log('Ready for launch: ')) .catch(console.error) .then(() => console.log('Time to inform the press.')); console.log('After promise created'); } Start your Python web server and open the following link in your browser: http://localhost:8000/. If you are lucky and the boosters are added successfully, you'll see the following output: 12. If you are unlucky, you'll see an error message like the following: How it works... We can see in the preceding output that whether or not the asynchronous function completes in an error state, the last then callback is executed. This is possible because the catch method doesn't stop the promise chain. It simply catches any error states from the previous links in the chain, and then propagates a new value forward. The final then is then protected from being bypassed by an error state by this catch. And so, regardless of the fulfillment state of prior links in the chain, we can be sure that the callback of this final then will be executed. To summarize, we learned how to use the Promise API to organize asynchronous programs. We also looked at how to propagate results through promise chains and handle errors.  You read an excerpt from a book written by Ross Harrison, titled ECMAScript Cookbook. It’s a complete guide on how to become a better web programmer by writing efficient and modular code using ES6 and ES8. What’s new in ECMAScript 2018 (ES9)? ECMAScript 7 – What to expect? Modular Programming in ECMAScript 6  
Read more
  • 0
  • 0
  • 19228

Packt
23 Dec 2013
5 min read
Save for later

GLSL – How to Set up the Shaders from the Host Application Side

Packt
23 Dec 2013
5 min read
(For more resources related to this topic, see here.) Setting up geometry Let's say that our mesh (a quad formed by two triangles) has the following information: vertex positions and texture coordinates. Also, we will arrange the data interleaved in the array. struct MyVertex { float x, y, z; float s, t; }; MyVertex geometry[] =  {{0,1,0,0,1}, {0,0,0,0,0}, {1,1,0,1,1},{1,0,0,1,0}}; // Let's create the objects that will encapsulate our geometry data GLuint vaoID, vboID; glGenVertexArray(1, &vaoID); glBindVertexArray(vaoID); glGenBuffers(1, &vboID); glBindBuffer(GL_ARRAY_BUFFER, vboID); // Attach our data to the OpenGL objects glBufferData(GL_ARRAY_BUFFER, 4 * sizeof(MyVertex), &geometry[0].x,GL_DYNAMIC_DRAW); // Specify the format of each vertex attribute glEnableVertexAttribArray(0); glVertexAttribPointer(0, 3, GL_FLOAT, GL_FALSE, sizeof(MyVertex), NULL); glEnableVertexAttribArray(1); glVertexAttribPointer(1, 2, GL_FLOAT, GL_FALSE, sizeof(MyVertex),(void*)(sizeof(float)*3)); At this point, we created the OpenGL objects, set up correctly each vertex attribute format and uploaded the data to GPU. Setting up textures Setting up textures follows the same pattern. First create the OpenGL objects, then fill the data and the format in which it is provided. const int width = 512; const int height = 512; const int bpp = 32; struct RGBColor { unsigned char R,G,B,A; }; RGBColor textureData[width * height]; for(size_t y = 0; y < height; ++y) for(size_t x = 0; x < width; ++x)                textureData[y*height+x] = …; //fill your texture here // Create GL object GLuint texID; glGenTextures(1, &texID); glBindTexture(GL_TEXTURE_2D, texID); // Fill up the data and set the texel format glTexImage2D(GL_TEXTURE_2D, 0, GL_RGBA, width, height, 0, GL_RGBA,GL_UNSIGNED_BYTE, &textureData[0].R); // Set texture format data: interpolation and clamping modes glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MIN_FILTER, GL_LINEAR); glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MAG_FILTER, GL_LINEAR); glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_WRAP_S, GL_REPEAT); glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_WRAP_T, GL_REPEAT); And that's all about textures. Later we will use the texture ID to place the texture into a slot, and that slot number is the information that we will pass to the shader to tell it where the texture is placed in order to locate it. Setting up shaders In order to setup the shaders, we have to carry out some steps:  load up the source code, compile it and associate to a shader object, and link all shaders together into a program object. char* vs[1]; vs[0] = "#version 430\nlayout (location = 0) in vec3 PosIn;layout (location = 1) in vec2 TexCoordIn;smooth out vec2 TexCoordOut;uniform mat4 MVP; void main() { TexCoordOut = TexCoordIn;gl_Position = MVP * vec4(PosIn, 1.0);}"; char fs[1]; fs = "#version 430\n uniform sampler2D Image; smooth in vec2 TexCoord;out vec4 FBColor;void main() {FBColor = texture(Image, TexCoord);}"; // Upload source code and compile it GLuint pId, vsId, fsId; vsId = glCreateShader(GL_VERTEX_SHADER); glShaderSource(vsId, 1, (const char**)&vs, NULL); glCompileShader(vsId); // Check for compilation errors GLint status = 0, bufferLength = 0; glGetShaderiv(vsId, GL_COMPILE_STATUS, &status); if(!status) { char* infolog = new char[bufferLength + 1]; glGetShaderiv(vsId, GL_INFO_LOG_LENGTH, &bufferLength); glGetShaderInfoLog(vsId, bufferLength, NULL, infolog); infolog[bufferLength] = 0; printf("Shader compile errors / warnings: %s\n", infolog); delete [] infolog; } The process for the fragment shader is exactly the same. The only change is that the shader object must be created as fsId = glCreateShader(GL_FRAGMENT_SHADER); // Now let's proceed to link the shaders into the program object pId = glCreateProgram(); glAttachShader(pId, vsId); glAttachShader(pId, fsId); glLinkProgram(pId); glGetProgramiv(pId, GL_LINK_STATUS, &status); if(!status) { char* infolog = new char[bufferLength + 1]; glGetProgramiv(pId, GL_INFO_LOG_LENGTH, &bufferLength); infolog[bufferLength] = 0; printf("Shader linking errors / warnings: %s\n", infolog); delete [] infolog; } // We do not need the vs and fs anymore, so it is same mark them for deletion. glDeleteShader(vsId); glDeleteShader(fsId); The last things to upload to the shader are two uniform variables: the one that corresponds with the view-projection matrix and the one that represents the texture. Those are uniform variables and are set in the following way: // First bind the program object where we want to upload the variables glUseProgram(pId); // Obtain the "slot number" where the uniform is located in GLint location = glGetUniformLocation(pId, "MVP"); float mvp[16] = {1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1}; // Set the data into the uniform's location glUniformMatrix4fv(location, 1, GL_FALSE, mvp); // Active the texture slot 0 glActiveTexture(GL_TEXTURE0); // Bind the texture to the active slot glBindTexture(GL_TEXTURE_2D, texID); location = glGetUniformLocation(pId, "Image"); // Upload the texture's slot number to the uniform variable int imageSlot = 0; glUniform1i(location, imageSlot); And that's all. For the other types of shaders, process is all the same: Create shader object, upload source code, compile and link, but using the proper OpenGL types such as GL_GEOMETRY_SHADER or GL_COMPUTE_SHADER. A last step, to draw all these things, is to establish them as active and issue the draw call: glBindVertexArray(vaoID); glBindTexture(GL_TEXTURE_2D, texID); glUseProgram(pId); glDrawArrays(GL_TRIANGLE_STRIP, 0, 4); Resources for Article: Further resources on this subject: The Basics of GLSL 4.0 Shaders [Article] GLSL 4.0: Using Subroutines to Select Shader Functionality [Article] Getting Started with GLSL [Article]
Read more
  • 0
  • 0
  • 19166

article-image-new-functionality-opencv-30
Packt
25 Aug 2014
5 min read
Save for later

New functionality in OpenCV 3.0

Packt
25 Aug 2014
5 min read
In this article by Oscar Deniz Suarez, coauthor of the book OpenCV Essentials, we will cover the forthcoming Version 3.0, which represents a major evolution of the OpenCV library for Computer Vision. Currently, OpenCV already includes several new techniques that are not available in the latest official release (2.4.9). The new functionality can be already used by downloading and compiling the latest development version from the official repository. This article provides an overview of some of the new techniques implemented. Other numerous lower-level changes in the forthcoming Version 3.0 (updated module structure, C++ API changes, transparent API for GPU acceleration, and so on) are not discussed. (For more resources related to this topic, see here.) Line Segment Detector OpenCV users have had the Hough transform-based straight line detector available in the previous versions. An improved method called Line Segment Detector (LSD) is now available. LSD is based on the algorithm described at http://dx.doi.org/10.5201/ipol.2012.gjmr-lsd. This method has been shown to be more robust and faster than the best previous Hough-based detector (the Progressive Probabilistic Hough Transform). The detector is now part of the imgproc module. OpenCV provides a short sample code ([opencv_source_code]/samples/cpp/lsd_lines.cpp), which shows how to use the LineSegmentDetector class. The following table shows the main components of the class: Method Function <constructor> The constructor allows to enter parameters of the algorithm; particularly; the level of refinements we want in the result detect This method detects line segments in the image drawSegments This method draws the segments in a given image compareSegments This method draws two sets of segments in a given image. The two sets are drawn with blue and red color lines Connected components The previous versions of OpenCV have included functions for working with image contours. Contours are the external limits of connected components (that is, regions of connected pixels in a binary image). The new functions, connectedComponents and connectedComponentsWithStats retrieve connected components as such. The connected components are retrieved as a labeled image with the same dimensions as the input image. This allows drawing the components on the original image easily. The connectedComponentsWithStats function retrieves useful statistics about each component shown in the following table: CC_STAT_LEFT  The leftmost (x) coordinate, which is the inclusive start of the bounding box in the horizontal direction CC_STAT_TOP  The topmost (y) coordinate, which is the inclusive start of the bounding box in the vertical direction CC_STAT_WIDTH  The horizontal size of the bounding box CC_STAT_HEIGHT  The vertical size of the bounding box CC_STAT_AREA  The total area (in pixels) of the connected component Scene text detection Text recognition is a classic problem in Computer Vision. Thus, Optical Character Recognition (OCR) is now routinely used in our society. In OCR, the input image is expected to depict typewriter black text over white background. In the last years, researchers aim at the more challenging problem of recognizing text "in the wild" on street signs, indoor signs, with diverse backgrounds and fonts, colors, and so on. The following figure shows and example of the difference between the two scenarios. In this scenario, OCR cannot be applied to the input images. Consequently, text recognition is actually accomplished in two steps. The text is first localized in the image and then character or word recognition is performed on the cropped region. OpenCV now provides a scene text detector based on the algorithm described in Neumann L., Matas J.: Real-Time Scene Text Localization and Recognition, CVPR 2012 (Providence, Rhode Island, USA). The implementation of OpenCV makes use of additional improvements found at http://158.109.8.37/files/GoK2013.pdf. OpenCV includes an example ([opencv_source_code]/samples/cpp/textdetection.cpp) that detects and draws text regions in an input image. The KAZE and AKAZE features Several 2D features have been proposed in the computer vision literature. Generally, the two most important aspects in feature extraction algorithms are computational efficiency and robustness. One of the latest contenders is the KAZE (Japanese word meaning "Wind") and Accelerated-KAZE (AKAZE) detector. There are reports that show that KAZE features are both robust and efficient, as compared with other widely-known features (BRISK, FREAK, and so on). The underlying algorithm is described in KAZE Features, Pablo F. Alcantarilla, Adrien Bartoli, and Andrew J. Davison, in European Conference on Computer Vision (ECCV), Florence, Italy, October 2012. As with other keypoint detectors in OpenCV, the KAZE implementation allows retrieving both keypoints and descriptors (that is, a feature vector computed around the keypoint neighborhood). The detector follows the same framework used in OpenCV for other detectors, so drawing methods are also available. Computational photography One of the modules with most improvements in the forthcoming Version 3.0 is the computational photography module (photo). The new techniques include the functionalities mentioned in the following table: Functionality Description HDR imaging Functions for handling High-Dynamic Range images (tonemapping, exposure alignment, camera calibration with multiple exposures, and exposure fusion) Seamless cloning Functions for realistically inserting one image into other image with an arbitrary-shape region of interest. Non-photorealistic rendering This technique includes non-photorealistic filters (such as pencil-like drawing effect) and edge-preserving smoothing filters (those are similar to the bilateral filter). New modules Finally, we provide a list with the new modules in development for version 3.0: Module name Description videostab Global motion estimation, Fast Marching method softcascade Implements a stageless variant of the cascade detector, which is considered more accurate shape Shape matching and retrieval. Shape context descriptor and matching algorithm, Hausdorff distance and Thin-Plate Splines cuda<X> Several modules with CUDA-accelerated implementations of other functions in the library Summary In this article, we learned about the different functionalities in OpenCV 3.0 and its different components. Resources for Article: Further resources on this subject: Wrapping OpenCV [article] A quick start – OpenCV fundamentals [article] Linking OpenCV to an iOS project [article]
Read more
  • 0
  • 0
  • 18965
Unlock access to the largest independent learning library in Tech for FREE!
Get unlimited access to 7500+ expert-authored eBooks and video courses covering every tech area you can think of.
Renews at $19.99/month. Cancel anytime
article-image-basics-programming-julia
Packt
03 Mar 2015
17 min read
Save for later

Basics of Programming in Julia

Packt
03 Mar 2015
17 min read
 In this article by Ivo Balbaert, author of the book Getting Started with Julia Programming, we will explore how Julia interacts with the outside world, reading from standard input and writing to standard output, files, networks, and databases. Julia provides asynchronous networking I/O using the libuv library. We will see how to handle data in Julia. We will also discover the parallel processing model of Julia. In this article, the following topics are covered: Working with files (including the CSV files) Using DataFrames (For more resources related to this topic, see here.) Working with files To work with files, we need the IOStream type. IOStream is a type with the supertype IO and has the following characteristics: The fields are given by names(IOStream) 4-element Array{Symbol,1}:  :handle   :ios    :name   :mark The types are given by IOStream.types (Ptr{None}, Array{Uint8,1}, String, Int64) The file handle is a pointer of the type Ptr, which is a reference to the file object. Opening and reading a line-oriented file with the name example.dat is very easy: // code in Chapter 8io.jl fname = "example.dat"                                 f1 = open(fname) fname is a string that contains the path to the file, using escaping of special characters with when necessary; for example, in Windows, when the file is in the test folder on the D: drive, this would become d:\test\example.dat. The f1 variable is now an IOStream(<file example.dat>) object. To read all lines one after the other in an array, use data = readlines(f1), which returns 3-element Array{Union(ASCIIString,UTF8String),1}: "this is line 1.rn" "this is line 2.rn" "this is line 3." For processing line by line, now only a simple loop is needed: for line in data   println(line) # or process line end close(f1) Always close the IOStream object to clean and save resources. If you want to read the file into one string, use readall. Use this only for relatively small files because of the memory consumption; this can also be a potential problem when using readlines. There is a convenient shorthand with the do syntax for opening a file, applying a function process, and closing it automatically. This goes as follows (file is the IOStream object in this code): open(fname) do file     process(file) end The do command creates an anonymous function, and passes it to open. Thus, the previous code example would have been equivalent to open(process, fname). Use the same syntax for processing a file fname line by line without the memory overhead of the previous methods, for example: open(fname) do file     for line in eachline(file)         print(line) # or process line     end end Writing a file requires first opening it with a "w" flag, then writing strings to it with write, print, or println, and then closing the file handle that flushes the IOStream object to the disk: fname =   "example2.dat" f2 = open(fname, "w") write(f2, "I write myself to a filen") # returns 24 (bytes written) println(f2, "even with println!") close(f2) Opening a file with the "w" option will clear the file if it exists. To append to an existing file, use "a". To process all the files in the current folder (or a given folder as an argument to readdir()), use this for loop: for file in readdir()   # process file end Reading and writing CSV files A CSV file is a comma-separated file. The data fields in each line are separated by commas "," or another delimiter such as semicolons ";". These files are the de-facto standard for exchanging small and medium amounts of tabular data. Such files are structured so that one line contains data about one data object, so we need a way to read and process the file line by line. As an example, we will use the data file Chapter 8winequality.csv that contains 1,599 sample measurements, 12 data columns, such as pH and alcohol per sample, separated by a semicolon. In the following screenshot, you can see the top 20 rows:   In general, the readdlm function is used to read in the data from the CSV files: # code in Chapter 8csv_files.jl: fname = "winequality.csv" data = readdlm(fname, ';') The second argument is the delimiter character (here, it is ;). The resulting data is a 1600x12 Array{Any,2} array of the type Any because no common type could be found:     "fixed acidity"   "volatile acidity"      "alcohol"   "quality"      7.4                        0.7                                9.4              5.0      7.8                        0.88                              9.8              5.0      7.8                        0.76                              9.8              5.0   … If the data file is comma separated, reading it is even simpler with the following command: data2 = readcsv(fname) The problem with what we have done until now is that the headers (the column titles) were read as part of the data. Fortunately, we can pass the argument header=true to let Julia put the first line in a separate array. It then naturally gets the correct datatype, Float64, for the data array. We can also specify the type explicitly, such as this: data3 = readdlm(fname, ';', Float64, 'n', header=true) The third argument here is the type of data, which is a numeric type, String or Any. The next argument is the line separator character, and the fifth indicates whether or not there is a header line with the field (column) names. If so, then data3 is a tuple with the data as the first element and the header as the second, in our case, (1599x12 Array{Float64,2}, 1x12 Array{String,2}) (There are other optional arguments to define readdlm, see the help option). In this case, the actual data is given by data3[1] and the header by data3[2]. Let's continue working with the variable data. The data forms a matrix, and we can get the rows and columns of data using the normal array-matrix syntax). For example, the third row is given by row3 = data[3, :] with data:  7.8  0.88  0.0  2.6  0.098  25.0  67.0  0.9968  3.2  0.68  9.8  5.0, representing the measurements for all the characteristics of a certain wine. The measurements of a certain characteristic for all wines are given by a data column, for example, col3 = data[ :, 3] represents the measurements of citric acid and returns a column vector 1600-element Array{Any,1}:   "citric acid" 0.0  0.0  0.04  0.56  0.0  0.0 …  0.08  0.08  0.1  0.13  0.12  0.47. If we need columns 2-4 (volatile acidity to residual sugar) for all wines, extract the data with x = data[:, 2:4]. If we need these measurements only for the wines on rows 70-75, get these with y = data[70:75, 2:4], returning a 6 x 3 Array{Any,2} outputas follows: 0.32   0.57  2.0 0.705  0.05  1.9 … 0.675  0.26  2.1 To get a matrix with the data from columns 3, 6, and 11, execute the following command: z = [data[:,3] data[:,6] data[:,11]] It would be useful to create a type Wine in the code. For example, if the data is to be passed around functions, it will improve the code quality to encapsulate all the data in a single data type, like this: type Wine     fixed_acidity::Array{Float64}     volatile_acidity::Array{Float64}     citric_acid::Array{Float64}     # other fields     quality::Array{Float64} end Then, we can create objects of this type to work with them, like in any other object-oriented language, for example, wine1 = Wine(data[1, :]...), where the elements of the row are splatted with the ... operator into the Wine constructor. To write to a CSV file, the simplest way is to use the writecsv function for a comma separator, or the writedlm function if you want to specify another separator. For example, to write an array data to a file partial.dat, you need to execute the following command: writedlm("partial.dat", data, ';') If more control is necessary, you can easily combine the more basic functions from the previous section. For example, the following code snippet writes 10 tuples of three numbers each to a file: // code in Chapter 8tuple_csv.jl fname = "savetuple.csv" csvfile = open(fname,"w") # writing headers: write(csvfile, "ColName A, ColName B, ColName Cn") for i = 1:10   tup(i) = tuple(rand(Float64,3)...)   write(csvfile, join(tup(i),","), "n") end close(csvfile) Using DataFrames If you measure n variables (each of a different type) of a single object of observation, then you get a table with n columns for each object row. If there are m observations, then we have m rows of data. For example, given the student grades as data, you might want to know "compute the average grade for each socioeconomic group", where grade and socioeconomic group are both columns in the table, and there is one row per student. The DataFrame is the most natural representation to work with such a (m x n) table of data. They are similar to pandas DataFrames in Python or data.frame in R. A DataFrame is a more specialized tool than a normal array for working with tabular and statistical data, and it is defined in the DataFrames package, a popular Julia library for statistical work. Install it in your environment by typing in Pkg.add("DataFrames") in the REPL. Then, import it into your current workspace with using DataFrames. Do the same for the packages DataArrays and RDatasets (which contains a collection of example datasets mostly used in the R literature). A common case in statistical data is that data values can be missing (the information is not known). The DataArrays package provides us with the unique value NA, which represents a missing value, and has the type NAtype. The result of the computations that contain the NA values mostly cannot be determined, for example, 42 + NA returns NA. (Julia v0.4 also has a new Nullable{T} type, which allows you to specify the type of a missing value). A DataArray{T} array is a data structure that can be n-dimensional, behaves like a standard Julia array, and can contain values of the type T, but it can also contain the missing (Not Available) values NA and can work efficiently with them. To construct them, use the @data macro: // code in Chapter 8dataarrays.jl using DataArrays using DataFrames dv = @data([7, 3, NA, 5, 42]) This returns 5-element DataArray{Int64,1}: 7  3   NA  5 42. The sum of these numbers is given by sum(dv) and returns NA. One can also assign the NA values to the array with dv[5] = NA; then, dv becomes [7, 3, NA, 5, NA]). Converting this data structure to a normal array fails: convert(Array, dv) returns ERROR: NAException. How to get rid of these NA values, supposing we can do so safely? We can use the dropna function, for example, sum(dropna(dv)) returns 15. If you know that you can replace them with a value v, use the array function: repl = -1 sum(array(dv, repl)) # returns 13 A DataFrame is a kind of an in-memory database, versatile in the ways you can work with the data. It consists of columns with names such as Col1, Col2, Col3, and so on. Each of these columns are DataArrays that have their own type, and the data they contain can be referred to by the column names as well, so we have substantially more forms of indexing. Unlike two-dimensional arrays, columns in a DataFrame can be of different types. One column might, for instance, contain the names of students and should therefore be a string. Another column could contain their age and should be an integer. We construct a DataFrame from the program data as follows: // code in Chapter 8dataframes.jl using DataFrames # constructing a DataFrame: df = DataFrame() df[:Col1] = 1:4 df[:Col2] = [e, pi, sqrt(2), 42] df[:Col3] = [true, false, true, false] show(df) Notice that the column headers are used as symbols. This returns the following 4 x 3 DataFrame object: We could also have used the full constructor as follows: df = DataFrame(Col1 = 1:4, Col2 = [e, pi, sqrt(2), 42],    Col3 = [true, false, true, false]) You can refer to the columns either by an index (the column number) or by a name, both of the following expressions return the same output: show(df[2]) show(df[:Col2]) This gives the following output: [2.718281828459045, 3.141592653589793, 1.4142135623730951,42.0] To show the rows or subsets of rows and columns, use the familiar splice (:) syntax, for example: To get the first row, execute df[1, :]. This returns 1x3 DataFrame.  | Row | Col1 | Col2    | Col3 |  |-----|------|---------|------|  | 1   | 1    | 2.71828 | true | To get the second and third row, execute df [2:3, :] To get only the second column from the previous result, execute df[2:3, :Col2]. This returns [3.141592653589793, 1.4142135623730951]. To get the second and third column from the second and third row, execute df[2:3, [:Col2, :Col3]], which returns the following output: 2x2 DataFrame  | Row | Col2    | Col3  |  |---- |-----   -|-------|  | 1   | 3.14159 | false |  | 2   | 1.41421 | true  | The following functions are very useful when working with DataFrames: The head(df) and tail(df) functions show you the first six and the last six lines of data respectively. The names function gives the names of the columns names(df). It returns 3-element Array{Symbol,1}:  :Col1  :Col2  :Col3. The eltypes function gives the data types of the columns eltypes(df). It gives the output as 3-element Array{Type{T<:Top},1}:  Int64  Float64  Bool. The describe function tries to give some useful summary information about the data in the columns, depending on the type, for example, describe(df) gives for column 2 (which is numeric) the min, max, median, mean, number, and percentage of NAs: Col2 Min      1.4142135623730951 1st Qu.  2.392264761937558  Median   2.929937241024419 Mean     12.318522011105483  3rd Qu.  12.856194490192344  Max      42.0  NAs      0  NA%      0.0% To load in data from a local CSV file, use the method readtable. The returned object is of type DataFrame: // code in Chapter 8dataframes.jl using DataFrames fname = "winequality.csv" data = readtable(fname, separator = ';') typeof(data) # DataFrame size(data) # (1599,12) Here is a fraction of the output: The readtable method also supports reading in gzipped CSV files. Writing a DataFrame to a file can be done with the writetable function, which takes the filename and the DataFrame as arguments, for example, writetable("dataframe1.csv", df). By default, writetable will use the delimiter specified by the filename extension and write the column names as headers. Both readtable and writetable support numerous options for special cases. Refer to the docs for more information (refer to http://dataframesjl.readthedocs.org/en/latest/). To demonstrate some of the power of DataFrames, here are some queries you can do: Make a vector with only the quality information data[:quality] Give the wines with alcohol percentage equal to 9.5, for example, data[ data[:alcohol] .== 9.5, :] Here, we use the .== operator, which does element-wise comparison. data[:alcohol] .== 9.5 returns an array of Boolean values (true for datapoints, where :alcohol is 9.5, and false otherwise). data[boolean_array, : ] selects those rows where boolean_array is true. Count the number of wines grouped by quality with by(data, :quality, data -> size(data, 1)), which returns the following: 6x2 DataFrame | Row | quality | x1  | |-----|---------|-----| | 1    | 3      | 10  | | 2    | 4      | 53  | | 3    | 5      | 681 | | 4    | 6      | 638 | | 5    | 7      | 199 | | 6    | 8      | 18  | The DataFrames package contains the by function, which takes in three arguments: A DataFrame, here it takes data A column to split the DataFrame on, here it takes quality A function or an expression to apply to each subset of the DataFrame, here data -> size(data, 1), which gives us the number of wines for each quality value Another easy way to get the distribution among quality is to execute the histogram hist function hist(data[:quality]) that gives the counts over the range of quality (2.0:1.0:8.0,[10,53,681,638,199,18]). More precisely, this is a tuple with the first element corresponding to the edges of the histogram bins, and the second denoting the number of items in each bin. So there are, for example, 10 wines with quality between 2 and 3, and so on. To extract the counts as a variable count of type Vector, we can execute _, count = hist(data[:quality]); the _ means that we neglect the first element of the tuple. To obtain the quality classes as a DataArray class, we will execute the following: class = sort(unique(data[:quality])) We can now construct a df_quality DataFrame with the class and count columns as df_quality = DataFrame(qual=class, no=count). This gives the following output: 6x2 DataFrame | Row | qual | no  | |-----|------|-----| | 1   | 3    | 10  | | 2   | 4    | 53  | | 3   | 5    | 681 | | 4   | 6    | 638 | | 5   | 7    | 199 | | 6   | 8    | 18  | To deepen your understanding and learn about the other features of Julia DataFrames (such as joining, reshaping, and sorting), refer to the documentation available at http://dataframesjl.readthedocs.org/en/latest/. Other file formats Julia can work with other human-readable file formats through specialized packages: For JSON, use the JSON package. The parse method converts the JSON strings into Dictionaries, and the json method turns any Julia object into a JSON string. For XML, use the LightXML package For YAML, use the YAML package For HDF5 (a common format for scientific data), use the HDF5 package For working with Windows INI files, use the IniFile package Summary In this article we discussed the basics of network programming in Julia. Resources for Article: Further resources on this subject: Getting Started with Electronic Projects? [article] Getting Started with Selenium Webdriver and Python [article] Handling The Dom In Dart [article]
Read more
  • 0
  • 0
  • 18945

article-image-command-line-tools
Packt
06 Jul 2017
9 min read
Save for later

Command-Line Tools

Packt
06 Jul 2017
9 min read
In this article by Aaron Torres, author of the book, Go Cookbook, we will cover the following recipes: Using command-line arguments Working with Unix pipes An ANSI coloring application (For more resources related to this topic, see here.) Using command-line arguments This article will expand on other uses for these arguments by constructing a command that supports nested subcommands. This will demonstrate Flagsets and also using positional arguments passed into your application. This recipe requires a main function to run. There are a number of third-party packages for dealing with complex nested arguments and flags, but we'll again investigate doing so using only the standard library. Getting ready You need to perform the following steps for the installation: Download and install Go on your operating system at https://golang.org/doc/install and configure your GOPATH. Open a terminal/console application. Navigate to your GOPATH/src and create a project directory, for example, $GOPATH/src/github.com/yourusername/customrepo. All code will be run and modified from this directory. Optionally, install the latest tested version of the code using the go get github.com/agtorre/go-cookbook/ command. How to do it... From your terminal/console application, create and navigate to the chapter2/cmdargs directory. Copy tests from https://github.com/agtorre/go-cookbook/tree/master/chapter2/cmdargs or use this as an exercise to write some of your own. Create a file called cmdargs.go with the following content: package main import ( "flag" "fmt" "os" ) const version = "1.0.0" const usage = `Usage: %s [command] Commands: Greet Version ` const greetUsage = `Usage: %s greet name [flag] Positional Arguments: name the name to greet Flags: ` // MenuConf holds all the levels // for a nested cmd line argument type MenuConf struct { Goodbye bool } // SetupMenu initializes the base flags func (m *MenuConf) SetupMenu() *flag.FlagSet { menu := flag.NewFlagSet("menu", flag.ExitOnError) menu.Usage = func() { fmt.Printf(usage, os.Args[0]) menu.PrintDefaults() } return menu } // GetSubMenu return a flag set for a submenu func (m *MenuConf) GetSubMenu() *flag.FlagSet { submenu := flag.NewFlagSet("submenu", flag.ExitOnError) submenu.BoolVar(&m.Goodbye, "goodbye", false, "Say goodbye instead of hello") submenu.Usage = func() { fmt.Printf(greetUsage, os.Args[0]) submenu.PrintDefaults() } return submenu } // Greet will be invoked by the greet command func (m *MenuConf) Greet(name string) { if m.Goodbye { fmt.Println("Goodbye " + name + "!") } else { fmt.Println("Hello " + name + "!") } } // Version prints the current version that is // stored as a const func (m *MenuConf) Version() { fmt.Println("Version: " + version) } Create a file called main.go with the following content: package main import ( "fmt" "os" "strings" ) func main() { c := MenuConf{} menu := c.SetupMenu() menu.Parse(os.Args[1:]) // we use arguments to switch between commands // flags are also an argument if len(os.Args) > 1 { // we don't care about case switch strings.ToLower(os.Args[1]) { case "version": c.Version() case "greet": f := c.GetSubMenu() if len(os.Args) < 3 { f.Usage() return } if len(os.Args) > 3 { if.Parse(os.Args[3:]) } c.Greet(os.Args[2]) default: fmt.Println("Invalid command") menu.Usage() return } } else { menu.Usage() return } } Run the go build command. Run the following commands and try a few other combinations of arguments: $./cmdargs -h Usage: ./cmdargs [command] Commands: Greet Version $./cmdargs version Version: 1.0.0 $./cmdargs greet Usage: ./cmdargs greet name [flag] Positional Arguments: name the name to greet Flags: -goodbye Say goodbye instead of hello $./cmdargs greet reader Hello reader! $./cmdargs greet reader -goodbye Goodbye reader! If you copied or wrote your own tests go up one directory and run go test, and ensure all tests pass. How it works... Flagsets can be used to set up independent lists of expected arguments, usage strings, and more. The developer is required to do validation on a number of arguments, parsing in the right subset of arguments to commands, and defining usage strings. This can be error prone and requires a lot of iteration to get it completely correct. The flag package makes parsing arguments much easier and includes convenience methods to get the number of flags, arguments, and more. This recipe demonstrates basic ways to construct a complex command-line application using arguments, including a package-level config, required positional arguments, multi-leveled command usage, and how to split these things into multiple files or packages if needed. Working with Unix pipes Unix pipes are useful when passing the output of one program to the input of another. Consider the following example: $ echo "test case" | wc -l 1 In a Go application, the left-hand side of the pipe can be read in using os.Stdin and acts like a file descriptor. To demonstrate this, this recipe will take an input on the left-hand side of a pipe and return a list of words and their number of occurrences. These words will be tokenized on white space. Getting ready Refer to the Getting Ready section of the Using command-line arguments recipe. How to do it... From your terminal/console application, create a new directory, chapter2/pipes. Navigate to that directory and copy tests from https://github.com/agtorre/go-cookbook/tree/master/chapter2/pipes or use this as an exercise to write some of your own. Create a file called pipes.go with the following content: package main import ( "bufio" "fmt" "os" ) // WordCount takes a file and returns a map // with each word as a key and it's number of // appearances as a value func WordCount(f *os.File) map[string]int { result := make(map[string]int) // make a scanner to work on the file // io.Reader interface scanner := bufio.NewScanner(f) scanner.Split(bufio.ScanWords) for scanner.Scan() { result[scanner.Text()]++ } if err := scanner.Err(); err != nil { fmt.Fprintln(os.Stderr, "reading input:", err) } return result } func main() { fmt.Printf("string: number_of_occurrencesnn") for key, value := range WordCount(os.Stdin) { fmt.Printf("%s: %dn", key, value) } }   Run echo "some string" | go run pipes.go. You may also run: go build echo "some string" | ./pipes You should see the following output: $ echo "test case" | go run pipes.go string: number_of_occurrences test: 1 case: 1 $ echo "test case test" | go run pipes.go string: number_of_occurrences test: 2 case: 1 If you copied or wrote your own tests, go up one directory and run go test, and ensure that all tests pass. How it works... Working with pipes in go is pretty simple, especially if you're familiar with working with files. This recipe uses a scanner to tokenize the io.Reader interface of the os.Stdin file object. You can see how you must check for errors after completing all of the reads. An ANSI coloring application Coloring an ANSI terminal application is handled by a variety of code before and after a section of text that you want colored. This recipe will explore a basic coloring mechanism to color the text red or keep it plain. For a more complete application, take a look at https://github.com/agtorre/gocolorize, which supports many more colors and text types implements the fmt.Formatter interface for ease of printing. Getting ready Refer to the Getting Ready section of the Using command line arguments recipe. How to do it... From your terminal/console application, create and navigate to the chapter2/ansicolor directory. Copy tests from https://github.com/agtorre/go-cookbook/tree/master/chapter2/ansicolor or use this as an exercise to write some of your own. Create a file called color.go with the following content: package ansicolor import "fmt" //Color of text type Color int const ( // ColorNone is default ColorNone = iota // Red colored text Red // Green colored text Green // Yellow colored text Yellow // Blue colored text Blue // Magenta colored text Magenta // Cyan colored text Cyan // White colored text White // Black colored text Black Color = -1 ) // ColorText holds a string and its color type ColorText struct { TextColor Color Text string } func (r *ColorText) String() string { if r.TextColor == ColorNone { return r.Text } value := 30 if r.TextColor != Black { value += int(r.TextColor) } return fmt.Sprintf("33[0;%dm%s33[0m", value, r.Text) } Create a new directory named example. Navigate to example and then create a file named main.go with the following content. Ensure that you modify the ansicolor import to use the path you set up in step 1: package main import ( "fmt" "github.com/agtorre/go-cookbook/chapter2/ansicolor" ) func main() { r := ansicolor.ColorText{ansicolor.Red, "I'm red!"} fmt.Println(r.String()) r.TextColor = ansicolor.Green r.Text = "Now I'm green!" fmt.Println(r.String()) r.TextColor = ansicolor.ColorNone r.Text = "Back to normal..." fmt.Println(r.String()) } Run go run main.go. Alternatively, you may also run the following: go build ./example You should see the following with the text colored if your terminal supports the ANSI coloring format: $ go run main.go I'm red! Now I'm green! Back to normal... If you copied or wrote your own tests, go up one directory and run go test, and ensure that all the tests pass. How it works... This application makes use of a struct keyword to maintain state of the colored text. In this case, it stores the color of the text and the value of the text. The final string is rendered when you call the String() method, which will either return colored text or plain text depending on the values stored in the struct. By default, the text will be plain. Summary In this article, we demonstrated basic ways to construct a complex command-line application using arguments, including a package-level config, required positional arguments, multi-leveled command usage, and how to split these things into multiple files or packages if needed. We saw how to work with Unix pipes and explored a basic coloring mechanism to color text red or keep it plain. Resources for Article: Further resources on this subject: Building a Command-line Tool [article] A Command-line Companion Called Artisan [article] Scaffolding with the command-line tool [article]
Read more
  • 0
  • 0
  • 18487

article-image-no-new-peps-will-be-approved-for-python-until-a-new-bdfl-is-chosen-next-year
Natasha Mathur
24 Jul 2018
3 min read
Save for later

No new PEPS will be approved for Python in 2018, with BDFL election pending.

Natasha Mathur
24 Jul 2018
3 min read
According to an email exchange, Python will not approve of any proposals or PEPs until January 2019 as the team is planning to choose a new leader, with a deadline set for January 1, 2019. The news comes after Python founder and “Benevolent Dictator for Life (BDFL)” Guido van Rossum, announced earlier this month that he was standing down from the decision-making process.   Here’s a quick look at Why Guido may have quit as the Python chief (BDFL). A PEP - or Python Enhancement Proposal - is a design document consisting of information important to the Python community. If a new feature is to be added to the language, it is first detailed in a PEP. It will include technical specifications as well as the rationale for the feature. A passionate discussion regarding PEP 576, 579 and 580 started last week on the python-dev community with Jeroen Demeyer, a Cython contributor, stating how he finally came up with real-life benchmarks proving why a faster C calling protocol is required in Python. He implemented Cython compilation of SageMath on Python 2.7.15 (SageMath is not yet ported to Python 3). But, he mentioned that the conclusions of his implementation should be valid for newer Python versions as well. Guido’s, in his capacity as a core developer, responded back to this implementation request. In the email thread, he writes that “Jeroen was asked to provide benchmarks but only provided them for Python 2. The reasoning that not much has changed could affect the benchmarks feels a bit optimistic, that's all. The new BDFL may be less demanding though.” Stefan Behnel, a core Cython developer, was disappointed with Rossum’s response.. He writes that because Demeyer has already done “the work to clean up the current implementation… it would be very sad if this chance was wasted, and we would have to remain with the current implementation” But it seems that Guido is not convinced. According to Guido, no PEPS will be approved until the Python core dev team have elected a “new BDFL or come up with some other process for accepting PEPs, no action will be taken on this PEP.” The mail says “right now, there is no-one who can approve this PEP, and you will have to wait until 2019 until there is”. Many Python users are worried about this decision:                    Reddit All that’s left to do now is wait for more updates from the Python-dev community. Behnel states “I just hope that Python development won't stall completely. Even if no formal action can be taken on this PEP (or any other), I hope that there will still be informal discussion”. For more coverage on this news, check out the official mail posts on python-dev. Top 7 Python programming books you need to read Python, Tensorflow, Excel and more – Data professionals reveal their top tools  
Read more
  • 0
  • 0
  • 18382

article-image-understanding-python-regex-engine
Packt
18 Feb 2014
8 min read
Save for later

Understanding the Python regex engine

Packt
18 Feb 2014
8 min read
(For more resources related to this topic, see here.) These are the most common characteristics of the algorithm: It supports "lazy quantifiers" such as *?, +?, and ??. It matches the first coincidence, even though there are longer ones in the string. >>>re.search("engineer | engineering", "engineering").group()'engineer' This also means that order is important. The algorithm tracks only one transition at one step, which means that the engine checks one character at a time. Backreferences and capturing parentheses are supported. Backtracking is the ability to remember the last successful position so that it can go back and retry if needed In the worst case, complexity is exponential O(Cn). We'll see this later in Backtracking. Backtracking Backtracking allows going back and repeating the different paths of the regular expression. It does so by remembering the last successful position, this applies to alternation and quantifiers, let’s see an example: Backtracking As we see in the image the regex engine tries to match one character at a time until it fails and starts again with the following path it can retry. The regex used in the image is the perfect example of the importance of how the regex is built, in this case the expression can be rebuild as spa (in | niard), so that the regex engine doesn’t have to go back up to the start of the string in order to retry the second alternative. This leads us to what is called catastrophic backtracking a well-known problem with backtracking that can give you several problems, ranging from slow regex to a crash with a stack overflow. In the previous example, you can see that the behavior grows not only with the input but also with the different paths in the regex, that’s why the algorithm is exponential O(Cn), with this in mind it’s easy to understand why we can end up with a stack overflow. The problem arises when the regex fails to match the String. Let’s benchmark a regex with technique we’ve seen previously, so we can understand the problem better. First let’s try a simple regex: >>> def catastrophic(n): print "Testing with %d characters" %n pat = re.compile('(a+)+c') text = "%s" %('a' * n) pat.search(text) As you can see the text we’re trying to match it’s always going to fail due to there is no c at the end. Let’s test it with different inputs. >>> for n in range(20, 30): test(catastrophic, n) Testing with 20 characters The function catastrophic lasted: 0.130457 Testing with 21 characters The function catastrophic lasted: 0.245125 …… The function catastrophic lasted: 14.828221 Testing with 28 characters The function catastrophic lasted: 29.830929 Testing with 29 characters The function catastrophic lasted: 61.110949 The behavior of this regex looks like quadratic. But why? what’s happening here? The problem is that (a+) starts greedy, so it tries to get as many a’s as possible and after that it fails to match the (a+)+, that is, it backtracks to the second a, and continue consuming a’s until it fails to match c, when it tries again (backtrack) the whole process starting with the second a. Let’s see another example, in this case with an exponential behavior: >>> def catastrophic(n): print "Testing with %d characters" %n pat = re.compile('(x+)+(b+)+c') text = 'x' * n text += 'b' * n pat.search(text) >>> for n in range(12, 18): test(catastrophic, n) Testing with 12 characters The function catastrophic lasted: 1.035162 Testing with 13 characters The function catastrophic lasted: 4.084714 Testing with 14 characters The function catastrophic lasted: 16.319145 Testing with 15 characters The function catastrophic lasted: 65.855182 Testing with 16 characters The function catastrophic lasted: 276.941307 As you can see the behavior is exponential, which can lead to a catastrophic scenarios. And finally let’s see what happen when regex has a match. >>> def non_catastrophic(n): print "Testing with %d characters" %n pat = re.compile('(x+)+(b+)+c') text = 'x' * n text += 'b' * n text += 'c' pat.search(text) >>> for n in range(12, 18): test(non_catastrophic, n) Testing with 10 characters The function catastrophic lasted: 0.000029 …… Testing with 19 characters The function catastrophic lasted: 0.000012 Optimization recommendations In the following sections we will find a number of recommendations that could be used to apply to improve regular expressions. The best tool will always be the common sense, and even following these recommendations common sense will need to be used. It has to be understood when the recommendation is applicable and when not. For instance the recommendation don’t be greedy cannot be used in the 100% of the cases. Reuse compiled patterns To use a regular expression we have to convert it from the string representation to a compiled form as RegexObject. This compilation takes some time. If instead of using the compile function we are using the rest of the methods to avoid the creation of the RegexObject, we should understand that the compilation is executed anyway and a number of compiled RegexObject are cached automatically. However, when we are compiling that cache won’t back us. Every single compile execution will consume an amount of time that perhaps could be negligible for a single execution, but it’s definitely relevant if many executions are performed. Extract common parts in alternation Alternation is always a performance risk point in regular expressions. When using them in Python, and therefore in a sort of NFA implementation, we should extract any common part outside of the alternation. For instance if we have /(Hello⇢World|Hello⇢Continent|Hello⇢Country,)/, we could easily extract Hello⇢ having the following expression /Hello⇢(World|Continent|Country)/. This will make our engine to just check Hello⇢ once, and not going back and recheck for each possibility. Shortcut the alternation Ordering in alternation is relevant; each of the different options present in the alternation will be checked one by one, from the left to the right. This can be used in favor of performance. If we place the more likely options at the beginning of the alternation, more checks will mark the alternation as matched sooner. For instance, we know that the more common colors of cars are white and black. If we are writing a regular expression accepting some colors, we should put white and black first, as those are the more likely to appear. This is: /(white|black|red|blue|green)/. For the rest of the elements, if they have the very same odds of appearing, if could be favorable to put the shortest ones before the longer ones. Use non capturing groups when appropriate Capturing groups will consume some time per each group defined in an expression. This time is not very important but is still relevant if we are executing a regular expression many times. Sometimes we are using groups but we might not be interested in the result. For instance when using alternation. If that is the case we can save some execution time to the engine by marking that group as non-capturing. This is: (?:person|company).. Be specific When the patterns we define are very specific, the engine can help us performing quick integrity checks before the actual pattern matching is executed. For instance, if we pass to the engine the expression /w{15}/ to be matched against the text hello, the engine could decide to check if the input string is actually at least 15 characters long instead of matching the expression. Don’t be greedy The quantifiers and we learnt the difference between greedy and reluctant quantifiers. We also found that the quantifiers are greedy by default. What does this mean to performance? It means that the engine will always try to catch as many characters as possible and then reducing the scope step by step until it's done. This could potentially make the regular expression slow if the match is typically short. Keep in mind, however, this is only applicable if the match is usually short. Summary In this article, we understood how to see the engine working behind the scenes. We learned some theory of the engine design and how it's easy to fall in a common pitfall—the catastrophic backtracking. Finally, we reviewed different general recommendations to improve the performance of our regular expressions. Resources for Article: Further resources on this subject: Python LDAP Applications: Part 1 - Installing and Configuring the Python-LDAP Library and Binding to an LDAP Directory [Article] Python Data Persistence using MySQL Part III: Building Python Data Structures Upon the Underlying Database Data [Article] Python Testing: Installing the Robot Framework [Article]
Read more
  • 0
  • 0
  • 18289
article-image-modular-programming-ecmascript-6
Packt
15 Feb 2016
18 min read
Save for later

Modular Programming in ECMAScript 6

Packt
15 Feb 2016
18 min read
Modular programming is one of the most important and frequently used software design techniques. Unfortunately, JavaScript didn't support modules natively that lead JavaScript programmers to use alternative techniques to achieve modular programming in JavaScript. But now, ES6 brings modules into JavaScript officially. This article is all about how to create and import JavaScript modules. In this article, we will first learn how the modules were created earlier, and then we will jump to the new built-in module system that was introduced in ES6, known as the ES6 modules. In this article, we'll cover: What is modular programming? The benefits of modular programming The basics of IIFE modules, AMD, UMD, and CommonJS Creating and importing the ES6 modules The basics of the Modular Loader Creating a basic JavaScript library using modules (For more resources related to this topic, see here.) The JavaScript modules in a nutshell The practice of breaking down programs and libraries into modules is called modular programming. In JavaScript, a module is a collection of related objects, functions, and other components of a program or library that are wrapped together and isolated from the scope of the rest of the program or library. A module exports some variables to the outside program to let it access the components wrapped by the module. To use a module, a program needs to import the module and the variables exported by the module. A module can also be split into further modules called as its submodules, thus creating a module hierarchy. Modular programming has many benefits. Some benefits are: It keeps our code both cleanly separated and organized by splitting into multiple modules Modular programming leads to fewer global variables, that is, it eliminates the problem of global variables, because modules don't interface via the global scope, and each module has its own scope Makes code reusability easier as importing and using the same modules in different projects is easier It allows many programmers to collaborate on the same program or library, by making each programmer to work on a particular module with a particular functionality Bugs in an application can easily be easily identified as they are localized to a particular module Implementing modules – the old way Before ES6, JavaScript had never supported modules natively. Developers used other techniques and third-party libraries to implement modules in JavaScript. Using Immediately-invoked function expression (IIFE), Asynchronous Module Definition (AMD), CommonJS, and Universal Module Definition (UMD) are various popular ways of implementing modules in ES5. As these ways were not native to JavaScript, they had several problems. Let's see an overview of each of these old ways of implementing modules. The Immediately-Invoked Function Expression The IIFE is used to create an anonymous function that invokes itself. Creating modules using IIFE is the most popular way of creating modules. Let's see an example of how to create a module using IIFE: //Module Starts (function(window){   var sum = function(x, y){     return x + y;   }     var sub = function(x, y){     return x - y;   }   var math = {     findSum: function(a, b){       return sum(a,b);     },     findSub: function(a, b){       return sub(a, b);     }   }   window.math = math; })(window) //Module Ends console.log(math.findSum(1, 2)); //Output "3" console.log(math.findSub(1, 2)); //Output "-1" Here, we created a module using IIFE. The sum and sub variables are global to the module, but not visible outside of the module. The math variable is exported by the module to the main program to expose the functionalities that it provides. This module works completely independent of the program, and can be imported by any other program by simply copying it into the source code, or importing it as a separate file. A library using IIFE, such as jQuery, wraps its all of its APIs in a single IIFE module. When a program uses a jQuery library, it automatically imports the module. Asynchronous Module Definition AMD is a specification for implementing modules in browser. AMD is designed by keeping the browser limitations in mind, that is, it imports modules asynchronously to prevent blocking the loading of a webpage. As AMD is not a native browser specification, we need to use an AMD library. RequireJS is the most popular AMD library. Let's see an example on how to create and import modules using RequireJS. According to the AMD specification, every module needs to be represented by a separate file. So first, create a file named math.js that represents a module. Here is the sample code that will be inside the module: define(function(){   var sum = function(x, y){     return x + y;   }   var sub = function(x, y){     return x - y;   }   var math = {     findSum: function(a, b){       return sum(a,b);     },     findSub: function(a, b){       return sub(a, b);     }   }   return math; }); Here, the module exports the math variable to expose its functionality. Now, let's create a file named index.js, which acts like the main program that imports the module and the exported variables. Here is the code that will be inside the index.js file: require(["math"], function(math){   console.log(math.findSum(1, 2)); //Output "3"   console.log(math.findSub(1, 2)); //Output "-1" }) Here, math variable in the first parameter is the name of the file that is treated as the AMD module. The .js extension to the file name is added automatically by RequireJS. The math variable, which is in the second parameter, references the exported variable. Here, the module is imported asynchronously, and the callback is also executed asynchronously. CommonJS CommonJS is a specification for implementing modules in Node.js. According to the CommonJS specification, every module needs to be represented by a separate file. The CommonJS modules are imported synchronously. Let's see an example on how to create and import modules using CommonJS. First, we will create a file named math.js that represents a module. Here is a sample code that will be inside the module: var sum = function(x, y){   return x + y; } var sub = function(x, y){   return x - y; } var math = {   findSum: function(a, b){     return sum(a,b);   },   findSub: function(a, b){     return sub(a, b);   } } exports.math = math; Here, the module exports the math variable to expose its functionality. Now, let's create a file named index.js, which acts like the main program that imports the module. Here is the code that will be inside the index.js file: var math = require("./math").math; console.log(math.findSum(1, 2)); //Output "3" console.log(math.findSub(1, 2)); //Output "-1" Here, the math variable is the name of the file that is treated as module. The .js extension to the file name is added automatically by CommonJS. Universal Module Definition We saw three different specifications of implementing modules. These three specifications have their own respective ways of creating and importing modules. Wouldn't it have been great if we can create modules that can be imported as an IIFE, AMD, or CommonJS module? UMD is a set of techniques that is used to create modules that can be imported as an IIFE, CommonJS, or AMD module. Therefore now, a program can import third-party modules, irrespective of what module specification it is using. The most popular UMD technique is returnExports. According to the returnExports technique, every module needs to be represented by a separate file. So, let's create a file named math.js that represents a module. Here is the sample code that will be inside the module: (function (root, factory) {   //Environment Detection   if (typeof define === 'function' && define.amd) {     define([], factory);   } else if (typeof exports === 'object') {     module.exports = factory();   } else {     root.returnExports = factory();   } }(this, function () {   //Module Definition   var sum = function(x, y){     return x + y;   }   var sub = function(x, y){     return x - y;   }   var math = {     findSum: function(a, b){       return sum(a,b);     },     findSub: function(a, b){       return sub(a, b);     }   }   return math; })); Now, you can successfully import the math.js module any way that you wish, for instance, by using CommonJS, RequireJS, or IIFE. Implementing modules – the new way ES6 introduced a new module system called ES6 modules. The ES6 modules are supported natively and therefore, they can be referred as the standard JavaScript modules. You should consider using ES6 modules instead of the old ways, because they have neater syntax, better performance, and many new APIs that are likely to be packed as the ES6 modules. Let's have a look at the ES6 modules in detail. Creating the ES6 modules Every ES6 module needs to be represented by a separate .js file. An ES6 module can contain any JavaScript code, and it can export any number of variables. A module can export a variable, function, class, or any other entity. We need to use the export statement in a module to export variables. The export statement comes in many different formats. Here are the formats: export {variableName}; export {variableName1, variableName2, variableName3}; export {variableName as myVariableName}; export {variableName1 as myVariableName1, variableName2 as myVariableName2}; export {variableName as default}; export {variableName as default, variableName1 as myVariableName1, variableName2}; export default function(){}; export {variableName1, variableName2} from "myAnotherModule"; export * from "myAnotherModule"; Here are the differences in these formats: The first format exports a variable. The second format is used to export multiple variables. The third format is used to export a variable with another name, that is, an alias. The fourth format is used to export multiple variables with different names. The fifth format uses default as the alias. We will find out the use of this later in this article. The sixth format is similar to fourth format, but it also has the default alias. The seventh format works similar to fifth format, but here you can place an expression instead of a variable name. The eighth format is used to export the exported variables of a submodule. The ninth format is used to export all the exported variables of a submodule. Here are some important things that you need to know about the export statement: An export statement can be used anywhere in a module. It's not compulsory to use it at the end of the module. There can be any number of export statements in a module. You cannot export variables on demand. For example, placing the export statement in the if…else condition throws an error. Therefore, we can say that the module structure needs to be static, that is, exports can be determined on compile time. You cannot export the same variable name or alias multiple times. But you can export a variable multiple times with a different alias. All the code inside a module is executed in the strict mode by default. The values of the exported variables can be changed inside the module that exported them. Importing the ES6 modules To import a module, we need to use the import statement. The import statement comes in many different formats. Here are the formats: import x from "module-relative-path"; import {x} from "module-relative-path"; import {x1 as x2} from "module-relative-path"; import {x1, x2} from "module-relative-path"; import {x1, x2 as x3} from "module-relative-path"; import x, {x1, x2} from "module-relative-path"; import "module-relative-path"; import * as x from "module-relative-path"; import x1, * as x2 from "module-relative-path"; An import statement consists of two parts: the variable names we want to import and the relative path of the module. Here are the differences in these formats: In the first format, the default alias is imported. The x is alias of the default alias. In the second format, the x variable is imported. The third format is the same as the second format. It's just that x2 is an alias of x1. In the fourth format, we import the x1 and x2 variables. In the fifth format, we import the x1 and x2 variables. The x3 is an alias of the x2 variable. In the sixth format, we import the x1 and x2 variable, and the default alias. The x is an alias of the default alias. In the seventh format, we just import the module. We do not import any of the variables exported by the module. In the eighth format, we import all the variables, and wrap them in an object called x. Even the default alias is imported. The ninth format is the same as the eighth format. Here, we give another alias to the default alias.[RR1]  Here are some important things that you need to know about the import statement: While importing a variable, if we import it with an alias, then to refer to that variable, we have to use the alias and not the actual variable name, that is, the actual variable name will not be visible, only the alias will be visible. The import statement doesn't import a copy of the exported variables; rather, it makes the variables available in the scope of the program that imports it. Therefore, if you make a change to an exported variable inside the module, then the change is visible to the program that imports it. The imported variables are read-only, that is, you cannot reassign them to something else outside of the scope of the module that exports them. A module can only be imported once in a single instance of a JavaScript engine. If we try to import it again, then the already imported instance of the module will be used. We cannot import modules on demand. For example, placing the import statement in the if…else condition throws an error. Therefore, we can say that the imports should be able to be determined on compile time. The ES6 imports are faster than the AMD and CommonJS imports, because the ES6 imports are supported natively and also as importing modules and exporting variables are not decided on demand. Therefore, it makes JavaScript engine easier to optimize performance. The module loader A module loader is a component of a JavaScript engine that is responsible for importing modules. The import statement uses the build-in module loader to import modules. The built-in module loaders of the different JavaScript environments use different module loading mechanisms. For example, when we import a module in JavaScript running in the browsers, then the module is loaded from the server. On the other hand, when we import a module in Node.js, then the module is loaded from filesystem. The module loader loads modules in a different manner, in different environments, to optimize the performance. For example, in the browsers, the module loader loads and executes modules asynchronously in order to prevent the importing of the modules that block the loading of a webpage. You can programmatically interact with the built-in module loader using the module loader API to customize its behavior, intercept module loading, and fetch the modules on demand. We can also use this API to create our own custom module loaders. The specifications of the module loader are not specified in ES6. It is a separate standard, controlled by the WHATWG browser standard group. You can find the specifications of the module loader at http://whatwg.github.io/loader/. The ES6 specifications only specify the import and export statements. Using modules in browsers The code inside the <script> tag doesn't support the import statement, because the tag's synchronous nature is incompatible with the asynchronicity of the modules in browsers. Instead, you need to use the new <module> tag to import modules. Using the new <module> tag, we can define a script as a module. Now, this module can import other modules using the import statement. If you want to import a module using the <script> tag, then you have to use the Module Loader API. The specifications of the <module> tag are not specified in ES6. Using modules in the eval() function You cannot use the import and export statements in the eval() function. To import modules in the eval() function, you need to use the Module Loader API. The default exports vs. the named exports When we export a variable with the default alias, then it's called as a default export. Obviously, there can only be one default export in a module, as an alias can be used only once. All the other exports except the default export are called as named exports. It's recommended that a module should either use default export or named exports. It's not a good practice to use both together. The default export is used when we want to export only one variable. On the other hand, the named exports are used when we want to export the multiple variables. Diving into an example Let's create a basic JavaScript library using the ES6 modules. This will help us understand how to use the import and export statements. We will also learn how a module can import other modules. The library that we will create is going to be a math library, which provides basic logarithmic and trigonometric functions. Let's get started with creating our library: Create a file named math.js, and a directory named math_modules. Inside the math_modules directory, create two files named logarithm.js and trigonometry.js, respectively. Here, the math.js file is the root module, whereas the logarithm.js and the trigonometry.js files are its submodules. Place this code inside the logarithm.js file: var LN2 = Math.LN2; var N10 = Math.LN10;   function getLN2() {   return LN2; }   function getLN10() {   return LN10; }   export {getLN2, getLN10}; Here, the module is exporting the functions named as exports. It's preferred that the low-level modules in a module hierarchy should export all the variables separately, because it may be possible that a program may need just one exported variable of a library. In this case, a program can import this module and a particular function directly. Loading all the modules when you need just one module is a bad idea in terms of performance. Similarly, place this code in the trigonometry.js file: var cos = Math.cos; var sin = Math.sin; function getSin(value) {   return sin(value); } function getCos(value) {   return cos(value); } export {getCos, getSin}; Here we do something similar. Place this code inside the math.js file, which acts as the root module: import * as logarithm from "math_modules/logarithm"; import * as trigonometry from "math_modules/trigonometry"; export default {   logarithm: logarithm,   trigonometry: trigonometry } It doesn't contain any library functions. Instead, it makes easy for a program to import the complete library. It imports its submodules, and then exports their exported variables to the main program. Here, in case the logarithm.js and trigonometry.js scripts depends on other submodules, then the math.js module shouldn't import those submodules, because logarithm.js and trigonometry.js are already importing them. Here is the code using which a program can import the complete library: import math from "math"; console.log(math.trigonometry.getSin(3)); console.log(math.logarithm.getLN2(3)); Summary In this article, we saw what modular programming is and learned different modular programming specifications. We also saw different ways to create modules using JavaScript. Technologies such as the IIFE, CommonJS, AMD, UMD, and ES6 modules are covered. Finally, we created a basic library using the modular programming design technique. Now, you should be confident enough to build the JavaScript apps using the ES6 modules. To learn more about ECMAScript and JavaScript, the following books published by Packt Publishing (https://www.packtpub.com/) are recommended: JavaScript at Scale (https://www.packtpub.com/web-development/javascript-scale) Google Apps Script for Beginners (https://www.packtpub.com/web-development/google-apps-script-beginners) Learning TypeScript (https://www.packtpub.com/web-development/learning-typescript) JavaScript Concurrency (https://www.packtpub.com/web-development/javascript-concurrency) You can also watch out for an upcoming title, Mastering JavaScript Object-Oriented Programming, on this technology on Packt Publishing's website at https://www.packtpub.com/web-development/mastering-javascript-object-oriented-programming. Resources for Article:   Further resources on this subject: Concurrency Principles [article] Using Client Methods [article] HTML5 APIs [article]
Read more
  • 0
  • 0
  • 17160

article-image-r-statistical-package-interfacing-python
Janu Verma
17 Nov 2016
8 min read
Save for later

The R Statistical Package Interfacing with Python

Janu Verma
17 Nov 2016
8 min read
One of my coding hobbies is to explore different Python packages and libraries. In this post, I'll talk about the package rpy2, which is used to call R inside python. Being an avid user of R and a huge supporter of R graphical packages, I had always desired to call R inside my Python code to be able to produce beautiful visualizations. The R framework offers machinery for a variety of statistical and data mining tasks. Let's review the basics of R before we delve into R-Python interfacing. R is a statistical language which is free, is open source, and has comprehensive support for various statistical, data mining, and visualization tasks. Quick-R describes it as: "R is an elegant and comprehensive statistical and graphical programming language." R is one of the fastest growing languages, mainly due to the surge in interest in statistical learning and data science. The Data Science Specialization on Coursera has all courses taught in R. There are R packages for machine learning, graphics, text mining, bioinformatics, topics modeling, interactive visualizations, markdown, and many others. In this post, I'll give a quick introduction to R. The motivation is to acquire some knowledge of R to be able to follow the discussion on R-Python interfacing. Installing R R can be downloaded from one of the Comprehensive R Archive Network (CRAN) mirror sites. Running R To run R interactively on the command line, type r. Launch the standard GUI (which should have been included in the download) and type R code in it. RStudio is the most popular IDE for R. It is recommended, though not required, to install RStudio and run R on it. To write a file with R code, create a file with the .r extension (for example, myFirstCode.r). And run the code by typing the following on the terminal: Rscript file.r Basics of R The most fundamental data structure in R is a vector; actually everything in R is a vector (even numbers are 1-dimensional vectors). This is one of the strangest things about R. Vectors contain elements of the same type. A vector is created by using the c() function. a = c(1,2,5,9,11) a [1] 1 2 5 9 11 strings = c("aa", "apple", "beta", "down") strings [1] "aa" "apple" "beta" "down" The elements in a vector are indexed, but the indexing starts at 1 instead of 0, as in most major languages (for example, python). strings[1] [1] "aa" The fact that everything in R is a vector and that the indexing starts at 1 are the main reasons for people's initial frustration with R (I forget this all the time). Data Frames A lot of R packages expect data as a data frame, which are essentially matrices but the columns can be accessed by names. The columns can be of different types. Data frames are useful outside of R also. The Python package Pandas was written primarily to implement data frames and to do analysis on them. In R, data frames are created (from vectors) as follows: students = c("Anne", "Bret", "Carl", "Daron", "Emily") scores = c(7,3,4,9,8) grades = c('B', 'D', 'C', 'A', 'A') results = data.frame(students, scores, grades) results students scores grades 1 Anne 7 B 2 Bret 3 D 3 Carl 4 C 4 Daron 9 A 5 Emily 8 A The elements of a data frame can be accessed as: results$students [1] Anne Bret Carl Daron Emily Levels: Anne Bret Carl Daron Emily This gives a vector, the elements of which can be called by indexing. results$students[1] [1] Anne Levels: Anne Bret Carl Daron Emily Reading Files Most of the times the data is given as a comma-separated values (csv) file or a tab-separated values (tsv) file. We will see how to read a csv/tsv file in R and create a data frame from it. (Aside: The datasets in most Kaggle competitions are given as csv files and we are required to do machine learning on them. In Python, one creates a pandas data frame or a numpy array from this csv file.) In R, we use a read.csv or read.table command to load a csv file into memory, for example, for the Titanic competition on Kaggle: training_data <- read.csv("train.csv", header=TRUE) train <- data.frame(survived=train_all$Survived, age=train_all$Age, fare=train_all$Fare, pclass=train_all$Pclass) Similarly, a tsv file can be loaded as: data <- read.csv("file.tsv";, header=TRUE, delimiter="t") Thus given a csv/tsv file with or without headers, we can read it using the read.csv function and create a data frame using: data.frame(vector_1, vector_2, ... vector_n). This should be enough to start exploring R packages. Another command that is very useful in R is head(), which is similar to the less command on Unix. rpy2 First things first, we need to have both Python and R installed. Then install rpy2 from the Python package index (Pypi). To do this, simply type the following on the command line: pip install rpy2 We will use the high-level interface to R, the robjects subpackage of rpy2. import rpy2.robjects as ro We can pass commands to the R session by putting the R commands in the ro.r() method as strings. Recall that everything in R is a vector. Let's create a vector using robjects: ro.r('x=c(2,4,6,8)') print(ro.r('x')) [1] 2 4 6 8 Keep in mind that though x is an R object (vector), ro.r('x') is a Python object (rpy2 object). This can be checked as follows: type(ro.r('x')) <class 'rpy2.robjects.vectors.FloatVector'> The most important data types in R are data frames, which are essentially matrices. We can create a data frame using rpy2: ro.r('x=c(2,4,6,8)') ro.r('y=c(4,8,12,16)') ro.r('rdf=data.frame(x,y)') This created an R data frame, rdf. If we want to manipulate this data frame using Python, we need to convert it to a python object. We will convert the R data frame to a pandas data frame. The Python package pandas contains efficient implementations of data frame objects in python. import pandas.rpy.common as com df = com.load_data('rdf') print type(df) <class 'pandas.core.frame.DataFrame'> df.x = 2*df.x Here we have doubled each of the elements of the x vector in the data frame df. But df is a Python object, which we can convert back to an R data frame using pandas as: rdf = com.convert_to_r_dataframe(df) print type(rdf) <class 'rpy2.robjects.vectors.DataFrame'> Let's use the plotting machinery of R, which is the main purpose of studying rpy2: ro.r('plot(x,y)') Not only R data types, but rpy2 lets us import R packages as well (given that these packages are installed on R) and use them for analysis. Here we will build a linear model on x and y using the R package stats: from rpy2.robjects.packages import importr stats = importr('stats') base = importr('base') fit = stats.lm('y ~ x', data=rdf) print(base.summary(fit)) We get the following results: Residuals: 1 2 3 4 0 0 0 0 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 0 0 NA NA x 2 0 Inf <2e-16 *** --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Residual standard error: 0 on 2 degrees of freedom Multiple R-squared: 1, Adjusted R-squared: 1 F-statistic: Inf on 1 and 2 DF, p-value: < 2.2e-16 R programmers will immediately recognize the output as coming from applying linear model function lm() on data. I'll end this discussion with an example using my favorite R package ggplot2. I have written a lot of posts on data visualization using ggplot2. The following example is borrowed from the official documentation of rpy2. import math, datetime import rpy2.robjects.lib.ggplot2 as ggplot2 import rpy2.robjects as ro from rpy2.robjects.packages import importr base = importr('base') datasets = importr('datasets') mtcars = datasets.data.fetch('mtcars')['mtcars'] pp = ggplot2.ggplot(mtcars) + ggplot2.aes_string(x='wt', y='mpg', col='factor(cyl)') + ggplot2.geom_point() + ggplot2.geom_smooth(ggplot2.aes_string(group = 'cyl'), method = 'lm') pp.plot() Author: Janu Verma is a researcher in the IBM T.J. Watson Research Center, New York. His research interests are in mathematics, machine learning, information visualization, computational biology, and healthcare analytics. He has held research positions at Cornell University, Kansas State University, Tata Institute of Fundamental Research, Indian Institute of Science, and the Indian Statistical Institute. He has written papers for IEEE Vis, KDD, International Conference on HealthCare Informatics, Computer Graphics and Applications, Nature Genetics, IEEE Sensors Journals and so on. His current focus is on the development of visual analytics systems for prediction and understanding. He advises start-ups and other companies on data science and machine learning in the Delhi-NCR area. He can be found at Here.
Read more
  • 0
  • 0
  • 16961

article-image-delphi-cookbook
Packt
07 Jul 2016
6 min read
Save for later

Delphi Cookbook

Packt
07 Jul 2016
6 min read
In this article by Daniele Teti author of the book Delphi Cookbook - Second Edition we will study about Multithreading. Multithreading can be your biggest problem if you cannot handle it with care. One of the fathers of the Delphi compiler used to say: "New programmers are drawn to multithreading like moths to flame, with similar results." – Danny Thorpe (For more resources related to this topic, see here.) In this chapter, we will discuss some of the main techniques to handle single or multiple background threads. We'll talk about shared resource synchronization and thread-safe queues and events. The last three recipes will talk about the Parallel Programming Library introduced in Delphi XE7, and I hope that you will love it as much as I love it. Multithreaded programming is a huge topic. So, after reading this chapter, although you will not become a master of it, you will surely be able to approach the concept of multithreaded programming with confidence and will have the basics to jump on to more specific stuff when (and if) you require them. Talking with the main thread using a thread-safe queue Using a background thread and working with its private data is not difficult, but safely bringing information retrieved or elaborated by the thread back to the main thread to show them to the user (as you know, only the main thread can handle the GUI in VCL as well as in FireMonkey) can be a daunting task. An even more complex task would be establishing a generic communication between two or more background threads. In this recipe, you'll see how a background thread can talk to the main thread in a safe manner using the TThreadedQueue<T> class. The same concepts are valid for a communication between two or more background threads. Getting ready Let's talk about a scenario. You have to show data generated from some sort of device or subsystem, let's say a serial, a USB device, a query polling on the database data, or a TCP socket. You cannot simply wait for data using TTimer because this would freeze your GUI during the wait, and the wait can be long. You have tried it, but your interface became sluggish… you need another solution! In the Delphi RTL, there is a very useful class called TThreadedQueue<T> that is, as the name suggests, a particular parametric queue (a FIFO data structure) that can be safely used from different threads. How to use it? In the programming field, there is mostly no single solution valid for all situations, but the following one is very popular. Feel free to change your approach if necessary. However, this is the approach used in the recipe code: Create the queue within the main form. Create a thread and inject the form queue to it. In the thread Execute method, append all generated data to the queue. In the main form, use a timer or some other mechanism to periodically read from the queue and display data on the form. How to do it… Open the recipe project called ThreadingQueueSample.dproj. This project contains the main form with all the GUI-related code and another unit with the thread code. The FormCreate event creates the shared queue with the following parameters that will influence the behavior of the queue: QueueDepth = 100: This is the maximum queue size. If the queue reaches this limit, all the push operations will be blocked for a maximum of PushTimeout, then the Push call will fail with a timeout. PushTimeout = 1000: This is the timeout in milliseconds that will affect the thread, that in this recipe is the producer of a producer/consumer pattern. PopTimeout = 1: This is the timeout in milliseconds that will affect the timer when the queue is empty. This timeout must be very short because the pop call is blocking in nature, and you are in the main thread that should never be blocked for a long time. The button labeled Start Thread creates a TReaderThread instance passing the already created queue to its constructor (this is a particular type of dependency injection called constructor injection). The thread declaration is really simple and is as follows: type TReaderThread = class(TThread) private FQueue: TThreadedQueue<Byte>; protected procedure Execute; override; public constructor Create(AQueue: TThreadedQueue<Byte>); end; While the Execute method simply appends randomly generated data to the queue, note that the Terminated property must be checked often so the application can terminate the thread and wait a reasonable time for its actual termination. In the following example, if the queue is not empty, check the termination at least every 700 msec ca: procedure TReaderThread.Execute; begin while not Terminated do begin TThread.Sleep(200 + Trunc(Random(500))); // e.g. reading from an actual device FQueue.PushItem(Random(256)); end; end; So far, you've filled the queue. Now, you have to read from the queue and do something useful with the read data. This is the job of a timer. The following is the code of the timer event on the main form: procedure TMainForm.Timer1Timer(Sender: TObject); var Value: Byte; begin while FQueue.PopItem(Value) = TWaitResult.wrSignaled do begin ListBox1.Items.Add(Format('[%3.3d]', [Value])); end; ListBox1.ItemIndex := ListBox1.Count - 1; end; That's it! Run the application and see how we are reading the data coming from the threads and showing the main form. The following is a screenshot: The main form showing data generated by the background thread There's more… The TThreadedQueue<T> is very powerful and can be used to communicate between two or more background threads in a consumer/producer schema as well. You can use multiple producers, multiple consumers, or both. The following screenshot shows a popular schema used when the speed at which the data generated is faster than the speed at which the same data is handled. In this case, usually you can gain speed on the processing side using multiple consumers. Single producer, multiple consumers Summary In this article we had a look at how to talk to the main thread using a thread-safe queue. Resources for Article: Further resources on this subject: Exploring the Usages of Delphi[article] Adding Graphics to the Map[article] Application Performance[article]
Read more
  • 0
  • 0
  • 16833
article-image-single-responsibility-principle-solid-net-core
Aaron Lazar
07 May 2018
7 min read
Save for later

Applying Single Responsibility principle from SOLID in .NET Core

Aaron Lazar
07 May 2018
7 min read
In today's tutorial, we'll learn how to apply the Single Responsibility principle from the SOLID principles, to .NET Core Applications. This brings us to another interesting concept in OOP called SOLID design principles. These design principles are applied to any OOP design and are intended to make software easier to understand, more flexible, and easily maintainable. [box type="shadow" align="" class="" width=""]This article is an extract from the book C# 7 and .NET Core Blueprints, authored by Dirk Strauss and Jas Rademeyer. The book is a step-by-step guide that will teach you the essential .NET Core and C# concepts with the help of real-world projects.[/box] The term SOLID is a mnemonic for: Single responsibility principle Open/closed principle Liskov substitution principle Interface segregation principle Dependency inversion principle In this article, we will take a look at the first of the principles—the single responsibility principle. Let's look at the single responsibility principle now. Single responsibility principle Simply put, a module or class should have the following characteristics only: It should do one single thing and only have a single reason to change It should do its one single thing well The functionality provided needs to be entirely encapsulated by that class or module What is meant when saying that a module must be responsible for a single thing? The Google definition of a module is: "Each of a set of standardized parts or independent units that can be used to construct a more complex structure, such as an item of furniture or a building." From this, we can understand that a module is a simple building block. It can be used or reused to create something bigger and more complex when used with other modules. In C# therefore, the module does closely resemble a class, but I will go so far as to say that a module can also be extended to be a method. The function that the class or module performs can only be one thing. That is to say that it has a narrow responsibility. It is not concerned with anything else other than doing that one thing it was designed to do. If we had to apply the single responsibility principle to a person, then that person would be only a software developer, for example. But what if a software developer also was a doctor and a mechanic and a school teacher? Would that person be effective in any of those roles? That would contravene the single responsibility principle. The same is true for code. Having a look at our AllRounder and Batsman classes, you will notice that in AllRounder, we have the following code: private double CalculateStrikeRate(StrikeRate strikeRateType) { switch (strikeRateType) { case StrikeRate.Bowling: return (BowlerBallsBowled / BowlerWickets); case StrikeRate.Batting: return (BatsmanRuns * 100) / BatsmanBallsFaced; default: throw new Exception("Invalid enum"); } } public override int CalculatePlayerRank() { return 0; } In Batsman, we have the following code: public double BatsmanBattingStrikeRate => (BatsmanRuns * 100) / BatsmanBallsFaced; public override int CalculatePlayerRank() { return 0; } Using what we have learned about the single responsibility principle, we notice that there is an issue here. To illustrate the problem, let's compare the code side by side: We are essentially repeating code in the Batsman and AllRounder classes. This doesn't really bode well for single responsibility, does it? I mean, the one principle is that a class must only have a single function to perform. At the moment, both the Batsman and AllRounder classes are taking care of calculating strike rates. They also both take care of calculating the player rank. They even both have exactly the same code for calculating the strike rate of a batsman! The problem comes in when the strike rate calculation changes (not that it easily would, but let's assume it does). We now know that we have to change the calculation in both places. As soon as the developer changes one calculation and not the other, a bug is introduced into our application. Let's simplify our classes. In the BaseClasses folder, create a new abstract class called Statistics. The code should look as follows: namespace cricketScoreTrack.BaseClasses { public abstract class Statistics { public abstract double CalculateStrikeRate(Player player); public abstract int CalculatePlayerRank(Player player); } } In the Classes folder, create a new derived class called PlayerStatistics (that is to say it inherits from the Statistics abstract class). The code should look as follows: using cricketScoreTrack.BaseClasses; using System; namespace cricketScoreTrack.Classes { public class PlayerStatistics : Statistics { public override int CalculatePlayerRank(Player player) { return 1; } public override double CalculateStrikeRate(Player player) { switch (player) { case AllRounder allrounder: return (allrounder.BowlerBallsBowled / allrounder.BowlerWickets); case Batsman batsman: return (batsman.BatsmanRuns * 100) / batsman.BatsmanBallsFaced; default: throw new ArgumentException("Incorrect argument supplied"); } } } } You will see that the PlayerStatistics class is now solely responsible for calculating player statistics for the player's rank and the player's strike rate. You will see that I have not included much of an implementation for calculating the player's rank. I briefly commented the code on GitHub for this method on how a player's rank is determined. It is quite a complicated calculation and differs for batsmen and bowlers. I have therefore omitted it for the purposes of this chapter on OOP. Your Solution should now look as follows: Swing back over to your Player abstract class and remove abstract public int CalculatePlayerRank(); from the class. In the IBowler interface, remove the double BowlerStrikeRate { get; } property. In the IBatter interface, remove the double BatsmanBattingStrikeRate { get; } property. In the Batsman class, remove public double BatsmanBattingStrikeRate and public override int CalculatePlayerRank() from the class. The code in the Batsman class will now look as follows: using cricketScoreTrack.BaseClasses; using cricketScoreTrack.Interfaces; namespace cricketScoreTrack.Classes { public class Batsman : Player, IBatter { #region Player public override string FirstName { get; set; } public override string LastName { get; set; } public override int Age { get; set; } public override string Bio { get; set; } #endregion #region IBatsman public int BatsmanRuns { get; set; } public int BatsmanBallsFaced { get; set; } public int BatsmanMatch4s { get; set; } public int BatsmanMatch6s { get; set; } #endregion } } Looking at the AllRounder class, remove the public enum StrikeRate { Bowling = 0, Batting = 1 } enum as well as the public double BatsmanBattingStrikeRate and public double BowlerStrikeRate properties. Lastly, remove the private double CalculateStrikeRate(StrikeRate strikeRateType) and public override int CalculatePlayerRank() methods. The code for the AllRounder class now looks as follows: using cricketScoreTrack.BaseClasses; using cricketScoreTrack.Interfaces; using System; namespace cricketScoreTrack.Classes { public class AllRounder : Player, IBatter, IBowler { #region Player public override string FirstName { get; set; } public override string LastName { get; set; } public override int Age { get; set; } public override string Bio { get; set; } #endregion #region IBatsman public int BatsmanRuns { get; set; } public int BatsmanBallsFaced { get; set; } public int BatsmanMatch4s { get; set; } public int BatsmanMatch6s { get; set; } #endregion #region IBowler public double BowlerSpeed { get; set; } public string BowlerType { get; set; } public int BowlerBallsBowled { get; set; } public int BowlerMaidens { get; set; } public int BowlerWickets { get; set; } public double BowlerEconomy => BowlerRunsConceded / BowlerOversBowled; public int BowlerRunsConceded { get; set; } public int BowlerOversBowled { get; set; } #endregion } } Looking back at our AllRounder and Batsman classes, the code is clearly simplified. It is definitely more flexible and is starting to look like a well-constructed set of classes. Give your solution a rebuild and make sure that it is all working. So now you know how to simplify your .NET Core applications by applying the Single Responsibility principle. If you found this tutorial helpful and you'd like to learn more, go ahead and pick up the book C# 7 and .NET Core Blueprints, authored by Dirk Strauss and Jas Rademeyer. What is ASP.NET Core? How to call an Azure function from an ASP.NET Core MVC application How to dockerize an ASP.NET Core application  
Read more
  • 0
  • 0
  • 16646

article-image-ruby-strings
Packt
06 Jul 2017
9 min read
Save for later

Ruby Strings

Packt
06 Jul 2017
9 min read
In this article by Jordan Hudgens, the author of the book Comprehensive Ruby Programming, you'll learn about the Ruby String data type and walk through how to integrate string data into a Ruby program. Working with words, sentences, and paragraphs are common requirements in many applications. Additionally you learn how to: Employ string manipulation techniques using core Ruby methods Demonstrate how to work with the string data type in Ruby (For more resources related to this topic, see here.) Using strings in Ruby A string is a data type in Ruby and contains set of characters, typically normal English text (or whatever natural language you're building your program for), that you would write. A key point for the syntax of strings is that they have to be enclosed in single or double quotes if you want to use them in a program. The program will throw an error if they are not wrapped inside quotation marks. Let's walk through three scenarios. Missing quotation marks In this code I tried to simply declare a string without wrapping it in quotation marks. As you can see, this results in an error. This error is because Ruby thinks that the values are classes and methods. Printing strings In this code snippet we're printing out a string that we have properly wrapped in quotation marks. Please note that both single and double quotation marks work properly. It's also important that you do not mix the quotation mark types. For example, if you attempted to run the code: puts "Name an animal' You would get an error, because you need to ensure that every quotation mark is matched with a closing (and matching) quotation mark. If you start a string with double quotation marks, the Ruby parser requires that you end the string with the matching double quotation marks. Storing strings in variables Lastly in this code snippet we're storing a string inside of a variable and then printing the value out to the console. We'll talk more about strings and string interpolation in subsequent sections. String interpolation guide for Ruby In this section, we are going to talk about string interpolation in Ruby. What is string interpolation? So what exactly is string interpolation? Good question. String interpolation is the process of being able to seamlessly integrate dynamic values into a string. Let's assume we want to slip dynamic words into a string. We can get input from the console and store that input into variables. From there we can call the variables inside of a pre-existing string. For example, let's give a sentence the ability to change based on a user's input. puts "Name an animal" animal = gets.chomp puts "Name a noun" noun= gets.chomp p "The quick brown #{animal} jumped over the lazy #{noun} " Note the way I insert variables inside the string? They are enclosed in curly brackets and are preceded by a # sign. If I run this code, this is what my output will look: So, this is how you insert values dynamically in your sentences. If you see sites like Twitter, it sometimes displays personalized messages such as: Good morning Jordan or Good evening Tiffany. This type of behavior is made possible by inserting a dynamic value in a fixed part of a string and leverages string interpolation. Now, let's use single quotes instead of double quotes, to see what happens. As you'll see, the string was printed as it is without inserting the values for animal and noun. This is exactly what happens when you try using single quotes—it prints the entire string as it is without any interpolation. Therefore it's important to remember the difference. Another interesting aspect is that anything inside the curly brackets can be a Ruby script. So, technically you can type your entire algorithm inside these curly brackets, and Ruby will run it perfectly for you. However, it is not recommended for practical programming purposes. For example, I can insert a math equation, and as you'll see it prints the value out. String manipulation guide In this section we are going to learn about string manipulation along with a number of examples of how to integrate string manipulation methods in a Ruby program. What is string manipulation? So what exactly is string manipulation? It's the process of altering the format or value of a string, usually by leveraging string methods. String manipulation code examples Let's start with an example. Let's say I want my application to always display the word Astros in capital letters. To do that, I simply write: "Astros".upcase Now if I always a string to be in lower case letters I can use the downcase method, like so: "Astros".downcase Those are both methods I use quite often. However there are other string methods available that we also have at our disposal. For the rare times when you want to literally swap the case of the letters you can leverage the swapcase method: "Astros".swapcase And lastly if you want to reverse the order of the letters in the string we can call the reverse method: "Astros".reverse These methods are built into the String data class and we can call them on any string values in Ruby. Method chaining Another neat thing we can do is join different methods together to get custom output. For example, I can run: "Astros".reverse.upcase The preceding code displays the value SORTSA. This practice of combining different methods with a dot is called method chaining. Split, strip, and join guides for strings In this section, we are going to walk through how to use the split and strip methods in Ruby. These methods will help us clean up strings and convert a string to an array so we can access each word as its own value. Using the strip method Let's start off by analyzing the strip method. Imagine that the input you get from the user or from the database is poorly formatted and contains white space before and after the value. To clean the data up we can use the strip method. For example: str = " The quick brown fox jumped over the quick dog " p str.strip When you run this code, the output is just the sentence without the white space before and after the words. Using the split method Now let's walk through the split method. The split method is a powerful tool that allows you to split a sentence into an array of words or characters. For example, when you type the following code: str = "The quick brown fox jumped over the quick dog" p str.split You'll see that it converts the sentence into an array of words. This method can be particularly useful for long paragraphs, especially when you want to know the number of words in the paragraph. Since the split method converts the string into an array, you can use all the array methods like size to see how many words were in the string. We can leverage method chaining to find out how many words are in the string, like so: str = "The quick brown fox jumped over the quick dog" p str.split.size This should return a value of 9, which is the number of words in the sentence. To know the number of letters, we can pass an optional argument to the split method and use the format: str = "The quick brown fox jumped over the quick dog" p str.split(//).size And if you want to see all of the individual letters, we can remove the size method call, like this: p str.split(//) And your output should look like this: Notice, that it also included spaces as individual characters which may or may not be what you want a program to return. This method can be quite handy while developing real-world applications. A good practical example of this method is Twitter. Since this social media site restricts users to 140 characters, this method is sure to be a part of the validation code that counts the number of characters in a Tweet. Using the join method We've walked through the split method, which allows you to convert a string into a collection of characters. Thankfully, Ruby also has a method that does the opposite, which is to allow you to convert an array of characters into a single string, and that method is called join. Let's imagine a situation where we're asked to reverse the words in a string. This is a common Ruby coding interview question, so it's an important concept to understand since it tests your knowledge of how string work in Ruby. Let's imagine that we have a string, such as: str = "backwards am I" And we're asked to reverse the words in the string. The pseudocode for the algorithm would be: Split the string into words Reverse the order of the words Merge all of the split words back into a single string We can actually accomplish each of these requirements in a single line of Ruby code. The following code snippet will perform the task: str.split.reverse.join(' ') This code will convert the single string into an array of strings, for the example it will equal ["backwards", "am", "I"]. From there it will reverse the order of the array elements, so the array will equal: ["I", "am", "backwards"]. With the words reversed, now we simply need to merge the words into a single string, which is where the join method comes in. Running the join method will convert all of the words in the array into one string. Summary In this article, we were introduced to the string data type and how it can be utilized in Ruby. We analyzed how to pass strings into Ruby processes by leveraging string interpolation. We also learned the methods of basic string manipulation and how to find and replace string data. We analyzed how to break strings into smaller components, along with how to clean up string based data. We even introduced the Array class in this article. Resources for Article: Further resources on this subject: Ruby and Metasploit Modules [article] Find closest mashup plugin with Ruby on Rails [article] Building tiny Web-applications in Ruby using Sinatra [article]
Read more
  • 0
  • 0
  • 16622
Modal Close icon
Modal Close icon