How-To Tutorials

article-image-how-construct-your-own-network-models-chainer

19 Sep 2016

6 min read

How to construct your own network models on Chainer

19 Sep 2016

In this post, I will introduce you to how to construct your own network model on Chainer. First, I will begin by introducing Chainer's basic concepts as its building blocks. Then, you will see how to construct network models on Chainer. Finally, I define a multi-class classifier bound to the network model. Concepts Let’s start with the basic concepts Chainer has as its building blocks. Procedural abstractions Chains Links Functions Data abstraction Variables Chainer has three procedural abstractions: chains, links and functions from the higher level. Chains are at the highest level and they represent entire network models. They consist of links and/or other chains. Links are like layers in a network model, which have learnable parameters to be optimized through training. Functions are the most fundamental in Chainer's procedural abstractions. They take inputs and return outputs. Links use functions to apply them its inputs and parameters to return its outputs. At the data abstraction level, Chainer has Variables. They represent inputs and outputs of chains, links and functions. As their actual data representation, they wrap numpy and cupy's n-dimentional arrays. In the following sections, we will construct our own network models with these concepts. Construct network models on Chainer In this section, I describe how to construct multi-layer perceptron (MLP) with three layers on top of the basic concepts shown in the previous section. It is very simple. from chainer import Chain import chainer.functions as F import chainer.links as L class MLP(Chain): def__init__(self): super(MLP, self).__init__( l1=L.Linear(784, 100), l2=L.Linear(100, 100), l3=L.Linear(100, 10), ) def__call__(self, x): h1 = F.relu(self.l1(x)) h2 = F.relu(self.l2(h1)) y = self.l3(h2) return y We define the MLP class derived from Chain class. Then we implement two methods, __init__ and __call__. The __init__ method is for initializing links that the chain has. It has three fully connected layers: chainer.links.Linear, or L.Linear above. The first layer named l1 takes 784-dimension input vectors, which means a 28x28-pixel grayscale handwritten digital image. The last layer named l3 returns 10-dimension output vectors, which correspond to ten digits from 0 to 9. Between them, it has another middle layer named l2 that takes 100 input/output vectors. Called on forward propagation is the __call__ method. It takes an input x as a Chainer variable and applies it to three Linear layers and two ReLU activity functions for l1 and l2 Linear layers. It then returns a Chainer variable y that is the output of l3 layer. Because the two ReLU activity functions are Chainer functions, they do not have learnable parameters internally, so you do not have to initialize them with the __init__ method. This code does forward propagation as well as network construction behind the stage. It is Chainer's magic, and backward propagation is automatically computed based on the network constructed here when we optimize it. Of course, you may write the __call__ method as follows. The local variables h1 and h2 are just for clear code. def__call__(self, x): returnself.l3(F.relu(self.l2(F.relu(self.l1(x))))) Define a Multi-class classifier Another chain is Classifier, which will be bound to the MLP chain defined in the previous section. Classifier is for general multi-class classification, and it computes loss value and accuracy of a network model for a given input vector and its ground truth. import chainer.functions as F class Classifier(Chain): def__init__(self, predictor): super(Classifier, self).__init__(predictor=predictor) def__call__(self, x, t): y = self.predictor(x) self.loss = F.softmax_cross_entropy(y, t) self.accuracy = F.accuracy(y, t) returnself.loss We define the Classifier class derived from the Chain class as well as the MLP class, because it takes a chain as a predictor. We similarly implement the __init__ and __call__ methods. In the __init__ method, it takes a predictor parameter as a chain, or a link is also acceptable, to initialize this class. In the __call__ method, it takes two inputs x and t. These are input vectors and its ground truth respectively. Chains that are passed to Chainer optimizers should be compliant with this protocol. First, we give the input vector x to the predictor, which would be MLP model in this post, to get the result y of forward propagation. Then, the loss value and accuracy are computed with the softmax_cross_entropy and accuracy functions for a given ground truth t. Finally, it returns the loss value. The accuracy can be accessed as an attribute any time. Actually, we initialize this classifier bound to the MLP model as follows. model is passed to Chainer optimizer. model = Classifier(MLP()) Conclusion Now we have our own network model. In this post I introduced basic concepts of Chainer. Then we implemented the MLP class that models multi-layer perceptron and Classifier class that computes loss value and accuracy to be optimized for a given ground truth. from chainer import Chain import chainer.functions as F import chainer.links as L class MLP(Chain): def__init__(self): super(MLP, self).__init__( l1=L.Linear(784, 100), l2=L.Linear(100, 100), l3=L.Linear(100, 10), ) def__call__(self, x): h1 = F.relu(self.l1(x)) h2 = F.relu(self.l2(h1)) y = self.l3(h2) return y class Classifier(Chain): def__init__(self, predictor): super(Classifier, self).__init__(predictor=predictor) def__call__(self, x, t): y = self.predictor(x) self.loss = F.softmax_cross_entropy(y, t) self.accuracy = F.accuracy(y, t) returnself.loss model = Classifier(MLP()) As some next steps, you may want to learn: How to optimize the model. How to train and test the model for an MNIST dataset. How to accelerate training the model using GPU. Chainer provides an MNIST example program in the chainer/examples/mnist directory, which will help you. About the author Masayuki Takagi is an entrepreneur and software engineer from Japan. His professional experience domains are advertising and deep learning, serving big Japanese corporations. His personal interests are fluid simulation, GPU computing, FPGA, compiler, and programming language design. Common Lisp is his most beloved programming language and he is the author of the cl-cuda library. Masayuki is a graduate of the University of Tokyo and lives in Tokyo with his buddy Plum, a long coat Chihuahua, and aco.

0
0
1773

How-To Tutorials

article-image-setting-upa-network-backup-server-bacula

Packt

19 Sep 2016

12 min read

Setting Up a Network Backup Server with Bacula

Packt

19 Sep 2016

12 min read

In this article by Timothy Boronczyk,the author of the book CentOS 7 Server Management Cookbook,we'll discuss how to set up a network backup server with Bacula. The fact of the matter is that we are living in a world that is becoming increasingly dependent on data. Also, from accidental deletion to a catastrophic hard drive failure, there are many threats to the safety of your data. The more important your data is and the more difficult it is to recreate if it were lost, the more important it is to have backups. So, this article shows you how you can set up a backup server using Bacula and how to configure other systems on your network to back up their data to it. (For more resources related to this topic, see here.) Getting ready This article requires at least two CentOS systems with working network connections. The first system is the local system which we'll assume has the hostname benito and the IP address 192.168.56.41. The second system is the backup server. You'll need administrative access on both systems, either by logging in with the root account or through the use of sudo. How to do it… Perform the following steps on your local system to install and configure the Bacula file daemon: Install the bacula-client package. yum install bacula-client Open the file daemon's configuration file with your text editor. vi /etc/bacula/bacula-fd.conf In the FileDaemon resource, update the value of the Name directive to reflect the system's hostname with the suffix -fd. FileDaemon { Name = benito-fd ... } Save the changes and close the file. Start the file daemon and enable it to start when the system reboots. systemctl start bacula-fd.service systemctl enable bacula-fd.service Open the firewall to allow TCP traffic through to port 9102. firewall-cmd --zone=public --permanent --add-port=9102/tcp firewall-cmd --reload Repeat steps 1-6 on each system that will be backed up. Install the bacula-console, bacula-director, bacula-storage, and bacula-client packages. yum install bacula-console bacula-director bacula-storage bacula-client Re-link the catalog library to use SQLite database storage. alternatives --config libbaccats.so Type 2 when asked to provide the selection number. Create the SQLite database file and import the table schema. /usr/libexec/bacula/create_sqlite3_database /usr/libexec/bacula/make_sqlite3_tables Open the director's configuration file with your text editor. vi /etc/bacula/bacula-dir.conf In the Job resource where Name has the value BackupClient1, change the value of the Name directive to reflect one of the local systems. Then add a Client directive with a value that matches that system's FileDaemonName. Job { Name = "BackupBenito" Client = benito-fd JobDefs = "DefaultJob" } Duplicate the Job resource and update its directive values as necessary so that there is a Job resource defined for each system to be backed up. For each system that will be backed up, duplicate the Client resource where the Name directive is set to bacula-fd. In the copied resource, update the Name and Address directives to identify that system. Client { Name = bacula-fd Address = localhost ... } Client { Name = benito-fd Address = 192.168.56.41 ... } Client { Name = javier-fd Address = 192.168.56.42 ... } Save your changes and close the file. Open the storage daemon's configuration file. vi /etc/bacula/bacula-sd.conf In the Device resource where Name has the value FileStorage, change the value of the Archive Device directive to /bacula. Device { Name = FileStorage Media Type = File Archive Device = /bacula ... Save the update and close the file. Create the /bacula directory and assign it the proper ownership. mkdir /bacula chown bacula:bacula /bacula If you have SELinux enabled, reset the security context on the new directory. restorecon -Rv /bacula Start the director and storage daemons and enable them to start when the system reboots. systemctl start bacula-dir.service bacula-sd.service bacula-fd.service systemctl enable bacula-dir.service bacula-sd.service bacula-fd.service Open the firewall to allow TCP traffic through to ports 9101-9103. firewall-cmd --zone=public --permanent --add-port=9101-9103/tcp firewall-cmd –reload Launch Bacula's console interface. bconsole Enter label to create a destination for the backup. When prompted for the volume name, use Volume0001 or a similar value. When prompted for the pool, select the File pool. label Enter quit to leave the console interface. How it works… The suite's distributed architecture and the amount of flexibility it offers us can make configuring Bacula a daunting task.However, once you have everything up and running, you'll be able to rest easy knowing that your data is safe from disasters and accidents. Bacula is broken up into several components. In this article, our efforts centered on the following three daemons: the director, the file daemon, and the storage daemon. The file daemon is installed on each local system to be backed up and listens for connections from the director. The director connects to each file daemon as scheduled and tells it whichfiles to back up and where to copy them to (the storage daemon). This allows us to perform all scheduling at a central location. The storage daemon then receives the data and writes it to the backup medium, for example, disk or tape drive. On the local system, we installed the file daemon with the bacula-client package andedited the file daemon's configuration file at /etc/bacula/bacula-fd.conf to specify the name of the process. The convention is to add the suffix -fd to the system's hostname. FileDaemon { Name = benito-fd FDPort = 9102 WorkingDirectory = /var/spool/bacula Pid Directory = /var/run Maximum Concurrent Jobs = 20 } On the backup server, we installed thebacula-console, bacula-director, bacula-storage, and bacula-client packages. This gives us the director and storage daemon and another file daemon. This file daemon's purpose is to back up Bacula's catalog. Bacula maintains a database of metadata about previous backup jobs called the catalog, which can be managed by MySQL, PostgreSQL, or SQLite. To support multiple databases, Bacula is written so that all of its database access routines are contained in shared libraries with a different library for each database. When Bacula wants to interact with a database, it does so through libbaccats.so, a fake library that is nothing more than a symbolic link pointing to one of the specific database libraries. This let's Bacula support different databases without requiring us to recompile its source code. To create the symbolic link, we usedalternatives and selected the real library that we want to use. I chose SQLite since it's an embedded database library and doesn't require additional services. Next, we needed to initialize the database schema using scripts that come with Bacula. If you want to use MySQL, you'll need to create a dedicated MySQL user for Bacula to use and then initialize the schema with the following scripts instead. You'll also need to review Bacula's configuration files to provide Bacula with the required MySQL credentials. /usr/libexec/bacula/grant_mysql_privileges /usr/libexec/bacula/create_mysql_database /usr/libexec/bacula/make_mysql_tables Different resources are defined in the director's configuration file at /etc/bacula/bacula-dir.conf, many of which consist not only of their own values but also reference other resources. For example, the FileSet resource specifies which files are included or excluded in backups and restores, while a Schedule resource specifies when backups should be made. A JobDef resource can contain various configuration directives that are common to multiple backup jobs and also reference particular FileSet and Schedule resources. Client resources identify the names and addresses of systems running file daemons, and a Job resource will pull together a JobDef and Client resource to define the backup or restore task for a particular system. Some resources define things at a more granular level and are used as building blocks to define other resources. Thisallows us to create complex definitions in a flexible manner. The default resource definitions outline basic backup and restore jobs that are sufficient for this article (you'll want to study the configuration and see how the different resources fit together so that you can tweak them to better suit your needs). We customized the existing backup Jobresource by changing its name and client. Then, we customized the Client resource by changing its name and address to point to a specific system running a file daemon. A pair of Job and Client resources can be duplicated for each additional system youwant to back up. However, notice that I left the default Client resource that defines bacula-fd for the localhost. This is for the file daemon that's local to the backup server and will be the target for things such as restore jobs and catalog backups. Job { Name = "BackupBenito" Client = benito-fd JobDefs = "DefaultJob" } Job { Name = "BackupJavier" Client = javier-fd JobDefs = "DefaultJob" } Client { Name = bacula-fd Address = localhost ... } Client { Name = benito-fd Address = 192.168.56.41 ... } Client { Name = javier-fd Address = 192.168.56.42 ... } To complete the setup, we labeled a backup volume. This task, as with most others, is performed through bconsole, a console interface to the Bacula director. We used thelabel command to specify a label for the backup volume and when prompted for the pool, we assigned the labeled volume to the File pool. In a way very similar to how LVM works, an individual device or storage unit is allocated as a volume and the volumes are grouped into storage pools. If a pool contains two volumes backed by tape drives for example and one of the drives is full, the storage daemon will write the data to the tape that has space available. Even though in our configuration we're storing the backup to disk, we still need to create a volume as the destination for data to be written to. There's more... At this point, you should consider which backup strategy works best for you. A full backup is a complete copy of your data, a differential backup captures only the files that have changed since the last full backup, and an incremental backup copies the files that have changed since the last backup (regardless of the type of backup). Commonly, administrators employ a previous combination, perhaps making a full backup at the start of the week and then differential or incremental backups each day thereafter. This saves storage space because the differential and incremental backups are not only smaller but also convenient when the need to restore a file arises because a limited number of backups need to be searched for the file. Another consideration is the expected size of each backup and how long it will take for the backup to run to completion. Full backups obviously take longer to run, and in an office with 9-5 working hours, Monday through Friday and it may not be possible to run a full backup during the evenings. Performing a full backup on Fridays gives the backup time over the weekend to run. Smaller, incremental backups can be performed on the other days when time is lesser. Yet another point that is important in your backup strategy is, how long the backups will be kept and where they will be kept. A year's worth of backups is of no use if your office burns down and they were sitting in the office's IT closet. At one employer, we kept the last full back up and last day's incremental on site;they were then duplicated to tape and stored off site. Regardless of the strategy you choose to implement, your backups are only as good as your ability to restore data from them. You should periodically test your backups to make sure you can restore your files. To run a backup job on demand, enter run in bconsole. You'll be prompted with a menu to select one of the current configured jobs. You'll then be presented with the job's options, such as what level of backup will be performed (full, incremental, or differential), it's priority, and when it will run. You can type yes or no to accept or cancel it or mod to modify a parameter. Once accepted, the job will be queued and assigned a job ID. To restore files from a backup, use the restore command. You'll be presented with a list of options allowing you to specify which backup the desired files will be retrieved from. Depending on your selection, the prompts will be different. Bacula's prompts are rather clear, so read them carefully and they will guide you through the process. Apart from the run and restore commands, another useful command is status. It allows you to see the current status of the Bacula components, if there are any jobs currently running, and which jobs have completed. A full list of commands can be retrieved by typing help in bconsole. See also For more information on working with Bacula, refer to the following resources: Bacula documentation (http://blog.bacula.org/documentation/) How to use Bacula on CentOS 7 (http://www.digitalocean.com/community/tutorial_series/how-to-use-bacula-on-centos-7) Bacula Web (a web-based reporting and monitoring tool for Bacula) (http://www.bacula-web.org/) Summary In this article, we discussed how we can set up a backup server using Bacula and how to configure other systems on our network to back up our data to it. Resources for Article: Further resources on this subject: Jenkins 2.0: The impetus for DevOps Movement [article] Gearing Up for Bootstrap 4 [article] Introducing Penetration Testing [article]

0
0
30260

article-image-making-history-event-sourcing

Packt

15 Sep 2016

18 min read

Making History with Event Sourcing

Packt

15 Sep 2016

18 min read

In this article by Christian Baxter, author of the book Mastering Akka, we will see, the most common, tried, and true approach is to model the data in a relational database when it comes to the persistence needs for an application. Following this approach has been the de facto way to store data until recently, when NoSQL (and to a lesser extent NewSQL) started to chip away at the footholds of relational database dominance. There's nothing wrong with storing your application's data this way—it's how we initially chose to do so for the bookstore application using PostgreSQL as the storage engine. This article deals with event sourcing and how to implement that approach using Akka Persistence. These are the main things you can expect to learn from this article: (For more resources related to this topic, see here.) Akka persistence for event sourcing Akka persistence is a relatively newer module within the Akka toolkit. It became available as experimental in the 2.3.x series. Throughout that series, it went through quite a few changes as the team worked on getting the API and functionality right. When Akka 2.4.2 was released, the experimental label was removed, signifying that persistence was stable and ready to be leveraged in production code. Akka persistence allows stateful actors to persist their internal state. It does this not to persisting the state itself, but instead as changes to that state. It uses an append-only model to persist these state changes, allowing you to later reconstitute the state by replaying the changes to that state. It also allows you to take periodic snapshots and use those to reestablish an actor's state as a performance optimization for long-lived entities with lots of state changes. Akka persistence's approach should certainly sound familiar as it's almost a direct overlay to the features of event sourcing. In fact, it was inspired by the eventsourced Scala library, so that overlay is no coincidence. Because of this alignment with event sourcing, Akka persistence will be the perfect tool for us to switch over to an event sourced model. Before getting into the details of the refactor, I want to describe some of the high-level concepts in the framework. The PersistentActor trait The PersistentActor trait is the core building block to create event sourced entities. This actor is able to persist its events to a pluggable journal. When a persistent actor is restarted (reloaded), it will replay its journaled events to reestablish its current internal state. These two behaviors perfectly fit what we need to do for our event sourced entities, so this will be our core building block. The PersistentActor trait has a log of features, more that I will cover in the next few sections. I'll cover the things that we will use in the bookstore refactoring, which I consider to be the most useful features in PersistentActor. If you want to learn more, then I suggest you take a look at the Akka documents as they pretty much cover everything else that you can do with PersistentActor. Persistent actor state handling A PersistentActor implementation has two basic states that it can be in—Recovering and Receiving Commands. When Recovering, it's in the process of reloading its event stream from the journal to rebuild its internal state. Any external messages that come in during this time will be stashed until the recovery process is complete. Once the recovery process completes, the persistent actor transitions into the Receiving Commands state where it can start to handle commands. These commands can then generate new events that can further modify the state of this entity. This two-state flow can be visualized in the following diagram: These two states are both represented by custom actor receive handling partial functions. You must provide implementations for both of the following vals in order to properly implement these two states for your persistent actor: val receiveRecover: Receive = { . . . } val receiveCommand: Receive = { . . . } While in the recovering state, there are two possible messages that you need to be able to handle. The first is one of the event types that you previously persisted for this entity type. When you get that type of message, you have to reapply the change implied by that event to the internal state of the actor. For example, if we had a SalesOrderFO fields object as the internal state, and we received a replayed event indicating that the order was approved, the handling might look something like this: var state:SalesOrderFO = ... val receiveRecover: Receive = { case OrderApproved(id) => state = state.copy(status = SalesOrderStatus.Approved) } We'd, of course, need to handle a lot more than that one event. This code sample was just to show you how you can modify the internal state of a persistent actor when it's being recovered. Once the actor has completed the recovery process, it can transition into the state where it starts to handle incoming command requests. Event sourcing is all about Action (command) and Reaction (events). When the persistent actor receives a command, it has the option to generate zero to many events as a result of that command. These events represent a happening on that entity that will affect its current state. Events you receive while in the Recovering state will be previously generated while in the Receiving Commands state. So, the preceding example that I coded, where we receive OrderApproved, must have previously come from some command that we handled earlier. The handling of that command could have looked something like this: val receiveCommand: Receive = { case ApproveOrder(id) => persist(OrderApproved(id)){ event => state = state.copy(status = SalesOrderStatus.Approved) sender() ! FullResult(state) } } After receiving the command request to change the order status to approved, the code makes a call to persist, which will asynchronously write an event into the journal. The full signature for persist is: persist[A](event: A)(handler: (A) ⇒ Unit): Unit The first argument there represents the event that you want to write to the journal. The second argument is a callback function that will be executed after the event has been successfully persisted (and won't be called at all if the persistence fails). For our example, we will use that callback function to mutate the internal state to update the status field to match the requested action. One thing to note is that the writing in the journal is asynchronous. So, one may then think that it's possible to be closing over that internal state in an unsafe way when the callback function is executed. If you persisted two events in rapid succession, couldn't it be possible for both of them to access that internal state at the same time in separate threads, kind of like when using Futures in an actor? Thankfully, this is not the case. The completion of a persistence action is sent back as a new message to the actor. The hidden receive handling for this message will then invoke the callback associated with that persistence action. By using the mailbox again, we will know these post-persistence actions will be executed one at a time, in a safe manner. As an added bonus, the sender associated with those post-persistence messages will be the original sender of the command so you can safely use sender() in a persistence callback to reply to the original requestor, as shown in my example. Another guarantee that the persistence framework makes when persisting events is that no other commands will be processed in between the persistence action and the associated callback. Any commands that come in during that time will be stashed until all of the post-persistence actions have been completed. This makes the persist/callback sequence atomic and isolated, in that nothing else can interfere with it while it's happening. Allowing additional commands to be executed during this process may lead to an inconsistent state and response to the caller who sent the commands. If for some reason, the persisting to the journal fails, there is an onPersistFailure callback that will be invoked. If you want to implement custom handling for this, you can override this method. No matter what, when persistence fails, the actor will be stopped after making this callback. At this point, it's possible that the actor is in an inconsistent state, so it's safer to stop it than to allow it to continue on in this state. Persistence failures probably mean something is failing with the journal anyway so restarting as opposed to stopping will more than likely lead to even more failures. There's one more callback that you can implement in your persistent actors and that's onPersistRejected. This will happen if the serialization framework rejects the serialization of the event to store. When this happens, the persist callback does not get invoked, so no internal state update will happen. In this case, the actor does not stop or restart because it's not in an inconsistent state and the journal itself is not failing. The PersistenceId Another important concept that you need to understand with PersistentActor is the persistenceId method. This abstract method must be defined for every type of PersistentActor you define, returning a String that is to be unique across different entity types and also between actor instances within the same type. Let's say I will create the Book entity as a PersistentActor and define the persistenceId method as follows: override def persistenceId = "book" If I do that, then I will have a problem with this entity, in that every instance will share the entire event stream for every other Book instance. If I want each instance of the Book entity to have its own separate event stream (and trust me, you will), then I will do something like this when defining the Book PersistentActor: class Book(id:Int) extends PersistentActor{ override def persistenceId = s"book-$id" } If I follow an approach like this, then I can be assured that each of my entity instances will have its own separate event stream as the persistenceId will be unique for every Int keyed book we have. In the current model, when creating a new instance of an entity, we will pass in the special ID of 0 to indicate that this entity does not yet exist and needs to be persisted. We will defer ID creation to the database, and once we have an ID (after persistence), we will stop that actor instance as it is not properly associated with that newly generated ID. With the persistenceId model of associating the event stream to an entity, we will need the ID as soon as we create the actor instance. This means we will need a way to have a unique identifier even before persisting the initial entity state. This is something to think about before we get to the upcoming refactor. Taking snapshots for faster recovery I've mentioned the concept of taking a snapshot of the current state of an entity to speed up the process of recovering its state. If you have a long-lived entity that has generated a large amount of events, it will take progressively more and more time to recover its state. Akka's PersistentActor supports the snapshot concept, putting it in your hands as to when to take the snapshot. Once you have taken the snapshots, the latest one will be offered to the entity during the recovery phase instead of all of the events that led up to it. This will reduce the total number of events to process to recover state, thus speeding up that process. This is a two-part process, with the first part being taking snapshots periodically and the second being handling them during the recovery phase. Let's take a look at the snapshot taking process first. Let's say that you coded a particular entity to save a new snapshot for every one hundred events received. To make this happen, your command handling block may look something like this: var eventTotal = ... val receiveCommand:Receive = { case UpdateStatus(status) => persist(StatusUpdated(status)){ event => state = state.copy(status = event.status) eventTotal += 1 if (eventTotal % 100 == 0) saveSnapshot(state) } case SaveSnapshotSuccess(metadata) => . . . case SaveSnapshotFailure(metadata, reason) => . . . } You can see in the post-persist logic that if we we're making a specific call to saveSnapshot, we are passing the latest version of the actor's internal state. You're not limited to doing this just in the post-persist logic in reaction to a new event, but you can also set up the actor to publish a snapshot on regular intervals. You can leverage Akka's scheduler to send a special message to the entity to instruct it to save the snapshot periodically. If you start saving snapshots, then you will have to start handling the two new messages that will be sent to the entity indicating the status of the saved snapshot. These two new message types are SaveSnapshotSuccess and SaveSnapshotFailure. The metadata that appears on both messages will tell you things, such as the persistence ID where the failure occurred, the sequence number of the snapshot that failed, and the timestamp of the failure. You can see these two new messages in the command handling block shown in the preceding code. Once you have saved a snapshot, you will need to start handling it in the recovery phase. The logic to handle a snapshot during recovery will look like the following code block: val receiveRecover:Receive = { case SnapshotOffer(metadata, offeredSnapshot) => state = offeredSnapshot case event => . . . } Here, you can see that if we get a snapshot during recovery, instead of just making an incremental change, as we do with real replayed events, we set the entire state to whatever the offered snapshot is. There may be hundreds of events that led up to that snapshot, but all we need to handle here is one message in order to wind the state forward to when we took that snapshot. This process will certainly pay dividends if we have lots of events for this entity and we continue to take periodic snapshots. One thing to note about snapshots is that you will only ever be offered the latest snapshot (per persistence id) during the recovery process. Even though I'm taking a new snapshot every 100 events, I will only ever be offered one,the latest one, during the recovery phase. Another thing to note is that there is no real harm in losing a snapshot. If your snapshot storage was wiped out for some reason, the only negative side effect is that you'll be stuck processing all of the events for an entity when recovering it. When you take snapshots, you don't lose any of the event history. Snapshots are completely supplemental and only benefit the performance of the recovery phase. You don't need to take them, and you can live without them if something happens to the ones you had taken. Serialization of events and snapshots Within both the persistence and snapshot examples, you can see I was passing objects into the persist and saveSnapshot calls. So, how are these objects marshaled to and from a format that can actually be written to those stores? The answer is—via Akka serialization. Akka persistence is dependent on Akka serialization to convert event and snapshot objects to and from a binary format that can be saved into a data store. If you don't make any changes to the default serialization configuration, then your objects will be converted into binary via Java serialization. Java serialization is both slow and inefficient in terms of size of the serialized object. It's also not flexible in terms of the object definition changing after producing the binary when you are trying to read it back in. It's not a good choice for our needs with our event sourced app. Luckily, Akka serialization allows you to provide your own custom serializers. If you, perhaps, wanted to use JSON as your serialized object representation then you can pretty easily build a custom serializer to do that. They also have a built-in Google Protobuf serializer that can convert your Protobuf binding classes into their binary format. We'll explore both custom serializers and the Protobuf serializer when we get into the sections dealing with the refactors. The AsyncWriteJournal Another important component in Akka persistence, which I've mentioned a few times already, is the AsyncWriteJournal. This component is an append-only data store that stores the sequence of events (per persistence id) a PersistentActor generates via calls to persist. The journal also stores the highestSequenceNr per persistence id that tracks the total number of persisted events for that persistence id. The journal is a pluggable component. You have the ability to configure the default journal and, also, override it on a per-entity basis. The default configuration for Akka does not provide a value for the journal to use, so you must either configure this setting or add a per-entity override (more on that in a moment) in order to start using persistence. If you want to set the default journal, then it can be set in your config with the following property: akka.persistence.journal.plugin="akka.persistence.journal.leveldb" The value in the preceding code must be the fully qualified path to another configuration section of the same name where the journal plugin's config lives. For this example, I set it to the already provided leveldb config section (from Akka's reference.conf). If you want to override the journal plugin for a particular entity instance only, then you can do so by overriding the journalPluginId method on that entity actor, as follows: class MyEntity extends PersistentActor{ override def journalPluginId = "my-other-journal" . . . } The same rules apply here, in which, my-other-journal must be the fully qualified name to a config section where the config for that plugin lives. My example config showed the use of the leveldb plugin that writes to the local file system. If you actually want to play around using this simple plugin, then you will also need to add the following dependencies into your sbt file: "org.iq80.leveldb" % "leveldb" % "0.7" "org.fusesource.leveldbjni" % "leveldbjni-all" % "1.8" If you want to use something different, then you can check the community plugins page on the Akka site to find one that suits your needs. For our app, we will use the Cassandra journal plugin. I'll show you how to set up the config for that in the section dealing with the installation of Cassandra. The SnapshotStore The last thing I want to cover before we start the refactoring process is the SnapshotStore. Like the AsyncWriteJournal, the SnapshotStore is a pluggable and configurable storage system, but this one stores just snapshots as opposed to the entire event stream for a persistence id. As I mentioned earlier, you don't need snapshots, and you can survive if the storage system you used for them gets wiped out for some reason. Because of this, you may consider using a separate storage plugin for them. When selecting the storage system for your events, you need something that is robust, distributed, highly available, fault tolerant, and backup capable. If you lose these events, you lose the entire data set for your application. But, the same is not true for snapshots. So, take that information into consideration when selecting the storage. You may decide to use the same system for both, but you certainly don't have to. Also, not every journal plugin can act as a snapshot plugin; so, if you decide to use the same for both, make sure that the journal plugin you select can handle snapshots. If you want to configure the snapshot store, then the config setting to do that is as follows: akka.persistence.snapshot-store.plugin="my-snapshot-plugin" The setting here follows the same rules as the write journal; the value must be the fully qualified name to a config section where the plugin's config lives. If you want to override the default setting on a per entity basis, then you can do so by overriding the snapshotPluginId command on your actor like this: class MyEntity extends PersistentActor{ override def snapshotPluginId = "my-other-snap-plugin" . . . } The same rules apply here as well, in which, the value must be a fully qualified path to a config section where the plugin's config lives. Also, there are no out-of-the-box default settings for the snapshot store, so if you want to use snapshots, you must either set the appropriate setting in your config or provide the earlier mentioned override on a per entity basis. For our needs, we will use the same storage mechanism—Cassandra—for both the write journal and the snapshot storage. We have a multi-node system currently, so using something that writes to the local file system, or a simple in-memory plugin, won't work for us. Summary In this article, you learned about Akka persistence for event sourcing and the need to take a snapshot of the current state of an entity to speed up the process to recover its state. Resources for Article: Further resources on this subject: Introduction to Akka [article] Using NoSQL Databases [article] PostgreSQL – New Features [article]

0
0
2704

Packt

14 Sep 2016

12 min read

It's All About Data

Packt

14 Sep 2016

12 min read

In this article by Samuli Thomasson, the author of the book, Haskell High Performance Programming, we will know how to choose and design optimal data structures in applications. You will be able to drop the level of abstraction in slow parts of code, all the way to mutable data structures if necessary. (For more resources related to this topic, see here.) Annotating strictness and unpacking datatype fields We used the BangPatterns extension to make function arguments strict: {-# LANGUAGE BangPatterns #-} f !s (x:xs) = f (s + 1) xs f !s _ = s Using bangs for annotating strictness in fact predates the BangPatterns extension (and the older compiler flag -fbang-patterns in GHC6.x). With just plain Haskell98, we are allowed to use bangs to make datatype fields strict: > data T = T !Int A bang in front of a field ensures that whenever the outer constructor (T) is in WHNF, the inner field is as well in WHNF. We can check this: > T undefined `seq` () *** Exception: Prelude.undefined There are no restrictions to which fields can be strict, be it recursive or polymorphic fields, although, it rarely makes sense to make recursive fields strict. Consider the fully strict linked list: data List a = List !a !(List a) | ListEnd With this much strictness, you cannot represent parts of infinite lists without always requiring infinite space. Moreover, before accessing the head of a finite strict list you must evaluate the list all the way to the last element. Strict lists don't have the streaming property of lazy lists. By default, all data constructor fields are pointers to other data constructors or primitives, regardless of their strictness. This applies to basic data types such asInt, Double, Char, and so on, which are not primitive in Haskell. They are data constructors over their primitive counterparts Int#, Double#, and Char#: > :info Int data Int = GHC.Types.I# GHC.Prim.Int# There is a performance overhead, the size of pointer dereference between types, say, Int and Int#, but an Int can represent lazy values (called thunks), whereas primitives cannot. Without thunks, we couldn't have lazy evaluation. Luckily,GHC is intelligent enough to unroll wrapper types as primitives in many situations, completely eliminating indirect references. The hash suffix is specific to GHC and always denotes a primitive type. The GHC modules do expose the primitive interface. Programming with primitives you can further micro-optimize code and get C-like performance. However, several limitations and drawbacks apply. Using anonymous tuples Tuples may seem harmless at first; they just lump a bunch of values together. But note that the fields in a tuple aren't strict, so a twotuple corresponds to the slowest PairP data type from our previous benchmark. If you need a strict Tuple type, you need to define one yourself. This is also one more reason to prefer custom types over nameless tuples in many situations. These two structurally similar tuple types have widely different performance semantics: data Tuple = Tuple {-# UNPACK #-} !Int {-# UNPACK #-} !Int data Tuple2 = Tuple2 {-# UNPACK #-} !(Int, Int) If you really want unboxed anonymous tuples, you can enable the UnboxedTuples extension and write things with types like (# Int#, Char# #). But note that a number of restrictions apply to unboxed tuples like to all primitives. The most important restriction is that unboxed types may not occur where polymorphic types or values are expected, because polymorphic values are always considered as pointers. Representing bit arrays One way to define a bitarray in Haskell that still retains the convenience of Bool is: import Data.Array.Unboxed type BitArray = UArrayInt Bool This representation packs 8 bits per byte, so it's space efficient. See the following section on arrays in general to learn about time efficiency – for now we only note that BitArray is an immutable data structure, like BitStruct, and that copying small BitStructs is cheaper than copying BitArrays due to overheads in UArray. Consider a program that processes a list of integers and tells whether they are even or odd counts of numbers divisible by 2, 3, and 5. We can implement this with simple recursion and a three-bit accumulator. Here are three alternative representations for the accumulator: type BitTuple = (Bool, Bool, Bool) data BitStruct = BitStruct !Bool !Bool !Bool deriving Show type BitArray = UArrayInt Bool And the program itself is defined along these lines: go :: acc -> [Int] ->acc go acc [] = acc go (two three five) (x:xs) = go ((test 2 x `xor` two) (test 3 x `xor` three) (test 5 x `xor` five)) xs test n x = x `mod` n == 0 I've omitted the details here. They can be found in the bitstore.hs file. The fastest variant is BitStruct, then comes BitTuple (30% slower), and BitArray is the slowest (130% slower than BitStruct). Although BitArray is the slowest (due to making a copy of the array on every iteration), it would be easy to scale the array in size or make it dynamic. Note also that this benchmark is really on the extreme side; normally programs do a bunch of other stuff besides updating an array in a tight loop. If you need fast array updates, you can resort to mutable arrays discussed later on. It might also be tempting to use Data.Vector.Unboxed.VectorBool from the vector package, due to its nice interface. But beware that that representation uses one byte for every bit, wasting 7 bits for every bit. Mutable references are slow Data.IORef and Data.STRef are the smallest bits of mutable state, references to mutable variables, one for IO and other for ST. There is also a Data.STRef.Lazy module, which provides a wrapper over strict STRef for lazy ST. However, because IORef and STRef are references, they imply a level of indirection. GHC intentionally does not optimize it away, as that would cause problems in concurrent settings. For this reason, IORef or STRef shouldn't be used like variables in C, for example. Performance will for sure be very bad. Let's verify the performance hit by considering the following ST-based sum-of-range implementation: -- file: sum_mutable.hs import Control.Monad.ST import Data.STRef count_st :: Int ->Int count_st n = runST $ do ref <- newSTRef 0 let go 0 = readSTRef ref go i = modifySTRef' ref (+ i) >> go (i - 1) go n And compare it to this pure recursive implementation: count_pure :: Int ->Int count_pure n = go n 0 where go 0 s = s go i s = go (i - 1) $! (s + i) The ST implementation is many times slower when at least -O is enabled. Without optimizations, the two functions are more or less equivalent in performance;there is similar amount of indirection from not unboxing arguments in the latter version. This is one example of the wonders that can be done to optimize referentially transparent code. Bubble sort with vectors Bubble sort is not an efficient sort algorithm, but because it's an in-place algorithm and simple, we will implement it as a demonstration of mutable vectors: -- file: bubblesort.hs import Control.Monad.ST import Data.Vector as V import Data.Vector.Mutable as MV import System.Random (randomIO) -- for testing The (naive) bubble sort compares values of all adjacent indices in order, and swaps the values if necessary. After reaching the last element, it starts from the beginning or, if no swaps were made, the list is sorted and the algorithm is done: bubblesortM :: (Ord a, PrimMonad m) =>MVector (PrimState m) a -> m () bubblesortM v = loop where indices = V.fromList [1 .. MV.length v - 1] loop = do swapped <- V.foldM' f False indices – (1) if swapped then loop else return () – (2) f swapped i = do – (3) a <- MV.read v (i-1) b <- MV.read v i if a > b then MV.swap v (i-1) i>> return True else return swapped At (1), we fold monadically over all but the last index, keeping state about whether or not we have performed a swap in this iteration. If we had, at (2) we rerun the fold or, if not, we can return. At (3) we compare an index and possibly swap values. We can write a pure function that wraps the stateful algorithm: bubblesort :: Ord a => Vector a -> Vector a bubblesort v = runST $ do mv <- V.thaw v bubblesortM mv V.freeze mv V.thaw and V.freeze (both O(n)) can be used to go back and forth with mutable and immutable vectors. Now, there are multiple code optimization opportunities in our implementation of bubble sort. But before tackling those, let's see how well our straightforward implementation fares using the following main: main = do v <- V.generateM 10000 $ _ ->randomIO :: IO Double let v_sorted = bubblesort v median = v_sorted ! 5000 print median We should remember to compile with -O2. On my machine, this program takes about 1.55s, and Runtime System reports 99.9% productivity, 18.7 megabytes allocated heap and 570 Kilobytes copied during GC. So now with a baseline, let's see if we can squeeze out more performance from vectors. This is a non-exhaustive list: Use unboxed vectors instead. This restricts the types of elements we can store, but it saves us a level of indirection. Down to 960ms and approximately halved GC traffic. Large lists are inefficient, and they don't compose with vectors stream fusion. We should change indices so that it uses V.enumFromTo instead (alternatively turn on OverloadedLists extension and drop V.fromList). Down to 360ms and 94% less GC traffic. Conversion functions V.thaw and V.freeze are O(n), that is, they modify copies. Using in-place V.unsafeThaw and V.unsafeFreeze instead is sometimes useful. V.unsafeFreeze in the bubblesort wrapper is completely safe, but V.unsafeThaw is not. In our example, however, with -O2, the program is optimized into a single loop and all those conversions get eliminated. Vector operations (V.read, V.swap) in bubblesortM are guaranteed to never be out of bounds, so it's perfectly safe to replace these with unsafe variants (V.unsafeRead, V.unsafeSwap) that don't check bounds. Speed-up of about 25 milliseconds, or 5%. To summarize, applying good practices and safe usage of unsafe functions, our Bubble sort just got 80% faster. These optimizations are applied in thebubblesort-optimized.hsfile (omitted here). We noticed that almost all GC traffic came from a linked list, which was constructed and immediately consumed. Lists are bad for performance in that they don't fuse like vectors. To ensure good vector performance, ensure that the fusion framework can work effectively. Anything that can be done with a vector should be done. As final note, when working with vectors(and other libraries) it's a good idea to keep the Haddock documentation handy. There are several big and small performance choices to be made. Often the difference is that of between O(n) and O(1). Speedup via continuation-passing style Implementing monads in continuation-passing style (CPS) can have very good results. Unfortunately, no widely-used or supported library I'm aware of would provide drop-in replacements for ubiquitous Maybe, List, Reader, Writer, and State monads. It's not that hard to implement the standard monads in CPS from scratch. For example, the State monad can be implemented using the Cont monad from mtl as follows: -- file: cont-state-writer.hs {-# LANGUAGE GeneralizedNewtypeDeriving #-} {-# LANGUAGE MultiParamTypeClasses #-} {-# LANGUAGE FlexibleInstances #-} {-# LANGUAGE FlexibleContexts #-} import Control.Monad.State.Strict import Control.Monad.Cont newtypeStateCPS s r a = StateCPS (Cont (s -> r) a) deriving (Functor, Applicative, Monad, MonadCont) instance MonadState s (StateCPS s r) where get = StateCPS $ cont $ next curState→ next curStatecurState put newState = StateCPS $ cont $ next curState→ next () newState runStateCPS :: StateCPS s s () -> s -> s runStateCPS (StateCPS m) = runCont m (_ -> id) In case you're not familiar with the continuation-passing style and the Cont monad, the details might not make much sense, instead of just returning results from a function, a function in CPS applies its results to a continuation. So in short, to "get" the state in continuation-passing style, we pass the current state to the "next" continuation (first argument) and don't change the state (second argument). To "put", we call the continuation with unit (no return value) and change the state to new state (second argument to next). StateCPS is used just like the State monad: action :: MonadStateInt m => m () action = replicateM_ 1000000 $ do i<- get put $! i + 1 main = do print $ (runStateCPS action 0 :: Int) print $ (snd $ runState action 0 :: Int) That action operation is, in the CPS version of the state monad, about 5% faster and performs 30% less heap allocation than the state monad from mtl. This program is limited pretty much only by the speed of monadic composition, so these numbers are at least very close to maximum speedup we can have from CPSing the state monad. Speedups of the writer monad are probably near these results. Other standard monads can be implemented similarly to StateCPS. The definitions can also be generalized to monad transformers over an arbitrary monad (a la ContT). For extra speed, you might wish to combine many monads in a single CPS monad, similarly to what RWST does. Summary We witnessed the performance of the bytestring, text, and vector libraries, all of which get their speed from fusion optimizations, in contrast to linked lists, which have a huge overhead despite also being subject to fusion to some degree. However, linked lists give rise to simple difference lists and zippers. The builder patterns for lists, bytestring, and textwere introduced. We discovered that the array package is low-level and clumsy compared to the superior vector package, unless you must support Haskell 98. We also saw how to implement Bubble sort using vectors and how to speedup via continuation-passing style. Resources for Article: Further resources on this subject: Data Tables and DataTables Plugin in jQuery 1.3 with PHP [article] Data Extracting, Transforming, and Loading [article] Linking Data to Shapes [article]

0
0
13304

Packt

14 Sep 2016

6 min read

Hello TDD!

Packt

14 Sep 2016

6 min read

In this article by Gaurav Sood, the author of the book Scala Test-Driven Development, tells basics of Test-Driven Development. We will explore: What is Test-Driven Development? What is the need for Test-Driven Development? Brief introduction to Scala and SBT (For more resources related to this topic, see here.) What is Test-Driven Development? Test-Driven Development or TDD(as it is referred to commonly)is the practice of writing your tests before writing any application code. This consists of the following iterative steps: This process is also referred to asRed-Green-Refactor-Repeat. TDD became more prevalent with the use of agile software development process, though it can be used as easily with any of the Agile's predecessors like Waterfall. Though TDD is not specifically mentioned in agile manifesto (http://agilemanifesto.org), it has become a standard methodology used with agile. Saying this, you can still use agile without using TDD. Why TDD? The need for TDD arises from the fact that there can be constant changes to the application code. This becomes more of a problem when we are using agile development process, as it is inherently an iterative development process. Here are some of the advantages, which underpin the need for TDD: Code quality: Tests on itself make the programmer more confident of their code. Programmers can be sure of syntactic and semantic correctness of their code. Evolving architecture: A purely test-driven application code gives way to an evolving architecture. This means that we do not have to pre define our architectural boundaries and the design patterns. As the application grows so does the architecture. This results in an application that is flexible towards future changes. Avoids over engineering: Tests that are written before the application code define and document the boundaries. These tests also document the requirements and application code. Agile purists normally regard comments inside the code as a smell. According to them your tests should document your code. Since all the boundaries are predefined in the tests, it is hard to write application code, which breaches these boundaries. This however assumes that TDD is following religiously. Paradigm shift: When I had started with TDD, I noticed that the first question I asked myself after looking at the problem was; "How can I solve it?" This however is counterproductive. TDD forces the programmer to think about the testability of the solution before its implementation. To understand how to test a problem would mean a better understanding of the problem and its edge cases. This in turn can result into refinement of the requirements or discovery or some new requirements. Now it had become impossible for me not to think about testability of the problem before the solution. Now the first question I ask myself is; "How can I test it?". Maintainable code: I have always found it easier to work on an application that has historically been test-driven rather than on one that is not. Why? Only because when I make change to the existing code, the existing tests make sure that I do not break any existing functionality. This results in highly maintainable code, where many programmers can collaborate simultaneously. Brief introduction to Scala and SBT Let us look at Scala and SBT briefly. It is assumed that the reader is familiar with Scala and therefore will not go into the depth of it. What is Scala Scala is a general-purpose programming language. Scala is an acronym for Scalable Language. This reflects the vision of its creators of making Scala a language that grows with the programmer's experience of it. The fact that Scala and Java objects can be freely mixed, makes transition from Java to Scala quite easy. Scala is also a full-blown functional language. Unlike Haskell, which is a pure functional language, Scala allows interoperability with Java and support for objectoriented programming. Scala also allows use of both pure and impure functions. Impure functions have side affect like mutation, I/O and exceptions. Purist approach to Scala programming encourages use of pure functions only. Scala is a type-safe JVM language that incorporates both object oriented and functional programming into an extremely concise, logical, and extraordinarily powerful language. Why Scala? Here are some advantages of using Scala: Functional solution to problem is always better: This is my personal view and open for contention. Elimination of mutation from application code allows application to be run in parallelacross hosts and cores without any deadlocks. Better concurrency model: Scala has an actor model that is better than Java's model of locks on thread. Concise code:Scala code is more concise than itsmore verbose cousin Java. Type safety/ static typing: Scala does type checking at compile time. Pattern matching: Case statements in Scala are superpowerful. Inheritance:Mixin traits are great and they definitely reduce code repetition. There are other features of Scala like closure and monads, which will need more understanding of functional language concepts to learn. Scala Build Tool Scala Build Tool (SBT) is a build tool that allows compiling, running, testing, packaging, and deployment of your code. SBT is mostly used with Scala projects, but it can as easily be used for projects in other languages. Here, we will be using SBT as a build tool for managing our project and running our tests. SBT is written in Scala and can use many of the features of Scala language. Build definitions for SBT are also written in Scala. These definitions are both flexible and powerful. SBT also allows use of plugins and dependency management. If you have used a build tool like Maven or Gradlein any of your previous incarnations, you will find SBT a breeze. Why SBT? Better dependency management Ivy based dependency management Only-update-on-request model Can launch REPL in project context Continuous command execution Scala language support for creating tasks Resources for learning Scala Here are few of the resources for learning Scala: http://www.scala-lang.org/ https://www.coursera.org/course/progfun https://www.manning.com/books/functional-programming-in-scala http://www.tutorialspoint.com/scala/index.htm Resources for SBT Here are few of the resources for learning SBT: http://www.scala-sbt.org/ https://twitter.github.io/scala_school/sbt.html Summary In this article we learned what is TDD and why to use it. We also learned about Scala and SBT. Resources for Article: Further resources on this subject: Overview of TDD [article] Understanding TDD [article] Android Application Testing: TDD and the Temperature Converter [article]

0
0
8952

How-To Tutorials

article-image-decoding-why-good-php-developerisnt-oxymoron

Packt

14 Sep 2016

20 min read

Decoding Why "Good PHP Developer"Isn't an Oxymoron

Packt

14 Sep 2016

20 min read

In this article by Junade Ali, author of the book Mastering PHP Design Patterns, we will be revisiting object-oriented programming. Back in 2010 MailChimp published a post on their blog, it was entitled Ewww, You Use PHP? In this blog post they described the horror when they explained their choice of PHP to developers who consider the phrase good PHP programmer an oxymoron. In their rebuttal they argued that their PHP wasn't your grandfathers PHP and they use a sophisticated framework. I tend to judge the quality of PHP on the basis of, not only how it functions, but how secure it is and how it is architected. This book focuses on ideas of how you should architect your code. The design of software allows for developers to ease the extension of the code beyond its original purpose, in a bug free and elegant fashion. (For more resources related to this topic, see here.) As Martin Fowler put it: Any fool can write code that a computer can understand. Good programmers write code that humans can understand. This isn't just limited to code style, but how developers architect and structure their code. I've encountered many developers with their noses constantly stuck in documentation, copying and pasting bits of code until it works; hacking snippets together until it works. Moreover, I far too often see the software development process rapidly deteriorate as developers ever more tightly couple their classes with functions of ever increasing length. Software engineers mustn't just code software; they must know how to design it. Indeed often a good software engineer, when interviewing other software engineers will ask questions surrounding the design of the code itself. It is trivial to get a piece of code that will execute, and it is also benign to question a developer as to whether strtolower or str2lower is the correct name of a function (for the record, it's strtolower). Knowing the difference between a class and an object doesn't make you a competent developer; a better interview question would, for example, be how one could apply subtype polymorphism to a real software development challenge. Failure to assess software design skills dumbs down an interview and results in there being no way to differentiate between those who are good at it, and those who aren't. These advanced topics will be discussed throughout this book, by learning these tactics you will better understand what the right questions to ask are when discussing software architecture. Moxie Marlinspike once tweeted: As a software developer, I envy writers, musicians, and filmmakers. Unlike software, when they create something it is really done, forever. When developing software we mustn't forget we are authors, not just of instructions for a machine, but we are also authoring something that we later expect others to extend upon. Therefore, our code mustn't just be targeted at machines, but humans also. Code isn't just poetry for a machine, it should be poetry for humans also. This is, of course, better said than done. In PHP this may be found especially difficult given the freedom PHP offers developers on how they may architect and structure their code. By the very nature of freedom, it may be both used and abused, so it is true with the freedom offered in PHP. PHP offers freedom to developers to decide how to architect this code. By the very nature of freedom it can be both used and abused, so it is true with the freedom offered in PHP. Therefore, it is increasingly important that developers understand proper software design practices to ensure their code maintains long term maintainability. Indeed, another key skill lies in refactoringcode, improving design of existing code to make it easier to extend in the longer term Technical debt, the eventual consequence of poor system design, is something that I've found comes with the career of a PHP developer. This has been true for me whether it has been dealing with systems that provide advanced functionality or simple websites. It usually arises because a developer elects to implement bad design for a variety of reasons; this is when adding functionality to an existing codebase or taking poor design decisions during the initial construction of software. Refactoring can help us address these issues. SensioLabs (the creators of the Symfonyframework) have a tool called Insight that allows developers to calculate the technical debt in their own code. In 2011 they did an evaluation of technical debt in various projects using this tool; rather unsurprisingly they found that WordPress 4.1 topped the chart of all platforms they evaluated with them claiming it would take 20.1 years to resolve the technical debt that the project contains. Those familiar with the WordPress core may not be surprised by this, but this issue of course is not only associated to WordPress. In my career of working with PHP, from working with security critical cryptography systems to working with systems that work with mission critical embedded systems, dealing with technical debt comes with the job. Dealing with technical debt is not something to be ashamed of for a PHP Developer, indeed some may consider it courageous. Dealing with technical debt is no easy task, especially in the face of an ever more demanding user base, client, or project manager; constantly demanding more functionality without being familiar with the technical debt the project has associated to it. I recently emailed the PHP Internals group as to whether they should consider deprecating the error suppression operator @. When any PHP function is prepended by an @ symbol, the function will suppress an error returned by it. This can be brutal; especially where that function renders a fatal error that stops the execution of the script, making debugging a tough task. If the error is suppressed, the script may fail to execute without providing developers a reason as to why this is. Despite the fact that no one objected to the fact that there were better ways of handling errors (try/catch, proper validation) than abusing the error suppression operator and that deprecation should be an eventual aim of PHP, it is the case that some functions return needless warnings even though they already have a success/failure value. This means that due to technical debt in the PHP core itself, this operator cannot be deprecated until a lot of other prerequisite work is done. In the meantime, it is down to developers to decide the best methodologies of handling errors. Until the inherent problem of unnecessary error reporting is addressed, this operator cannot be deprecated. Therefore, it is down to developers to be educated as to the proper methodologies that should be used to address error handling and not to constantly resort to using an @ symbol. Fundamentally, technical debt slows down development of a project and often leads to code being deployed that is broken as developers try and work on a fragile project. When starting a new project, never be afraid to discus architecture as architecture meetings are vital to developer collaboration; as one scrum master I've worked with said in the face of criticism that "meetings are a great alternative to work", he said "meetings are work…how much work would you be doing without meetings?". Coding style - thePSR standards When it comes to coding style, I would like to introduce you to the PSR standards created by the PHP Framework Interop Group. Namely, the two standards that apply to coding standards are PSR-1 (Basic Coding Style) and PSR-2 (Coding Style Guide). In addition to this there are PSR standards that cover additional areas, for example, as of today; the PSR-4 standard is the most up-to-date autoloading standard published by the group. You can find out more about the standards at http://www.php-fig.org/. Coding style being used to enforce consistency throughout a codebase is something I strongly believe in, it does make a difference to your code readability throughout a project. It is especially important when you are starting a project (chances are you may be reading this book to find out how to do that right) as your coding style determines the style the developers following you in working on this project will adopt. Using a global standard such as PSR-1 or PSR-2 means that developers can easily switch between projects without having to reconfigure their code style in their IDE. Good code style can make formatting errors easier to spot. Needless to say that coding styles will develop as time progresses, to date I elect to work with the PSR standards. I am a strong believer in the phrase: Always code as if the guy who ends up maintaining your code will be a violent psychopath who knows where you live. It isn't known who wrote this phrase originally; but it's widely thought that it could have been John Woods or potentially Martin Golding. I would strongly recommend familiarizingyourself with these standards before proceeding in this book. Revising object-oriented programming Object-oriented programming is more than just classes and objects, it's a whole programming paradigm based around objects(data structures) that contain data fields and methods. It is essential to understand this; using classes to organize a bunch of unrelated methods together is not object orientation. Assuming you're aware of classes (and how to instantiate them), allow me to remind you of a few different bits and pieces. Polymorphism Polymorphism is a fairly long word for a fairly simple concept. Essentially, polymorphism means the same interfaceis used with a different underlying code. So multiple classes could have a draw function, each accepting the same arguments, but at an underlying level the code is implemented differently. In this article, I would also like to talk about Subtype Polymorphism in particular (also known as Subtyping or Inclusion Polymorphism). Let's say we have animals as our supertype;our subtypes may well be cats, dogs, and sheep. In PHP, interfaces allow you to define a set of functionality that a class that implements it must contain, as of PHP 7 you can also use scalar type hints to define the return types we expect. So for example, suppose we defined the following interface: interface Animal { public function eat(string $food) : bool; public function talk(bool $shout) : string; } We could then implement this interface in our own class, as follows: class Cat implements Animal { } If we were to run this code without defining the classes we would get an error message as follows: Class Cat contains 2 abstract methods and must therefore be declared abstract or implement the remaining methods (Animal::eat, Animal::talk) Essentially, we are required to implement the methods we defined in our interface, so now let's go ahead and create a class that implements these methods: class Cat implements Animal { public function eat(string $food): bool { if ($food === "tuna") { return true; } else { return false; } } public function talk(bool $shout): string { if ($shout === true) { return "MEOW!"; } else { return "Meow."; } } } Now that we've implemented these methods we can then just instantiate the class we are after and use the functions contained in it: $felix = new Cat(); echo $felix->talk(false); So where does polymorphism come into this? Suppose we had another class for a dog: class Dog implements Animal { public function eat(string $food): bool { if (($food === "dog food") || ($food === "meat")) { return true; } else { return false; } } public function talk(bool $shout): string { if ($shout === true) { return "WOOF!"; } else { return "Woof woof."; } } } Now let's suppose we have multiple different types of animals in a pets array: $pets = array( 'felix' => new Cat(), 'oscar' => new Dog(), 'snowflake' => new Cat() ); We can now actually go ahead and loop through all these pets individually in order to run the talk function.We don't care about the type of pet because the talkmethod that is implemented in every class we getis by virtue of us having extended the Animals interface. So let's suppose we wanted to have all our animals run the talk method, we could just use the following code: foreach ($pets as $pet) { echo $pet->talk(false); } No need for unnecessary switch/case blocks in order to wrap around our classes, we just use software design to make things easier for us in the long-term. Abstract classes work in a similar way, except for the fact that abstract classes can contain functionality where interfaces cannot. It is important to note that any class that defines one or more abstract classes must also be defined as abstract. You cannot have a normal class defining abstract methods, but you can have normal methods in abstract classes. Let's start off by refactoring our interface to be an abstract class: abstract class Animal { abstract public function eat(string $food) : bool; abstract public function talk(bool $shout) : string; public function walk(int $speed): bool { if ($speed > 0) { return true; } else { return false; } } } You might have noticed that I have also added a walk method as an ordinary, non-abstract method; this is a standard method that can be used or extended by any classes that inherit the parent abstract class. They already have implementation. Note that it is impossible to instantiate an abstract class (much like it's not possible to instantiate an interface). Instead we must extend it. So, in our Cat class let's substitute: class Cat implements Animal With the following code: class Cat extends Animal That's all we need to refactor in order to get classes to extend the Animal abstract class. We must implement the abstract functions in the classes as we outlined for the interfaces, plus we can use the ordinary functions without needing to implement them: $whiskers = new Cat(); $whiskers->walk(1); As of PHP 5.4 it has also become possible to instantiate a class and access a property of it in one system. PHP.net advertised it as: Class member access on instantiation has been added, e.g. (new Foo)->bar(). You can also do it with individual properties, for example,(new Cat)->legs. In our example, we can use it as follows: (new IcyAprilChapterOneCat())->walk(1); Just to recap a few other points about how PHP implemented OOP, the final keyword before a class declaration or indeed a function declaration means that you cannot override such classes or functions after they've been defined. So, if we were to try extending a class we have named as final: final class Animal { public function walk() { return "walking..."; } } class Cat extends Animal { } This results in the following output: Fatal error: Class Cat may not inherit from final class (Animal) Similarly, if we were to do the same except at a function level: class Animal { final public function walk() { return "walking..."; } } class Cat extends Animal { public function walk () { return "walking with tail wagging..."; } } This results in the following output: Fatal error: Cannot override final method Animal::walk() Traits (multiple inheritance) Traits were introduced into PHP as a mechanism for introducing Horizontal Reuse. PHP conventionally acts as a single inheritance language, namely because of the fact that you can't inherit more than one class into a script. Traditional multiple inheritance is a controversial process that is often looked down upon by software engineers. Let me give you an example of using Traits first hand; let's define an abstract Animal class which we want to extend into another class: class Animal { public function walk() { return "walking..."; } } class Cat extends Animal { public function walk () { return "walking with tail wagging..."; } } So now let's suppose we have a function to name our class, but we don't want it to apply to all our classes that extend the Animal class, we want it to apply to certain classes irrespective of whether they inherit the properties of the abstract Animal class or not. So we've defined our functions like so: function setFirstName(string $name): bool { $this->firstName = $name; return true; } function setLastName(string $name): bool { $this->lastName = $name; return true; } The problem now is that there is no place we can put them without using Horizontal Reuse, apart from copying and pasting different bits of code or resorting to using conditional inheritance. This is where Traits come to the rescue; let's start off by wrapping these methods in a Trait called Name: trait Name { function setFirstName(string $name): bool { $this->firstName = $name; return true; } function setLastName(string $name): bool { $this->lastName = $name; return true; } } So now that we've defined our Trait, we can just tell PHP to use it in our Cat class: class Cat extends Animal { use Name; public function walk() { return "walking with tail wagging..."; } } Notice the use of theName statement? That's where the magic happens. Now you can call the functions in that Trait without any problems: $whiskers = new Cat(); $whiskers->setFirstName('Paul'); echo $whiskers->firstName; All put together, the new code block looks as follows: trait Name { function setFirstName(string $name): bool { $this->firstName = $name; return true; } function setLastName(string $name): bool { $this->lastName = $name; return true; } } class Animal { public function walk() { return "walking..."; } } class Cat extends Animal { use Name; public function walk() { return "walking with tail wagging..."; } } $whiskers = new Cat(); $whiskers->setFirstName('Paul'); echo $whiskers->firstName; Scalar type hints Let me take this opportunity to introduce you to a PHP7 concept known as scalar type hinting; it allows you to define the return types (yes, I know this isn't strictly under the scope of OOP; deal with it). Let's define a function, as follows: function addNumbers (int $a, int $b): int { return $a + $b; } Let's take a look at this function; firstly you will notice that before each of the arguments we define the type of variable we want to receive, in this case,int or integer. Next up you'll notice there's a bit of code after the function definition : int, which defines our return type so our function can only receive an integer. If you don't provide the right type of variable as a function argument or don't return the right type of variable from the function; you will get a TypeError exception. In strict mode, PHP will also throw a TypeError exception in the event that strict mode is enabled and you also provide the incorrect number of arguments. It is also possible in PHP to define strict_types; let me explain why you might want to do this. Without strict_types, PHP will attempt to automatically convert a variable to the defined type in very limited circumstances. For example, if you pass a string containing solely numbers it will be converted to an integer, a string that's non-numeric, however, will result in a TypeError exception. Once you enable strict_typesthis all changes, you can no longer have this automatic casting behavior. Taking our previous example, without strict_types, you could do the following: echo addNumbers(5, "5.0"); Trying it again after enablingstrict_types, you will find that PHP throws a TypeError exception. This configuration only applies on an individual file basis, putting it before you include other files will not result in this configuration being inherited to those files. There are multiple benefits of why PHP chose to go down this route; they are listed very clearly in Version: 0.5.3 of the RFC that implemented scalar type hints called PHP RFC: Scalar Type Declarations. You can read about it by going to http://www.wiki.php.net (the wiki, not the main PHP website) and searching for scalar_type_hints_v5. In order to enable it, make sure you put this as the very first statement in your PHP script: declare(strict_types=1); This will not work unless you define strict_typesas the very first statement in a PHP script; no other usages of this definition are permitted. Indeed if you try to define it later on, your script PHP will throw a fatal error. Of course, in the interests of the rage induced PHP core fanatic reading this book in its coffee stained form, I should mention that there are other valid types that can be used in type hinting. For example, PHP 5.1.0 introduced this with arrays and PHP 5.0.0 introduced the ability for a developer to do this with their own classes. Let me give you a quick example of how this would work in practice, suppose we had an Address class: class Address { public $firstLine; public $postcode; public $country; public function __construct(string $firstLine, string $postcode, string $country) { $this->firstLine = $firstLine; $this->postcode = $postcode; $this->country = $country; } } We can then type the hint of the Address class that we inject into a Customer class: class Customer { public $name; public $address; public function __construct($name, Address $address) { $this->name = $name; $this->address = $address; } } And just to show how it all can come together: $address = new Address('10 Downing Street', 'SW1A2AA', 'UK'); $customer = new Customer('Davey Cameron', $address); var_dump($customer); Limiting debug access to private/protected properties If you define a class which contains private or protected variables, you will notice an odd behavior if you were to var_dumpthe object of that class. You will notice that when you wrap the object in a var_dumpit reveals all variables; be they protected, private, or public. PHP treats var_dump as an internal debugging function, meaning all data becomes visible. Fortunately, there is a workaround for this. PHP 5.6 introduced the __debugInfo magic method. Functions in classes preceded by a double underscore represent magic methods and have special functionality associated to them. Every time you try to var_dump an object that has the __debugInfo magic method set, the var_dump will be overridden with the result of that function call instead. Let me show you how this works in practice, let's start by defining a class: class Bear { private $hasPaws = true; } Let's instantiate this class: $richard = new Bear(); Now if we were to try and access the private variable that ishasPaws, we would get a fatal error; so this call: echo $richard->hasPaws; Would result in the following fatal error being thrown: Fatal error: Cannot access private property Bear::$hasPaws That is the expected output, we don't want a private property visible outside its object. That being said, if we wrap the object with a var_dump as follows: var_dump($richard); We would then get the following output: object(Bear)#1 (1) { ["hasPaws":"Bear":private]=> bool(true) } As you can see, our private property is marked as private, but nevertheless it is visible. So how would we go about preventing this? So, let's redefine our class as follows: class Bear { private $hasPaws = true; public function __debugInfo () { return call_user_func('get_object_vars', $this); } } Now, after we instantiate our class and var_dump the resulting object, we get the following output: object(Bear)#1 (0) { } The script all put together looks like this now, you will notice I've added an extra public property called growls, which I have set to true: <?php class Bear { private $hasPaws = true; public $growls = true; public function __debugInfo () { return call_user_func('get_object_vars', $this); } } $richard = new Bear(); var_dump($richard); If we were to var_dump this script (with both public and private property to play with), we would get the following output: object(Bear)#1 (1) { ["growls"]=> bool(true) } As you can see, only the public property is visible. So what is the moral of the story from this little experiment? Firstly, that var_dumps exposesprivate and protected properties inside objects, and secondly, that this behavior can be overridden. Summary In this article, we revised some PHP principles, including OOP principles. We also revised some PHP syntax basics. Resources for Article: Further resources on this subject: Running Simpletest and PHPUnit [article] Data Tables and DataTables Plugin in jQuery 1.3 with PHP [article] Understanding PHP basics [article]

0
0
17843

article-image-getting-started-cloud-only-scenario

Packt

13 Sep 2016

23 min read

Getting Started with a Cloud-Only Scenario

Packt

13 Sep 2016

23 min read

0
0
10914

article-image-how-build-your-own-futuristic-robot

Packt

13 Sep 2016

5 min read

How to Build your own Futuristic Robot

Packt

13 Sep 2016

5 min read

In this article by Richard Grimmett author of the book Raspberry Pi Robotic Projects - Third Edition we will start with simple but impressive project where you'll take a toy robot and give it much more functionality. You'll start with an R2D2 toy robot and modify it to add a web cam, voice recognition, and motors so that it can get around. Creating your own R2D2 will require a bit of mechanical work, you'll need a drill and perhaps a Dremel tool, but most of the mechanical work will be removing the parts you don't need so you can add some exciting new capabities. (For more resources related to this topic, see here.) Modifying the R2D2 There are several R2D2 toys that can provide the basis for this project. Both are available from online retailers. This project will use one that is both inexpensive but also provides such interesting features as a top that turns and a wonderful place to put a webcam. It is the Imperial Toy R2D2 bubble machine. Here is a picture of the unit: The unit can be purchased at amazon.com, toyrus.com, and a number of other retailers. It is normally used as a bubble machine that uses a canister of soap bubbles to produce bubbles, but you'll take all of that capability out to make your R2D2 much more like the original robot. Adding wheels and motors In order to make your R2D2 a reality the first thing you'll want to do is add wheels to the robot. In order to do this you'll need to take the robot apart, separating the two main plastic pieces that make up the lower body of the robot. Once you have done this both the right and left arms can be removed from the body. You'll need to add two wheels that are controlled by DC motors to these arms. Perhaps the best way to do this is to purchase a simple, two-wheeled car that is available at many online electronics stores like amazon.com, ebay.com, or bandgood.com. Here is a picture of the parts that come with the car: You'll be using these pieces to add mobility to your robot. The two yellow pieces are dc motors. So, let's start with those. To add these to the two arms on either side of the robot, you'll need to separate the two halves of the arm, and then remove material from one of the halves, like this: You can use a Dremel tool to do this, or any kind of device that can cut plastic. This will leave a place for your wheel. Now you'll want to cut the plastic kit of your car up to provide a platform to connect to your R2D2 arm. You'll cut your plastic car up using this as a pattern, you'll want to end up with the two pieces that have the + sign cutouts, and this is where you'll mount your wheels and also the piece you'll attach to the R2D2 arm. The image below will help you understand this better. On the cut out side that has not been removed, mark and drill two holes to fix the clear plastic to the bottom of the arm. Then fix the wheel to the plastic, then the plastic to the bottom of the arm as shown in the picture. You'll connect two wires, one to each of the polarities on the motor, and then run the wires up to the top of the arm and out the small holes. These wires will eventually go into the body of the robot through small holes that you will drill where the arms connect to the body, like this: You'll repeat this process for the other arm. For the third, center arm, you'll want to connect the small, spinning white wheel to the bottom of the arm. Here is a picture: Now that you have motors and wheels connected to the bottom of arms you'll need to connect these to the Raspberry Pi. There are several different ways to connect and drive these two DC motors, but perhaps the easiest is to add a shield that can directly drive a DC motor. This motor shield is an additional piece of hardware that installs on the top of Raspberry Pi and can source the voltage and current to power both motors. The RaspiRobot Board V3 is available online and can provide these signals. The specifics on the board can be found at http://www.monkmakes.com/rrb3/. Here is a picture of the board: The board will provide the drive signals for the motors on each of the wheels. The following are the steps to connect Raspberry Pi to the board: First, connect the battery power connector to the power connector on the side of the board. Next, connect the two wires from one of the motors to the L motor connectors on the board. Connect the other two wires from the other motor to the R motor connectors on the board. Once completed your connections should look like this: The red and black wires go to the battery, the green and yellow to left motor, the blue and white to the right motor. Now you will be able to control both the speed and the direction of the motors through the motor control board. Summary Thus we have covered some aspect of building first project, your own R2D2. You can now move it around, program it to respond to voice commands, or run it remotely from a computer, tablet or phone. Following in this theme your next robot will look and act like WALL-E. Resources for Article: Further resources on this subject: The Raspberry Pi and Raspbian [article] Building Our First Poky Image for the Raspberry Pi [article] Raspberry Pi LED Blueprints [article]

0
0
23414

How-To Tutorials

article-image-creating-slack-progress-bar

Bradley Cicenas

13 Sep 2016

4 min read

Creating a Slack Progress Bar

Bradley Cicenas

13 Sep 2016

4 min read

There are many ways you can customize your Slack program, but with a little bit of programming you can create things that can provide you with all sorts of productivity-enhancing options. In this post, you are going to learn how to use Python to create a custom progress bar that will run in your very own Slack channel. You can use this progress bar in so many different ways. We won’t get into the specifics beyond the progress bar, but one idea would be to use it to broadcast the progress of a to-do list in Trello. That is something that is easy to set up in Slack. Let’s get started. Real-time notifications are a great way to follow along with the events of the day. For ever-changing and long-running events though, a stream of update messages quickly becomes spammy and overwhelming. If you make a typo in a sent message, or want to append additional details, you know that you can go back and edit it without sending a new chat(and subsequently a new notification for those in the channel); so why can't your bot friends do the same? As you may have guessed by now, they can! This program will make use of Slack’s chat.update method to create a dynamic progress bar with just one single-line message. Save the below script to a file namedprogress.py, updating the SLACK_TOKEN and SLACK_CHANNEL variables at the top with your configured token and desired channel name. from time import sleep from slacker import Slacker from random importrandint SLACK_TOKEN = '<token>' SLACK_CHANNEL = '#general' classSlackSimpleProgress(object): def__init__(self, slack_token, slack_channel): self.slack = Slacker(slack_token) self.channel = slack_channel res = self.slack.chat.post_message(self.channel, self.make_bar(0)) self.msg_ts = res.body['ts'] self.channel_id = res.body['channel'] def update(self, percent_done): self.slack.chat.update(self.channel_id, self.msg_ts, self.make_bar(percent_done)) @staticmethod defmake_bar(percent_done): return'%s%s%%' % ((round(percent_done / 5) * chr(9608)), percent_done) if__name__ == '__main__': progress_bar = SlackSimpleProgress(SLACK_TOKEN, SLACK_CHANNEL) # initialize the progress bar percent_complete = 1 whilepercent_complete<100: progress.update(percent_complete) percent_complete += randint(1,10) sleep(1) progress_bar.update(100) Run the script with a simple python progress.py and, like magic, your progress bar will appear in the channel, updating at regular intervals until completion: How It Works In the last six lines of the script: A progress bar is initially created with SlackSimpleProgress. The update method is used to refresh the current position of the bar once a second. percent_complete is slowly incremented by a random amount using Python’s random.randint. We set the progress bar to 100% when completed. Extending With the basic usage down, the same script can be updated to serve a real purpose. Say we're copying a large amount of files between directories for a backup—we'll replace the __main__ part of the script with: importos import sys importshutil src_path = sys.argv[0] dest_path = sys.argv[1] all_files = next(os.walk('/usr/lib'))[2] progress_bar = SlackSimpleProgress(SLACK_TOKEN, SLACK_CHANNEL) # initialize the progress bar files_copied = 0 forfile in all_files: src_file = '%s/%s' % (src_path, file) dest_file = '%s/%s' % (dest_path, file) print('copying %s to %s' % (src_file, dest_file)) # print the file being copied shutil.copy2(src_file, dest_file) files_copied += 1# increment files copied percent_done = files_copied / len(all_files) * 100# calculate percent done progress_bar.update(percent_done) The script can now be run with two arguments: python progress.py /path/to/files /path/to/destination. You'll see the name of each file copied in the output of the script, and your teammates will be able to follow along with the status of the copy as it progresses in your Slack channel! I hope you find many uses for ways to implement this progress bar in Slack! About the author Bradley Cicenas is a New York City-based infrastructure engineer with an affinity for microservices, systems design, data science, and stoops.

0
0
9137

How-To Tutorials

article-image-computer-vision-keras-part-2

Sasank Chilamkurthy

13 Sep 2016

6 min read

Computer Vision with Keras, Part 2

Sasank Chilamkurthy

13 Sep 2016

6 min read

If you were following along in Part 1, you will have seen how we used Keras to create our model for tackling The German Traffic Sign Recognition Benchmark(GTSRB). Now in Part 2 you will see how we achieve performance close to human-level performance. You will also see how to improve the accuracy of the model using augmentation of the training data. Training Now, our model is ready to train. During the training, our model will iterate over batches of the training set, each of size batch_size. For each batch, gradients will be computed and updates will be made to the weights of the network automatically. One iteration over all of the training set is referred to as an epoch. Training is usually run until the loss converges to a constant. We will add a couple of features to our training: Learning rate scheduler: Decaying learning rate over the epochs usually helps the model learn better. Model checkpoint: We will save the model with best validation accuracy. This is useful because our network might start overfitting after a certain number of epochs, but we want the best model. These are not necessary but they improve the model accuracy. These features are implemented via the callback feature of Keras. callback are a set of functions that will applied at given stages of training procedure like end of an epoch of training. Keras provides inbuilt functions for both learning rate scheduling and model checkpointing. fromkeras.callbacks import LearningRateScheduler, ModelCheckpoint deflr_schedule(epoch): returnlr*(0.1**int(epoch/10)) batch_size = 32 nb_epoch = 30 model.fit(X, Y, batch_size=batch_size, nb_epoch=nb_epoch, validation_split=0.2, callbacks=[LearningRateScheduler(lr_schedule), ModelCheckpoint('model.h5',save_best_only=True)] ) You'll see that model starts training and logs the losses and accuracies: Train on 31367 samples, validate on 7842 samples Epoch 1/30 31367/31367 [==============================] - 30s - loss: 1.1502 - acc: 0.6723 - val_loss: 0.1262 - val_acc: 0.9616 Epoch 2/30 31367/31367 [==============================] - 32s - loss: 0.2143 - acc: 0.9359 - val_loss: 0.0653 - val_acc: 0.9809 Epoch 3/30 31367/31367 [==============================] - 31s - loss: 0.1342 - acc: 0.9604 - val_loss: 0.0590 - val_acc: 0.9825 ... Now this might take a bit of time, especially if you are running on a CPU. If you have anNvidiaGPU, you should install cuda. It speeds up the training dramatically. For example, on my Macbook air, it takes 10 minutes per epoch while on a machine with Nvidia Titan X GPU, it takes 30 seconds. Even modest GPUs offer impressive speedup because of the inherent parallelizability of the neural networks. This makes GPUs necessary for deep learning if anything big has to be done. Grab a coffee while you wait for training to complete ;). Congratulations! You have just trained your first deep learning model. Evaluation Let's quickly load test data and evaluate our model on it: import pandas as pd test = pd.read_csv('GT-final_test.csv',sep=';') # Load test dataset X_test = [] y_test = [] i = 0 forfile_name, class_id in zip(list(test['Filename']), list(test['ClassId'])): img_path = os.path.join('GTSRB/Final_Test/Images/',file_name) X_test.append(preprocess_img(io.imread(img_path))) y_test.append(class_id) X_test = np.array(X_test) y_test = np.array(y_test) # predict and evaluate y_pred = model.predict_classes(X_test) acc = np.sum(y_pred==y_test)/np.size(y_pred) print("Test accuracy = {}".format(acc)) Which outputs on my system (Results may change a bit because the weights of the neural network are randomly initialized): 12630/12630 [==============================] - 2s Test accuracy = 0.9792557403008709 97.92%! That's great! It's not far from average human performance (98.84%)[1]. A lot of things can be done to squeeze out extra performance from the neural net. I'll implement one such improvement in the next section. Data Augmentation You might think 40000 images is a lot of images. Think about it again. Our model has 1358155 parameters (try model.count_params() or model.summary()). That's 4X the number of training images. If we can generate new images for training from the existing images, that will be a great way to increase the size of the dataset. This can be done by slightly: Translating theimage Rotating theimage Shearing the image Zooming in/out of the image Rather than generating and saving such images to hard disk, we will generate them on the fly during training. This can be done directly using built-in functionality of Keras. fromkeras.preprocessing.image import ImageDataGenerator fromkeras.preprocessing.image import ImageDataGenerator fromsklearn.cross_validation import train_test_split X_train, X_val, Y_train, Y_val = train_test_split(X, Y, test_size=0.2, random_state=42) datagen = ImageDataGenerator(featurewise_center=False, featurewise_std_normalization=False, width_shift_range=0.1, height_shift_range=0.1, zoom_range=0.2, shear_range=0.1, rotation_range=10.,) datagen.fit(X_train) # Reinitialize model and compile model = cnn_model() model.compile(loss='categorical_crossentropy', optimizer=sgd, metrics=['accuracy']) # Train again nb_epoch = 30 model.fit_generator(datagen.flow(X_train, Y_train, batch_size=batch_size), samples_per_epoch=X_train.shape[0], nb_epoch=nb_epoch, validation_data=(X_val, Y_val), callbacks=[LearningRateScheduler(lr_schedule), ModelCheckpoint('model.h5',save_best_only=True)] ) With this model, I get 98.29% accuracy on the test set. Frankly, I haven't done much parameter tuning. I'll make a small list of things which can be tried to improve the model: Try different network architectures. Try deeper and shallower networks Try adding BatchNormalization layers to the network Experiment with different weight initializations Try different learning rates and schedules Make an ensemble of models Try normalization of input images More aggressive data augmentation This is but a model for beginners. For state-of-the-art solutions of the problem, you can have a look at this, where the authors achieve 99.61% accuracy with a specialized layer called Spatial Transformer layer. Conclusion In this two-part post, you have learned how to use convolutional networks to solve a computer vision problem. We used the Keras deep learning framework to implement CNNs in Python. We have achieved performance close to human-level performance. We also have seen a way to improve the accuracy of the model using augmentation of the training data. References: Stallkamp, Johannes, et al. "Man vs. computer: Benchmarking machine learning algorithms for traffic sign recognition." Neural networks 32 (2012): 323-332. About the author Sasank Chilamkurthy works at Qure.ai. His work involves deep learning on medical images obtained from radiology and pathology. He completed his UG in Mumbai at the Indian Institute of Technology, Bombay. He can be found on Github at here.

0
0
2649

How-To Tutorials

Packt

12 Sep 2016

28 min read

Gearing Up for Bootstrap 4

Packt

12 Sep 2016

28 min read

In this article by Benjamin Jakobus and Jason Marah, the authors of the book Mastering Bootstrap 4, we will be discussing the key points about Bootstrap as a web development framework that helps developers build web interfaces. Originally conceived at Twitter in 2011 by Mark Otto and Jacob Thornton, the framework is now open source and has grown to be one of the most popular web development frameworks to date. Being freely available for private, educational, and commercial use meant that Bootstrap quickly grew in popularity. Today, thousands of organizations rely on Bootstrap, including NASA, Walmart, and Bloomberg. According to BuiltWith.com, over 10% of the world's top 1 million websites are built using Bootstrap (http://trends.builtwith.com/docinfo/Twitter-Bootstrap). As such, knowing how to use Bootstrap will be an important skill and serve as a powerful addition to any web developer’s tool belt. (For more resources related to this topic, see here.) The framework itself consists of a mixture of JavaScript and CSS, and provides developers with all the essential components required to develop a fully functioning web user interface. Over the course of the book, we will be introducing you to all of the most essential features that Bootstrap has to offer by teaching you how to use the framework to build a complete website from scratch. As CSS and HTML alone are already the subject of entire books in themselves, we assume that you, the reader, has at least a basic knowledge of HTML, CSS, and JavaScript. We begin this article by introducing you to our demo website—MyPhoto. This website will accompany us throughout the book, and serve as a practical point of reference. Therefore, all lessons learned will be taught within the context of MyPhoto. We will then discuss the Bootstrap framework, listing its features and contrasting the current release to the last major release (Bootstrap 3). Last but not least, this article will help you set up your development environment. To ensure equal footing, we will guide you towards installing the right build tools, and precisely detail the various ways in which you can integrate Bootstrap into a project. To summarize, this article will do the following: Introduce you to what exactly we will be doing Explain what is new in the latest version of Bootstrap, and how the latest version differs to the previous major release Show you how to include Bootstrap in our web project Introducing our demo project The book will teach you how to build a complete Bootstrap website from scratch. We will build and improve the website's various sections as we progress through the book. The concept behind our website is simple. To develop a landing page for photographers. Using this landing page, (hypothetical) users will be able to exhibit their wares and services. While building our website, we will be making use of the same third-party tools and libraries that you would if you were working as a professional software developer. We chose these tools and plugins specifically because of their widespread use. Learning how to use and integrate them will save you a lot of work when developing websites in the future. Specifically, the tools that we will use to assist us throughout the development of MyPhoto are Bower, node package manager (npm) and Grunt. From a development perspective, the construction of MyPhoto will teach you how to use and apply all of the essential user interface concepts and components required to build a fully functioning website. Among other things, you will learn how to do the following: Use the Bootstrap grid system to structure the information presented on your website. Create a fixed, branded, navigation bar with animated scroll effects. Use an image carousel for displaying different photographs, implemented using Bootstrap's carousel.js and jumbotron (jumbotron is a design principle for displaying important content). It should be noted that carousels are becoming an increasingly unpopular design choice, however, they are still heavily used and are an important feature of Bootstrap. As such, we do not argue for or against the use of carousels as their effectiveness depends very much on how they are used, rather than on whether they are used. Build custom tabs that allow users to navigate across different contents. Use and apply Bootstrap's modal dialogs. Apply a fixed page footer. Create forms for data entry using Bootstrap's input controls (text fields, text areas, and buttons) and apply Bootstrap's input validation styles. Make best use of Bootstrap's context classes. Create alert messages and learn how to customize them. Rapidly develop interactive data tables for displaying product information. How to use drop-down menus, custom fonts, and icons. In addition to learning how to use Bootstrap 4, the development of MyPhoto will introduce you to a range of third-party libraries such as Scrollspy (for scroll animations), SalvattoreJS (a library for complementing our Bootstrap grid), Animate.css (for beautiful CSS animations, such as fade-in effects at https://daneden.github.io/animate.css/) and Bootstrap DataTables (for rapidly displaying data in tabular form). The website itself will consist of different sections: A Welcome section An About section A Services section A Gallery section A Contact Us section The development of each section is intended to teach you how to use a distinct set of features found in third-party libraries. For example, by developing the Welcome section, you will learn how to use Bootstrap's jumbotron and alert dialogs along with different font and text styles, while the About section will show you how to use cards. The Services section of our project introduces you to Bootstrap's custom tabs. That is, you will learn how to use Bootstrap's tabs to display a range of different services offered by our website. Following on from the Services section, you will need to use rich imagery to really show off the website's sample services. You will achieve this by really mastering Bootstrap's responsive core along with Bootstrap's carousel and third-party jQuery plugins. Last but not least, the Contact Us section will demonstrate how to use Bootstrap's form elements and helper functions. That is, you will learn how to use Bootstrap to create stylish HTML forms, how to use form fields and input groups, and how to perform data validation. Finally, toward the end of the book, you will learn how to optimize your website, and integrate it with the popular JavaScript frameworks AngularJS (https://angularjs.org/) and React (http://facebook.github.io/react/). As entire books have been written on AngularJS alone, we will only cover the essentials required for the integration itself. Now that you have glimpsed a brief overview of MyPhoto, let’s examine Bootstrap 4 in more detail, and discuss what makes it so different to its predecessor. Take a look at the following screenshot: Figure 1.1: A taste of what is to come: the MyPhoto landing page. What Bootstrap 4 Alpha 4 has to offer Much has changed since Twitter’s Bootstrap was first released on August 19th, 2011. In essence, Bootstrap 1 was a collection of CSS rules offering developers the ability to lay out their website, create forms, buttons, and help with general appearance and site navigation. With respect to these core features, Bootstrap 4 Alpha 4 is still much the same as its predecessors. In other words, the framework's focus is still on allowing developers to create layouts, and helping to develop a consistent appearance by providing stylings for buttons, forms, and other user interface elements. How it helps developers achieve and use these features however, has changed entirely. Bootstrap 4 is a complete rewrite of the entire project, and, as such, ships with many fundamental differences to its predecessors. Along with Bootstrap's major features, we will be discussing the most striking differences between Bootstrap 3 and Bootstrap 4 in the sub sections below. Layout Possibly the most important and widely used feature is Bootstrap's ability to lay out and organize your page. Specifically, Bootstrap offers the following: Responsive containers. Responsive breakpoints for adjusting page layout in response to differing screen sizes. A 12 column grid layout for flexibly arranging various elements on your page. Media objects that act as building blocks and allow you to build your own structural components. Utility classes that allow you to manipulate elements in a responsive manner. For example, you can use the layout utility classes to hide elements, depending on screen size. Content styling Just like its predecessor, Bootstrap 4 overrides the default browser styles. This means that many elements, such as lists or headings, are padded and spaced differently. The majority of overridden styles only affect spacing and positioning, however, some elements may also have their border removed. The reason behind this is simple. To provide users with a clean slate upon which they can build their site. Building on this clean slate, Bootstrap 4 provides styles for almost every aspect of your webpage such as buttons (Figure 1.2), input fields, headings, paragraphs, special inline texts, such as keyboard input (Figure 1.3), figures, tables, and navigation controls. Aside from this, Bootstrap offers state styles for all input controls, for example, styles for disabled buttons or toggled buttons. Take a look at the following screenshot: Figure 1.2: The six button styles that come with Bootstrap 4 are btn-primary,btn-secondary, btn-success,btn-danger, btn-link,btn-info, and btn-warning. Take a look at the following screenshot: Figure 1.3: Bootstrap's content styles. In the preceding example, we see inline styling for denoting keyboard input. Components Aside from layout and content styling, Bootstrap offers a large variety of reusable components that allow you to quickly construct your website's most fundamental features. Bootstrap's UI components encompass all of the fundamental building blocks that you would expect a web development toolkit to offer: Modal dialogs, progress bars, navigation bars, tooltips, popovers, a carousel, alerts, drop-down menu, input groups, tabs, pagination, and components for emphasizing certain contents. Let's have a look at the following modal dialog screenshot: Figure 1.4: Various Bootstrap 4 components in action. In the screenshot above we see a sample modal dialog, containing an info alert, some sample text, and an animated progress bar. Mobile support Similar to its predecessor, Bootstrap 4 allows you to create mobile friendly websites without too much additional development work. By default, Bootstrap is designed to work across all resolutions and screen sizes, from mobile, to tablet, to desktop. In fact, Bootstrap's mobile first design philosophy implies that its components must display and function correctly at the smallest screen size possible. The reasoning behind this is simple. Think about developing a website without consideration for small mobile screens. In this case, you are likely to pack your website full of buttons, labels, and tables. You will probably only discover any usability issues when a user attempts to visit your website using a mobile device only to find a small webpage that is crowded with buttons and forms. At this stage, you will be required to rework the entire user interface to allow it to render on smaller screens. For precisely this reason, Bootstrap promotes a bottom-up approach, forcing developers to get the user interface to render correctly on the smallest possible screen size, before expanding upwards. Utility classes Aside from ready-to-go components, Bootstrap offers a large selection of utility classes that encapsulate the most commonly needed style rules. For example, rules for aligning text, hiding an element, or providing contextual colors for warning text. Cross-browser compatibility Bootstrap 4 supports the vast majority of modern browsers, including Chrome, Firefox, Opera, Safari, Internet Explorer (version 9 and onwards; Internet Explorer 8 and below are not supported), and Microsoft Edge. Sass instead of Less Both Less and Sass (Syntactically Awesome Stylesheets) are CSS extension languages. That is, they are languages that extend the CSS vocabulary with the objective of making the development of many, large, and complex style sheets easier. Although Less and Sass are fundamentally different languages, the general manner in which they extend CSS is the same—both rely on a preprocessor. As you produce your build, the preprocessor is run, parsing the Less/Sass script and turning your Less or Sass instructions into plain CSS. Less is the official Bootstrap 3 build, while Bootstrap 4 has been developed from scratch, and is written entirely in Sass. Both Less and Sass are compiled into CSS to produce a single file, bootstrap.css. It is this CSS file that we will be primarily referencing throughout this book (with the exception of Chapter 3, Building the Layout). Consequently, you will not be required to know Sass in order to follow this book. However, we do recommend that you take a 20 minute introductory course on Sass if you are completely new to the language. Rest assured, if you already know CSS, you will not need more time than this. The language's syntax is very close to normal CSS, and its elementary concepts are similar to those contained within any other programming language. From pixel to root em Unlike its predecessor, Bootstrap 4 no longer uses pixel (px) as its unit of typographic measurement. Instead, it primarily uses root em (rem). The reasoning behind choosing rem is based on a well known problem with px, that is websites using px may render incorrectly, or not as originally intended, as users change the size of the browser's base font. Using a unit of measurement that is relative to the page's root element helps address this problem, as the root element will be scaled relative to the browser's base font. In turn, a page will be scaled relative to this root element. Typographic units of measurement Simply put, typographic units of measurement determine the size of your font and elements. The most commonly used units of measurement are px and em. The former is an abbreviation for pixel, and uses a reference pixel to determine a font's exact size. This means that, for displays of 96 dots per inch (dpi), 1 px will equal an actual pixel on the screen. For higher resolution displays, the reference pixel will result in the px being scaled to match the display's resolution. For example, specifying a font size of 100 px will mean that the font is exactly 100 pixels in size (on a display with 96 dpi), irrespective of any other element on the page. Em is a unit of measurement that is relative to the parent of the element to which it is applied. So, for example, if we were to have two nested div elements, the outer element with a font size of 100 px and the inner element with a font size of 2 em, then the inner element's font size would translate to 200 px (as in this case 1 em = 100 px). The problem with using a unit of measurement that is relative to parent elements is that it increases your code's complexity, as the nesting of elements makes size calculations more difficult. The recently introduced rem measurement aims to address both em's and px's shortcomings by combining their two strengths—instead of being relative to a parent element, rem is relative to the page's root element. No more support for Internet Explorer 8 As was already implicit in the feature summary above, the latest version of Bootstrap no longer supports Internet Explorer 8. As such, the decision to only support newer versions of Internet Explorer was a reasonable one, as not even Microsoft itself provides technical support and updates for Internet Explorer 8 anymore (as of January 2016). Furthermore, Internet Explorer 8 does not support rem, meaning that Bootstrap 4 would have been required to provide a workaround. This in turn would most likely have implied a large amount of additional development work, with the potential for inconsistencies. Lastly, responsive website development for Internet Explorer 8 is difficult, as the browser does not support CSS media queries. Given these three factors, dropping support for this version of Internet Explorer was the most sensible path of action. A new grid tier Bootstrap's grid system consists of a series of CSS classes and media queries that help you lay out your page. Specifically, the grid system helps alleviate the pain points associated with horizontal and vertical positioning of a page's contents and the structure of the page across multiple displays. With Bootstrap 4, the grid system has been completely overhauled, and a new grid tier has been added with a breakpoint of 480 px and below. We will be talking about tiers, breakpoints, and Bootstrap's grid system extensively in this book. Bye-bye GLYPHICONS Bootstrap 3 shipped with a nice collection of over 250 font icons, free of use. In an effort to make the framework more lightweight (and because font icons are considered bad practice), the GLYPHICON set is no longer available in Bootstrap 4. Bigger text – No more panels, wells, and thumbnails The default font size in Bootstrap 4 is 2 px bigger than in its predecessor, increasing from 14 px to 16 px. Furthermore, Bootstrap 4 replaced panels, wells, and thumbnails with a new concept—cards. To readers unfamiliar with the concept of wells, a well is a UI component that allows developers to highlight text content by applying an inset shadow effect to the element to which it is applied. A panel too serves to highlight information, but by applying padding and rounded borders. Cards serve the same purpose as their predecessors, but are less restrictive as they are flexible enough to support different types of content, such as images, lists, or text. They can also be customized to use footers and headers. Take a look at the following screenshot: Figure 1.5: The Bootstrap 4 card component replaces existing wells, thumbnails, and panels. New and improved form input controls Bootstrap 4 introduces new form input controls—a color chooser, a date picker, and a time picker. In addition, new classes have been introduced, improving the existing form input controls. For example, Bootstrap 4 now allows for input control sizing, as well as classes for denoting block and inline level input controls. However, one of the most anticipated new additions is Bootstrap's input validation styles, which used to require third-party libraries or a manual implementation, but are now shipped with Bootstrap 4 (see Figure 1.6 below). Take a look at the following screenshot: Figure 1.6: The new Bootstrap 4 input validation styles, indicating the successful processing of input. Last but not least, Bootstrap 4 also offers custom forms in order to provide even more cross-browser UI consistency across input elements (Figure 1.7). As noted in the Bootstrap 4 Alpha 4 documentation, the input controls are: "built on top of semantic and accessible markup, so they're solid replacements for any default form control" – Source: http://v4-alpha.getbootstrap.com/components/forms/ Take a look at the following screenshot: Figure 1.7: Custom Bootstrap input controls that replace the browser defaults in order to ensure cross-browser UI consistency. Customization The developers behind Bootstrap 4 have put specific emphasis on customization throughout the development of Bootstrap 4. As such, many new variables have been introduced that allow for the easy customization of Bootstrap. Using the $enabled-*- Sass variables, one can now enable or disable specific global CSS preferences. Setting up our project Now that we know what Bootstrap has to offer, let us set up our project: Create a new project directory named MyPhoto. This will become our project root directory. Create a blank index.html file and insert the following HTML code: <!DOCTYPE html> <html lang="en"> <head> <meta charset="utf-8"> <meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no"> <meta http-equiv="x-ua-compatible" content="ie=edge"> <title>MyPhoto</title> </head> <body> <div class="alert alert-success"> Hello World! </div> </body> </html> Note the three meta tags: The first tag tells the browser that the document in question is utf-8 encoded. Since Bootstrap optimizes its content for mobile devices, the subsequent meta tag is required to help with viewport scaling. The last meta tag forces the document to be rendered using the latest document rendering mode available if viewed in Internet Explorer. Open the index.html in your browser. You should see just a blank page with the words Hello World. Now it is time to include Bootstrap. At its core, Bootstrap is a glorified CSS style sheet. Within that style sheet, Bootstrap exposes very powerful features of CSS with an easy-to-use syntax. It being a style sheet, you include it in your project as you would with any other style sheet that you might develop yourself. That is, open the index.html and directly link to the style sheet. Viewport scaling The term viewport refers to the available display size to render the contents of a page. The viewport meta tag allows you to define this available size. Viewport scaling using meta tags was first introduced by Apple and, at the time of writing, is supported by all major browsers. Using the width parameter, we can define the exact width of the user's viewport. For example, <meta name="viewport" content="width=320px"> will instruct the browser to set the viewport's width to 320 px. The ability to control the viewport's width is useful when developing mobile-friendly websites; by default, mobile browsers will attempt to fit the entire page onto their viewports by zooming out as far as possible. This allows users to view and interact with websites that have not been designed to be viewed on mobile devices. However, as Bootstrap embraces a mobile-first design philosophy, a zoom out will, in fact, result in undesired side-effects. For example, breakpoints will no longer work as intended, as they now deal with the zoomed out equivalent of the page in question. This is why explicitly setting the viewport width is so important. By writing content="width=device-width, initial-scale=1, shrink-to-fit=no", we are telling the browser the following: To set the viewport's width equal to whatever the actual device's screen width is. We do not want any zoom, initially. We do not wish to shrink the content to fit the viewport. For now, we will use the Bootstrap builds hosted on Bootstrap's official Content Delivery Network (CDN). This is done by including the following HTML tag into the head of your HTML document (the head of your HTML document refers to the contents between the <head> opening tag and the </head> closing tag): <link rel="stylesheet" href="https://maxcdn.bootstrapcdn.com/bootstrap/4.0.0-alpha.4/css/bootstrap.min.css"> Bootstrap relies on jQuery, a JavaScript framework that provides a layer of abstraction in an effort to simplify the most common JavaScript operations (such as element selection and event handling). Therefore, before we include the Bootstrap JavaScript file, we must first include jQuery. Both inclusions should occur just before the </body> closing tag: <script src="https://ajax.googleapis.com/ajax/libs/jquery/2.1.4/jquery.min.js"> </script> <script src="https://maxcdn.bootstrapcdn.com/bootstrap/4.0.0-alpha.4/js/bootstrap.min.js"></script> Note that, while these scripts could, of course, be loaded at the top of the page, loading scripts at the end of the document is considered best practice to speed up page loading times and to avoid JavaScript issues preventing the page from being rendered. The reason behind this is that browsers do not download all dependencies in parallel (although a certain number of requests are made asynchronously, depending on the browser and the domain). Consequently, forcing the browser to download dependencies early on will block page rendering until these assets have been downloaded. Furthermore, ensuring that your scripts are loaded last will ensure that once you invoke Document Object Model (DOM) operations in your scripts, you can be sure that your page's elements have already been rendered. As a result, you can avoid checks that ensure the existence of given elements. What is a Content Delivery Network? The objective behind any Content Delivery Network (CDN) is to provide users with content that is highly available. This means that a CDN aims to provide you with content, without this content ever (or rarely) becoming unavailable. To this end, the content is often hosted using a large, distributed set of servers. The BootstrapCDN basically allows you to link to the Bootstrap style sheet so that you do not have to host it yourself. Save your changes and reload the index.html in your browser. The Hello World string should now contain a green background: Figure 1.5: Our "Hello World" styled using Bootstrap 4. Now that the Bootstrap framework has been included in our project, open your browser's developer console (if using Chrome on Microsoft Windows, press Ctrl + Shift + I. On Mac OS X you can press cmd + alt + I). As Bootstrap requires another third-party library, Tether for displaying popovers and tooltips, the developer console will display an error (Figure 1.6). Take a look at the following screenshot: Figure 1.6: Chrome's Developer Tools can be opened by going to View, selecting Developer and then clicking on Developer Tools. At the bottom of the page, a new view will appear. Under the Console tab, an error will indicate an unmet dependency. Tether is available via the CloudFare CDN, and consists of both a CSS file and a JavaScript file. As before, we should include the JavaScript file at the bottom of our document while we reference Tether's style sheet from inside our document head: <!DOCTYPE html> <html lang="en"> <head> <meta charset="utf-8"> <meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no"> <meta http-equiv="x-ua-compatible" content="ie=edge"> <title>MyPhoto</title> <link rel="stylesheet" href="https://maxcdn.bootstrapcdn.com/bootstrap/4.0.0-alpha.4/css/bootstrap.min.css"> <link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/tether/1.3.1/css/tether.min.css"> </head> <body> <div class="alert alert-success"> Hello World! </div> <script src="https://ajax.googleapis.com/ajax/libs/jquery/2.1.4/jquery.min.js"></script> <script src="https://cdnjs.cloudflare.com/ajax/libs/tether/1.3.1/js/tether.min.js"></script> <script src="https://maxcdn.bootstrapcdn.com/bootstrap/4.0.0-alpha.4/js/bootstrap.min.js"></script> </body> </html> While CDNs are an important resource, there are several reasons why, at times, using a third party CDN may not be desirable: CDNs introduce an additional point of failure, as you now rely on third-party servers. The privacy and security of users may be compromised, as there is no guarantee that the CDN provider does not inject malicious code into the libraries that are being hosted. Nor can one be certain that the CDN does not attempt to track its users. Certain CDNs may be blocked by the Internet Service Providers of users in different geographical locations. Offline development will not be possible when relying on a remote CDN. You will not be able to optimize the files hosted by your CDN. This loss of control may affect your website's performance (although typically you are more often than not offered an optimized version of the library through the CDN). Instead of relying on a CDN, we could manually download the jQuery, Tether, and Bootstrap project files. We could then copy these builds into our project root and link them to the distribution files. The disadvantage of this approach is the fact that maintaining a manual collection of dependencies can quickly become very cumbersome, and next to impossible as your website grows in size and complexity. As such, we will not manually download the Bootstrap build. Instead, we will let Bower do it for us. Bower is a package management system, that is, a tool that you can use to manage your website's dependencies. It automatically downloads, organizes, and (upon command) updates your website's dependencies. To install Bower, head over to http://bower.io/. How do I install Bower? Before you can install Bower, you will need two other tools: Node.js and Git. The latter is a version control tool—in essence, it allows you to manage different versions of your software. To install Git, head over to http://git-scm.com/and select the installer appropriate for your operating system. NodeJS is a JavaScript runtime environment needed for Bower to run. To install it, simply download the installer from the official NodeJS website: https://nodejs.org/ Once you have successfully installed Git and NodeJS, you are ready to install Bower. Simply type the following command into your terminal: npm install -g bower This will install Bower for you, using the JavaScript package manager npm, which happens to be used by, and is installed with, NodeJS. Once Bower has been installed, open up your terminal, navigate to the project root folder you created earlier, and fetch the bootstrap build: bower install bootstrap#v4.0.0-alpha.4 This will create a new folder structure in our project root: bower_components bootstrap Gruntfile.js LICENSE README.md bower.json dist fonts grunt js less package.js package.json We will explain all of these various files and directories later on in this book. For now, you can safely ignore everything except for the dist directory inside bower_components/bootstrap/. Go ahead and open the dist directory. You should see three sub directories: css fonts js The name dist stands for distribution. Typically, the distribution directory contains the production-ready code that users can deploy. As its name implies, the css directory inside dist includes the ready-for-use style sheets. Likewise, the js directory contains the JavaScript files that compose Bootstrap. Lastly, the fonts directory holds the font assets that come with Bootstrap. To reference the local Bootstrap CSS file in our index.html, modify the href attribute of the link tag that points to the bootstrap.min.css: <link rel="stylesheet" href="bower_components/bootstrap/dist/css/bootstrap.min.css"> Let's do the same for the Bootstrap JavaScript file: <script src="bower_components/bootstrap/dist/js/bootstrap.min.js"></script> Repeat this process for both jQuery and Tether. To install jQuery using Bower, use the following command: bower install jquery Just as before, a new directory will be created inside the bower_components directory: bower_components jquery AUTHORS.txt LICENSE.txt bower.json dist sizzle src Again, we are only interested in the contents of the dist directory, which, among other files, will contain the compressed jQuery build jquery.min.js. Reference this file by modifying the src attribute of the script tag that currently points to Google's jquery.min.js by replacing the URL with the path to our local copy of jQuery: <script src="bower_components/jquery/dist/jquery.min.js"></script> Last but not least, repeat the steps already outlined above for Tether: bower install tether Once the installation completes, a similar folder structure than the ones for Bootstrap and jQuery will have been created. Verify the contents of bower_components/tether/dist and replace the CDN Tether references in our document with their local equivalent. The final index.html should now look as follows: <!DOCTYPE html> <html lang="en"> <head> <meta charset="utf-8"> <meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no"> <meta http-equiv="x-ua-compatible" content="ie=edge"> <title>MyPhoto</title> <link rel="stylesheet" href="bower_components/bootstrap/dist/css/bootstrap.min.css"> <link rel="stylesheet" href="bower_components/tether/dist/css/tether.min.css"> </head> <body> <div class="alert alert-success"> Hello World! </div> <script src="bower_components/jquery/dist/jquery.min.js"></script> <script src="bower_components/tether/dist/js/tether.min.js"></script> <script src="bower_components/bootstrap/dist/js/bootstrap.min.js"></script> </body> </html> Refresh the index.html in your browser to make sure that everything works. What IDE and browser should I be using when following the examples in this book? While we recommend a JetBrains IDE or Sublime Text along with Google Chrome, you are free to use whatever tools and browser you like. Our taste in IDE and browser is subjective on this matter. However, keep in mind that Bootstrap 4 does not support Internet Explorer 8 or below. As such, if you do happen to use Internet Explorer 8, you should upgrade it to the latest version. Summary Aside from introducing you to our sample project MyPhoto, this article was concerned with outlining Bootstrap 4, highlighting its features, and discussing how this new version of Bootstrap differs to the last major release (Bootstrap 3). The article provided an overview of how Bootstrap can assist developers in the layout, structuring, and styling of pages. Furthermore, we noted how Bootstrap provides access to the most important and widely used user interface controls through the form of components that can be integrated into a page with minimal effort. By providing an outline of Bootstrap, we hope that the framework's intrinsic value in assisting in the development of modern websites has become apparent to the reader. Furthermore, during the course of the wider discussion, we highlighted and explained some important concepts in web development, such as typographic units of measurement or the definition, purpose and justification of the use of Content Delivery Networks. Last but not least, we detailed how to include Bootstrap and its dependencies inside an HTML document.

0
0
30382

How-To Tutorials

article-image-intro-swift-repl-and-playgrounds

Dov Frankel

11 Sep 2016

6 min read

Intro to the Swift REPL and Playgrounds

Dov Frankel

11 Sep 2016

6 min read

When Apple introduced Swift at WWDC (its annual Worldwide Developers Conference) in 2014, it had a few goals for the new language. Among them was being easy to learn, especially compared to other compiled languages. The following is quoted from Apple's The Swift Programming Language: Swift is friendly to new programmers. It is the first industrial-quality systems programming language that is as expressive and enjoyable as a scripting language. The REPL Swift's playgrounds embody that friendliness by modernizing the concept of a REPL (Read-Eval-Print Loop, pronounced "repple"). Introduced by the LISP language, and now a common feature of many modern scripting languages, REPLs allow you to quickly and interactively build up your code, one line at a time. A post on the Swift Blog gives an overview of the Swift REPL's features, but this is what using it looks like (to launch it, enter swift in Terminal, if you have Xcode already installed): Welcome to Apple Swift version 2.2 (swiftlang-703.0.18.1 clang-703.0.29). Type :help for assistance. 1> "Hello world" $R0: String = "Hello World" 2> let a = 1 a: Int = 1 3> let b = 2 b: Int = 2 4> a + b $R1: Int = 3 5> func aPlusB() { 6. print("(a + b)") 7. } 8> aPlusB() 3 If you look at what's there, each line containing input you give has a small indentation. A line that starts a new command begins with the line number, followed by > (1, 2, 3, 4, 5, and 8), and each subsequent line for a given command begins with the line number, followed by (6 and 7). These help to keep you oriented as you enter your code one line at a time. While it's certainly possible to work on more complicated code this way, it requires the programmer to keep more state about the code in his head, and it limits the output to data types that can be represented in text only. Playgrounds Playgrounds take the concept of a REPL to the next level by allowing you to see all of your code in a single editable code window, and giving you richer options to visualize your data. To get started with a Swift playground, launch Xcode, and select File > New > Playground… (⌥⇧⌘N) to create a new playground. In the following, you can see a new playground with the same code entered into the previous REPL: The results are shown in the gray area on the right-hand side of the window update as you type, which allows for rapid iteration. You can write standalone functions, classes, or whatever level of abstraction you wish to work in for the task at hand, removing barriers that prevent the expression of your ideas, or experimentation with the language and APIs. So, what types of goals can you accomplish? Experimentation with the language and APIs Playgrounds are an excellent place to learn about Swift, whether you're new to the language, or new to programming. You don't need to worry about main(), app bundles, simulators, or any of the other things that go into making a full-blown app. Or, if you hear about a new framework and would like to try it out, you can import it into your playground and get your hands dirty with minimal effort. Crucially, it also blows away the typical code-build-run-quit-repeat cycle that can often take up so much development time. Providing living documentation or code samples Playgrounds provide a rich experience to allow users to try out concepts they're learning, whether they're reading a technical article, or learning to use a framework. Aside from interactivity, playgrounds provide a whole other type of richness: built-in formatting via Markdown, which you can sprinkle in your playground as easily as writing comments. This allows some interesting options such as describing exercises for students to complete or providing sample code that runs without any action required of the user. Swift blog posts have included playgrounds to experiment with, as does the Swift Programming Language's A Swift Tour chapter. To author Markdown in your playground, start a comment block with /*:. Then, to see the comment rendered, click on Editor > Show Rendered Markup. There are some unique features available, such as marking page boundaries and adding fields that populate the built-in help Xcode shows. You can learn more at Apple's Markup Formatting Reference page. Designing code or UI You can also use playgrounds to interactively visualize how your code functions. There are a few ways to see what your code is doing: Individual results are shown in the gray side panel and can be Quick-Looked (including your own custom objects that implement debugQuickLookObject()). Individual or looped values can be pinned to show inline with your code. A line inside a loop will read "(10 times)," with a little circle you can toggle to pin it. For instance, you can show how a value changes with each iteration, or how a view looks: Using some custom APIs provided in the XCPlayground module, you can show live UIViews and captured values. Just import XCPlayground and set the XCPlayground.currentPage.liveView to a UIView instance, or call XCPlayground.currentPage.captureValue(someValue, withIdentifier: "My label") to fill the Assistant view. You also still have Xcode's console available to you for when debugging is best served by printing values and keeping scrolled to the bottom. As with any Swift code, you can write to the console with NSLog and print. Working with resources Playgrounds can also include resources for your code to use, such as images: A .playground file is an OS X package (a folder presented to the user as a file), which contains two subfolders: Sources and Resources. To view these folders in Xcode, show the Project Navigator in Xcode's left sidebar, the same as for a regular project. You can then drag in any resources to the Resources folder, and they'll be exposed to your playground code the same as resources are in an app bundle. You can refer to them like so: let spriteImage = UIImage(named:"sprite.png") Xcode versions starting with 7.1 even support image literals, meaning you can drag a Resources image into your source code and treat it as a UIImage instance. It's a neat idea, but makes for some strange-looking code. It's more useful for UIColors, which allow you to use a color-picker. The Swift blog post goes into more detail on how image, color, and file literals work in playgrounds: Wrap-up Hopefully this has opened your eyes to the opportunities afforded by playgrounds. They can be useful in different ways to developers of various skill and experience levels (in Swift, or in general) when learning new things or working on solving a given problem. They allow you to iterate more quickly, and visualize how your code operates more easily than other debugging methods. About the author Dov Frankel (pronounced as in "he dove swiftly into the shallow pool") works on Windows in-house software at a Connecticut hedge fund by day, and independent Mac and iOS apps by night. Find his digital comic reader, on the Mac App Store, and his iPhone app Afterglo is in the iOS App Store. He blogs when the mood strikes him, at dovfrankel.com; he's @DovFrankel on Twitter, and @abbeycode on GitHub.

0
0
26696

article-image-simple-slack-websocket-integrations-10-lines-python

Bradley Cicenas

09 Sep 2016

3 min read

Simple Slack Websocket Integrations in <10 lines of Python

Bradley Cicenas

09 Sep 2016

3 min read

If you use Slack, you've probably added a handful of integrations for your team from the ever-growing App Directory, and maybe even had an idea for your own Slack app. While the Slack API is featureful and robust, writing your own integration can be exceptionally easy. Through the Slack RTM (Real Time Messaging) API, you can write our own basic integrations in just a few lines of Python using the SlackSocket library. Want an accessible introduction to Python that's comprehensive enough to give you the confidence you need to dive deeper? This week, follow our Python Fundamentals course inside Mapt. It's completely free - so what have you got to lose? Structure Our integration will be structured with the following basic components: Listener Integration/bot logic Response The listener watches for one or more pre-defined "trigger" words, while the response posts the result of our intended task. Basic Integration We'll start by setting up SlackSocket with our API token: fromslacksocketimportSlackSocket slack=SlackSocket('<slack-token>', event_filter=['message']) By default, SlackSocket will listen for all Slack events. There are a lot of different events sent via RTM, but we're only concerned with 'message' events for our integration, so we've set an event_filter for only this type. Using the SlackSocketevents() generator, we'll read each 'message' event that comes in and can act on various conditions: for e inslack.events(): ife.event['text'] =='!hello': slack.send_msg('it works!', channel_name=e.event['channel']) If our message text matches the string '!hello', we'll respond to the source channel of the event with a given message('it works!'). At this point, we've created a complete integration that can connect to Slack as a bot user(or regular user), follow messages, and respond accordingly. Let's build something a bit more useful, like a password generator for throwaway accounts. Expanding Functionality For this integration command, we'll write a simple function to generate a random alphanumeric string 15 characters long: import random import string defrandomstr(): chars=string.ascii_letters+string.digits return''.join(random.choice(chars) for _ inrange(15)) Now we're ready to provide our random string generator to the rest of the team using the same chat logic as before, responding to the source channel with our generated password: for e inslack.events(): e.event['text'].startswith('!random'): slack.send_msg(randomstr(), channel_name=e.event['channel']) Altogether: import random import string fromslacksocketimportSlackSocket slack=SlackSocket('<slack-token>', event_filter=['message']) defrandomstr(): chars=string.ascii_letters+string.digits return''.join(random.choice(chars) for _ inrange(15)) for e inslack.events(): ife.event['text'].startswith('!random'): slack.send_msg(randomstr(), channel_name=e.event['channel']) And the results: A complete integration in 10 lines of Python. Not bad! Beyond simplicity, SlackSocket provides a great deal of flexibility for writing apps, bots, or integrations. In the case of massive Slack groups with several thousand users, messages are buffered locally to ensure that none are missed. Dropped websocket connections are automatically re-connected as well, making it an ideal base for chat client. The code for SlackSocket is available on GitHub, and as always, we welcome any contributions or feature requests! About the author Bradley Cicenas is a New York City-based infrastructure engineer with an affinity for microservices, systems design, data science, and stoops.

0
0
5703

article-image-using-web-api-extend-your-application

Packt

08 Sep 2016

14 min read

Using Web API to Extend Your Application

Packt

08 Sep 2016

14 min read

0
0
13799

article-image-customizing-xtext-components

Packt

08 Sep 2016

30 min read

Customizing Xtext Components

Packt

08 Sep 2016

30 min read

In this article written by Lorenzo Bettini, author of the book Implementing Domain Specific Languages Using Xtend and Xtext, Second Edition, the author describes the main mechanism for customizing Xtext components—Google Guice, a Dependency Injection framework. With Google Guice, we can easily and consistently inject custom implementations of specific components into Xtext. In the first section, we will briefly show some Java examples that use Google Guice. Then, we will show how Xtext uses this dependency injection framework. In particular, you will learn how to customize both the runtime and the UI aspects. This article will cover the following topics: An introduction to Google Guice dependency injection framework How Xtext uses Google Guice How to customize several aspects of an Xtext DSL (For more resources related to this topic, see here.) Dependency injection The Dependency Injection pattern (see the article Fowler, 2004) allows you to inject implementation objects into a class hierarchy in a consistent way. This is useful when classes delegate specific tasks to objects referenced in fields. These fields have abstract types (that is, interfaces or abstract classes) so that the dependency on actual implementation classes is removed. In this first section, we will briefly show some Java examples that use Google Guice. Of course, all the injection principles naturally apply to Xtend as well. If you want to try the following examples yourself, you need to create a new Plug-in Project, for example, org.example.guice and add com.google.inject and javax.inject as dependencies in the MANIFEST.MF. Let's consider a possible scenario: a Service class that abstracts from the actual implementation of a Processor class and a Logger class. The following is a possible implementation: public class Service { private Logger logger; private Processor processor; public void execute(String command) { logger.log("executing " + command); processor.process(command); logger.log("executed " + command); } } public class Logger { public void log(String message) { out.println("LOG: " + message); } } public interface Processor { public void process(Object o); } public classProcessorImplimplements Processor { private Logger logger; public void process(Object o) { logger.log("processing"); out.println("processing " + o + "..."); } } These classes correctly abstract from the implementation details, but the problem of initializing the fields correctly still persists. If we initialize the fields in the constructor, then the user still needs to hardcode the actual implementation classnames. Also, note that Logger is used in two independent classes; thus, if we have a custom logger, we must make sure that all the instances use the correct one. These issues can be dealt with using dependency injection. With dependency injection, hardcoded dependencies will be removed. Moreover, we will be able to easily and consistently switch the implementation classes throughout the code. Although the same goal can be achieved manually by implementing factory method or abstract factory patterns (see the book Gamma et al, 1995), with dependency injection framework it is easier to keep the desired consistency and the programmer needs to write less code. Xtext uses the dependency injection framework Google Guice, https://github.com/google/guice. We refer to the Google Guice documentation for all the features provided by this framework. In this section, we just briefly describe its main features. You annotate the fields you want Guice to inject with the @Inject annotation (com.google.inject.Inject): public class Service { @Inject private Logger logger; @Inject private Processor processor; public void execute(String command) { logger.log("executing " + command); processor.process(command); logger.log("executed " + command); } } public class ProcessorImpl implements Processor { @Inject private Logger logger; public void process(Object o) { logger.log("processing"); out.println("processing " + o + "..."); } } The mapping from injection requests to instances is specified in a Guice Module, a class that is derived from com.google.inject.AbstractModule. The method configure is implemented to specify the bindings using a simple and intuitive API. You only need to specify the bindings for interfaces, abstract classes, and for custom classes. This means that you do not need to specify a binding for Logger since it is a concrete class. On the contrary, you need to specify a binding for the interface Processor. The following is an example of a Guice module for our scenario: public class StandardModule extends AbstractModule { @Override protected void configure() { bind(Processor.class).to(ProcessorImpl.class); } } You create an Injector using the static method Guice.createInjector by passing a module. You then use the injector to create instances: Injector injector = Guice.createInjector(newStandardModule()); Service service = injector.getInstance(Service.class); service.execute("First command"); The initialization of injected fields will be done automatically by Google Guice. It is worth noting that the framework is also able to initialize (inject) private fields, like in our example. Instances of classes that use dependency injection must be created only through an injector. Creating instances with new will not trigger injection, thus all the fields annotated with @Inject will be null. When implementing a DSL with Xtext you will never have to create a new injector manually. In fact, Xtext generates utility classes to easily obtain an injector, for example, when testing your DSL with JUnit. We also refer to the article Köhnlein, 2012 for more details. The example shown in this section only aims at presenting the main features of Google Guice. If we need a different configuration of the bindings, all we need to do is define another module. For example, let's assume that we defined additional derived implementations for logging and processing. Here is an example where Logger and Processor are bound to custom implementations: public class CustomModule extends AbstractModule { @Override protected void configure() { bind(Logger.class).to(CustomLogger.class); bind(Processor.class).to(AdvancedProcessor.class); } } Creating instances with an injector obtained using this module will ensure that the right classes are used consistently. For example, the CustomLogger class will be used both by Service and Processor. You can create instances from different injectors in the same application, for example: executeService(Guice.createInjector(newStandardModule())); executeService(Guice.createInjector(newCustomModule())); voidexecuteService(Injector injector) { Service service = injector.getInstance(Service.class); service.execute("First command"); service.execute("Second command"); } It is possible to request injection in many different ways, such as injection of parameters to constructors, using named instances, specification of default implementation of an interface, setter methods, and much more. In this book, we will mainly use injected fields. Injected fields are instantiated only once when the class is instantiated. Each injection will create a new instance, unless the type to inject is marked as @Singleton(com.google.inject.Singleton). The annotation @Singleton indicates that only one instance per injector will be used. We will see an example of Singleton injection. If you want to decide when you need an element to be instantiated from within method bodies, you can use a provider. Instead of injecting an instance of the wanted type C, you inject a com.google.inject.Provider<C> instance, which has a get method that produces an instance of C. For example: public class Logger { @Inject private Provider<Utility>utilityProvider; public void log(String message) { out.println("LOG: " + message + " - " + utilityProvider.get().m()); } } Each time we create a new instance of Utility using the injected Provider class. Even in this case, if the type of the created instance is annotated with @Singleton, then the same instance will always be returned for the same injector. The nice thing is that to inject a custom implementation of Utility, you do not need to provide a custom Provider: you just bind the Utility class in the Guice module and everything will work as expected: public classCustomModule extends AbstractModule { @Override protected void configure() { bind(Logger.class).to(CustomLogger.class); bind(Processor.class).to(AdvancedProcessor.class); bind(Utility.class).to(CustomUtility.class); } } It is crucial to keep in mind that once classes rely on injection, their instances must be created only through an injector; otherwise, all the injected elements will be null. In general, once dependency injection is used in a framework, all classes of the framework must rely on injection. Google Guice in Xtext All Xtext components rely on Google Guice dependency injection, even the classes that Xtext generates for your DSL. This means that in your classes, if you need to use a class from Xtext, you just have to declare a field of such type with the @Inject annotation. The injection mechanism allows a DSL developer to customize basically every component of the Xtext framework. This boils down to another property of dependency injection, which, in fact, inverts dependencies. The Xtext runtime can use your classes without having a dependency to its implementer. Instead, the implementer has a dependency on the interface defined by the Xtext runtime. For this reason, dependency injection is said to implement inversion of control and the dependency inversion principle. When running the MWE2 workflow, Xtext generates both a fully configured module and an empty module that inherits from the generated one. This allows you to override generated or default bindings. Customizations are added to the empty stub module. The generated module should not be touched. Xtext generates one runtime module that defines the non-user interface-related parts of the configuration and one specific for usage in the Eclipse IDE. Guice provides a mechanism for composing modules that is used by Xtext—the module in the UI project uses the module in the runtime project and overrides some bindings. Let's consider the Entities DSL example. You can find in the src directory of the runtime project the Xtend class EntitiesRuntimeModule, which inherits from AbstractEntitiesRuntimeModule in the src-gen directory. Similarly, in the UI project, you can find in the src directory the Xtend class EntitiesUiModule, which inherits from AbstractEntitiesUiModule in the src-gen directory. The Guice modules in src-gen are already configured with the bindings for the stub classes generated during the MWE2 workflow. Thus, if you want to customize an aspect using a stub class, then you do not have to specify any specific binding. The generated stub classes concern typical aspects that the programmer usually wants to customize, for example, validation and generation in the runtime project, and labels, and outline in the UI project (as we will see in the next sections). If you need to customize an aspect which is not covered by any of the generated stub classes, then you will need to write a class yourself and then specify the binding for your class in the Guice module in the src folder. We will see an example of this scenario in the Other customizations section. Bindings in these Guice module classes can be specified as we saw in the previous section, by implementing the configure method. However, Xtext provides an enhanced API for defining bindings; Xtext reflectively searches for methods with a specific signature in order to find Guice bindings. Thus, assuming you want to bind a BaseClass class to your derived CustomClass, you can simply define a method in your module with a specific signature, as follows: def Class<? extendsBaseClass>bindBaseClass() { returnCustomClass } Remember that in Xtend, you must explicitly specify that you are overriding a method of the base class; thus, in case the bind method is already defined in the base class, you need to use override instead of def. These methods are invoked reflectively, thus their signature must follow the expected convention. We refer to the official Xtext documentation for the complete description of the module API. Typically, the binding methods that you will see in this book will have the preceding shape, in particular, the name of the method must start with bind followed by the name of the class or interface we want to provide a binding for. It is important to understand that these bind methods do not necessarily have to override a method in the module base class. You can also make your own classes, which are not related to Xtext framework classes at all, participants of this injection mechanism, as long as you follow the preceding convention on method signatures. In the rest of this article, we will show examples of customizations of both IDE and runtime concepts. For most of these customizations, we will modify the corresponding Xtend stub class that Xtext generated when running the MWE2 workflow. As hinted before, in these cases, we will not need to write a custom Guice binding. We will also show an example of a customization, which does not have an automatically generated stub class. Xtext uses injection to inject services and not to inject state (apart from EMF Singleton registries). Thus, the things that are injected are interfaces consisting of functions that take state as arguments (for example, the document, the resource, and so on.). This leads to a service-oriented architecture, which is different from an object-oriented architecture where state is encapsulated with operations. An advantage of this approach is that there are far less problems with synchronization of multiple threads. Customizations of IDE concepts In this section, we show typical concepts of the IDE for your DSL that you may want to customize. Xtext shows its usability in this context as well, since, as you will see, it reduces the customization effort. Labels Xtext UI classes make use of an ILabelProvider interface to obtain textual labels and icons through its methods getText and getImage, respectively. ILabelProvider is a standard component of Eclipse JFace-based viewers. You can see the label provider in action in the Outline view and in content assist proposal popups (as well as in various other places). Xtext provides a default implementation of a label provider for all DSLs, which does its best to produce a sensible representation of the EMF model objects using the name feature, if it is found in the corresponding object class, and a default image. You can see that in the Outline view when editing an entities file, refer to the following screenshot: However, you surely want to customize the representation of some elements of your DSL. The label provider Xtend stub class for your DSL can be found in the UI plug-in project in the subpackageui.labeling. This stub class extends the base class DefaultEObjectLabelProvider. In the Entities DSL, the class is called EntitiesLabelProvider. This class employs a Polymorphic Dispatcher mechanism, which is also used in many other places in Xtext. Thus, instead of implementing the getText and getImage methods, you can simply define several versions of methods text and image taking as parameter an EObject object of the type you want to provide a representation for. Xtext will then search for such methods according to the runtime type of the elements to represent. For example, for our Entities DSL, we can change the textual representation of attributes in order to show their names and a better representation of types (for example, name : type). We then define a method text taking Attribute as a parameter and returning a string: classEntitiesLabelProviderextends ... { @Inject extensionTypeRepresentation def text(Attribute a) { a.name + if (a.type != null) " : " + a.type.representation else "" } } To get a representation of the AttributeType element, we use an injected extension, TypeRepresentation, in particular its method representation: classTypeRepresentation { def representation(AttributeType t) { valelementType = t.elementType valelementTypeRepr = switch (elementType) { BasicType : elementType.typeName EntityType : elementType?.entity.name } elementTypeRepr + if (t.array) "[]"else"" } } Remember that the label provider is used, for example, for the Outline view, which is refreshed when the editor contents change, and its contents might contain errors. Thus, you must be ready to deal with an incomplete model, and some features might still be null. That is why you should always check that the features are not null before accessing them. Note that we inject an extension field of type TypeRepresentation instead of creating an instance with new in the field declaration. Although it is not necessary to use injection for this class, we decided to rely on that because in the future we might want to be able to provide a different implementation for that class. Another point for using injection instead of new is that the other class may rely on injection in the future. Using injection leaves the door open for future and unanticipated customizations. The Outline view now shows as in the following screenshot: We can further enrich the labels for entities and attributes using images for them. To do this, we create a directory in the org.example.entities.ui project where we place the image files of the icons we want to use. In order to benefit from Xtext's default handling of images, we call the directory icons, and we place two gif images there, Entity.gif and Attribute.gif (for entities and attributes, respectively). You fill find the icon files in the accompanying source code in the org.example.entities.ui/icons folder. We then define two image methods in EntitiesLabelProvider where we only need to return the name of the image files and Xtext will do the rest for us: class EntitiesLabelProvider extends DefaultEObjectLabelProvider { ... as before def image(Entity e) { "Entity.gif" } def image(Attribute a) { "Attribute.gif" } } You can see the result by relaunching Eclipse, as seen in the following screenshot: Now, the entities and attributes labels look nicer. If you plan to export the plugins for your DSL so that others can install them in their Eclipse, you must make sure that the icons directory is added to the build.properties file, otherwise that directory will not be exported. The bin.includes section of the build.properties file of your UI plugin should look like the following: bin.includes = META-INF/, ., plugin.xml, icons/ The Outline view The default Outline view comes with nice features. In particular, it provides toolbar buttons to keep the Outline view selection synchronized with the element currently selected in the editor. Moreover, it provides a button to sort the elements of the tree alphabetically. By default, the tree structure is built using the containment relations of the metamodel of the DSL. This strategy is not optimal in some cases. For example, an Attribute definition also contains the AttributeType element, which is a structured definition with children (for example, elementType, array, and length). This is reflected in the Outline view (refer to the previous screenshot) if you expand the Attribute elements. This shows unnecessary elements, such as BasicType names, which are now redundant since they are shown in the label of the attribute, and additional elements which are not representable with a name, such as the array feature. We can influence the structure of the Outline tree using the generated stub class EntitiesOutlineTreeProvider in the src folder org.example.entities.ui.outline. Also in this class, customizations are specified in a declarative way using the polymorphic dispatch mechanism. The official documentation, https://www.eclipse.org/Xtext/documentation/, details all the features that can be customized. In our example, we just want to make sure that the nodes for attributes are leaf nodes, that is, they cannot be further expanded and they have no children. In order to achieve this, we just need to define a method named _isLeaf (note the underscore) with a parameter of the type of the element, returning true. Thus, in our case we write the following code: classEntitiesOutlineTreeProviderextends DefaultOutlineTreeProvider { def _isLeaf(Attribute a) { true } } Let's relaunch Eclipse, and now see that the attribute nodes do not expose children anymore. Besides defining leaf nodes, you can also specify the children in the tree for a specific node by defining a _createChildren method taking as parameters the type of outline node and the type of the model element. This can be useful to define the actual root elements of the Outline tree. By default, the tree is rooted with a single node for the source file. In this example, it might be better to have a tree with many root nodes, each one representing an entity. The root of the Outline tree is always represented by a node of type DefaultRootNode. The root node is actually not visible, it is just the container of all nodes that will be displayed as roots in the tree. Thus, we define the following method (our Entities model is rooted by a Model element): public classEntitiesOutlineTreeProvider ... { ... as before def void _createChildren(DocumentRootNodeoutlineNode, Model model) { model.entities.forEach[ entity | createNode(outlineNode, entity); ] } } This way, when the Outline tree is built, we create a root node for each entity instead of having a single root for the source file. The createNode method is part of the Xtext base class. The result can be seen in the following screenshot: Customizing other aspects We will show how to customize the content assistant. There is no need to do this for the simple Entities DSL since the default implementation already does a fine job. Custom formatting An editor for a DSL should provide a mechanism for rearranging the text of the program in order to improve its readability, without changing its semantics. For example, nested regions inside blocks should be indented, and the user should be able to achieve that with a menu. Besides that, implementing a custom formatter has also other benefits, since the formatter is automatically used by Xtext when you change the EMF model of the AST. If you tried to apply the quickfixes, you might have noticed that after the EMF model has changed, the editor immediately reflects this change. However, the resulting textual representation is not well formatted, especially for the quickfix that adds the missing referred entity. In fact, the EMF model representing the AST does not contain any information about the textual representation, that is, all white space characters are not part of the EMF model (after all, the AST is an abstraction of the actual program). Xtext keeps track of such information in another in-memory model called the nodemodel. The node model carries the syntactical information, that is, offset and length in the textual document. However, when we manually change the EMF model, we do not provide any formatting directives, and Xtext uses the default formatter to get a textual representation of the modified or added model parts. Xtext already generates the menu for formatting your DSL source programs in the Eclipse editor. As it is standard in Eclipse editors (for example, the JDT editor), you can access the Format menu from the context menu of the editor or using the Ctrl + Shift + F key combination. The default formatter is OneWhitespaceFormatter and you can test this in the Entities DSL editor; this formatter simply separates all tokens of your program with a space. Typically, you will want to change this default behavior. If you provide a custom formatter, this will be used not only when the Format menu is invoked, but also when Xtext needs to update the editor contents after a manual modification of the AST model, for example, a quickfix performing a semantic modification. The easiest way to customize the formatting is to have the Xtext generator create a stub class. To achieve this, you need to add the following formatter specification in the StandardLanguage block in the MWE2 workflow file, requesting to generate an Xtend stub class: language = StandardLanguage { name = "org.example.entities.Entities" fileExtensions = "entities" ... formatter = { generateStub = true generateXtendStub = true } } If you now run the workflow, you will find the formatter Xtend stub class in the main plugin project in the formatting2 package. For our Entities DSL, the class is org.example.entities.formatting2.EntitiesFormatter. This stub class extends the Xtext class AbstractFormatter2. Note that the name of the package ends with 2. That is because Xtext recently completely changed the customization of the formatter to enhance its mechanisms. The old formatter is still available, though deprecated, so the new formatter classes have the 2 in the package in order not to be mixed with the old formatter classes. In the generated stub class, you will get lots of warnings of the shape Discouraged access: the type AbstractFormatter2 is not accessible due to restriction on required project org.example.entities. That is because the new formatting API is still provisional, and it may change in future releases in a non-backward compatible way. Once you are aware of that, you can decide to ignore the warnings. In order to make the warnings disappear from the Eclipse project, you configure the specific project settings to ignore such warnings, as shown in the following screenshot: The Xtend stub class already implements a few dispatch methods, taking as parameters the AST element to format and an IFormattableDocument object. The latter is used to specify the formatting requests. A formatting request will result in a textual replacement in the program text. Since it is an extension parameter, you can use its methods as extension methods (for more details on extension methods. The IFormattableDocument interface provides a Java API for specifying formatting requests. Xtend features such as extension methods and lambdas will allow you to specify formatting request in an easy and readable way. The typical formatting requests are line wraps, indentations, space addition and removal, and so on. These will be applied on the textual regions of AST elements. As we will show in this section, the textual regions can be specified by the EObject of AST or by its keywords and features. For our Entities DSL, we decide to perform formatting as follows: Insert two newlines after each entity so that entities will be separated by an empty line; after the last entity, we want a single empty line. Indent attributes between entities curly brackets. Insert one line-wrap after each attribute declaration. Make sure that entity name, super entity, and the extends keyword are surrounded by a single space. Remove possible white spaces around the ; of an attribute declaration. To achieve the empty lines among entities, we modify the stub method for the Entities Model element: def dispatch void format(Model model, extensionIFormattableDocument document) { vallastEntity = model.entities.last for (entity : model.entities) { entity.format if (entity === lastEntity) entity.append[setNewLines(1)] else entity.append[setNewLines(2)] } } We append two newlines after each entity. This way, each entity will be separated by an empty line, since each entity, except for the first one, will start on the second added newline. We append only one newline after the last entity. Now start a new Eclipse instance and manually test the formatter with some entities, by pressing Ctrl + Shift + F. We modify the format stub method for the Entity elements. In order to separate each attribute, we follow a logic similar to the previous format method. For the sake of the example, we use a different version of setNewLines, that is setNewLines(intminNewLines, intdefaultNewLines, intmaxNewLines), whose signature is self-explanatory: for (attribute : entity.attributes) { attribute.append[setNewLines(1, 1, 2)] } Up to now, we referred to a textual region of the AST by specifying the EObject. Now, we need to specify the textual regions of keywords and features of a given AST element. In order to specify that the "extends" keyword is surrounded by one single space we write the following: entity.regionFor.keyword("extends").surround[oneSpace] We also want to have no space around the terminating semicolon of attributes, so we write the following: attribute.regionFor.keyword(";").surround[noSpace] In order to specify that the the entity's name and the super entity are surrounded by one single space we write the following: entity.regionFor.feature(ENTITY__NAME).surround[oneSpace] entity.regionFor.feature(ENTITY__SUPER_TYPE).surround[oneSpace] After having imported statically all the EntitiesPackage.Literals members, as follows: import staticorg.example.entities.entities.EntitiesPackage.Literals.* Finally, we want to handle the indentation inside the curly brackets of an entity and to have a newline after the opening curly bracket. This is achieved with the following lines: val open = entity.regionFor.keyword("{") val close = entity.regionFor.keyword("}") open.append[newLine] interior(open, close)[indent] Summarizing, the format method for an Entity is the following one: def dispatch void format(Entity entity, extensionIFormattableDocument document) { entity.regionFor.keyword("extends").surround[oneSpace] entity.regionFor.feature(ENTITY__NAME).surround[oneSpace] entity.regionFor.feature(ENTITY__SUPER_TYPE).surround[oneSpace] val open = entity.regionFor.keyword("{") val close = entity.regionFor.keyword("}") open.append[newLine] interior(open, close)[indent] for (attribute : entity.attributes) { attribute.regionFor.keyword(";").surround[noSpace] attribute.append[setNewLines(1, 1, 2)] } } Now, start a new Eclipse instance and manually test the formatter with some attributes and entities, by pressing Ctrl + Shift + F. In the generated Xtend stub class, you also find an injected extension for accessing programmatically the elements of your grammar. In this DSL it is the following: @Inject extensionEntitiesGrammarAccess For example, to specify the left curly bracket of an entity, we could have written this alternative line: val open = entity.regionFor.keyword(entityAccess.leftCurlyBracketKeyword_3) Similarly, to specify the terminating semicolon of an attribute, we could have written this alternative line: attribute.regionFor.keyword(attributeAccess.semicolonKeyword_2) .surround[noSpace] Eclipse content assist will help you in selecting the right method to use. Note that the method names are suffixed with numbers that relate to the position of the keyword in the grammar's rule. Changing a rule in the DSL's grammar with additional elements or by removing some parts will make such method invocations invalid since the method names will change. On the other hand, if you change a keyword in your grammar, for example, you use square brackets instead of curly brackets, then referring to keywords with string literals as we did in the original implementation of the format methods will issue no compilation errors, but the formatting will not work anymore as expected. Thus, you need to choose your preferred strategy according to the likeliness of your DSL's grammar evolution. You can also try and apply our quickfixes for missing entities and you will see that the added entity is nicely formatted, according to the logic we implemented. What is left to be done is to format the attribute type nicely, including the array specification. This is left as an exercise. The EntitiesFormatter you find in the accompanying sources of this example DSL contains also this formatting logic for attribute types. You should specify formatting requests avoiding conflicting requests on the same textual region. In case of conflicts, the formatter will throw an exception with the details of the conflict. Other customizations All the customizations you have seen so far were based on modification of a generated stub class with accompanying generated Guice bindings in the module under the src-gen directory. However, since Xtext relies on injection everywhere, it is possible to inject a custom implementation for any mechanism, even if no stub class has been generated. If you installed Xtext SDK in your Eclipse, the sources of Xtext are available for you to inspect. You should learn to inspect these sources by navigating to them and see what gets injected and how it is used. Then, you are ready to provide a custom implementation and inject it. You can use the Eclipse Navigate menu. In particular, to quickly open a Java file (even from a library if it comes with sources), use Ctrl + Shift + T (Open Type…). This works both for Java classes and Xtend classes. If you want to quickly open another source file (for example, an Xtext grammar file) use Ctrl + Shift + R (Open Resource…). Both dialogs have a text field where, if you start typing, the available elements soon show up. Eclipse supports CamelCase everywhere, so you can just type the capital letters of a compound name to quickly get to the desired element. For example, to open the EntitiesRuntimeModule Java class, use the Open Type… menu and just digit ERM to see the filtered results. As an example, we show how to customize the output directory where the generated files will be stored (the default is src-gen). Of course, this output directory can be modified by the user using the Properties dialog that Xtext generated for your DSL, but we want to customize the default output directory for Entities DSL so that it becomes entities-gen. The default output directory is retrieved internally by Xtext using an injected IOutputConfigurationProvider instance. If you take a look at this class (see the preceding tip), you will see the following: importcom.google.inject.ImplementedBy; @ImplementedBy(OutputConfigurationProvider.class) public interfaceIOutputConfigurationProvider { Set<OutputConfiguration>getOutputConfigurations(); ... The @ImplementedByGuice annotation tells the injection mechanism the default implementation of the interface. Thus, what we need to do is create a subclass of the default implementation (that is, OutputConfigurationProvider) and provide a custom binding for the IOutputConfigurationProvider interface. The method we need to override is getOutputConfigurations; if we take a look at its default implementation, we see the following: public Set<OutputConfiguration>getOutputConfigurations() { OutputConfigurationdefaultOutput = new OutputConfiguration(IFileSystemAccess.DEFAULT_OUTPUT); defaultOutput.setDescription("Output Folder"); defaultOutput.setOutputDirectory("./src-gen"); defaultOutput.setOverrideExistingResources(true); defaultOutput.setCreateOutputDirectory(true); defaultOutput.setCleanUpDerivedResources(true); defaultOutput.setSetDerivedProperty(true); defaultOutput.setKeepLocalHistory(true); returnnewHashSet(defaultOutput); } Of course, the interesting part is the call to setOutputDirectory. We define an Xtend subclass as follows: classEntitiesOutputConfigurationProviderextends OutputConfigurationProvider { public static valENTITIES_GEN = "./entities-gen" overridegetOutputConfigurations() { super.getOutputConfigurations() => [ head.outputDirectory = ENTITIES_GEN ] } } Note that we use a public constant for the output directory since we might need it later in other classes. We use several Xtend features: the with operator, the implicit static extension method head, which returns the first element of a collection, and the syntactic sugar for setter method. We create this class in the main plug-in project, since this concept is not just an UI concept and it is used also in other parts of the framework. Since it deals with generation, we create it in the generatorsubpackage. Now, we must bind our implementation in the EntitiesRuntimeModule class: classEntitiesRuntimeModuleextends AbstractEntitiesRuntimeModule { def Class<? extendsIOutputConfigurationProvider> bindIOutputConfigurationProvider() { returnEntitiesOutputConfigurationProvider } } If we now relaunch Eclipse, we can verify that the Java code is generated into entities-gen instead of src-gen. If you previously used the same project, the src-gen directory might still be there from previous generations; you need to manually remove it and set the new entities-gen as a source folder. Summary In this article, we introduced the Google Guice dependency injection framework on which Xtext relies. You should now be aware of how easy it is to inject custom implementations consistently throughout the framework. You also learned how to customize some basic runtime and IDE concepts for a DSL. Resources for Article: Further resources on this subject: Testing with Xtext and Xtend [article] Clojure for Domain-specific Languages - Design Concepts with Clojure [article] Java Development [article]

0
0
4676

How to construct your own network models on Chainer

Setting Up a Network Backup Server with Bacula

Making History with Event Sourcing

It's All About Data

Hello TDD!

Decoding Why "Good PHP Developer"Isn't an Oxymoron

Getting Started with a Cloud-Only Scenario

How to Build your own Futuristic Robot

Creating a Slack Progress Bar

Computer Vision with Keras, Part 2

Trending Topics

Gearing Up for Bootstrap 4

Intro to the Swift REPL and Playgrounds

Simple Slack Websocket Integrations in <10 lines of Python

Using Web API to Extend Your Application

Customizing Xtext Components

Create a Free Account To Continue Reading

Sign in to activate your 7-day free access