Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Save more on your purchases! discount-offer-chevron-icon
Savings automatically calculated. No voucher code required.
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Events
Videos
Audiobooks
Packt Hub
Free Learning
Arrow right icon
timer SALE ENDS IN
0 Days
:
00 Hours
:
00 Minutes
:
00 Seconds

How-To Tutorials

7018 Articles
article-image-orchestration-docker-swarm
Packt
27 Dec 2016
28 min read
Save for later

Orchestration with Docker Swarm

Packt
27 Dec 2016
28 min read
In this article by Randall Smith, the author of the book Docker Orchestration, we will see how to use Docker Swarm to orchestrate a Docker cluster. Docker Swarm is the native orchestration tool for Docker. It was rolled into the core docker suite with version 1.12. At its simplest, using Docker Swarm is just like using Docker. All of the tools that have been covered still work. Swarm adds a couple of features that make deploying and updating services very nice. (For more resources related to this topic, see here.) Setting up a Swarm The first step into running a Swarm is to have a number of hosts ready with Docker installed. It does not matter if you use the install script from get.docker.com or if you use Docker Machine. You also need to be sure that a few ports are open between the servers, as given here: 2377 tcp: cluster management 7946 tcp and udp: node communication 4789 tcp and udp: overlay network Take a moment to get or assign a static IP address to the hosts that will be the swarm managers. Each manager must have a static IP address so that workers know how to connect to them. Worker addresses can be dynamic but the IP of the manager must be static. As you plan your swarm, take a moment to decide how many managers you are going to need. The minimum number to run and still maintain fault tolerance is three. For larger clusters, you many need as many as five or seven. Very rarely will you need more than that. In any case, the number of managers should be odd. Docker Swarm can maintain a quorum as long as 50% + 1 managers are running. Having two or four managers provides no additional fault tolerance than one or three. If possible, you should spread your managers out so that a single failure will not take down your swarm. For example, make sure that they are not all running on the same VM host or connected to the same switch. The whole point is to have multiple managers to keep the swarm running in the event of a manager failure. Do not undermine your efforts by allowing a single point of failure somewhere else taking down your swarm. Initializing a Swarm If you are installing from your desktop, it may be better to use Docker Machine to connect to the host as this will set up the necessary TLS keys. Run the docker swarm init command to initialize the swarm. The --advertize-addr flag is optional. By default, it will guess an address on the host. This may not be correct. To be sure, set it to the static IP address you have for the manager host:     $ docker swarm init --advertise-addr 172.31.26.152 Swarm initialized: current node (6d6e6kxlyjuo9vb9w1uug95zh) is now a manager. To add a worker to this swarm, run the following command: docker swarm join --token SWMTKN-1-0061dt2qjsdr4gabxryrksqs0b8fnhwg6bjhs8cxzen7tmarbi-89mmok3p f6dsa5n33fb60tx0m 172.31.26.152:2377 To add a manager to this swarm, run 'docker swarm join-token manager' and follow the instructions. Included in the output of the init command is the command to run on each worker host. It is also possible to run it from your desktop if you have set your environment to use the worker host. You can save the command somewhere, but you can always get it again by running docker swarm join-token manager, where manager is the name of a swarm manager. As part of the join process, Docker Swarm will create TLS keys, which will be used to encrypt communication between the managers and the hosts. It will not, however, encrypt network traffic between containers. The certificates are updated every three months. The time period can be updated by running docker swarm update --cert-expiry duration. The duration is specified as number of hours and minutes. For example, setting the duration to 1000h will tell Docker Swarm to reset the certificate every 1,000 hours. Managing a Swarm One of the easiest pieces to overlook when it comes to orchestrating Docker is how to manage the cluster itself. In this section, you will see how to manage your swarm including adding and removing nodes, changing their availability, and backup and recovery. Adding a node New worker nodes can be added at any time. Install Docker, then run docker swarm join-token worker to get the docker swarm join command to join the host to the swarm. Once added, the worker will be available to run tasks. Take note that the command to join the cluster is consistent. This makes it easy to add to a host configuration script and join the swarm on boot. You can get a list of all of the nodes in the swarm by running docker node ls: $ docker node ls ID HOSTNAME STATUS AVAILABILITY MANAGER STATUS 1i3wtacdjz5p509bu3l3qfbei worker1 Ready Active Reachable 3f2t4wwwthgcahs9089b8przv worker2 Ready Active Reachable 6d6e6kxlyjuo9vb9w1uug95zh * manager Ready Active Leader The list will not only show the machines but it will also show if they are active in the swarm and if they are a manager. The node marked as Leader is the master of the swarm. It coordinates the managers. Promoting and demoting nodes In Docker Swarm, just like in real life, managers are also workers. This means that managers can run tasks. It also means that workers can be promoted to become managers. This can be useful to quickly replace a failed manager or to seamlessly increase the number of managers. The following command promotes the node named worker1 to be a manager: $ docker node promote worker1 Node worker1 promoted to a manager in the swarm. Managers can also be demoted to become plain workers. This should be done before a manager node is going to be decommissioned. The following command demotes worker1 back to a plain worker: $ docker node demote worker1 Manager worker1 demoted in the swarm. Whatever your reasons for promoting or demoting node be, make sure that when you are done, there are an odd number of managers. Changing node availability Docker Swarm nodes have a concept of availability. The availability of a node determines whether or not tasks can be scheduled on that node. Use the docker node update --availability <state><node-id>command to set the availability state. There are three availability states that can be set—pause, drain, and active. Pausing a node Setting a node's availability to pause will prevent the scheduler from assigning new tasks to the node. Existing tasks will continue to run. This can be useful for troubleshooting load issues on a node or for preventing new tasks from being assigned to an already overloaded node. The following command pauses worker2: $ docker node update --availability pause worker2 worker2 $ docker node ls ID HOSTNAME STATUS AVAILABILITY MANAGER STATUS 1i3wtacdjz5p509bu3l3qfbei worker1 Ready Active Reachable 3f2t4wwwthgcahs9089b8przv worker2 Ready Pause Reachable 6d6e6kxlyjuo9vb9w1uug95zh * manager Ready Active Leader Do not rely on using pause to deal with overload issues. It is better to place reasonable resource limits on your services and let the scheduler figure things out for you. You can use pause to help determine what the resource limits should be. For example, you can start a task on a node, then pause the node to prevent new tasks from running while you monitor resource usage. Draining a node Like pause, setting a node's availability to drain will stop the scheduler from assigning new tasks to the node. In addition, drain will stop any running tasks and reschedule them to run elsewhere in the swarm. The drain mode has two common purposes. First, it is useful for preparing a node for an upgrade. Containers will be stopped and rescheduled in an orderly fashion. Updates can then be applied and the node rebooted, if necessary, without further disruption. The node can be set to active again once the updates are complete. Remember that, when draining a node, running containers have to be stopped and restarted elsewhere. This can cause disruption if your applications are not built to handle failure. The great thing about containers is that they start quickly, but some services, such as MySQL, take a few seconds to initialize. The second use of drain is to prevent services from being scheduled on manager nodes. Manager processing is very reliant on messages being passed in a timely manner. An unconstrained task running on a manager node can cause a denial of service outage for the node causing problems for your cluster. It is not uncommon to leave managers nodes in a drain state permanently. The following command will drain the node named manager: $ docker node update --availability drain manager manager $ docker node ls ID HOSTNAME STATUS AVAILABILITY MANAGER STATUS 1i3wtacdjz5p509bu3l3qfbei worker1 Ready Active Reachable 3f2t4wwwthgcahs9089b8przv worker2 Ready Active Reachable 6d6e6kxlyjuo9vb9w1uug95zh * manager Ready Drain Leader Activating a node When a node is ready to accept tasks again, set the state to active. Do not be concerned if the node does not immediately fill up with containers. Tasks are only assigned when a new scheduling event happens, such as starting a service. The following command will reactivate the worker2 node: $ docker node update --availability active worker2 worker2 $ docker node ls ID HOSTNAME STATUS AVAILABILITY MANAGER STATUS 1i3wtacdjz5p509bu3l3qfbei worker1 Ready Active Reachable 3f2t4wwwthgcahs9089b8przv worker2 Ready Active Reachable 6d6e6kxlyjuo9vb9w1uug95zh * manager Ready Active Leader Removing nodes Nodes may need to be removed for a variety of reasons including upgrades, failures, or simply eliminating capacity that is no longer needed. For example, it may be easier to upgrade nodes by building new ones rather than performing updates on the old nodes. The new node will be added and the old one removed. Step one for a healthy node is to set the availability to drain. This will ensure that all scheduled tasks have been stopped and moved to other nodes in the swarm. Step two is to run docker swarm leave from the node that will be leaving the swarm. This will assign the node a Down status. In the following example, worker2 has left the swarm: $ docker node ls ID HOSTNAME STATUS AVAILABILITY MANAGER STATUS 1i3wtacdjz5p509bu3l3qfbei worker1 Ready Active Reachable 3f2t4wwwthgcahs9089b8przv worker2 Down Active 6d6e6kxlyjuo9vb9w1uug95zh * manager Ready Active Leader If the node that is being removed is a manager, it must first be demoted to a worker as described earlier before running docker swarm leave. When removing managers, take care that you do not lose quorum or your cluster will stop working. Once the node has been marked Down, it can be removed from the swarm. From a manager node, use docker node rm to remove the node from the swarm: $ docker node rm worker2 In some cases, the node that you want to remove was unreachable so that it was not possible to run docker swarm leave. When that happens, use the --force option. $ docker node rm --force worker2 Nodes that have been removed can be re-added with docker swarm join. Backup and recovery Just because Docker Swarm is resilient to failure does not mean that you should ignore backups. Good backups may be the difference between restoring services and a resume altering event. There are two major components to backup—the Swarm state data and your application data. Backing up the Swarm Each manager keeps the cluster information in /var/lib/docker/swarm/raft. If, for some reason, you need to completely rebuild your cluster, you will need this data. Make sure you have at least one good backup of the raft data. It does not matter which of the managers the backups are pulled from. It might be wise to pull backups from a couple of managers, just in case one is corrupted. Recovering a Swarm In most cases, losing a failed manager is an easy fix. Restart the failed manager and everything should be good. It may be necessary to build a new manager or promote an existing worker to a manager. In most circumstances, this will bring your cluster back into a healthy state. If you lose enough managers to lose a quorum, recovery gets more complex. The first step is to start enough managers to restore quorum. The data should be synced to the new managers, and once quorum is recovered, so is the cluster. If that does not work, you will have to rebuild the swarm. This is where your backups come in. If the manager node you choose to rebuild on has an otherwise healthy raft database, you can start there. If not, or if you are rebuilding on a brand new node, stop Docker and copy the raft data back to /var/lib/docker/swarm/raft. After the raft data is in place, ensure that Docker is running and run the following command: $ docker swarm init --force-new-cluster --advertise-addr manager The address set in with --advertise-addr has the same meaning as what was used to create the swarm initially. The magic here is the --force-new-cluster option. This option will ignore the swarm membership data that is the raft database, but will remember things such as the worker node list, running services, and tasks. Backing up services Service information is backed up as part of the raft database, but you should have a plan to rebuild them in case the database becomes corrupted. Backing up the output of docker swarm ls is a start. Application information, including networks, and volumes may be sourced for Docker Compose files, which should be backed up. The containers themselves and your application should be in version control and backed up. Most importantly, do not forget your data. If you have your Docker files and the application code, the applications can be rebuilt even if the registry is lost. In some cases, it is a valid choice to not backup the registry since the images can, potentially, be rebuilt. The data, however, usually cannot be. Have a strategy in place for backing up the data that works for your environment. I suggest creating a container for each application that is deployed that can properly backup data for the application. Docker Swarm does not have a native scheduled task option, but you can configure cron on a worker node that runs the various backup containers on a schedule. The pause availability option can be helpful here. Configure a worker node that will be your designated host to pull backups. Set the availability to pause so that other containers are not started on the node and resources are available to perform the backups. Using pause means that containers can be started in the background and will continue to run after the node is paused, allowing them to finish normally. Then cron can run a script that looks something like the following one. The contents of run-backup-containers is left as an exercise for the reader: #!/bin/bash docker node update --availability active backupworker run-backup-containers docker node update --availability pause You can also label to designate multiple nodes for backups and schedule services to ignore those nodes, or in the case of backup containers, run on them. Managing services Now that the swarm is up and running, it is time to look at services. A service is a collection of one or more tasks that do something. Each task is a container running somewhere in the swarm. Since services are potentially composed of multiple tasks, there are different tools to manage them. In most cases, these commands will be a subcommand of docker service. Running services Running tasks with Docker Swarm is a little bit different than running them under plain Docker. Instead of using docker run, the command is docker service create: $ docker service create --name web nginx When a service starts, a swarm manager schedules the tasks to run on active workers in the swarm. By default, Swarm will spread running containers across all of the active hosts in the swarm. Offering services to the Internet requires publishing ports with the -p flag. Multiple ports can be opened by specifying -p multiple times. When using the swarm overlay network, you also get ingress mesh routing. The mesh will route connections from any host in the cluster to the service no matter where it is running. Port publishing will also load balance across multiple containers: $ docker service create --name web -p 80 -p 443 nginx Creating replicas A service can be started with multiple containers using the --replicas option. The value is the number of desired replicas. It may take a moment to start all the desired replicas: $ docker service create --replicas 2 --name web -p 80 -p 443 nginx This example starts two copies of the nginx container under the service name web. The great news is that you can change the number of replicas at any time: $ docker service update --replicas 3 web Even better, service can be scaled up or down. This example scales the service up to three containers. It is possible to scale the number down later once the replicas are no longer needed. Use docker service ls to see a summary of running services and the number of replicas for each: $ docker service ls ID NAME REPLICAS IMAGE COMMAND 4i3jsbsohkxj web 3/3 nginx If you need to see the details of a service, including where tasks are running, use the docker service ps command. It takes the name of the service as an argument. This example shows three nginx tasks that are part of the service web: $ docker service ps web ID NAME IMAGE NODE DESIRED STATE CURRENT STATE ERROR d993z8o6ex6wz00xtbrv647uq web.1 nginx worker2 Running Running 31 minutes ago eueui4hw33eonsin9hfqgcvd7 web.2 nginx worker1 Running Preparing 4 seconds ago djg5542upa1vq4z0ycz8blgfo web.3 nginx worker2 Running Running 2 seconds ago   Take note of the name of the tasks. If you were to connect to worker1 and try to use docker exec to access the web.2 container, you will get an error: $ docker exec -it web.2 bash Error response from daemon: No such container: web.2 Tasks started with the docker service are named with the name of the service, a number, and the ID of the task separated by dots. Using docker ps on worker1, you can see the actual name of the web.2 container: $ docker ps CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES b71ad831eb09 nginx:latest "nginx -g 'daemon off" 2 days ago Up 2 days 80/tcp, 443/tcp web.2.eueui4hw33eonsin9hfqgcvd7 Running global services There may be times when you want to run a task on every active node in the swarm. This is useful for monitoring tools: $ docker service create --mode global --name monitor nginx $ docker service ps monitor ID NAME IMAGE NODE DESIRED STATE CURRENT S TATE ERROR daxkqywp0y8bhip0f4ocpl5v1 monitor nginx worker2 Running Running 6 seconds ago a45opnrj3dcvz4skgwd8vamx8 _ monitor nginx worker1 Running Running 4 seconds ago It is important to reiterate that global services only run on active nodes. The task will not start on nodes that have the availability set to pause or drain. If a paused or drained node is set to active, the global service will be started on that node immediately. $ docker node update --availability active manager manager $ docker service ps monitor ID NAME IMAGE NODE DESIRED STATE CURRENT S TATE ERROR 0mpe2zb0mn3z6fa2ioybjhqr3 monitor nginx manager Running Preparing 3 seconds ago daxkqywp0y8bhip0f4ocpl5v1 _ monitor nginx worker2 Running Running 3 minutes ago a45opnrj3dcvz4skgwd8vamx8 _ monitor nginx worker1 Running Running 3 minutes ago Setting constraints It is often useful to limit which nodes a service can run on. For example, a service that might be dependent on fast disk might be limited to nodes that have SSDs. Constraints are added to the docker service create command with the --constraint flag. Multiple constraints can be added. The result will be the intersection of all of the constraints. For this example, assume that there exists a swarm with three nodes—manager, worker1, and worker2. The worker1 and worker2nodes have the env=prod label while manager has the env=dev label. If a service is started with the constraint that env is dev, it will only run service tasks on the manager node. $ docker service create --constraint "node.labels.env == dev" --name web-dev --replicas 2 nginx 913jm3v2ytrpxejpvtdkzrfjz $ docker service ps web-dev ID NAME IMAGE NODE DESIRED STATE CURRENT STATE ERROR 5e93fl10k9x6kq013ffotd1wf web-dev.1 nginx manager Running Running 6 seconds ago 5skcigjackl6b8snpgtcjbu12 web-dev.2 nginx manager Running Running 5 seconds ago Even though there are two other nodes in the swarm, the service is only running on the manager because it is the only node with the env=dev label. If another service was started with the constraint that env is prod, the tasks will start on the worker nodes: $ docker service create --constraint "node.labels.env == prod" --name web-prod --replicas 2 nginx 88kfmfbwksklkhg92f4fkcpwx $ docker service ps web-prod ID NAME IMAGE NODE DESIRED STATE CURRENT STATE ERROR 5f4s2536g0bmm99j7wc02s963 web-prod.1 nginx worker2 Running Running 3 seconds ago 5ogcsmv2bquwpbu1ndn4i9q65 web-prod.2 nginx worker1 Running Running 3 seconds ago The constraints will be honored if the services are scaled. No matter how many replicas are requested, the containers will only be run on nodes that match the constraints: $ docker service update --replicas 3 web-prod web-prod $ docker service ps web-prod ID NAME IMAGE NODE DESIRED STATE CURRENT STATE ERROR 5f4s2536g0bmm99j7wc02s963 web-prod.1 nginx worker2 Running Running about a minute ago 5ogcsmv2bquwpbu1ndn4i9q65 web-prod.2 nginx worker1 Running Running about a minute ago en15vh1d7819hag4xp1qkerae web-prod.3 nginx worker1 Running Running 2 seconds ago As you can see from the example, the containers are all running on worker1 and worker2. This leads to an important point. If the constraints cannot be satisfied, the service will be started but no containers will actually be running: $ docker service create --constraint "node.labels.env == testing" --name web-test --replicas 2 nginx 6tfeocf8g4rwk8p5erno8nyia $ docker service ls ID NAME REPLICAS IMAGE COMMAND 6tfeocf8g4rw web-test 0/2 nginx 88kfmfbwkskl web-prod 3/3 nginx 913jm3v2ytrp web-dev 2/2 nginx Notice that the number of replicas requested is two but the number of containers running is zero. Swarm cannot find a suitable node so it does not start the containers. If a node with the env=testing label were to be added or if that label were to be added to an existing node, swarm would immediately schedule the tasks: $ docker node update --label-add env=testing worker1 worker1 $ docker service ps web-test ID NAME IMAGE NODE DESIRED STATE CURRENT STATE ERROR 7hsjs0q0pqlb6x68qos19o1b0 web-test.1 nginx worker1 Running Running 18 seconds ago dqajwyqrah6zv83dsfqene3qa web-test.2 nginx worker1 Running Running 18 seconds ago In this example, the env label was changed to testing from prod on worker1. Since a node is now available that meets the constraints for the web-test service, swarm started the containers on worker1. However, the constraints are only checked when tasks are scheduled. Even though worker1 no longer has the env label set to prod, the existing containers for the web-prod service are still running: $ docker node ps worker1 ID NAME IMAGE NODE DESIRED STATE CURRENT STATE ERROR 7hsjs0q0pqlb6x68qos19o1b0 web-test.1 nginx worker1 Running Running 5 minutes ago dqajwyqrah6zv83dsfqene3qa web-test.2 nginx worker1 Running Running 5 minutes ago 5ogcsmv2bquwpbu1ndn4i9q65 web-prod.2 nginx worker1 Running Running 21 minutes ago en15vh1d7819hag4xp1qkerae web-prod.3 nginx worker1 Running Running 20 minutes ago Stopping services All good things come to an end and this includes services running in a swarm. When a service is no longer needed, it can be removed with docker service rm. When a service is removed, all tasks associated with that service are stopped and the containers removed from the nodes they were running on. The following example removes a service named web: $ docker service rm web Docker makes the assumption that the only time services are stopped is when they are no longer needed. Because of this, there is no docker service analog to the docker stop command. This might not be an issue since services are so easily recreated. That said, I have run into situations where I have needed to stop a service for a short time for testing and did not have the command at my fingertips to recreate it. The solution is very easy but not necessarily obvious. Rather than stopping the service and recreating it, set the number of replicas to zero. This will stop all running tasks and they will be ready to start up again when needed. $ docker service update --replicas 0 web web $ docker service ls ID NAME REPLICAS IMAGE COMMAND 5gdgmb7afupd web 0/0 nginx The containers for the tasks are stopped, but remain on the nodes. The docker service ps command will show that the tasks for the service are all in the Shutdown state. If needed, one can inspect the containers on the nodes: $ docker service ps web ID NAME IMAGE NODE DESIRED STATE CURRENT STATE ERROR 68v6trav6cf2qj8gp8wgcbmhu web.1 nginx worker1 Shutdown Shutdown 3 seconds ago 060wax426mtqdx79g0ulwu25r web.2 nginx manager Shutdown Shutdown 2 seconds ago 79uidx4t5rz4o7an9wtya1456 web.3 nginx worker2 Shutdown Shutdown 3 seconds ago When it is time to bring the service back, use docker swarm update to set the number of replicas back to what is needed. Swarm will start the containers across in the swarm just as if you had used docker swarm create. Upgrading a service with rolling updates It is likely that services running in a swarm will need to be upgraded. Traditionally, upgrades involved stopping a service, performing the upgrade, then restarting the service. If everything goes well, the service starts and works as expected and the downtime is minimized. If not, there can be an extended outage as the administrator and developers debug what went wrong and restore the service. Docker makes it easy to test new images before they are deployed and one can be confident that the service will work in production just as it did during testing. The question is, how does one deploy the upgraded service without a noticeable downtime for the users? For busy services, even a few seconds of downtime can be problematic. Docker Swarm provides a way to update services in the background with zero downtime. There are three options which are passed to docker service create that control how rolling updates are applied. These options can also be changed after a service has been created with docker service update. The options are as follows: --update-delay: This option sets the delay between each container upgrade. The delay is defined by a number of hours, minutes, and seconds indicated by a number followed by h, m, or s, respectively. For example, a 30 second delay will be written as 30s. A delay of 1 hour, 30 minutes, and 12 seconds will be written as 1h30m12s. --update-failure-action: This tells swarm how to handle an upgrade failure. By default, Docker Swarm will pause the upgrade if a container fails to upgrade. You can configure swarm to continue even if a task fails to upgrade. The allowed values are pause and continue. --update-parallelism: This tells swarm how many tasks to upgrade at one time. By default, Docker Swarm will only upgrade one task at a time. If this is set to 0 (zero), all running containers will be upgraded at once. For this example, a service named web is started with six replicas using the nginx:1.10 image. The service is configured to update two tasks at a time and wait 30 seconds between updates. The list of tasks from docker service ps shows that all six tasks are running and that they are all running the nginx:1.10 image: $ docker service create --name web --update-delay 30s --update-parallelism 2 --replicas 6 nginx:1.10 $ docker service ps web ID NAME IMAGE NODE DESIRED STATE CURRENT STATE ERROR 83p4vi4ryw9x6kbevmplgrjp4 web.1 nginx:1.10 worker2 Running Running 20 seconds ago 2yb1tchas244tmnrpfyzik5jw web.2 nginx:1.10 worker1 Running Running 20 seconds ago f4g3nayyx5y6k65x8n31x8klk web.3 nginx:1.10 worker2 Running Running 20 seconds ago 6axpogx5rqlg96bqt9qn822rx web.4 nginx:1.10 worker1 Running Running 20 seconds ago 2d7n5nhja0efka7qy2boke8l3 web.5 nginx:1.10 manager Running Running 16 seconds ago 5sprz723zv3z779o3zcyj28p1 web.6 nginx:1.10 manager Running Running 16 seconds ago Updates have started using the docker service update command and specifying a new image. In this case, the service will be upgraded from nginx:1.10 to nginx:1.11. Rolling updates work by stopping a number of tasks defined by --update-parallelism and starting new tasks based on the new image. Swarm will then wait until the delay set by --update-delay elapses before upgrading the next tasks: $ docker service update --image nginx:1.11 web When the updates begin, two tasks will be updated at a time. If the image is not found on the node, it will be pulled from the registry, slowing down the update. You can speed up the process by writing a script to pull the new image on each node before you run the update. The update process can be monitored by running docker service inspect or docker service ps: $ docker service inspect --pretty web ID: 4a60v04ux70qdf0fzyf3s93er Name: web Mode: Replicated Replicas: 6 Update status: State: updating Started: about a minute ago Message: update in progress Placement: UpdateConfig: Parallelism: 2 Delay: 30s On failure: pause ContainerSpec: Image: nginx:1.11 Resources: $ docker service ps web ID NAME IMAGE NODE DESIRED STATE CURRENT STATE ERROR 83p4vi4ryw9x6kbevmplgrjp4 web.1 nginx:1.10 worker2 Running Running 35 seconds ago 29qqk95xdrb0whcdy7abvji2p web.2 nginx:1.11 manager Running Preparing 2 seconds ago 2yb1tchas244tmnrpfyzik5jw _ web.2 nginx:1.10 worker1 Shutdown Shutdown 2 seconds ago f4g3nayyx5y6k65x8n31x8klk web.3 nginx:1.10 worker2 Running Running 35 seconds ago 6axpogx5rqlg96bqt9qn822rx web.4 nginx:1.10 worker1 Running Running 35 seconds ago 3z6ees2748tqsoacy114ol183 web.5 nginx:1.11 worker1 Running Running less than a second ago 2d7n5nhja0efka7qy2boke8l3 _ web.5 nginx:1.10 manager Shutdown Shutdown 1 seconds ago 5sprz723zv3z779o3zcyj28p1 web.6 nginx:1.10 manager Running Running 30 seconds ago As the upgrade starts, the web.2 and web.3tasks have been updated and are now running nginx:1.11. The others are still running the old version. Every 30 seconds, two more tasks will be upgraded until the entire service is running nginx:1.11. Docker Swarm does not care about the version that is used on the image tag. This means that you can just as easily downgrade the service. In this case, if the docker service update --image nginx:1.10 web command were to be run after the upgrade, Swarm will go through the same update process until all tasks are running nginx:1.10. This can be very helpful if an upgrade does work as it was supposed to. It also important to note that there is nothing that says that the new image has to be the same base as the old one. You can decide to run Apache instead of Nginx by running docker service update --image httpd web. Swarm will happily replace all the web tasks which were running an nginx image with one running the httpd image. Rolling updates require that your service is able to run multiple containers in parallel. This may not work for some services, such as SQL databases. Some updates may require a schema change that is incompatible with the older version. In either case, rolling updates may not work for the service. In those cases, you can set --update-parallelism 0 to force all tasks to update at once or manually recreate the service. If you are running a lot of replicas, you should pre-pull your image to ease the load on your registry. Summary In this article, you have seen how to use Docker Swarm to orchestrate a Docker cluster. The same tools that are used with a single node can be used with a swarm. Additional tools available through swarm allow for easy scale out of services and rolling updates with little to no downtime. Resources for Article: Further resources on this subject: Introduction to Docker [article] Docker in Production [article] Docker Hosts [article]
Read more
  • 0
  • 1
  • 10212

article-image-exploring-themes
Packt
14 Jan 2016
10 min read
Save for later

Exploring Themes

Packt
14 Jan 2016
10 min read
Drupal developers and interface engineers do not always create custom themes from scratch. Sometimes, we are asked to create starter themes that we begin any project from or subthemes that extend the functionality of a base theme. Having the knowledge of how to handle each of these situations is important. In this article by Chaz Chumley, the author of Drupal 8 Theming with Twig, we will be discussing some starter themes and how to work around the various libraries available to us. (For more resources related to this topic, see here.) Starter themes Any time we begin developing in Drupal, it is preferable to have a collection of commonly used functions and libraries that we can reuse. Being able to have a consistent starting point when creating multiple themes means we don't have to rethink much from design to design. The concept of a starter theme makes this possible, and we will walk through the steps involved in creating one. Before we begin, take a moment to use the drupal8.sql file that we already have with us to restore our current Drupal instance. This file will add the additional content and configuration required while creating a starter theme. Once the restore is complete, our home page should look like the following screenshot: This is a pretty bland-looking home page with no real styling or layout. So, one thing to keep in mind when first creating a starter theme is how we want our content to look. Do we want our starter theme to include another CSS framework, or do we want to create our own from scratch? Since this is our first starter theme, we should not be worried about reinventing the wheel but instead leverage an existing CSS framework, such as Twitter Bootstrap. Creating a Bootstrap starter Having an example or mockup that we can refer to while creating a starter theme is always helpful. So, to get the most out of our Twitter Bootstrap starter, let's go over to http://getbootstrap.com/examples/jumbotron/, where we will see an example of a home page layout: If we take a look at the mockup, we can see that the layout consists of two rows of content, with the first row containing a large callout known as a Jumbotron. The second row contains three featured blocks of content. The remaining typography and components take advantage of the Twitter Bootstrap CSS framework to display content. One advantage of integrating the Twitter Bootstrap framework into our starter theme is that our markup will be responsive in nature. This means that as the browser window is resized, the content will scale down accordingly. At smaller resolutions, the three columns will stack on top of one another, enabling the user to view the content more easily on smaller devices. We will be recreating this home page for our starter theme, so let's take a moment and familiarize ourselves with some basic Bootstrap layout terminology before creating our theme. Understanding grids and columns Bootstrap uses a 12-column grid system to structure content using rows and columns. The page layout begins with a parent container that wraps all child elements and allows you to maintain a specific page width. Each row and column then has CSS classes identifying how the content should appear. So, for example, if we want to have a row with two equal-width columns, we would build our page using the following markup: <div class="container">     <div class="row">         <div class="col-md-6"></div>         <div class="col-md-6"></div>     </div> </div> The two columns within a row must combine to a value of 12, since Bootstrap uses a 12-column grid system. Using this simple math, we can have variously sized columns and multiple columns, as long as their total is 12. We should also take notice of these following column classes, as we have great flexibility in targeting different breakpoints: Extra small (col-xs-x) Small (col-sm-x) Medium (col-md-x) Large (col-lg-x) Each breakpoint references the various devices, from smartphones all the way up to television-size monitors. We can use multiple classes like  class="col-sm-6 col-md-4" to manipulate our layout, which gives us a two-column row on small devices and a three-column row on medium devices when certain breakpoints are reached. To get a more detailed understanding of the remaining Twitter Bootstrap documentation, you can go to http://getbootstrap.com/getting-started/ any time. For now, it's time we begin creating our starter theme. Setting up a theme folder The initial step in our process of creating a starter theme is fairly simple: we need to open up Finder or Windows Explorer, navigate to the themes folder, and create a folder for our theme. We will name our theme tweet, as shown in the following screenshot: Adding a screenshot Every theme deserves a screenshot, and in Drupal 8, all we need to do is have a file named screenshot.png, and the Appearance screen will use it to display an image above our theme. Configuring our theme Next, we will need to create our theme configuration file, which will allow our theme to be discoverable. We will only worry about general configuration information to start with and then add library and region information in the next couple of steps. Begin by creating a new file called tweet.info.yml in your themes/tweet folder, and add the following metadata to your file: name: Tweet type: theme description: 'A Twitter Bootstrap starter theme' core: 8.x base theme: false Notice that we are setting the base theme configuration to false. Setting this value to false lets Drupal know that our theme will not rely on any other theme files. This allows us to have full control of our theme's assets and Twig templates. We will save our changes at this juncture and clear the Drupal cache. Now we can take a look to see whether our theme is available to install. Installing our theme Navigate to /admin/appearance in your browser and you should see your new theme located in the Uninstalled themes section. Go ahead and install the theme by clicking on the Install and set as default link. If we navigate to the home page, we should see an unstyled home page: This clean palate is perfect while creating a starter theme, as it allows us to begin theming without worrying about overriding any existing markup that a base theme might include. Working with libraries While Drupal 8 ships with some improvements to its default CSS and JavaScript libraries, we will generally find ourselves wanting to add additional third-party libraries that can enhance the function and feel of our website. In our case, we have decided to add Twitter Bootstrap (http://getbootstrap.com), which provides us with a responsive CSS framework and JavaScript library that utilize a component-based approach to theming. The process actually involves three steps. The first is downloading or installing the assets that make up the framework or library. The second is creating a *.libraries.yml file and adding library entries that point to our assets. Finally, we will need to add a libraries reference to our *.info.yml file. Adding assets We can easily add Twitter Bootstrap framework assets by following these steps: Navigate to http://getbootstrap.com/getting-started/#download Click on the Download Bootstrap button Extract the zip file Copy the contents of the bootstrap folder to our themes/tweet folder Once we are done, our themes/tweet folder content should look like the following screenshot: Now that we have the Twitter Bootstrap assets added to our theme, we need to create a *.libraries.yml file that we can use to reference our assets. Creating a library reference Any time we want to add CSS or JS files to our theme, we will either need to create or modify an existing *.libraries.yml file that allows us to organize our assets. Each library entry can include one to multiple pointers to the file and location within our theme structure. Remember that the filename of our *.libraries.yml file should follow the same naming convention as our theme. We can begin by following these steps: Create a new file called tweet.libraries.yml. Add a library entry called bootstrap. Add a version that reflects the current version of Bootstrap that we are using. Add the CSS entry for bootstrap.min.css and bootstrap-theme.min.css. Add the JS entry for bootstrap.min.js. Add a dependency to jQuery located in Drupal's core: bootstrap:   version: 3.3.6   css:     theme:       css/bootstrap.min.css: {}       css/bootstrap-theme.min.css: {}     js:       js/bootstrap.min.js     dependencies:       - core/jquery Save tweet.libraries.yml. In the preceding library entry, we have added both CSS and JS files as well as introduced dependencies. Dependencies allow any JS file that relies on a specific JS library to make sure that the file can include the library as a dependency, which makes sure that the library is loaded before our JS file. In the case of Twitter Bootstrap, it relies on jQuery, and since Drupal 8 has it as part of its core.libraries.yml file, we can reference it by pointing to that library and its entry. Including our library Just because we added a library to our theme doesn't mean it will automatically be added to our website. In order for us to add Bootstrap to our theme, we need to include it in our tweet.info.yml configuration file. We can add Bootstrap by following these steps: Open tweet.info.yml Add a libraries reference to bootstrap to the bottom of our configuration: libraries:   - tweet/bootstrap Save tweet.info.yml. Make sure to clear Drupal's cache to allow our changes to be added to the theme registry. Finally, navigate to our home page and refresh the browser so that we can preview our changes: If we inspect the HTML using Chrome's developer tools, we should see that the Twitter Bootstrap library has been included along with the rest of our files. Both the CSS and JS files are being loaded into the proper flow of our document. Summary Whether a starter theme or subthemes, they are all just different variations of the same techniques. The level of effort required to create each type of theme may vary, but as we saw, there was a lot of repetition. We began with a discussion around starter themes and learned what steps were involved in working with libraries. Resources for Article: Further resources on this subject: Using JavaScript with HTML [article] Custom JavaScript and CSS and tokens [article] Concurrency Principles [article]
Read more
  • 0
  • 0
  • 10207

article-image-er-diagrams-domain-model-and-n-layer-architecture-aspnet-35-part1
Packt
20 Oct 2009
11 min read
Save for later

ER Diagrams, Domain Model, and N-Layer Architecture with ASP.NET 3.5 (part1)

Packt
20 Oct 2009
11 min read
Let us start with a 1-tier ASP.NET application configuration. Note that the application as a whole including database and client browser is three tier. We can call this 1-tier architecture a 3-tier architecture if we include the browser and database (if used). For the rest of this article we will ignore the database and browser as separate tiers so that we can focus on how to divide the main ASP.NET application layers logically, using the n-layer pattern to its best use. We will first try to separate the data access and logical code into their own separate layers and see how we can introduce flexibility and re-usability into our solution. We will understand this with a sample project. Before we go ahead into the technical details and code, we will first learn about two important terms: ER Diagram and Domain Model, and how they help us in getting a good understanding of the application we need to develop. Entity-Relationship Diagram Entity-Relationship diagrams, or ER diagrams in short, are graphical representations depicting relationships between different entities in a system. We humans understand and remember pictures or images more easily than textual information. When we first start to understand a project we need to see how different entities in the project relate to each other. ER diagrams help us achieve that goal by graphically describing the relationships. An entity can be thought of as an object in a system that can be identified uniquely. An entity can have attributes; an attribute is simply a property we can associate with an entity. For example, a Car entity can have the following attributes: EngineCapacity, NumberofGears, SeatingCapacity, Mileage, and so on. So attributes are basically fields holding data to indentify an entity. Attributes cannot exist without an entity. Let us understand ER diagrams in detail with a simple e-commerce example: a very basic Order Management System. We will be building a simple web based system to track customer's orders, and manage customers and products. To start with, let us list the basic entities for our simplified Order Management System (OMS): Customer: A person who can place Orders to buy Products. Order: An order placed by a Customer. There can be multiple Products bought by a Customer in one Order. Product: A Product is an object that can be purchased by a Customer. Category: Category of a Product. A Category can have multiple Products, and a Product can belong to many Categories. For example, a mixer-grinder can be under the Electronic Gadgets category as well as in Home Appliances. OrderLineItem: An Order can be for multiple Products. Each individual Product in an order will be encapsulated by an OrderLineItem. So an Order can have multiple OrderLineItems. Now, let us picture the relationship between the core business entities is defined using an Entity-Relationship diagram. Our ER diagram will show the relational associations between the entities from a database's perspective. So it is more of a relational model and will not show any of the object-oriented associations (for which we will use the Domain Model in the later sections of this article). In an ER diagram, we show entities using rectangular boxes, the relationships between entities using diamond boxes and attributes using oval boxes, as shown below: The purpose of using such shapes is to make the ER diagram clear and concise, depicting the relational model as closely as possible without using long sentences or text. So the Customer entity with some of the basic attributes can be depicted in an ER diagram as follows: Now, let us create an ER diagram for our Order Management System. For the sake of simplicity, we will not list the attributes of the entities involved. Here is how the ER diagram looks: The above ER diagram depicts the relationships between the OMS entities but is still incomplete as the relationships do not show how the entities are quantitatively related to each other. We will now look at how to quantify relationships using degree and cardinality. Degree and Cardinality of a Relationship The relationships in an ER diagram can also have a degree. A degree specifies the multiplicity of a relationship. In simpler terms, it refers to the number of entities involved in a relationship. All relationships in an OMS ER diagram have a degree of two, also called binary relationships. For example, in Customer-Order relationships only two entities are involved—Customer and Order; so it's a two degree relationship. Most relationships you come across would be binary. Another term associated with a relationship is cardinality. The cardinality of a relationship identifies the number of instances of entities involved in that particular relationship. For example, an Order can have multiple OrderLineItems, which means the cardinality of the relationship between Order and OrderLineItem is one-to-many. The three commonly-used cardinalities of a relationship are: One-to-one: Depicted as 1:1Example: One OrderLineItem can have only one Product; so the OrderLineItem and Product entities share a one-to-one relationship One-to-many: Depicted as 1:nExample: One customer can place multiple orders, so the Customer and Order entities share a one-to-many relationship Many-to-many: Depicted as n:mExample: One Product can be included in multiple Categories and one Category can contain multiple Products; therefore the Product and Category entities share a many-to-many relationship After adding the cardinality of the relationships to our ER diagram, here is how it will look: This basic ER diagrams tells us a lot about how the different entities in the system are related to each other, and can help new programmers to quickly understand the logic and the relationships of the system they are working on. Each entity will be a unique table in the database. OMS Project using 2-Layer We know that the default coding style in ASP.NET 2.0 already supports the 1-tier 1-layer style, with two sub-layers in the main UI layer as follows: Designer code files: ASPX markup files Code behind files: Files containing C# or VB.NET code Because both of these layers contain the UI code, we can include them as a part of the UI layer. These two layers help us to separate the markup and the code from each other. However, it is still not advisable to have logical code, such as data access or business logic, directly in these code-behind files. Now, one way to create an ASP.NET web application for our Order Management System (OMS) in just one layer is by using a DataSet (or DataReader) to fill the front-end UI elements directly in the code-behind classes. This will involve writing data access code in the UI layer (code-behind), and will tightly bind this UI layer with the data access logic, making the application rigid (inflexible), harder to maintain, and less scalable. In order to have greater flexibility, and to keep the UI layer completely independent of the data access and business logic code, we need to put these elements in separate files. So we will now try and introduce some loose-coupling by following a 2-layer approach this time. What we will do is, write all data access code in separate class files instead of using the code-behind files of the UI layer. This will make the UI layer independent of the data-access code. We are assuming that we do not have any specific business logic code at this point, or else we would have put that under another layer with its own namespace, making it a 3-layered architecture. We will examine this in the upcoming sections of this article. Sample Project Let us see how we can move from this 1-tier 1-layer style to a 1-tier 2-layer style. Using the ER diagram above as reference, we can create a 2-Layer architecture for our OMS with these layers: UI-layer with ASPX and code-behind classes Data access classes under a different namespace but in the same project So let's start with a new VS 2008 project. We will create a new ASP.NET Web Project in C#, and add a new web form, ProductList.aspx, which will simply display a list of all the products using a Repeater control. The purpose of this project is to show how we can logically break up the UI layer further by separating the data access code into another class file. The following is the ASPX markup of the ProductList page (unnecessary elements and tags have been removed to keep things simple): <asp:Repeater ID="prodRepeater" runat="server"> <ItemTemplate> Product Code: <%# Eval("Code")%> <br> Name: <%# Eval("Name")%> <br> Unit Price: $<%# Eval("UnitPrice")%> <br> </ItemTemplate></asp:Repeater> In this ASPX file, we only have a Repeater control, which we will bind with the data in the code-behind file. Here is the code in the ProductList.aspx.cs code-behind file: namespace OMS{public partial class _Default : System.Web.UI.Page { /// <summary> /// Page Load method /// </summary> /// <param name="sender"></param> /// <param name="e"></param> protected void Page_Load(object sender, EventArgs e) { DataTable dt = DAL.GetAllProducts(); prodRepeater.DataSource = dt; prodRepeater.DataBind(); } }//end class}//end namespace Note that we don't have any data access code in the code-behind sample above. We are just calling the GetAllProducts() method, which has all of data access code wrapped in a different class named DAL. We can logically separate out the code, by using different namespaces to achieve code re-use and greater architectural flexibility. So we created a new class named DAL under a different namespace from the UI layer code files. Here is the DAL code: namespace OMS.Code{ public class DAL { /// <summary> /// Load all comments from the Access DB /// </summary> public static DataTable GetAllProducts() { string sCon = ConfigurationManager.ConnectionStrings[0].ConnectionString; using (SqlConnection cn = new SqlConnection(sCon)) { string sQuery = @"SELECT * FROM OMS_Product"; SqlCommand cmd = new SqlCommand(sQuery, cn); SqlDataAdapter da = new SqlDataAdapter(cmd); DataSet ds = new DataSet(); cn.Open(); da.Fill(ds); return ds.Tables[0]; } } }//end class}//end namespace So we have separated the data access code in a new logical layer, using a separate namespace, OMS.Code, and using a new class. Now, if we want to, we can re-use the same code in the other pages as well. Furthermore, methods to add and edit a product can be defined in this class and then used in the UI layer. This allows multiple developers to work on the DAL and UI layers simultaneously. Even though we have a logical separation of the code in this 2-layer sample architecture, we are still not using real Object Oriented Programming (OOP). All of the Object-Oriented Programming we have used so far has been the default structure the .NET framework has provided, such as the Page class, and so on. When a project grows big in size as well as complexity, using the 2-layer model discussed above can become cumbersome and cause scalability and flexibility issues. If the project grows in complexity, then we will be putting all of the business logic code in either the DAL or the UI layer. This business logic code includes business rules. For example, if the customer orders a certain number of products in one order, he gets a certain level of discount. If we code such business rules in the UI layer, then if the rules change we need to change the UI as well, which is not ideal, especially in cases where we can have multiple UIs for the same code, for example one normal web browser UI and another mobile-based UI. We also cannot put business logic code in the DAL layer because the DAL layer should only contain data access code which should not be mixed with any kind of business processing logic. In fact the DAL layer should be quite "dumb"–there should be no "logic" inside it because it is mostly a utility layer which only needs to put data in and pull data out from a data store. To make our applications more scalable and to reap the benefit of OOP, we need to create objects, and wrap business behavior in their methods. This is where the Domain Model comes into the picture.
Read more
  • 0
  • 0
  • 10197

article-image-how-to-classify-digits-using-keras-and-tensorflow
Sugandha Lahoti
19 Feb 2018
13 min read
Save for later

How to Classify Digits using Keras and TensorFlow

Sugandha Lahoti
19 Feb 2018
13 min read
[box type="note" align="" class="" width=""]This article is an excerpt from a book written by Ankit Dixit titled Ensemble Machine Learning. This book provides a practical approach to building efficient machine learning models using ensemble techniques with real-world use cases.[/box] Today we will look at how we can create, train, and test a neural network to perform digit classification using Keras and TensorFlow. This article uses MNIST dataset with images of handwritten digits.It contains 60,000 training images and 10,000 testing images. Half of the training set and half of the test set were taken from NIST's training dataset, while the other half of the training set and the other half of the test set were taken from NIST's testing dataset. There have been a number of scientific papers on attempts to achieve the lowest error rate. One paper, by using a hierarchical system of CNNs, manages to get an error rate on the MNIST database of 0.23 percent. The original creators of the database keep a list of some of the methods tested on it. In their original paper, they used a support vector machine to get an error rate of 0.8 percent. Images in the dataset look like this: So let's not waste our time and start implementing our very first neural network in Python. Let’s start the code by importing the supporting projects. # Imports for array-handling and plotting import numpy as np import matplotlib import matplotlib.pyplot as plt Keras already has the MNIST dataset as a sample dataset, so we can import it as it is. Generally, it downloads the data over the internet and stores it into the database. So, if your system does not have the dataset, Internet will be required to download it: # Keras imports for the dataset and building our neural network from keras.datasets import mnist Now, we will import the Sequential and load_model classes from the keras.model class. We are working with sequential networks as all layers will be in forward sequence only. We are not using any split in the layers. The Sequential class will create a sequential model by combining the layers sequentially. The load_model class will help us to load the trained model for testing and evaluation purposes: #Import Sequential and Load model for creating and loading model from keras.models import Sequential, load_model In the next line, we will call three types of layers from the keras library. Dense layer means a fully connected layer; that is, each neuron of current layer will have a connection to the each neuron of the previous as well as next layer. The dropout layer is for reducing overfitting in our model. It randomly selects some neurons and does not use them for training for that iteration. So there are less chances that two different neurons of the same layer learn the same features from the input. By doing this, it prevents redundancy and correlation between neurons in the network, which eventually helps prevent overfitting in the network. The activation layer applies the activation function to the output of the neuron. We will use rectified linear units (ReLU) and the softmax function as the activation layer. We will discuss their operation when we use them in network creation: #We will use Dense, Drop out and Activation layers from keras.layers.core import Dense, Dropout, Activation from keras.utils import np_utils So we will start with loading our dataset by mnist.load. It will give us training and testing input and output instances. Then, we will visualize some instances so that we know what kind of data we are dealing with. We will use matplotlib to plot them. As the images have gray values, we can easily plot a histogram of the images, which can give us the pixel intensity distribution: #Let's Start by loading our dataset (X_train, y_train), (X_test, y_test) = mnist.load_data() #Plot the digits to verify plt.figure() for i in range(9): plt.subplot(3,3,i+1) plt.tight_layout() plt.imshow(X_train[i], cmap='gray', interpolation='none') plt.title("Digit: {}".format(y_train[i])) plt.xticks([]) plt.yticks([]) plt.show() When we execute  our code for the preceding code block, we will get the output as: #Lets analyze histogram of the image plt.figure() plt.subplot(2,1,1) plt.imshow(X_train[0], cmap='gray', interpolation='none') plt.title("Digit: {}".format(y_train[0])) plt.xticks([]) plt.yticks([]) plt.subplot(2,1,2) plt.hist(X_train[0].reshape(784)) plt.title("Pixel Value Distribution") plt.show() The histogram of an image will look like this: # Print the shape before we reshape and normalize print("X_train shape", X_train.shape) print("y_train shape", y_train.shape) print("X_test shape", X_test.shape) print("y_test shape", y_test.shape) Currently, this is shape of the dataset we have: X_train shape (60000, 28, 28) y_train shape (60000,) X_test shape (10000, 28, 28) y_test shape (10000,) As we are working with 2D images, we cannot train them as with our neural network. For training 2D images, there are different types of neural networks available; we will discuss those in the future. To remove this data compatibility issue, we will reshape the input images into 1D vectors of 784 values (as images have size 28X28). We have 60000 such images in training data and 10000 in testing: # As we have data in image form convert it to row vectors X_train = X_train.reshape(60000, 784) X_test = X_test.reshape(10000, 784) X_train = X_train.astype('float32') X_test = X_test.astype('float32') Normalize the input data into the range of 0 to 1 so that it leads to a faster convergence of the network. The purpose of normalizing data is to transform our dataset into a bounded range; it also involves relativity between the pixel values. There are various kinds of normalizing techniques available such as mean normalization, min-max normalization, and so on: # Normalizing the data to between 0 and 1 to help with the training X_train /= 255 X_test /= 255 # Print the final input shape ready for training print("Train matrix shape", X_train.shape) print("Test matrix shape", X_test.shape) Let's print the shape of the data: Train matrix shape (60000, 784) Test matrix shape (10000, 784) Now, our training set contains output variables as discrete class values; say, for an image of number eight, the output class value is eight. But our output neurons will be able to give an output only in the range of zero to one. So, we need to convert discrete output values to categorical values so that eight can be represented as a vector of zero and one with the length equal to the number of classes. For example, for the number eight, the output class vector should be: 8 = [0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0] # One-hot encoding using keras' numpy-related utilities n_classes = 10 print("Shape before one-hot encoding: ", y_train.shape) Y_train = np_utils.to_categorical(y_train, n_classes) Y_test = np_utils.to_categorical(y_test, n_classes) print("Shape after one-hot encoding: ", Y_train.shape) After one-hot encoding of our output, the variable’s shape will be modified as: Shape before one-hot encoding:  (60000,) Shape after one-hot encoding:     (60000, 10) So, you can see that now we have an output variable of 10 dimensions instead of 1. Now, we are ready to define our network parameters and layer architecture. We will start creating our network by creating a Sequential class object, model. We can add different layers to this model as we have done in the following code block. We will create a network of an input layer, two hidden layers, and one output layer. As the input layer is always our data layer, it doesn't have any learning parameters. For hidden layers, we will use 512 neurons in each. At the end, for a 10-dimensional output, we will use 10 neurons in the final layer: # Here, we will create model of our ANN # Create a linear stack of layers with the sequential model model = Sequential() #Input Layer with 512 Weights model.add(Dense(512, input_shape=(784,))) #We will use relu as Activation model.add(Activation('relu')) #Put Drop out to prevent over-fitting model.add(Dropout(0.2)) #Add Hidden layer with 512 neurons with relu activation model.add(Dense(512)) model.add(Activation('relu')) model.add(Dropout(0.2)) #This is our Output layer with 10 neurons model.add(Dense(10))model.add(Activation('softmax')) After defining the preceding structure, our neural network will look something like this: The Shape field in each layer shows the shape of the data matrix in that layer, and it is quite intuitive. As we first get the multiplication of input with length of 784 values to 512 neurons, the data shape at Hidden-1 will be 784 X 512. It will be calculated similarly for the other two layers. We have used two different kinds of activation functions here. The first one is ReLU and the second one is sofmax probabilities. We will give some time to discuss these two. ReLU prevent the output of the neuron from becoming negative. The expression for relu function is: So if any neuron produces an output less than 0, it converts it to 0. We can write it in conditional form as: You just need to know that ReLU is a slightly better activation function than sigmoid. If we plot a sigmoid function, it will look like: If you look closer, the sigmoid function starts getting saturated before reaching its minimum (0) or maximum (1) values. So at the time of gradient calculation, values in the saturated region result in a very small gradient. That causes a very small change in the weight values, which is not sufficient to optimize the cost function. Now, as we go more backward during the backpropagation, that small change becomes smaller and almost reaches zero. This problem is known as the problem of vanishing gradients. So, in practical cases, we avoid sigmoid activation when our network has many stacked layers. Whereas if we see the expression of ReLU activation, it is more like a straight line: So, the gradient of the preceding function will always a non-zero value until and unless the output itself is a zero value. Thus, it prevents the problem of vanishing gradients. We have discussed the significance of the dropout layer earlier and I don’t think that it is further required. We are using 20% neuron dropout during the training time. We will not use the dropout layer during the testing time. Now, we are all set to train our very first ANN, but before starting training, we have to define the values of the network hyperparameters. We will use SGD using adaptive momentum. There are many algorithms to optimize the performance of the SGD algorithm. You just need to know that adaptive momentum is a better choice than simple gradient descent because it modifies the learning rate using previous errors created by the network. So, there are less chances of getting trapped at the local minima or missing the global minima conditions. We are using SGD with ADAM, using its default parameters. Here, we use  batch_size of 128 samples. That means we will update the weights after calculating the error on these 128 samples. It is a sufficient batch size for our total data population. We are going to train our network for 20 epochs for the time being. Here, one epoch means one complete training cycle of all mini-batches. Now, let's start training our network: #Here we will be compiling the sequential model model.compile(loss='categorical_crossentropy', metrics=['accuracy'], optimizer='adam') # Start training the model and saving metrics in history history = model.fit(X_train, Y_train, batch_size=128, epochs=20, verbose=2, validation_data=(X_test, Y_test)) We will save our trained model on disk so that we can use it for further fine-tuning whenever required. We will store the model in the HDF5 file format: # Saving the model on disk path2save = 'E:/PyDevWorkSpaceTest/Ensembles/Chapter_10/keras_mnist.h5' model.save(path2save) print('Saved trained model at %s ' % path2save) # Plotting the metrics fig = plt.figure() plt.subplot(2,1,1) plt.plot(history.history['acc']) plt.plot(history.history['val_acc']) plt.title('model accuracy') plt.ylabel('accuracy') plt.xlabel('epoch') plt.legend(['train', 'test'], loc='lower right') plt.subplot(2,1,2) plt.plot(history.history['loss']) plt.plot(history.history['val_loss']) plt.title('model loss') plt.ylabel('loss') plt.xlabel('epoch') plt.legend(['train', 'test'], loc='upper right') plt.tight_layout() plt.show() Let's analyze the loss with each iteration during the training of our neural network; we will also plot the accuracies for validation and test set. You should always monitor validation and training loss as it can help you know whether your model is underfitting or overfitting: Test Loss 0.0824991761778 Test Accuracy 0.9813 As you can see, we are getting almost similar performance for our training and validation sets in terms of loss and accuracy. You can see how accuracy is increasing as the number of epochs increases. This shows that our network is learning. Now, we have trained and stored our model. It's time to reload it and test it with the 10000 test instances: #Let's load the model for testing data path2save = 'D:/PyDevWorkspace/EnsembleMachineLearning/Chapter_10/keras_mnist.h5' mnist_model = load_model(path2save) #We will use Evaluate function loss_and_metrics = mnist_model.evaluate(X_test, Y_test, verbose=2) print("Test Loss", loss_and_metrics[0]) print("Test Accuracy", loss_and_metrics[1]) #Load the model and create predictions on the test set mnist_model = load_model(path2save) predicted_classes = mnist_model.predict_classes(X_test) #See which we predicted correctly and which not correct_indices = np.nonzero(predicted_classes == y_test)[0] incorrect_indices = np.nonzero(predicted_classes != y_test)[0] print(len(correct_indices)," classified correctly") print(len(incorrect_indices)," classified incorrectly") So, here is the performance of our model on the test set: 9813  classified correctly 187  classified incorrectly As you can see, we have misclassified 187 instances out of 10000, which I think is a very good accuracy on such a complex dataset. In the next code block, we will analyze such cases where we detect false labels: #Adapt figure size to accomodate 18 subplots plt.rcParams['figure.figsize'] = (7,14) plt.figure() # plot 9 correct predictions for i, correct in enumerate(correct_indices[:9]): plt.subplot(6,3,i+1) plt.imshow(X_test[correct].reshape(28,28), cmap='gray', interpolation='none') plt.title( "Predicted: {}, Truth: {}".format(predicted_classes[correct], y_test[correct])) plt.xticks([]) plt.yticks([]) # plot 9 incorrect predictions for i, incorrect in enumerate(incorrect_indices[:9]): plt.subplot(6,3,i+10) plt.imshow(X_test[incorrect].reshape(28,28), cmap='gray', interpolation='none') plt.title( "Predicted {}, Truth: {}".format(predicted_classes[incorrect], y_test[incorrect])) plt.xticks([]) plt.yticks([]) plt.show() If you look closely, our network is failing on such cases that are very difficult to identify by a human, too. So, we can say that we are getting quite a good accuracy from a very simple model. We saw how to create, train, and test a neural network to perform digit classification using Keras and TensorFlow. If you found our post  useful, do check out this book Ensemble Machine Learning to build ensemble models using TensorFlow and Python libraries such as scikit-learn and NumPy.  
Read more
  • 0
  • 0
  • 10190

article-image-spring-security-configuring-secure-passwords
Packt
24 May 2010
6 min read
Save for later

Encode your password with Spring Security 3

Packt
24 May 2010
6 min read
This article by Peter Mularien is an excerpt from the book Spring Security 3. In this article, we will: Examine different methods of configuring password encoding Understand the password salting technique of providing additional security to stored passwords (For more resources on Spring, see here.) In any secured system, password security is a critical aspect of trust and authoritativeness of an authenticated principal. Designers of a fully secured system must ensure that passwords are stored in a way in which malicious users would have an impractically difficult time compromising them. The following general rules should be applied to passwords stored in a database: Passwords must not be stored in cleartext (plain text) Passwords supplied by the user must be compared to recorded passwords in the database A user's password should not be supplied to the user upon demand (even if the user forgets it) For the purposes of most applications, the best fit for these requirements involves one-way encoding or encryption of passwords as well as some type of randomization of the encrypted passwords. One-way encoding provides the security and uniqueness properties that are important to properly authenticate users with the added bonus that once encrypted, the password cannot be decrypted. In most secure application designs, it is neither required nor desirable to ever retrieve the user's actual password upon request, as providing the user's password to them without proper additional credentials could present a major security risk. Most applications instead provide the user the ability to reset their password, either by presenting additional credentials (such as their social security number, date of birth, tax ID, or other personal information), or through an email-based system. Storing other types of sensitive information Many of the guidelines listed that apply to passwords apply equally to other types of sensitive information, including social security numbers and credit card information (although, depending on the application, some of these may require the ability to decrypt). It's quite common for databases storing this type of information to represent it in multiple ways, for example, a customer's full 16-digit credit card number would be stored in a highly encrypted form, but the last four digits might be stored in cleartext (for reference, think of any internet commerce site that displays XXXX XXXX XXXX 1234 to help you identify your stored credit cards). You may already be thinking ahead and wondering, given our (admittedly unrealistic) approach of using SQL to populate our HSQL database with users, how do we encode the passwords? HSQL, or most other databases for that matter, don't offer encryption methods as built-in database functions. Typically, the bootstrap process (populating a system with initial users and data) is handled through some combination of SQL loads and Java code. Depending on the complexity of your application, this process can get very complicated. For the JBCP Pets application, we'll retain the embedded-database declaration and the corresponding SQL, and then add a small bit of Java to fire after the initial load to encrypt all the passwords in the database. For password encryption to work properly, two actors must use password encryption in synchronization ensuring that the passwords are treated and validated consistently. Password encryption in Spring Security is encapsulated and defined by implementations of the o.s.s.authentication.encoding.PasswordEncoder interface. Simple configuration of a password encoder is possible through the <password-encoder> declaration within the <authentication-provider> element as follows: <authentication-manager alias="authenticationManager"> <authentication-provider user-service-ref="jdbcUserService"> <password-encoder hash="sha"/> </authentication-provider></authentication-manager> You'll be happy to learn that Spring Security ships with a number of implementations of PasswordEncoder, which are applicable for different needs and security requirements. The implementation used can be specified using the hash attribute of the <password-encoder> declaration. The following table provides a list of the out of the box implementation classes and their benefits. Note that all implementations reside in the o.s.s.authentication. encoding package. Implementation class Description hash value PlaintextPasswordEncoder Encodes the password as plaintext. Default DaoAuthenticationProvider password encoder. plaintext Md4PasswordEncoder PasswordEncoder utilizing the MD4 hash algorithm. MD4 is not a secure algorithm-use of this encoder is not recommended. md4 Md5PasswordEncoder PasswordEncoder utilizing the MD5 one-way encoding algorithm. md5 ShaPasswordEncoder PasswordEncoder utilizing the SHA one-way encoding algorithm. This encoder can support confi gurable levels of encoding strength. Sha   sha-256 LdapShaPasswordEncoder Implementation of LDAP SHA and LDAP SSHA algorithms used in integration with LDAP authentication stores. {sha}   {ssha}   As with many other areas of Spring Security, it's also possible to reference a bean definition implementing PasswordEncoder to provide more precise configuration and allow the PasswordEncoder to be wired into other beans through dependency injection. For JBCP Pets, we'll need to use this bean reference method in order to encode the bootstrapped user data. Let's walk through the process of configuring basic password encoding for the JBCP Pets application. Configuring password encoding Configuring basic password encoding involves two pieces—encrypting the passwords we load into the database after the SQL script executes, and ensuring that the DaoAuthenticationProvider is configured to work with a PasswordEncoder. Configuring the PasswordEncoder First, we'll declare an instance of a PasswordEncoder as a normal Spring bean: <bean class="org.springframework.security.authentication. encoding.ShaPasswordEncoder" id="passwordEncoder"/> You'll note that we're using the SHA-1 PasswordEncoder implementation. This is an efficient one-way encryption algorithm, commonly used for password storage. Configuring the AuthenticationProvider We'll need to configure the DaoAuthenticationProvider to have a reference to the PasswordEncoder, so that it can encode and compare the presented password during user login. Simply add a <password-encoder> declaration and refer to the bean ID we defined in the previous step: <authentication-manager alias="authenticationManager"> <authentication-provider user-service-ref="jdbcUserService"> <password-encoder ref="passwordEncoder"/> </authentication-provider></authentication-manager> Try to start the application at this point, and then try to log in. You'll notice that what were previously valid login credentials are now being rejected. This is because the passwords stored in the database (loaded with the bootstrap test-users-groupsdata. sql script) are not stored in an encrypted form that matches the password encoder. We'll need to post-process the bootstrap data with some simple Java code. Writing the database bootstrap password encoder The approach we'll take for encoding the passwords loaded via SQL is to have a Spring bean that executes an init method after the embedded-database bean is instantiated. The code for this bean, com.packtpub.springsecurity.security. DatabasePasswordSecurerBean, is fairly simple. public class DatabasePasswordSecurerBean extends JdbcDaoSupport { @Autowired private PasswordEncoder passwordEncoder; public void secureDatabase() { getJdbcTemplate().query("select username, password from users", new RowCallbackHandler(){ @Override public void processRow(ResultSet rs) throws SQLException { String username = rs.getString(1); String password = rs.getString(2); String encodedPassword = passwordEncoder.encodePassword(password, null); getJdbcTemplate().update("update users set password = ? where username = ?", encodedPassword,username); logger.debug("Updating password for username: "+username+" to: "+encodedPassword); } }); } } The code uses the Spring JdbcTemplate functionality to loop through all the users in the database and encode the password using the injected PasswordEncoder reference. Each password is updated individually.
Read more
  • 0
  • 0
  • 10184

article-image-create-an-ai-powered-coding-project-generator
Luis Sobrecueva
22 Jun 2023
8 min read
Save for later

Create an AI-Powered Coding Project Generator.

Luis Sobrecueva
22 Jun 2023
8 min read
OverviewMaking a smart coding project generator can be a game-changer for developers. With the help of large language models (LLM), we can generate entire code projects from a user-provided prompt.In this article, we are developing a Python program that utilizes OpenAI's GPT-3.5 to generate code projects and slide presentations based on user-provided prompts. The program is designed as a command-line interface (CLI) tool, which makes it easy to use and integrate into various workflows. Image 1: Weather App Features Our project generator will have the following features:Generates entire code projects based on user-provided promptsGenerates entire slide presentations based on user-provided prompts (watch a demo here)Uses OpenAI's GPT-3.5 for code generationOutputs to a local project directoryExample Usage Our tool will be able to generate a code project from a user-provided prompt, for example, this line will create a snake game:maiker "a snake game using just html and js"; We can then open the generated project in our browser: open maiker-generated-project/index.htmlImage 2: Generated ProjectImplementation To ensure a comprehensive understanding of the project, let's break down the process of creating the AI-powered coding project generator step by step: 1. Load environment variables: We use the `dotenv` package to load environment variables from a `.env` file. This file should contain your OpenAI API key.from dotenv import load_dotenv load_dotenv()2. Set up OpenAI API client: We set up the OpenAI API client using the API key loaded from the environment variables.import openai openai.api_key = os.getenv("OPENAI_API_KEY")3. Define the `generate_project` function: This function is responsible for generating code projects or slide presentations based on the user-provided prompt. Let's break down the function in more detail.def generate_project(prompt: str, previous_response: str = "", type: str = "code") -> Dict[str, str]: The function takes three arguments:prompt: The user-provided prompt describing the project to be generated.previous_response: A string containing the previously generated files, if any. This is used to avoid generating the same files again if it does more than one loop.type: The type of project to generate, either "code" or "presentation". Inside the function, we first create the system and user prompts based on the input type (code or presentation). if type == "presentation":      # ... (presentation-related prompts) else:      # ... (code-related prompts) For code projects, we create a system prompt that describes the role of the API as a code generator and a user prompt that includes the project description and any previously generated files. For presentations, we create a system prompt that describes the role of the API as a reveal.js presentation generator and a user prompt that includes the presentation description. Next, we call the OpenAI API to generate the code or presentation using the created system and user prompts. completion = openai.ChatCompletion.create(      model="gpt-3.5-turbo",      messages=[    {           "role": "system",           "content": system_prompt,    },    {           "role": "user",           "content": user_prompt,    },      ],      temperature=0, ) We use the openai.ChatCompletion.create method to send a request to the GPT-3.5 model. The `messages` parameter contains an array of two messages: the system message and the user message. The `temperature` parameter is set to 0 to encourage deterministic output. Once we receive the response from the API, we extract the generated code from the response. generated_code = completion.choices[0].message.contentGenerating the files to disk: We then attempt to parse the generated code as a JSON object. If the parsing is successful, we return the parsed JSON object, which is a dictionary containing the generated files and their content. If the parsing fails, we raise an exception with an error message.try:      if generated_code:    generated_code = json.loads(generated_code) except json.JSONDecodeError as e:      raise click.ClickException(    f"Code generation failed. Please check your prompt and try again. Error: {str(e)}, generated_code: {generated_code}"      ) return generated_code This dictionary is then used by the `main` function to save the generated files to the specified output directory.```4. Define the `main` function: This function is the entry point of our CLI tool. It takes a project prompt, an output directory, and the type of project (code or presentation) as input. It then calls the `generate_project` function to generate the project and saves the generated files to the specified output directory.def main(prompt: str, output_dir: str, type: str):      # ... (rest of the code) Inside the main function, we ensure the output directory exists, generate the project, and save the generated files.# ... (inside main function) os.makedirs(output_dir, exist_ok=True) for _loop in range(max_loops):      generated_code = generate_project(prompt, ",".join(generated_files), type)      for filename, contents in generated_code.items():    # ... (rest of the code) 5. **Create a Click command**: We use the `click` package to create a command-line interface for our tool. We define the command, its arguments, and options using the `click.command`, `click.argument`, and `click.option` decorators.import click @click.command() @click.argument("prompt") @click.option(      "--output-dir",      "-o",      default="./maiker-generated-project",      help="The directory where the generated code files will be saved.", ) @click.option('-t', '--type', required=False, type=click.Choice(['code', 'presentation']), default='code') def main(prompt: str, output_dir: str, type: str):      # ... (rest of the code) 6. Run the CLI tool: Finally, we run the CLI tool by calling the `main` function when the script is executed.if __name__ == "__main__":      main() In this article, we have used the`... (rest of the code)` as a placeholder to keep the explanations concise and focused on specific parts of the code. The complete code for the AI-powered coding project generator can be found in the GitHub repository at the following link: https://github.com/lusob/maiker-cliBy visiting the repository, you can access the full source code, which includes all the necessary components and functions to create the CLI tool. You can clone or download the repository to your local machine, install the required dependencies, and start using the tool to generate code projects and slide presentations based on user-provided prompts.   ConclusionWith the current AI-powered coding project generator, you can quickly generate code projects and slide presentations based on user-provided prompts. By leveraging the power of OpenAI's GPT-3.5, you can save time and effort in creating projects and focus on other important aspects of your work. However, it is important to note that the complexity of the generated projects is currently limited due to the model's token limitations. GPT-3.5 has a maximum token limit, which restricts the amount of information it can process and generate in a single API call. As a result, the generated projects might not be as comprehensive or sophisticated as desired for more complex applications. The good news is that with the continuous advancements in AI research and the development of new models with larger context windows (e.g., models with more than 100k context tokens), we can expect significant improvements in the capabilities of AI-powered code generators. These advancements will enable the generation of more complex and sophisticated projects, opening up new possibilities for developers and businesses alike.Author BioLuis Sobrecueva is a software engineer with many years of experience working with a wide range of different technologies in various operating systems, databases, and frameworks. He began his professional career developing software as a research fellow in the engineering projects area at the University of Oviedo. He continued in a private company developing low-level (C / C ++) database engines and visual development environments to later jump into the world of web development where he met Python and discovered his passion for Machine Learning, applying it to various large-scale projects, such as creating and deploying a recommender for a job board with several million users. It was also at that time when he began to contribute to open source deep learning projects and to participate in machine learning competitions and when he took several ML courses obtaining various certifications highlighting a MicroMasters Program in Statistics and Data Science at MIT and a Udacity Deep Learning nano degree. He currently works as a Data Engineer at a ride-hailing company called Cabify, but continues to develop his career as an ML engineer by consulting and contributing to open-source projects such as OpenAI and Autokeras.Author of the book: Automated Machine Learning with AutoKeras
Read more
  • 0
  • 0
  • 10175
Unlock access to the largest independent learning library in Tech for FREE!
Get unlimited access to 7500+ expert-authored eBooks and video courses covering every tech area you can think of.
Renews at €18.99/month. Cancel anytime
article-image-preventing-sql-injection-attacks-your-joomla-websites
Packt
23 Oct 2009
6 min read
Save for later

Preventing SQL Injection Attacks on your Joomla Websites

Packt
23 Oct 2009
6 min read
Introduction Mark Twain once said, "There are only two certainties in life-death and taxes." Even in web security there are two certainties: It's not "if you are attacked", but "when and how" your site will be taken advantage of. There are several types of attacks that your Joomla! site may be vulnerable to such as CSRF, Buffer Overflows, Blind SQL Injection, Denial of Service, and others that are yet to be found. The top issues in PHP-based websites are: Incorrect or invalid (intentional or unintentional) input Access control vulnerabilities Session hijacks and attempts on session IDs SQL Injection and Blind SQL Injection Incorrect or ignored PHP configuration settings Divulging too much in error messages and poor error handling Cross Site Scripting (XSS) Cross Site Request Forgery, that is CSRF (one-click attack) SQL Injections SQL databases are the heart of Joomla! CMS. The database holds the content, the users' IDs, the settings, and more. To gain access to this valuable resource is the ultimate prize of the hacker. Accessing this can gain him/her an administrative access that can gather private information such as usernames and passwords, and can allow any number of bad things to happen. When you make a request of a page on Joomla!, it forms a "query" or a question for the database. The database is unsuspecting that you may be asking a malformed question and will attempt to process whatever the query is. Often, the developers do not construct their code to watch for this type of an attack. In fact, in the month of February 2008, twenty-one new SQL Injection vulnerabilities were discovered in the Joomla! land. The following are some examples presented for your edification. Using any of these for any purpose is solely your responsibility and not mine: Example 1 index.php?option=com_****&Itemid=name&cmd=section&section=-  000/**/union+select/**/000,111,222,      concat(username,0x3a,password),0,     concat(username,0x3a,password)/**/from/**/jos_users/* Example 2 index.php?option=com_****&task=****&Itemid=name&catid=97&aid=- 9988%2F%2A%2A%2Funion%2F%2A%2A%2Fselect/**/ concat(username,0x3a,password),0x3a,password, 0x3a,username,0,0,0,0,1,1,1,1,1,1,1,1,0,0,0/**/ from/**/jos_users/* Both of these will reveal, under the right set of circumstances, the usernames and passwords in your system. There is a measure of protection in Joomla! 1.0.13, with an encryption scheme that will render the passwords useless. However, it does not make sense to allow extensions that are vulnerable to remain. Yielding ANY kind of information like this is unacceptable. The following screenshot displays the results of the second example running on a test system with the vulnerable extension. The two pieces of information are the username that is listed as Author, and the Hex string (partially blurred) that is the hashed password: You can see that not all MD5 hashes can be broken easily. Though it won't be shown here, there is a website available where you enter your hash and it attempts to crack it. It supports several popular hashes. When I entered this hash (of a password) into the tool, I found the password to be Anthony. It's worth noting that this hash and its password are a result of a website getting broken into, prompting the user to search for the "hash" left behind, thus yielding the password. The important news, however, is that if you are using Joomla! 1.0.13 or greater, the password's hash is now calculated with a "salt", making it nearly impossible to break. However, the standard MD5 could still be broken with enough effort in many cases. For more information about salting and MD5 see:http://www.php.net/md5. For an interesting read on salting, you may wish to read this link:www.governmentsecurity.org/forum/lofiversion/index.php/t19179.htm SQL Injection is a query put to an SQL database where data input was expected AND the application does not correctly filter the input. It allows hijacking of database information such as usernames and passwords, as we saw in the earlier example. Most of these attacks are based on two things. First, the developers have coding errors in their code, or they potentially reused the code from another application, thus spreading the error. The other issue is the inadequate validation of input. In essence, it means trusting the users to put in the RIGHT stuff, and not put in queries meant to harm the system. User input is rarely to be trusted for this reason. It should always be checked for proper format, length, and range. There are many ways to test for vulnerability to an SQL Injection, but one of the most common ones is as follows: In some cases, this may be enough to trigger a database to divulge details. This very simplistic example would not work in the login box that is shown. However, if it were presented to a vulnerable extension in a manner such as the following it might work: <FORM action=http://www.vulnerablesite.com/Search.php method=post><input type=hidden name=A value="me' or 1=1--"></FORM> This "posting" method (presented as a very generic exploit and not meant to work per se in Joomla!) will attempt to break into the database by putting forward queries that would not necessarily be noticed. But why 1=1- - ? According to PHP.NET, "It is a common technique to force the SQL parser to ignore the rest of the query written by the developer with-- which is the comment sign in SQL." You might be thinking, "So what if my passwords are hashed? They can get them but they cannot break them!" This is true, but if they wanted it badly, nothing keeps them from doing something such as this: INSERT INTO jos_mydb_users  ('email','password','login_id','full_name')  VALUES ('johndoe@email.com','default','Jdoe','John Doe');--'; This code has a potential if inserted into a query such as this: http://www.yourdomain/vulnerable_extension//index.php?option=com_vulext INSERT INTO jos_mydb_users ('email','password','login_id','full_name') VALUES ('johndoe@email.com','default','Jdoe','John Doe');--'; Again, this is a completely bogus example and is not likely to work. But if you can get an SQL DB to divulge its information, you can get it to "accept" (insert) information it should not as well. 
Read more
  • 0
  • 0
  • 10166

article-image-build-a-clone-of-yourself-with-large-language-models-llms
Louis Owen
05 Oct 2023
13 min read
Save for later

Build a Clone of Yourself with Large Language Models (LLMs)

Louis Owen
05 Oct 2023
13 min read
Dive deeper into the world of AI innovation and stay ahead of the AI curve! Subscribe to our AI_Distilled newsletter for the latest insights. Don't miss out – sign up today!Introduction"White Christmas," a standout sci-fi episode from the Black Mirror series, serves as a major source of inspiration for this article. In this episode, we witness a captivating glimpse into a potential future application of Artificial Intelligence (AI), particularly considering the timeframe when the show was released. The episode introduces us to "Cookies," digital replicas of individuals that piqued the author's interest.A "Cookie" is a device surgically implanted beneath a person's skull, meticulously replicating their consciousness over the span of a week. Subsequently, this replicated consciousness is extracted and transferred into a larger, egg-shaped device, which can be connected to a computer or tablet for various purposes.Back when this episode was made available to the public in approximately 2014, the concept seemed far-fetched, squarely in the realm of science fiction. However, what if I were to tell you that we now have the potential to create our own clones akin to the "Cookies" using Large Language Models (LLMs)? You might wonder how this is possible, given that LLMs primarily operate with text. Fortunately, we can bridge this gap by extending the capabilities of LLMs through the integration of a Text-to-Speech module.There are two primary approaches to harnessing LLMs for this endeavor: fine-tuning your own LLM and utilizing a general-purpose LLM (whether open-source or closed-source). Fine-tuning, though effective, demands a considerable investment of time and resources. It involves tasks such as gathering and preparing training data, fine-tuning the model through multiple iterations until it meets our criteria, and ultimately deploying the final model into production. Conversely, general LLMs have limitations on the length of input prompts (unless you are using an exceptionally long-context model like Antropic's Claude). Moreover, to fully leverage the capabilities of general LLMs, effective prompt engineering is essential. However, when we compare these two approaches, utilizing general LLMs emerges as the easier path for creating a Proof of Concept (POC). If the aim is to develop a highly refined model capable of replicating ourselves convincingly, then fine-tuning becomes the preferred route.In the course of this article, we will explore how to harness one of the general LLMs provided by AI21Labs and delve into the art of creating a digital clone of oneself through prompt engineering. While we will touch upon the basics of fine-tuning, we will not delve deeply into this process, as it warrants a separate article of its own.Without wasting any more time, let’s take a deep breath, make yourselves comfortable, and be ready to learn how to build a clone of yourself with LLM!Glimpse of Fine-tuning your LLMAs mentioned earlier, we won't delve into the intricate details of fine-tuning a Large Language Model (LLM) to achieve our objective of building a digital clone of ourselves. Nevertheless, in this section, we'll provide a high-level overview of the steps involved in creating such a clone through fine-tuning an LLM.1. Data CollectionThe journey begins with gathering all the relevant data needed for fine-tuning the LLM. This dataset should ideally comprise our historical conversational data, which can be sourced from various platforms like WhatsApp, Telegram, LINE, email, and more. It's essential to cast a wide net and collect as much pertinent data as possible to ensure the model's accuracy in replicating our conversational style and nuances.2. Data PreparationOnce the dataset is amassed, the next crucial step is data preparation. This phase involves several tasks:●  Data Formatting: Converting the collected data into the required format compatible with the fine-tuning process.●  Noise Removal: Cleaning the dataset by eliminating any irrelevant or noisy information that could negatively impact the model's training.●  Resampling: In some cases, it may be necessary to resample the data to ensure a balanced and representative dataset for training.3. Model TrainingWith the data prepared and in order, it's time to proceed to the model training phase. Modern advances in deep learning have made it possible to train LLMs on consumer-grade GPUs, offering accessibility and affordability, such as via QLoRA. During this stage, the LLM learns from the provided dataset, adapting its language generation capabilities to mimic our conversational style and patterns.4. Iterative RefinementFine-tuning an LLM is an iterative process. After training the initial model, we need to evaluate its performance. This evaluation may reveal areas for improvement. It's common to iterate between model training and evaluation, making incremental adjustments to enhance the model's accuracy and fluency.5. Model EvaluationThe evaluation phase is critical in assessing the model's ability to replicate our conversational style and content accurately. Evaluations may include measuring the model's response coherence, relevance, and similarity to our past conversations.6. DeploymentOnce we've achieved a satisfactory level of performance through multiple iterations, the next step is deploying the fine-tuned model. Deploying an LLM is a complex task that involves setting up infrastructure to host the model and handle user requests. An example of a robust inference server suitable for this purpose is Text Generation Inference. You can refer to my other article for this. Deploying the model effectively ensures that it can be accessed and used in various applications.Building the Clone of Yourself with General LLMLet’s start learning how to build the clone of yourself with general LLM through prompt engineering! In this article, we’ll use j2-ultra, the biggest and most powerful model provided by AI21Labs. Note that AI21Labs gives us a free trial for 3 months with $90 credits. These free credits is very useful for us to build a POC for this project. The first thing we need to do is to create the prompt and test it in the playground. To do this, you can log in with your AI21Labs account and go to the AI21Studio. If you don’t have an account yet, you can create one by just following the steps provided on the web. It’s very straightforward. Once you’re on the Studio page, go to the Foundation Models page and choose the j2-ultra model. Note that there are three foundation models provided by AI21Labs. However, in this article, we’ll use j2-ultra which is the best one.Once we’re in the playground, we can experiment with the prompt that we want to try. Here, I provided an example prompt that you can start with. What you need to do is to adjust the prompt with your own information. Louis is an AI Research Engineer/Data Scientist from Indonesia. He is a continuous learner, friendly, and always eager to share his knowledge with his friends. Important information to follow: - His hobbies are writing articles and watching movies - He has 3 main strengths: strong-willed, fast-learner, and effective. - He is currently based in Bandung, Indonesia. - He prefers to Work From Home (WFH) compared to Work From Office - He is currently working as an NLP Engineer at Yellow.ai. - He pursued a Mathematics major in Bandung Institute of Technology. - The reason why he loves NLP is that he found it interesting where one can extract insights from the very unstructured text data. - He learns Data Science through online courses, competitions, internship, and side-projects. - For technical skills, he is familiar with Python, Tableau, SQL, R, Google Big Query, Git, Docker, Design Thinking, cloud service (AWS EC2), Google Data Studio, Matlab, SPSS - He is a Vegan since 2007! He loves all vegan foods except tomatoes. User: Hi, what's up? Louis: Hey, doing good here! How are u? User: All's good. Just wondering, I knew that you're into NLP, can you please give me some recommendation on how to learn? Louis: Sure thing man! I personally learned Data Science through online courses, competitions, internship, and side-projects. However, my top recommendation is to create your own personal projects and joining competitions. You can learn a lot from those! User: Nice. What personal projects to start? Louis: You can start with the topic that you're really interested at. For example, if you're interested at soccer, you can maybe create a data analysis on how one soccer team strategy can gives a better chance for them to win their matches. User: Awesome! thanks man, will ping you again if I have any doubts. Is it okay? Louis: Absolutely! Feel free, good day! ## Louis is an AI Research Engineer/Data Scientist from Indonesia. He is a continuous learner, friendly, and always eager to share his knowledge with his friends. Important information to follow: - His hobbies are writing articles and watching movies - He has 3 main strengths: strong-willed, fast-learner, and effective. - He is currently based in Bandung, Indonesia. - He prefers to Work From Home (WFH) compared to Work From Office - He is currently working as an NLP Engineer at Yellow.ai. - He pursued a Mathematics major in Bandung Institute of Technology. - The reason why he loves NLP is that he found it interesting where one can extract insights from the very unstructured text data. - He learns Data Science through online courses, competitions, internship, and side-projects. - For technical skills, he is familiar with Python, Tableau, SQL, R, Google Big Query, Git, Docker, Design Thinking, cloud service (AWS EC2), Google Data Studio, Matlab, SPSS - He is a Vegan since 2007! He loves all vegan foods except tomatoes. User: Hey, I stumbled upon your IG and realized that you're a Vegan?! Louis: Oh yeah man. I'm a Vegan since 2007! User: Awesome! Out of curiosity, what made you to decide become a Vegan? Louis: Oh mostly it's because of my family beliefs and also to help save the planet. User: Got it. Cool! Anyway, what are you up to lately? Louis: Lately I spend my time to work on my full-time job and also writes articles in my spare time. User: Cool man, keep up the good work! ## Louis is an AI Research Engineer/Data Scientist from Indonesia. He is a continuous learner, friendly, and always eager to share his knowledge with his friends. Important information to follow: - His hobbies are writing articles and watching movies - He has 3 main strengths: strong-willed, fast-learner, and effective. - He is currently based in Bandung, Indonesia. - He prefers to Work From Home (WFH) compared to Work From Office - He is currently working as an NLP Engineer at Yellow.ai. - He pursued a Mathematics major in Bandung Institute of Technology. - The reason why he loves NLP is that he found it interesting where one can extract insights from the very unstructured text data. - He learns Data Science through online courses, competitions, internship, and side-projects. - For technical skills, he is familiar with Python, Tableau, SQL, R, Google Big Query, Git, Docker, Design Thinking, cloud service (AWS EC2), Google Data Studio, Matlab, SPSS - He is a Vegan since 2007! He loves all vegan foods except tomatoes. User: Hey! Louis:The way this prompt works is by giving several few examples commonly called few-shot prompting. Using this prompt is very straightforward, we just need to append the User message at the end of the prompt and the model will generate the answer replicating yourself. Once the answer is generated, we need to put it back to the prompt and wait for the user’s reply. Once the user has replied to the generated response, we also need to put it back to the prompt. Since this is a looping procedure, it’s better to create a function to do all of this. The following is an example of the function that can handle the conversation along with the code to call the AI21Labs model from Python.import ai21 ai21.api_key = 'YOUR_API_KEY' def talk_to_your_clone(prompt):    while True:        user_message = input()        prompt += "User: " + user_message + "\n"        response = ai21.Completion.execute(                                            model="j2-ultra",                                            prompt=prompt,                                            numResults=1,                                           maxTokens=100,         temperature=0.5,         topKReturn=0,         topP=0.9,                                            stopSequences=["##","User:"],                                        )        prompt += "Louis: " + response + "\n"        print(response)ConclusionCongratulations on keeping up to this point! Throughout this article, you have learned ways to create a clone of yourself, detailed steps on how to create it with general LLM provided by AI21Labs, also working code that you can utilize to customize it for your own needs. Hope the best for your experiment in creating a clone of yourself and see you in the next article!Author BioLouis Owen is a data scientist/AI engineer from Indonesia who is always hungry for new knowledge. Throughout his career journey, he has worked in various fields of industry, including NGOs, e-commerce, conversational AI, OTA, Smart City, and FinTech. Outside of work, he loves to spend his time helping data science enthusiasts to become data scientists, either through his articles or through mentoring sessions. He also loves to spend his spare time doing his hobbies: watching movies and conducting side projects. Currently, Louis is an NLP Research Engineer at Yellow.ai, the world’s leading CX automation platform. Check out Louis’ website to learn more about him! Lastly, if you have any queries or any topics to be discussed, please reach out to Louis via LinkedIn.
Read more
  • 0
  • 0
  • 10162

article-image-understanding-material-design
Packt
18 Feb 2016
22 min read
Save for later

Understanding Material Design

Packt
18 Feb 2016
22 min read
Material can be thought of as something like smart paper. Like paper, it has surfaces and edges that reflect light and cast shadows, but unlike paper, material has properties that real paper does not, such as its ability to move, change its shape and size, and merge with other material. Despite this seemingly magical behavior, material should be treated like a physical object with a physicality of its own. Material can be seen as existing in a three-dimensional space, and it is this that gives its interfaces a reassuring sense of depth and structure. Hierarchies become obvious when it is instantly clear whether an object is above or below another. Based largely on age-old principles taken from color theory, animation, traditional print design, and physics, material design provides a virtual space where developers can use surface and light to create meaningful interfaces and movement to design intuitive user interactions. (For more resources related to this topic, see here.) Material properties As mentioned in the introduction, material can be thought of as being bound by physical laws. There are things it can do and things it cannot. It can split apart and heal again, and change color and shape, but it cannot occupy the same space as another sheet of material or rotate around two of its axes. We will be dealing with these properties throughout the book, but it is a good idea to begin with a quick look at the things material can and can't do. The third dimension is fundamental when it comes to material. This is what gives the user the illusion that they are interacting with something more tangible than a rectangle of light. The illusion is generated by the widening and softening of shadows beneath material that is closer to the user. Material exists in virtual space, but a space that, nevertheless, represents the real dimensions of a phone or tablet. The x axis can be thought of as existing between the top and bottom of the screen, the y axis between the right and left edges, and the z axis confined to the space between the back of the handset and the glass of the screen. It is for this reason that material should not rotate around the x or y axes, as this would break the illusion of a space inside the phone. The basic laws of the physics of material are outlined, as follows, in the form of a list: All material is 1 dp thick (along the z axis). Material is solid, only one sheet can exist in one place at a time and material cannot pass through other material. For example, if a card needs to move past another, it must move over it. Elevation, or position along the z axis, is portrayed by shadow, with higher objects having wider, softer shadows. The z axis should be used to prompt interaction. For example, an action button rising up toward the user to demonstrate that it can be used to perform some action. Material does not fold or bend. Material cannot appear to rise higher than the screen surface. Material can grow and shrink along both x and y axes. Material can move along any axis. Material can be spontaneously created and destroyed, but this must not be without movement. The arrivals and departures of material components must be animated. For example, a card growing from the point that it was summoned from or sliding off the screen when dismissed. A sheet of material can split apart anywhere along the x or y axes, and join together again with its original partner or with other material. This covers the basic rules of material behavior but we have said nothing of its content. If material can be thought of as smart paper, then its content can only be described as smart ink. The rules governing how ink behaves are a little simpler: Material content can be text, imagery, or any other form of visual digital content Content can be of any shape or color and behaves independently from its container material It cannot be displayed beyond the edges of its material container It adds nothing to the thickness (z axis) of the material it is displayed on Setting up a development environment The Android development environment consists mainly of two distinct components: the SDK, which provides the code libraries behind Android and Android Studio, and a powerful code editor that is used for constructing and testing applications for Android phones and tablets, Wear, TV, Auto, Glass, and Cardboard. Both these components can both be downloaded as a single package from http://developer.android.com/sdk/index.html. Installing Android Studio The installation is very straightforward. Run the Android Studio bundle and follow the on-screen instructions, installing HAXM hardware acceleration if prompted, and selecting all SDK components, as shown here: Android Studio is dependent on the Java JDK. If you have not previously installed it, this will be detected while you are installing Android Studio, and you will be prompted to download and install it. If for some reason it does not, it can be found at http://www.oracle.com/technetwork/java/javase/downloads/index.html, from where you should download the latest version. This is not quite the end of the installation process. There are still some SDK components that we will need to download manually before we can build our first app. As we will see next, this is done using the Android SDK Manager. Configuring the Android SDK People often refer to Android versions by name, such as Lollipop, or an identity number, such as 5.1.1. As developers, it makes more sense to use the API level, which in the case of Android 5.1.1 would be API level 22. The SDK provides a platform for every API level since API level 8 (Android 2.2). In this section, we will use the SDK Manager to take a closer look at Android platforms, along with the other tools included in the SDK. Start a new Android Studio project or open an existing one with the minimum SDK at 21 or higher. You can then open the SDK manager from the menu via Tools | Android | SDK Manager or the matching icon on the main toolbar. The Android SDK Manager can also be started as a stand alone program. It can be found in the /Android/sdk directory, as can the Android Virtual Device (AVD) manager. As can be seen in the preceding screenshot, there are really three main sections in the SDK: A Tools folder A collection of platforms An Extras folder All these require a closer look. The Tools directory contains exactly what it says, that is, tools. There are a handful of these but the ones that will concern us are the SDK manager that we are using now, and the AVD manager that we will be using shortly to create a virtual device. Open the Tools folder. You should find the latest revisions of the SDK tools and the SDK Platform-tools already installed. If not, select these items, along with the latest Build-tools, that is, if they too have not been installed. These tools are often revised, and it is well worth it to regularly check the SDK manager for updates. When it comes to the platforms themselves, it is usually enough to simply install the latest one. This does not mean that these apps will not work on or be available to devices running older versions, as we can set a minimum SDK level when setting up a project, and along with the use of support libraries, we can bring material design to almost any Android device out there. If you open up the folder for the latest platform, you will see that some items have already been installed. Strictly speaking, the only things you need to install are the SDK platform itself and at least one system image. System images are copies of the hard drives of actual Android devices and are used with the AVD to create emulators. Which images you use will depend on your system and the form factors that you are developing for. In this book, we will be building apps for phones and tablets, so make sure you use one of these at least. Although they are not required to develop apps, the documentation and samples packages can be extremely useful. At the bottom of each platform folder are the Google APIs and corresponding system images. Install these if you are going to include Google services, such as Maps and Cloud, in your apps. You will also need to install the Google support libraries from the Extras directory, and this is what we will cover next. The Extras folder contains various miscellaneous packages with a range of functions. The ones you are most likely to want to download are listed as follows: Android support libraries are invaluable extensions to the SDK that provide APIs that not only facilitate backwards compatibility, but also provide a lot of extra components and functions, and most importantly for us, the design library. As we are developing on Android Studio, we need only install the Android Support Repository, as this contains the Android Support Library and is designed for use with Android. The Google Play services and Google Repository packages are required, along with the Google APIs mentioned a moment ago, to incorporate Google Services into an application. You will most likely need the Google USB Driver if you are intending to test your apps on a real device. How to do this will be explained later in this chapter. The HAXM installer is invaluable if you have a recent Intel processor. Android emulators can be notoriously slow, and this hardware acceleration can make a noticeable difference. Once you have downloaded your selected SDK components, depending on your system and/or project plans, you should have a list of installed packages similar to the one shown next: The SDK is finally ready, and we can start developing material interfaces. All that is required now is a device to test it on. This can, of course, be done on an actual device, but generally speaking, we will need to test our apps on as many devices as possible. Being able to emulate Android devices allows us to do this. Emulating Android devices The AVD allows us to test our designs across the entire range of form factors. There are an enormous number of screen sizes, shapes, and densities around. It is vital that we get to test our apps on as many device configurations as possible. This is actually more important for design than it is for functionality. An app might operate perfectly well on an exceptionally small or narrow screen, but not look as good as we had wanted, making the AVD one of the most useful tools available to us. This section covers how to create a virtual device using the AVD Manager. The AVD Manager can be opened from within Android Studio by navigating to Tools | Android | AVD Manager from the menu or the corresponding icon on the toolbar. Here, you should click on the Create Virtual Device... button. The easiest way to create an emulator is to simply pick a device definition from the list of hardware images and keep clicking on Next until you reach Finish. However, it is much more fun and instructive to either clone and edit an existing profile, or create one from scratch. Click on the New Hardware Profile button. This takes you to the Configure Hardware Profile window where you will be able to create a virtual device from scratch, configuring everything from cameras and sensors, to storage and screen resolution. When you are done, click on Finish and you will be returned to the hardware selection screen where your new device will have been added: As you will have seen from the Import Hardware Profiles button, it is possible to download system images for many devices not included with the SDK. Check the developer sections of device vendor's web sites to see which models are available. So far, we have only configured the hardware for our virtual device. We must now select all the software it will use. To do this, select the hardware profile you just created and press Next. In the following window, select one of the system images you installed earlier and press Next again. This takes us to the Verify Configuration screen where the emulator can be fine-tuned. Most of these configurations can be safely left as they are, but you will certainly need to play with the scale when developing for high density devices. It can also be very useful to be able to use a real SD card. Once you click on Finish, the emulator will be ready to run. An emulator can be rotated through 90 degrees with left Ctrl + F12. The menu can be called with F2, and the back button with ESC. Keyboard commands to emulate most physical buttons, such as call, power, and volume, and a complete list can be found at http://developer.android.com/tools/help/emulator.html. Android emulators are notoriously slow, during both loading and operating, even on quite powerful machines. The Intel hardware accelerator we encountered earlier can make a significant difference. Between the two choices offered, the one that you use should depend on how often you need to open and close a particular emulator. More often than not, taking advantage of your GPU is the more helpful of the two. Apart from this built-in assistance, there are a few other things you can do to improve performance, such as setting lower pixel densities, increasing the device's memory, and building the website for lower API levels. If you are comfortable doing so, set up exclusions in your anti-virus software for the Android Studio and SDK directories. There are several third-party emulators, such as Genymotion, that are not only faster, but also behave more like real devices. The slowness of Android emulators is not necessarily a big problem, as most early development needs only one device, and real devices suffer none of the performance issues found on emulators. As we shall see next, real devices can be connected to our development environment with very little effort. Connecting a real device Using an actual physical device to run and test applications does not have the flexibility that emulators provide, but it does have one or two advantages of its own. Real devices are faster than any emulator, and you can test features unavailable to a virtual device, such as accessing sensors, and making and receiving calls. There are two steps involved in setting up a real phone or tablet. We need to set developer options on the handset and configure the USB connection with our development computer: To enable developer options on your handset, navigate to Settings | About phone. Tap on Build number 7 times to enable Developer options, which will now be available from the previous screen. Open this to enable USB debugging and Allow mock locations. Connect the device to your computer and check that it is connected as a Media device (MTP). Your handset can now be used as a test device. Depending on your We need only install the Google USB. Connect the device to your computer with a USB cable, start Android Studio, and open a project. Depending on your setup, it is quite possible that you are already connected. If not, you can install the Google USB driver by following these steps: From the Windows start menu, open the device manager. Your handset can be found under Other Devices or Portable Devices. Open its Properties window and select the Driver tab. Update the driver with the Google version, which can be found in the sdkextrasgoogleusb_driver directory. An application can be compiled and run from Android Studio by selecting Run 'app' from the Run menu, pressing Shift + F10, or clicking on the green play icon on the toolbar. Once the project has finished building, you will be asked to confirm your choice of device before the app loads and then opens on your handset. With a fully set up development environment and devices to test on, we can now start taking a look at material design, beginning with the material theme that is included as the default in all SDKs with APIs higher than 21. The material theme Since API level 21 (Android 5.0), the material theme has been the built-in user interface. It can be utilized and customized, simplifying the building of material interfaces. However, it is more than just a new look; the material theme also provides the automatic touch feedback and transition animations that we associate with material design. To better understand Android themes and how to apply them, we need to understand how Android styles work, and a little about how screen components, such as buttons and text boxes, are defined. Most individual screen components are referred to as widgets or views. Views that contain other views are called view groups, and they generally take the form of a layout, such as the relative layout we will use in a moment. An Android style is a set of graphical properties defining the appearance of a particular screen component. Styles allow us to define everything from font size and background color, to padding elevation, and much more. An Android theme is simply a style applied across a whole screen or application. The best way to understand how this works is to put it into action and apply a style to a working project. This will also provide a great opportunity to become more familiar with Android Studio. Applying styles Styles are defined as XML files and are stored in the resources (res) directory of Android Studio projects. So that we can apply different styles to a variety of platforms and devices, they are kept separate from the layout code. To see how this is done, start a new project, selecting a minimum SDK of 21 or higher, and using the blank activity template. To the left of the editor is the project explorer pane. This is your access point to every branch of your project. Take a look at the activity_main.xml file, which would have been opened in the editor pane when the project was created. At the bottom of the pane, you will see a Text tab and a Design tab. It should be quite clear, from examining these, how the XML code defines a text box (TextView) nested inside a window (RelativeLayout). Layouts can be created in two ways: textually and graphically. Usually, they are built using a combination of both techniques. In the design view, widgets can be dragged and dropped to form layout designs. Any changes made using the graphical interface are immediately reflected in the code, and experimenting with this is a fantastic way to learn how various widgets and layouts are put together. We will return to both these subjects in detail later on in the book, but for now, we will continue with styles and themes by defining a custom style for the text view in our Hello world app. Open the res node in the project explorer; you can then right-click on the values node and select the New | Values resource file from the menu. Call this file my_style and fill it out as follows: <?xml version="1.0" encoding="utf-8"?> <resources>     <style name="myStyle">         <item name="android:layout_width">match_parent</item>         <item name="android:layout_height">wrap_content</item>         <item name="android:elevation">4dp</item>         <item name="android:gravity">center_horizontal</item>         <item name="android:padding">8dp</item>         <item name="android:background">#e6e6e6</item>         <item name="android:textSize">32sp</item>         <item name="android:textColor">#727272</item>     </style> </resources> This style defines several graphical properties, most of which should be self-explanatory with the possible exception of gravity, which here refers to how content is justified within the view. We will cover measurements and units later in the book, but for now, it is useful to understand dp and sp: Density-independent pixel (dp): Android runs on an enormous number of devices, with screen densities ranging from 120 dpi to 480 dpi and more. To simplify the process of developing for such a wide variety, Android uses a virtual pixel unit based on a 160 dpi screen. This allows us to develop for a particular screen size without having to worry about screen density. Scale-independent pixel (sp): This unit is designed to be applied to text. The reason it is scale-independent is because the actual text size on a user's device will depend on their font size settings. To apply the style we just defined, open the activity_main.xml file (from res/layouts, if you have closed it) and edit the TextView node so that it matches this: <TextView     style="@style/myStyle"     android_text="@string/hello_world" /> The effects of applying this style can be seen immediately from the design tab or preview pane, and having seen how styles are applied, we can now go ahead and create a style to customize the material theme palette. Customizing the material theme One of the most useful features of the material theme is the way it can take a small palette made of only a handful of colors and incorporate these colors into every aspect of a UI. Text and cursor colors, the way things are highlighted, and even system features such as the status and navigation bars can be customized to give our apps brand colors and an easily recognizable look. The use of color in material design is a topic in itself, and there are strict guidelines regarding color, shade, and text, and these will be covered in detail later in the book. For now, we will just look at how we can use a style to apply our own colors to a material theme. So as to keep our resources separate, and therefore easier to manage, we will define our palette in its own XML file. As we did earlier with the my_style.xml file, create a new values resource file in the values directory and call it colors. Complete the code as shown next: <?xml version="1.0" encoding="utf-8"?> <resources>     <color name="primary">#FFC107</color>     <color name="primary_dark">#FFA000</color>     <color name="primary_light">#FFECB3</color>     <color name="accent">#03A9F4</color>     <color name="text_primary">#212121</color>     <color name="text_secondary">#727272</color>     <color name="icons">#212121</color>     <color name="divider">#B6B6B6</color> </resources> In the gutter to the left of the code, you will see small, colored squares. Clicking on these will take you to a dialog with a color wheel and other color selection tools for quick color editing. We are going to apply our style to the entire app, so rather than creating a separate file, we will include our style in the theme that was set up by the project template wizard when we started the project. This theme is called AppTheme, as can be seen by opening the res/values/styles/styles.xml (v21) file. Edit the code in this file so that it looks like the following: <?xml version="1.0" encoding="utf-8"?> <resources>     <style name="AppTheme" parent="android:Theme.Material.Light">         <item name="android:colorPrimary">@color/primary</item>         <item name="android:colorPrimaryDark">@color/primary_dark</item>         <item name="android:colorAccent">@color/accent</item>         <item name="android:textColorPrimary">@color/text_primary</item>         <item name="android:textColor">@color/text_secondary</item>     </style> </resources> Being able to set key colors, such as colorPrimary and colorAccent, allows us to incorporate our brand colors throughout the app, although the project template only shows us how we have changed the color of the status bar and app bar. Try adding radio buttons or text edit boxes to see how the accent color is applied. In the following figure, a timepicker replaces the original text view: The XML for this looks like the following lines: <TimePicker     android_layout_width="wrap_content"     android_layout_height="wrap_content"     android_layout_alignParentBottom="true"     android_layout_centerHorizontal="true" /> For now, it is not necessary to know all the color guidelines. Until we get to them, there is an online material color palette generator at http://www.materialpalette.com/ that lets you try out different palette combinations and download color XML files that can just be cut and pasted into the editor. With a complete and up-to-date development environment constructed, and a way to customize and adapt the material theme, we are now ready to look into how material specific widgets, such as card views, are implemented. Summary The Android SDK, Android Studio, and AVD comprise a sophisticated development toolkit, and even setting them up is no simple task. But, with our tools in place, we were able to take a first look at one of material design's major components: the material theme. We have seen how themes and styles relate, and how to create and edit styles in XML. Finally, we have touched on material palettes, and how to customize a theme to utilize our own brand colors across an app. With these basics covered, we can move on to explore material design further, and in the next chapter, we will look at layouts and material components in greater detail. To learn more about material design, the following books published by Packt Publishing (https://www.packtpub.com/) are recommended: Instant Responsive Web Design (https://www.packtpub.com/web-development/instant-responsive-web-design-instant) Mobile Game Design Essentials (https://www.packtpub.com/game-development/mobile-game-design-essentials) Resources for Article: Further resources on this subject: Speaking Java – Your First Game [article] Metal API: Get closer to the bare metal with Metal API [article] Looking Good – The Graphical Interface [article]
Read more
  • 0
  • 0
  • 10159

article-image-setting-woocommerce
Packt
19 Nov 2013
6 min read
Save for later

Setting Up WooCommerce

Packt
19 Nov 2013
6 min read
(For more resources related to this topic, see here.) So, you're already familiar with WordPress and know how to use plugins, widgets, and themes? Your next step is to expand your existing WordPress website or blog with an online store? In that case you've come to the right place! WooCommerce is a versatile plugin for WordPress, that gives the possibility for everyone with a little WordPress knowledge to start their own online store. In case you are not familiar with WordPress at all, this book is not the first one you should read. No worries though, WordPress isn't that hard to learn and there are tons of online possibilities to learn about the WordPress solution very quickly. Or just turn to one of the many printed books on WordPress that are available. These are the topics we'll be covering in this article: Installing and activating WooCommerce Learn everything about setting up WooCommerce correctly Preparing for takeoff Before we start, remember that it's only possible to install your own plugins if you're working in your own WordPress installation. This means that users running a website on WordPress.com will not be able to follow along. It's simply impossible in that environment to install plugins yourself. Although installing WooCommerce on top of WordPress isn't difficult, we highly recommend that you set up a test environment first. Without going too much into depth, this is what you need to do: Create a backup copy of your complete WordPress environment using FTP. Alternatively use a plugin to store a copy into your Dropbox folder automatically. There are tons of solutions available, just pick your own favorite. UpDraftPlus is one of the possibilities and delivers a complete backup solution: http://wordpress.org/plugins/updraftplus/. Don't forget to backup your WordPress database as well. You may do this using a tool like phpMyAdmin and create an export from there. But also in this case, there are plugins that make life easier. The UpDraftPlus plugin mentioned previously can perform this task as well. Once your backups are complete, install XAMPP on a local (Windows) machine that can be downloaded from http://www.apachefriends.org. Although XAMPP is available for Mac users, MAMP is a widely used alternative for this group. MAMP can be downloaded from http://www.mamp.info/en/index.html. Restore your WordPress backup on your test server and start following the remaining part of this book in your new test environment. Alternatively, install a copy of your WordPress website as a temporary subdomain at your hosting provider. For instance, if my website is http://www.example.com, I could easily create a copy of my site in http://test.example.com. Possibilities may vary, depending on the package you have with your hosting provider. If in your situation it isn't needed to add WooCommerce to an existing WordPress site, of course you may also start from scratch. Just install WordPress on a local test server or install it at your hosting provider. To keep our instructions in this book as clear as possible we did just that. We created a fresh installation of WordPress Version 3.6. Next, you see a screenshot of our fresh WordPress installation: Are these short instructions just too much for you at this moment? Do you need a more detailed step-by-step guide to create a test environment for your WordPress website? Look at the following tutorials: For Max OSX users: http://wpmu.org/local-wordpresstest-environment-mamp-osx/ For Windows users: http://www.thegeekscope.com/howto-copy-a-live-wordpress-website-to-local-indowsenvironment/ More tutorials will also be available on our website: http://www.joomblocks.com Don't forget to sign up for the free Newsletter, that will bring you even more news and tutorials on WordPress, WooCommerce, and other open source software solutions! Once ready, we'll be able to take the next step and install the WooCommerce plugin. Let's take a look at our WordPress backend. In our situation we can open this by browsing to http://localhost/wp36/wp-admin. Depending on the choices you made previously for your test environment, your URL could be different. Well, this should all be pretty familiar for you already. Again, your situation might look different, depending on your theme or the number of plugins already active for your website. Installing WooCommerce Installing a plugin is a fairly simple task: Click on Plugins in the menu on the left and click on Add New. Next, simply enter woocommerce in the Search field and click on Search Plugins. Verify if the correct plugin is shown on top and click on Install Now. Confirm the warning message that appears by clicking on OK. Click on Activate Plugin. Note that in the following screenshot, we're installing Version 2.0.13 of WooCommerce. New versions will follow rather quickly, so you might already see a higher version number. WooCommerce needs to have a number of specific WordPress pages, that it automatically will setup for you. Just click on the Install WooCommerce Pages button and make sure not to forget this step! In our example project, we're installing the English version of WooCommerce. But you might need a different language. By default, WooCommerce is already delivered in a number of languages. This means the installation will automatically follow the language of your WordPress installation. If you need something else, just browse through the plugin directory on WordPress.org to find any additional translations. Once we have created the necessary pages, the WooCommerce welcome screen will appear and you will see a new menu item has been added to the main menu on the left. Meanwhile the plugin created the necessary pages, that you can access by clicking on Pages in the menu on the left. Note that if you open a page that was automatically created by WooCommerce, you'll only see a shortcode, which is used to call the needed functionality. Do not delete the shortcodes, or WooCommerce might stop working. However, it's still possible to add your own content before or after the shortcode on these pages. WooCommerce also added some widgets to your WordPress dashboard, giving an overview of the latest product and sales statistics. At this moment this is all still empty of course. Summary In this article, we learned about the basics of WooCommerce and installing the same. We also learned that WooCommerce is a free but versatile plugin for WordPress, that you may use to easily set up your own online store. Resources for Article: Further resources on this subject: Django 1.2 E-commerce: Generating PDF Reports from Python using ReportLab [Article] Increasing sales with Brainshark slideshows/documents [Article] Implementing OpenCart Modules [Article]
Read more
  • 0
  • 0
  • 10147
article-image-unleashing-creativity-mit-app-inventor-2
Packt
14 Jan 2016
9 min read
Save for later

Unleashing Creativity with MIT App Inventor 2

Packt
14 Jan 2016
9 min read
This article by Felicia Kamriani and Dr. Krishnendu Roy, authors of the book App Inventor 2 Essentials, covers what is MIT App Inventor 2 and why you should learn to use it? There are apps for just about everything—entertainment, socializing, dining, traveling, philanthropy, shopping, education, navigation, and so on. And just about everyone with a smartphone or tablet is using them to make their lives easier or better. But you have decided to move from just using mobile apps to creating mobile apps. Congratulations! Thanks to MIT App Inventor 2, mobile app development is no longer exclusively the realm of experienced software programmers. It empowers anyone with an idea to create technology. This book offers people of all ages a step-by-step guide to creating mobile apps with App Inventor. While this visual programming language is an ideal tool for people who have little or no coding experience, don’t be fooled into thinking that the capabilities of MIT App Inventor 2 are basic! The simple drag-and-drop blocks format is actually a powerful software language capable of creating complex and sophisticated mobile apps. The purpose of this article is to provide an overview of MIT App Inventor 2, and of your new role as a mobile app developer. You are in for more skill building than you ever imagined! Of course, you will learn to code mobile apps, but there are countless other valuable skills weaved into the mobile app building process. Most significantly, you will learn to think differently, master the design-thinking process, become a problem solver, and be resourceful. This article also offers tips on brainstorming app ideas and design principles. Lastly, it reveals the potential of MIT App Inventor 2 and showcases an array of mobile apps so that you, a budding app designer, can begin thinking about the full spectrum of possibilities. (For more resources related to this topic, see here.) What is MIT App Inventor 2? MIT App Inventor 2 is a free, drag-and-drop, blocks-based visual programming language that enables people, regardless of coding experience, to create mobile apps for Android devices. In 2008, iPhones and Android phones had just hit the market and MIT professor Hal Abelson had the idea to create an easy-to-use programming language to make mobile apps that would harness the power of the emerging smartphone technology. Equipped with fast processors, large memory storage, and sensors, smartphones were enabling people to monitor and interact with their environment like never before. Abelson’s goal was to democratize the mobile app development process by making it easy for anyone to create mobile apps that were meaningful and important to them. While on sabbatical at Google in Mountain View, CA, Abelson worked with engineer Mark Friedman to create App Inventor (yes, it was originally called Google App Inventor!). In 2011, Abelson brought App Inventor to MIT and together with the Media Lab and CSAIL (the Computer Science and Artificial Intelligence Lab) created the Center for Mobile Learning. In December 2013, Abelson and his team of developers launched MIT App Inventor 2, (from here on referred to as MIT App Inventor) an even easier to use web-based application version with an Integrated Development Environment (IDE). This means that you can see your app come to life on your smartphone as you are building it. All you need is a computer (Mac or PC), an internet connection (or a USB connection), a Google Gmail account and an Android device (phone or tablet). But, if you don’t have an Android device, don’t worry! You can still create apps with the on-screen Emulator. The MIT App Inventor (http://appinventor.mit.edu/) browser includes a Designer Screen, a graphical user interface (GUI) where you create the look and feel of the app (choosing which components you want it to include) and the Blocks Editor, where you add behavior to the app by coding with colorful blocks. Users build apps by dragging components and blocks from menu bars onto a workspace and a connected Android device (or Emulator) displays progress in real time. All the apps are saved on the MIT server and once completed, they can be can be shared on the MIT App Inventor Gallery, submitted to app contests (such as MIT App of the Month) or uploaded to the Google Play store (or other app marketplaces) for sharing or selling. To date, MIT App Inventor has empowered millions of people to become creators of technology by learning to be mobile app developers. And now, you will become one of them! Understanding your role as a mobile app developer Since you are reading this book, it is safe to assume that not only do you regularly use mobile apps, but on occasion, you have also had the thought, “I wish there were an app for that!” Now, with the help of MIT App Inventor and this guidebook to mobile app development, you will soon be able to say, “I can create an app for that!” In embracing your new role as a mobile app developer, you will not just be learning how to code, but you will also learn an array of other valuable skills. You will learn to think differently. Every time you open an app, you will start looking at it from the developer’s perspective rather than just as a user. You will start noticing what functions are logical and smooth and which are choppy and unintuitive. You will learn to get inspiration from your environment. What type of app could make the attendance process at my club/class/meeting more streamlined or efficient? What app idea could help solve the problem of inaccurate inventory at the gym? You will learn to become a data gatherer without even realizing it. When people make comments about apps, your ears will perk up and you will take note. You will start asking questions like why do you prefer Waze to Google Maps? You will learn to think logically so that you can tell the computer in a step-by-step manner how to perform an operation. You will learn to become a problem solver. Any coder will confirm that programming is an iterative process. It’s a continual cycle of coding, troubleshooting and debugging. Trial and error will become second nature, as will taking a step back to figure out why something that just worked a minute ago now seems broken. And, you will learn to assume the role of a designer. It is no longer accurate to merely depict programmers holed up by themselves at a computer creating white text-based code on black screens. Coders of mobile apps are also designers who think about and create attractive and intuitive user interfaces (UIs). Much of the design work happens not at the computer—it includes conversations with potential users, involves pens paper, and post-it notes, and uses story-boards or sketches. Only once you have your app designed on paper do you sit down at the computer to begin coding. And then, you will not find the traditional black and white interface, as the MIT App Inventor platform is interactive and full of colorful blocks that snap together. Brainstorming app ideas Chances are you already have an idea for a mobile app. But if not, how can you think of one? Sometimes, you have so many ideas; it’s hard to narrow them down to just one. The best way to start brainstorming app ideas is by starting with what you know. What’s an app you wish existed? What’s an app you and your friends, co-workers, or family members would use, need, or like? What’s a problem in your community, network, or circle of friends that could be solved with a digital solution? Maybe, you loan out books to each friends, but don’t have a system to keep track of who borrowed what. Maybe, you want to do a clothing swap with people who are your size so you want to post pictures of the items that you have available for trade and you want to view listed items in your size. Maybe, you have a favorite app that you use all the time but wish it just had this one other feature. Maybe, when you meet your friends in a public place, it’s hard to know if they’re nearby without a lot of texting back and forth, so you want to create an app that shows everyone’s location on one screen. The possibilities are endless! The key to successful brainstorming is to write down all of your ideas no matter how wild they are and talk to people about them to get feedback. Input from others is an essential part of the research needed to ensure that your app idea becomes a successful app that people will want, use, and/or buy. On a recent business trip, we had an idea for a travel app because we always seemed to forget at least one essential item. Over breakfast at the hotel, we discussed the app idea with a couple of colleagues and received amazing insight that we hadn’t thought of like a reminder notification to fill any prescriptions well before the trip and a weather component so we could be sure to pack appropriate clothes for each destination. The more people you talk to, the more market research you will conduct and the more defined the overall app concept will be. Summary This article highlights the many other learning outcomes from engaging in the mobile app development process. Taking an app concept and building it out into an actual mobile app is both a concrete and a creative process. Attention to detail and iteration is vital for both code and design to work effectively and synergistically. Whether you’re creating a game to play with your friend, an app to promote philanthropy involvement on campus, or an app to kickstart a recycling program in your neighborhood, the design thinking process is as much a part of app development as coding. Skills such as brainstorming, research, interviewing, synthesizing, ideating, storyboarding, designing, troubleshooting, problem solving, and testing are not only integral to app building, they are also transferrable to other disciplines, helping to unlock creativity and flow in any endeavor. Resources for Article: Further resources on this subject: Google Apps: Surfing the Web [article] Introduction to IT Inventory and Resource Management [article] How to Expand your Knowledge [article]
Read more
  • 0
  • 0
  • 10145

article-image-planning-failure-and-success
Packt
27 Dec 2016
24 min read
Save for later

Planning for Failure (and Success)

Packt
27 Dec 2016
24 min read
In this article by Michael Solberg and Ben Silverman, the author of the book Openstack for Architects, we will be walking through how to architect your cloud to avoid hardware and software failures. The OpenStack control plane is comprised of web services, application services, database services, and a message bus. Each of these tiers require different approaches to make them highly available and some organizations will already have defined architectures for each of the services. We've seen that customers either reuse those existing patterns or adopt new ones which are specific to the OpenStack platform. Both of these approaches make sense, depending on the scale of the deployment. Many successful deployments actually implement a blend of these. For example, if your organization already has a supported pattern for highly available MySQL databases, you might chose that pattern instead of the one outlined in this article. If your organization doesn't have a pattern for highly available MongoDB, you might have to architect a new one. (For more resources related to this topic, see here.) Building a highly available control plane Back in the Folsom and Grizzly days, coming up with an high availability (H/A) design for the OpenStack control plane was something of a black art. Many of the technologies recommended in the first iterations of the OpenStack High Availability Guide were specific to the Ubuntu distribution of Linux and were unavailable on the Red Hat Enterprise Linux-derived distributions. The now-standard cluster resource manager (Pacemaker) was unsupported by Red Hat at that time. As such, architects using Ubuntu might use one set of software, those using CentOS or RHEL might use another set of software, and those using a Rackspace or Mirantis distribution might use yet another set of software. However, these days, the technology stack has converged and the H/A pattern is largely consistent regardless of the distribution used. About failure and success When we design a highly available OpenStack control plane, we're looking to mitigate two different scenarios: The first is failure. When a physical piece of hardware dies, we want to make sure that we recover without human interaction and continue to provide service to our users The second and perhaps more important scenario is success Software systems always work as designed and tested until humans start using them. While our automated test suites will try to launch a reasonable number of virtual objects, humans are guaranteed to attempt to launch an unreasonable number. Also, many of the OpenStack projects we've worked on have grown far past their expected size and need to be expanded on the fly. There are a few different types of success scenarios that we need to plan for when architecting an OpenStack cloud. First, we need to plan for a growth in the number of instances. This is relatively straightforward. Each additional instance grows the size of the database, it grows the amount of metering data in Ceilometer, and, most importantly, it will grow the number of compute nodes. Adding compute nodes and reporting puts strain on the message bus, which is typically the limiting factor in the size of OpenStack regions or cells. We'll talk more about this when we talk about dividing up OpenStack clouds into regions, cells, and Availability Zones. The second type of growth we need to plan for is an increase in the number of API calls. Deployments which support Continuous Integration(CI) development environments might have (relatively) small compute requirements, but CI typically brings up and tears down environments rapidly. This will generate a large amount of API traffic, which in turn generates a large amount of database and message traffic. In hosting environments, end users might also manually generate a lot of API traffic as they bring up and down instances, or manually check the status of deployments they've already launched. While a service catalog might check the status of instances it has launched on a regular basis, humans tend to hit refresh on their browsers in an erratic fashion. Automated testing of the platform has a tendency to grossly underestimate this kind of behavior. With that in mind, any pattern that we adopt will need to provide for the following requirements: API services must continue to be available during a hardware failure in the control plane The systems which provide API services must be horizontally scalable (and ideally elastic) to respond to unanticipated demands The database services must be vertically or horizontally scalable to respond to unanticipated growth of the platform The message bus can either be vertically or horizontally scaled depending on the technology chosen Finally, every system has its limits. These limits should be defined in the architecture documentation so that capacity planning can account for them. At some point, the control plane has scaled as far as it can and a second control plane should be deployed to provide additional capacity. Although OpenStack is designed to be massively scalable, it isn't designed to be infinitely scalable. High availability patterns for the control plane There are three approaches commonly used in OpenStack deployments these days for achieving high availability of the control plane. The first is the simplest. Take the single-node cloud controller virtualize it, and then make the virtual machine highly available using either VMware clustering or Linux clustering. While this option is simple and it provides for failure scenarios, it scales vertically (not horizontally) and doesn't provide for success scenarios. As such, it should only be used in regions with a limited number of compute nodes and a limited number of API calls. In practice, this method isn't used frequently and we won't spend any more time on it here. The second pattern provides for H/A, but not horizontal scalability. This is the "Active/Passive" scenario described in the OpenStack High Availability Guide. At Red Hat, we used this a lot with our Folsom and Grizzly deployments, but moved away from it starting with Havana. It's similar to the virtualization solution described earlier but instead of relying on VMware clustering or Linux clustering to restart a failed virtual machine, it relies on Linux clustering to restart failed services on a second cloud controller node, also running the same subset of services. This pattern doesn't provide for success scenarios in the Web tier, but can still be used in the database and messaging tiers. Some networking services may still need to be provided as Active/Passive as well. The third H/A pattern available to OpenStack architectures is the Active/Active pattern. In this pattern, services are horizontally scaled out behind a load balancing service or appliance, which is Active/Passive. As a general rule, most OpenStack services should be enabled as Active/Active where possible to allow for success scenarios while mitigating failure scenarios. Ideally, Active/Active services can be scaled out elastically without service disruption by simply adding additional control plane nodes. Both of the Active/Passive and Active/Active designs require clustering software to determine the health of services and the hosts on which they run. In this article, we'll be using Pacemaker as the cluster manager. Some architects may choose to use Keepalived instead of Pacemaker. Active/Passive service configuration In the Active/Passive service configuration, the service is configured and deployed to two or more physical systems. The service is associated with a Virtual IP(VIP)address. A cluster resource manager (normally Pacemaker) is used to ensure that the service and its VIP are enabled on only one of the two systems at any point in time. The resource manager may be configured to favor one of the machines over the other. When the machine that the service is running on fails, the resource manager first ensures that the failed machine is no longer running and then it starts the service on the second machine. Ensuring that the failed machine is no longer running is accomplished through a process known as fencing. Fencing usually entails powering off the machine using the management interface on the BIOS. The fence agent may also talk to a power supply connected to the failed server to ensure that the system is down. Some services (such as the Glance image registry) require shared storage to operate. If the storage is network-based, such as NFS, the storage may be mounted on both the active and the passive nodes simultaneously. If the storage is block-based, such as iSCSI, the storage will only be mounted on the active node and the resource manager will ensure that the storage migrates with the service and the VIP. Active/Active service configuration Most of the OpenStack API services are designed to be run on more than one system simultaneously. This configuration, the Active/Active configuration, requires a load balancer to spread traffic across each of the active services. The load balancer manages the VIP for the service and ensures that the backend systems are listening before forwarding traffic to them. The cluster manager ensures that the VIP is only active on one node at a time. The backend services may or may not be managed by the cluster manager in the Active/Active configuration. Service or system failure is detected by the load balancer and failed services are brought out of rotation. There are a few different advantages to the Active/Active service configuration, which are as follows: The first advantage is that it allows for horizontal scalability. If additional capacity is needed for a given service, a new system can be brought up which is running the service and it can be added into rotation behind the load balancer without any downtime. The control plane may also be scaled down without downtime in the event that it was over-provisioned. The second advantage is that Active/Active services have a much shorter mean time to recovery. Fencing operations often take up to 2 minutes and fencing is required before the cluster resource manager will move a service from a failed system to a healthy one. Load balancers can immediately detect system failure and stop sending requests to unresponsive nodes while the cluster manager fences them in the background. Whenever possible, architects should employ the Active/Active pattern for the control plane services. OpenStack service specifics In this section, we'll walk through each of the OpenStack services and outline the H/A strategy for them. While most of the services can be configured as Active/Active behind a load balancer, some of them must be configured as Active/Passive and others may be configured as Active/Passive. Some of the configuration is dependent on a particular version of OpenStack as well, especially, Ceilometer, Heat, and Neutron. The following details are current as of the Liberty release of OpenStack. The OpenStack web services As a general rule, all of the web services and the Horizon dashboard may be run Active/Active. These include the API services for Keystone, Glance, Nova, Cinder, Neutron, Heat, and Ceilometer. The scheduling services for Nova, Cinder, Neutron, Heat, and Ceilometer may also be deployed Active/Active. These services do not require a load balancer, as they respond to requests on the message bus. The only web service which must be run Active/Passive is the Ceilometer Central agent. This service can be configured to split its workload among multiple instances, however, to support scaling horizontally. The database services All state for the OpenStack web services is stored in a central database—usually a MySQL database. MySQL is usually deployed in an Active/Passive configuration, but can be made Active/Active with the Galera replication extension. Galera is clustering software for MySQL (MariaDB in OpenStack) and this uses synchronous replication to achieve H/A. However, even with Galera, we still recommend directing writes to only one of the replicas—some queries used by the OpenStack services may deadlock when writing to more than one master. With Galera, a load balancer is typically deployed in front of the cluster and is configured to deliver traffic to only one replica at a time. This configuration reduces the mean time to recovery of the service while ensuring that the data is consistent. In practice, many organizations will defer to the database architects for their preference regarding highly available MySQL deployments. After all, it is typically the database administration team who is responsible for responding to failures of that component. Deployments which use the Ceilometer service also require a MongoDB database to store telemetry data. MongoDB is horizontally scalable by design and is typically deployed Active/Active with at least three replicas. The message bus All OpenStack services communicate through the message bus. Most OpenStack deployments these days use the RabbitMQ service as the message bus. RabbitMQ can be configured to be Active/Active through a facility known as "mirrored queues". The RabbitMQ service is not load balanced, each service is given a list of potential nodes and the client is responsible for determining which nodes are active and which ones have failed. Other messaging services used with OpenStack such as ZeroMQ, ActiveMQ, or Qpid may have different strategies and configurations for achieving H/A and horizontal scalability. For these services, refer to the documentation to determine the optimal architecture. Compute, storage, and network agents The compute, storage, and network components in OpenStack has a set of services which perform the work which is scheduled by the API services. These services register themselves with the schedulers on start up over the message bus. The schedulers are responsible for determining the health of the services and scheduling work to active services. The compute and storage services are all designed to be run Active/Active but the network services need some extra consideration. Each hypervisor in an OpenStack deployment runs the nova-compute service. When this service starts up, it registers itself with the nova-scheduler service. A list of currently available nova services is available via the nova service-list command. If a compute node is unavailable, its state is listed as down and the scheduler skips it when performing instance actions. When the node becomes available, the scheduler includes it in the list of available hosts. For KVM or Xen-based deployments, the nova-compute service runs once per hypervisor and is not made highly available. For VMware-based deployments though, a single nova-compute service is run for every vSphere cluster. As such, this service should be made highly available in an Active/Passive configuration. This is typically done by virtualizing the service within a vSphere cluster and configuring the virtual machine to be highly available. Cinder includes a service known as the volume service or cinder-volume. The volume service registers itself with the Cinder scheduler on startup and is responsible for creating, modifying, or deleting LUNs on block storage devices. For backends which support multiple writers, multiple copies of this service may be run in Active/Active configuration. The LVM backend (which is the reference backend) is not highly available, though, and may only have one cinder-volume service for each block device. This is because the LVM backend is responsible for providing iSCSI access to a locally attached storage device. For this reason, highly available deployments of OpenStack should avoid the LVM Cinder backend and instead use a backend that supports multiple cinder-volume services. Finally, the Neutron component of OpenStack has a number of agents, which all require some special consideration for highly available deployments. The DHCP agent can be configured as highly available, and the number of agents which will respond to DHCP requests for each subnet is governed by a parameter in the neutron.conf file, dhcp_agents_per_network. This is typically set to 2, regardless of the number of DHCP agents which are configured to run in a control plane. For most of the history of OpenStack, the L3 routing agent in Neutron has been a single point of failure. It could be made highly available in Active/Passive configuration, but its failover meant the interruption of network connections in the tenant space. Many of the third-party Neutron plugins have addressed this in different ways and the reference Open vSwitch plugin has a highly available L3 agent as of the Juno release. For details on implementing a solution to the single routing point of failure using OpenStack's Distributed Virtual Routers (DVR), refer to the OpenStack Foundation's Neutron documentation at http://docs.openstack.org/liberty/networking-guide/scenario-dvr-ovs.html. Regions, cells, and availability Zones As we mentioned before, OpenStack is designed to be scalable, but not infinitely scalable. There are three different techniques architects can use to segregate an OpenStack cloud—regions, cells, and Availability Zones. In this section, we'll walk through how each of these concepts maps to hypervisor topologies. Regions From an end user's perspective, OpenStack regions are equivalent to regions in Amazon Web Services. Regions live in separate data centers and are often named after their geographical location. If your organization has a data center in Phoenix and one in Raleigh (like ours does) you'll have at least a PHX and a RDU region. Users who want to geographically disperse their workloads will place some of them in PHX and some of them in RDU. Regions have separate API endpoints, and although the Horizon UI has some support for multiple regions, they essentially entirely separate deployments. From an architectural standpoint, there are two main design choices for implementing regions, which are as follows: The first is around authorization. Users will want to have the same credentials for accessing each of the OpenStack regions. There are a few ways to accomplish this. The simplest way is to use a common backing store (usually LDAP) for the Keystone service in each region. In this scenario, the user has to authenticate separately to each region to get a token, but the credentials are the same. In Juno and later, Keystone also supports federation across regions. In this scenario, a Keystone token granted by one region can be presented to another region to authenticate a user. While this currently isn't widely used, it is a major focus area for the OpenStack Foundation and will probably see broader adoption in the future. The second major consideration for regional architectures is whether or not to present a single set of Glance images to each region. While work is currently being done to replicate Glance images across federated clouds, most organizations are manually ensuring that the shared images are consistent. This typically involves building a workflow around image publishing and deprecation which is mindful of the regional layout. Another option for ensuring consistent images across regions is to implement a central image repository using Swift. This also requires shared Keystone and Glance services which span multiple data centers. Details on how to design multiple regions with shared services are in the OpenStack Architecture Design Guide. Cells The Nova compute service has a concept of cells, which can be used to segregate large pools of hypervisors within a single region. This technique is primarily used to mitigate the scalability limits of the OpenStack message bus. The deployment at CERN makes wide use of cells to achieve massive scalability within single regions. Support for cells varies from service to service and as such cells are infrequently used outside a few very large cloud deployments. The CERN deployment is well-documented and should be used as a reference for these types of deployments. In our experience, it's much simpler to deploy multiple regions within a single data center than to implement cells to achieve large scale. The added inconvenience of presenting your users with multiple API endpoints within a geographic location is typically outweighed by the benefits of having a more robust platform. If multiple control planes are available in a geographic region, the failure of a single control plane becomes less dramatic. The cells architecture has its own set of challenges with regard to networking and scheduling of instance placement. Some very large companies that support the OpenStack effort have been working for years to overcome these hurdles. However, many different OpenStack distributions are currently working on a new control plane design. These new designs would begin to split the OpenStack control plane into containers running the OpenStack services in a microservice type architecture. This way the services themselves can be placed anywhere and be scaled horizontally based on the load. One architecture that has garnered a lot of attention lately is the Kolla project that promotes Docker containers and Ansible playbooks to provide production-ready containers and deployment tools for operating OpenStack clouds. To see more, go to https://wiki.openstack.org/wiki/Kolla. Availability Zones Availability Zones are used to group hypervisors within a single OpenStack region. Availability Zones are exposed to the end user and should be used to provide the user with an indication of the underlying topology of the cloud. The most common use case for Availability Zones is to expose failure zones to the user. To ensure the H/A of a service deployed on OpenStack, a user will typically want to deploy the various components of their service onto hypervisors within different racks. This way, the failure of a top of rack switch or a PDU will only bring down a portion of the instances which provide the service. Racks form a natural boundary for Availability Zones for this reason. There are a few other interesting uses of Availability Zones apart from exposing failure zones to the end user. One financial services customer we work with had a requirement for the instances of each line of business to run on dedicated hardware. A combination of Availability Zones and the AggregateMultiTenancyIsolation Nova Scheduler filter were used to ensure that each tenant had access to dedicated compute nodes. Availability Zones can also be used to expose hardware classes to end users. For example, hosts with faster processors might be placed in one Availability Zone and hosts with slower processors might be placed in different Availability Zones. This allows end users to decide where to place their workloads based upon compute requirements. Updating the design document In this article, we walked through the different approaches and considerations for achieving H/A and scalability in OpenStack deployments. As Cloud Architects, we need to decide on the correct approach for our deployment and then document it thoroughly so that it can be evaluated by the larger team in our organization. Each of the major OpenStack vendors has a reference architecture for highly available deployments and those should be used as a starting point for the design. The design should then be integrated with existing Enterprise Architecture and modified to ensure that best practices established by the various stakeholders within an organization are followed. The system administrators within an organization may be more comfortable supporting Pacemaker than Keepalived. The design document presents the choices made for each of these key technologies and gives the stakeholders an opportunity to comment on them before the deployment. Planning the physical architecture The simplest way to achieve H/A is to add additional cloud controllers to the deployment and cluster them. Other deployments may choose to segregate services into different host classes, which can then be clustered. This may include separating the database services into database nodes, separating the messaging services into messaging nodes, and separating the memcached service into memcache nodes. Load balancing services might live on their own nodes as well. The primary considerations for mapping scalable services to physical (or virtual) hosts are the following: Does the service scale horizontally or vertically? Will vertically scaling the service impede the performance of other co-located services? Does the service have particular hardware or network requirements that other services don't have? For example, some OpenStack deployments which use the HAProxy load balancing service chose to separate out the load balancing nodes on a separate hardware. The VIPs which the load balancing nodes host must live on a public, routed network, while the internal IPs of services that they route to don't have that requirement. Putting the HAProxy service on separate hosts allows the rest of the control plane to only have private addressing. Grouping all of the API services on dedicated hosts may ease horizontal scalability. These services don't need to be managed by a cluster resource manager and can be scaled by adding additional nodes to the load balancers without having to update cluster definitions. Database services have high I/O requirements. Segregating these services onto machines which have access to high performance fiber channel may make sense. Finally, you should consider whether or not to virtualize the control plane. If the control plane will be virtualized, creating additional host groups to host dedicated services becomes very attractive. Having eight or nine virtual machines dedicated to the control plane is a very different proposition than having eight or nine physical machines dedicated to the control plane. Most highly available control planes require at least three nodes to ensure that quorum is easily determined by the cluster resource manager. While dedicating three physical nodes to the control function of a hundred node OpenStack deployment makes a lot of sense, dedicating nine physical nodes may not. Many of the organizations that we've worked with will already have a VMware-based cluster available for hosting management appliances and the control plane can be deployed within that existing footprint. Organizations which are deploying a KVM-only cloud may not want to incur the additional operational complexity of managing the additional virtual machines outside OpenStack. Updating the physical architecture design Once the mapping of services to physical (or virtual) machines has been determined, the design document should be updated to include definition of the host groups and their associated functions. A simple example is provided as follows: Load balancer: These systems provide the load balancing services in an Active/Passive configuration Cloud controller: These systems provide the API services, the scheduling services, and the Horizon dashboard services in an Active/Active configuration Database node: These systems provide the MySQL database services in an Active/Passive configuration Messaging node: These systems provide the RabbitMQ messaging services in an Active/Active configuration Compute node: These systems act as KVM hypervisors and run the nova-compute and openvswitch-agent services Deployments which will be using only the cloud controller host group might use the following definitions: Cloud controller: These systems provide the load balancing services in an Active/Passive configuration and the API services, MySQL database services, and RabbitMQ messaging services in an Active/Active configuration Compute node: These systems act as KVM hypervisors and run the nova-compute and openvswitch-agent services After defining the host groups, the physical architecture diagram should be updated to reflect the mapping of host groups to physical machines in the deployment. This should also include considerations for network connectivity. The following is an example architecture diagram for inclusion in the design document: Summary A complete guide to implementing H/A of the OpenStack services is probably worth a book to itself. In this article we started out by covering the main strategies for making OpenStack services highly available and which strategies apply well to each service. Then we covered how OpenStack deployments are typically segmented across physical regions. Finally, we updated our documentation and implemented a few of the technologies we discussed in the lab. While walking through the main considerations for highly available deployments in this article, we've tried to emphasize a few key points: Scalability is at least as important as H/A in cluster design. Ensure that your design is flexible in case of unexpected growth. OpenStack doesn't scale forever. Plan for multiple regions. Also, it's important to make sure that the strategy and architecture that you adopt for H/A is supportable by your organization. Consider reusing existing architectures for H/A in the message bus and database layers. Resources for Article:  Further resources on this subject: Neutron API Basics [article] The OpenFlow Controllers [article] OpenStack Networking in a Nutshell [article]
Read more
  • 0
  • 0
  • 10142

article-image-building-an-api-for-language-model-inference-using-rust-and-hyper-part-2
Alan Bernardo Palacio
31 Aug 2023
10 min read
Save for later

Building an API for Language Model Inference using Rust and Hyper - Part 2

Alan Bernardo Palacio
31 Aug 2023
10 min read
IntroductionIn our previous exploration, we delved deep into the world of Large Language Models (LLMs) in Rust. Through the lens of the llm crate and the transformative potential of LLMs, we painted a picture of the current state of AI integrations within the Rust ecosystem. But knowledge, they say, is only as valuable as its application. Thus, we transition from understanding the 'how' of LLMs to applying this knowledge in real-world scenarios.Welcome to the second part of our Rust LLM. In this article, we roll up our sleeves to architect and deploy an inference server using Rust. Leveraging the blazingly fast and efficient Hyper HTTP library, our server will not just respond to incoming requests but will think, infer, and communicate like a human. We'll guide you through the step-by-step process of setting up, routing, and serving inferences right from the server, all the while keeping our base anchored to the foundational insights from our last discussion.For developers eager to witness the integration of Rust, Hyper, and LLMs, this guide promises to be a rewarding endeavor. By the end, you'll be equipped with the tools to set up a server that can converse intelligently, understand prompts, and provide insightful responses. So, as we progress from the intricacies of the llm crate to building a real-world application, join us in taking a monumental step toward making AI-powered interactions an everyday reality.Imports and Data StructuresLet's start by looking at the import statements and data structures used in the code:use hyper::service::{make_service_fn, service_fn}; use hyper::{Body, Request, Response, Server}; use std::net::SocketAddr; use serde::{Deserialize, Serialize}; use std::{convert::Infallible, io::Write, path::PathBuf};hyper: Hyper is a fast and efficient HTTP library for Rust.SocketAddr: This is used to specify the socket address (IP and port) for the server.serde: Serde is a powerful serialization/deserialization framework in Rust.Deserialize, Serialize: Serde traits for automatic serialization and deserialization.Next, we have the data structures that will be used for deserializing JSON request data and serializing response data:#[derive(Debug, Deserialize)] struct ChatRequest { prompt: String, } #[derive(Debug, Serialize)] struct ChatResponse { response: String, }1.    ChatRequest: A struct to represent the incoming JSON request containing a prompt field.2.    ChatResponse: A struct to represent the JSON response containing a response field.Inference FunctionThe infer function is responsible for performing language model inference:fn infer(prompt: String) -> String { let tokenizer_source = llm::TokenizerSource::Embedded; let model_architecture = llm::ModelArchitecture::Llama; let model_path = PathBuf::from("/path/to/model"); let prompt = prompt.to_string(); let now = std::time::Instant::now(); let model = llm::load_dynamic( Some(model_architecture), &model_path, tokenizer_source, Default::default(), llm::load_progress_callback_stdout, ) .unwrap_or_else(|err| { panic!("Failed to load {} model from {:?}: {}", model_architecture, model_path, err); }); println!( "Model fully loaded! Elapsed: {}ms", now.elapsed().as_millis() ); let mut session = model.start_session(Default::default()); let mut generated_tokens = String::new(); // Accumulate generated tokens here let res = session.infer::<Infallible>( model.as_ref(), &mut rand::thread_rng(), &llm::InferenceRequest { prompt: (&prompt).into(), parameters: &llm::InferenceParameters::default(), play_back_previous_tokens: false, maximum_token_count: Some(140), }, // OutputRequest &mut Default::default(), |r| match r { llm::InferenceResponse::PromptToken(t) | llm::InferenceResponse::InferredToken(t) => { print!("{t}"); std::io::stdout().flush().unwrap(); // Accumulate generated tokens generated_tokens.push_str(&t); Ok(llm::InferenceFeedback::Continue) } _ => Ok(llm::InferenceFeedback::Continue), }, ); // Return the accumulated generated tokens match res { Ok(_) => generated_tokens, Err(err) => format!("Error: {}", err), } }The infer function takes a prompt as input and returns a string containing generated tokens.It loads a language model, sets up an inference session, and accumulates generated tokens.The res variable holds the result of the inference, and a closure handles each inference response.The function returns the accumulated generated tokens or an error message.Request HandlerThe chat_handler function handles incoming HTTP requests:async fn chat_handler(req: Request<Body>) -> Result<Response<Body>, Infallible> { let body_bytes = hyper::body::to_bytes(req.into_body()).await.unwrap(); let chat_request: ChatRequest = serde_json::from_slice(&body_bytes).unwrap(); // Call the `infer` function with the received prompt let inference_result = infer(chat_request.prompt); // Prepare the response message let response_message = format!("Inference result: {}", inference_result); let chat_response = ChatResponse { response: response_message, }; // Serialize the response and send it back let response = Response::new(Body::from(serde_json::to_string(&chat_response).unwrap())); Ok(response) }chat_handler asynchronously handles incoming requests by deserializing the JSON payload.It calls the infer function with the received prompt and constructs a response message.The response is serialized as JSON and sent back in the HTTP response.Router and Not Found HandlerThe router function maps incoming requests to the appropriate handlers:The router function maps incoming requests to the appropriate handlers: async fn router(req: Request<Body>) -> Result<Response<Body>, Infallible> { match (req.uri().path(), req.method()) { ("/api/chat", &hyper::Method::POST) => chat_handler(req).await, _ => not_found(), } }router matches incoming requests based on the path and HTTP method.If the path is "/api/chat" and the method is POST, it calls the chat_handler.If no match is found, it calls the not_found function.Main FunctionThe main function initializes the server and starts listening for incoming connections:#[tokio::main] async fn main() { println!("Server listening on port 8083..."); let addr = SocketAddr::from(([0, 0, 0, 0], 8083)); let make_svc = make_service_fn(|_conn| { async { Ok::<_, Infallible>(service_fn(router)) } }); let server = Server::bind(&addr).serve(make_svc); if let Err(e) = server.await { eprintln!("server error: {}", e); } }In this section, we'll walk through the steps to build and run the server that performs language model inference using Rust and the Hyper framework. We'll also demonstrate how to make a POST request to the server using Postman.1.     Install Rust: If you haven't already, you need to install Rust on your machine. You can download Rust from the official website: https://www.rust-lang.org/tools/install2.     Create a New Rust Project: Create a new directory for your project and navigate to it in the terminal. Run the following command to create a new Rust project: cargo new language_model_serverThis command will create a new directory named language_model_server containing the basic structure of a Rust project.3.     Add Dependencies: Open the Cargo.toml file in the language_model_server directory and add the required dependencies for Hyper and other libraries.    Your Cargo.toml file should look something like this: [package] name = "llm_handler" version = "0.1.0" edition = "2018" [dependencies] hyper = {version = "0.13"} tokio = { version = "0.2", features = ["macros", "rt-threaded"]} serde = {version = "1.0", features = ["derive"] } serde_json = "1.0" llm = { git = "<https://github.com/rustformers/llm.git>" } rand = "0.8.5"Make sure to adjust the version numbers according to the latest versions available.4.     Replace Code: Replace the content of the src/main.rs file in your project directory with the code you've been provided in the earlier sections.5.     Building the Server: In the terminal, navigate to your project directory and run the following command to build the server: cargo build --releaseThis will compile your code and produce an executable binary in the target/release directory.Running the Server1.     Running the Server: After building the server, you can run it using the following command: cargo run --releaseYour server will start listening on the port 8083.2.     Accessing the Server: Open a web browser and navigate to http://localhost:8083. You should see the message "Not Found" indicating that the server is up and running.Making a POST Request Using Postman1.     Install Postman: If you don't have Postman installed, you can download it from the official website: https://www.postman.com/downloads/2.     Create a POST Request:o   Open Postman and create a new request.o   Set the request type to "POST".o   Enter the URL: http://localhost:8083/api/chato   In the "Body" tab, select "raw" and set the content type to "JSON (application/json)".o   Enter the following JSON request body: { "prompt": "Rust is an amazing programming language because" }3.     Send the Request: Click the "Send" button to make the POST request to your server. 4.     View the Response: You should receive a response from the server, indicating the inference result generated by the language model.ConclusionIn the previous article, we introduced the foundational concepts, setting the stage for the hands-on application we delved into this time. In this article, our main goal was to bridge theory with practice. Using the llm crate alongside the Hyper library, we embarked on a mission to create a server capable of understanding and executing language model inference. But our work was more than just setting up a server; it was about illustrating the synergy between Rust, a language famed for its safety and concurrency features, and the vast world of AI.What's especially encouraging is how this project can serve as a springboard for many more innovations. With the foundation laid out, there are numerous avenues to explore, from refining the server's performance to integrating more advanced features or scaling it for larger audiences.If there's one key takeaway from our journey, it's the importance of continuous learning and experimentation. The tech landscape is ever-evolving, and the confluence of AI and programming offers a fertile ground for innovation.As we conclude this series, our hope is that the knowledge shared acts as both a source of inspiration and a practical guide. Whether you're a seasoned developer or a curious enthusiast, the tools and techniques we've discussed can pave the way for your own unique creations. So, as you move forward, keep experimenting, iterating, and pushing the boundaries of what's possible. Here's to many more coding adventures ahead!Author BioAlan Bernardo Palacio is a data scientist and an engineer with vast experience in different engineering fields. His focus has been the development and application of state-of-the-art data products and algorithms in several industries. He has worked for companies such as Ernst and Young, and Globant, and now holds a data engineer position at Ebiquity Media helping the company to create a scalable data pipeline. Alan graduated with a Mechanical Engineering degree from the National University of Tucuman in 2015, participated as the founder of startups, and later on earned a Master's degree from the faculty of Mathematics at the Autonomous University of Barcelona in 2017. Originally from Argentina, he now works and resides in the Netherlands.LinkedIn
Read more
  • 0
  • 0
  • 10136
article-image-how-to-deploy-splunk-binary-and-set-up-its-configuration-tutorial
Savia Lobo
08 Oct 2018
13 min read
Save for later

How to deploy Splunk binary and set up its configuration [Tutorial]

Savia Lobo
08 Oct 2018
13 min read
Splunk provides binary distributions for Windows and a variety of Unix operating systems. For all Unix operating systems, a compressed .tar file is provided. For some platforms, packages are also provided. This article is an excerpt taken from the book Implementing Splunk 7 - Third Edition written by James Miller. This book covers the new modules of Splunk: Splunk Cloud and the Machine Learning Toolkit to ease data usage and more. In this tutorial, you will learn how to deploy Splunk library effectively within your system. It also includes how to set up configuration distributions in Splunk. If your organization uses packages, such as deb or rpm, you should be able to use the provided packages in your normal deployment process. Otherwise, installation starts by unpacking the provided tar to the location of your choice. The process is the same, whether you are installing the full version of Splunk or the Splunk universal forwarder. The typical installation process involves the following steps: Installing the binary Adding a base configuration Configuring Splunk to launch at boot Restarting Splunk Having worked with many different companies over the years, I can honestly say that none of them used the same product or even methodology for deploying software. Splunk takes a hands-off approach to fit in as easily as possible into customer workflows. Deploying from a tar file To deploy from a tar file, the command depends on your version of tar. With a modern version of tar, you can run the following command: tar xvzf splunk-7.0.x-xxx-Linux-xxx.tgz Older versions may not handle gzip files directly, so you may have to run the following command: gunzip -c splunk-7.0.x-xxx-Linux-xxx.tgz | tar xvf - This will expand into the current directory. To expand into a specific directory, you can usually add -C, depending on the version of TAR, as follows: tar -C /opt/ -xvzf splunk-7.0.x-xxx-Linux-xxx.tgz Deploying using msiexec In Windows, it is possible to deploy Splunk using msiexec. This makes it much easier to automate deployment on a large number of machines. To install silently, you can use the combination of AGREETOLICENSE and /quiet, as follows: msiexec.exe /i splunk-xxx.msi AGREETOLICENSE=Yes /quiet If you plan to use a deployment server, you can specify the following value: msiexec.exe /i splunk-xxx.msi AGREETOLICENSE=Yes DEPLOYMENT_SERVER="deployment_server_name:8089" /quiet Or, if you plan to overlay an app that contains deploymentclient.conf, you can forego starting Splunk until that app has been copied into place, as follows: msiexec.exe /i splunk-xxx.msi AGREETOLICENSE=Yes LAUNCHSPLUNK=0 /quiet There are options available to start reading data immediately, but I would advise deploying input configurations to your servers, instead of enabling inputs via installation arguments. Adding a base configuration If you are using the Splunk deployment server, this is the time to set up deploymentclient.conf. This can be accomplished in several ways, as follows: On the command line, by running the following code: $SPLUNK_HOME/bin/splunk set deploy-poll deployment_server_name:8089 By placing a deploymentclient.conf in: $SPLUNK_HOME/etc/system/local/ By placing an app containing deploymentclient.conf in: $SPLUNK_HOME/etc/apps/ The third option is what I would recommend because it allows overriding this configuration, via a deployment server, at a later time. We will work through an example later in the Using Splunk deployment server section. If you are deploying configurations in some other way, for instance with puppet, be sure to restart the Splunk forwarder processes after deploying the new configuration. Configuring Splunk to launch at boot On Windows machines, Splunk is installed as a service that will start after installation and on reboot. On Unix hosts, the Splunk command line provides a way to create startup scripts appropriate for the operating system that you are using. The command looks like this: $SPLUNK_HOME/bin/splunk enable boot-start To run Splunk as another user, provide the flag -user, as follows: $SPLUNK_HOME/bin/splunk enable boot-start -user splunkuser The startup command must still be run as root, but the startup script will be modified to run as the user provided. If you do not run Splunk as root, and you shouldn't if you can avoid it, be sure that the Splunk installation and data directories are owned by the user specified in the enable boot-start command. You can ensure this by using chmod, such as in chmod -R splunkuser $SPLUNK_HOME On Linux, you could then start the command using service splunk start. Configuration distribution in Splunk As we have covered, in some depth, configurations in Splunk are simply directories of plain text files. Distribution essentially consists of copying these configurations to the appropriate machines and restarting the instances. You can either use your own system for distribution, such as puppet or simply a set of scripts, or use the deployment server included with Splunk. Using your own deployment system The advantage of using your own system is that you already know how to use it. Assuming that you have normalized your apps, as described in the section Using apps to organize configuration, deploying apps to a forwarder or indexer consists of the following steps: Set aside the existing apps at $SPLUNK_HOME/etc/apps/. Copy the apps into $SPLUNK_HOME/etc/apps/. Restart Splunk forwarder. Note that this needs to be done as the user that is running Splunk, either by calling the service script or calling su. In Windows, restart the splunkd service. Assuming that you already have a system for managing configurations, that's it. If you are deploying configurations to indexers, be sure to only deploy the configurations when downtime is acceptable, as you will need to restart the indexers to load the new configurations, ideally in a rolling manner. Do not deploy configurations until you are ready to restart, as some (but not all) configurations will take effect immediately. Using the Splunk deployment server If you do not have a system for managing configurations, you can use the deployment server included with Splunk. Some advantages of the included deployment server are as follows: Everything you need is included in your Splunk installation It will restart forwarder instances properly when new app versions are deployed It is intelligent enough not to restart when unnecessary It will remove apps that should no longer be installed on a machine It will ignore apps that are not managed The logs for the deployment client and server are accessible in Splunk itself Some disadvantages of the included deployment server are: As of Splunk 4.3, there are issues with scale beyond a few hundred deployment clients, at which point tuning is required (although a solution option is to use multiple instances of deployment servers). The configuration is complicated and prone to typos With these caveats out of the way, let's set up a deployment server for the apps that we laid out before. Step 1 – deciding where your deployment server will run For a small installation with less than a few dozen forwarders, your main Splunk instance can run the deployment server without any issue. For more than a few dozen forwarders, a separate instance of Splunk makes sense. Ideally, this instance would run on its own machine. The requirements for this machine are not large, perhaps 4 gigabytes of RAM and two processors, or possibly less. A virtual machine would be fine. Define a DNS entry for your deployment server, if at all possible. This will make moving your deployment server later, much simpler. If you do not have access to another machine, you could run another copy of Splunk on the same machine that is running some other part of your Splunk deployment. To accomplish this, follow these steps: Install Splunk in another directory, perhaps /opt/splunk-deploy/splunk/. Start this instance of Splunk by using /opt/splunk-deploy/splunk/bin/splunk start. When prompted, choose different port numbers apart from the default and note what they are. I would suggest one number higher: 8090 and 8001. Unfortunately, if you run splunk enable boot-start in this new instance, the existing startup script will be overwritten. To accommodate both instances, you will need to either edit the existing startup script, or rename the existing script so that it is not overwritten. Step 2 - defining your deploymentclient.conf configuration Using the address of our new deployment server, ideally a DNS entry, we will build an app named deploymentclient-yourcompanyname. This app will have to be installed manually on forwarders but can then be managed by the deployment server. This app should look somewhat like this: deploymentclient-yourcompanyname local/deploymentclient.conf [deployment-client] [target-broker:deploymentServer] targetUri=deploymentserver.foo.com:8089 Step 3 - defining our machine types and locations Starting with what we defined in the Separate configurations by purpose section, we have, in the locations west and east, the following machine types: Splunk indexers db servers Web servers App servers Step 4 - normalizing our configurations into apps appropriately Let's use the apps that we defined in the section Separate configurations by purpose plus the deployment client app that we created in the Step 2 - defining your deploymentclient.conf configuration section. These apps will live in $SPLUNK_HOME/etc/deployment-apps/ on your deployment server. Step 5 - mapping these apps to deployment clients in serverclass.conf To get started, I always start with example 2 from SPLUNK_HOME/etc/system/README/serverclass.conf example: [global] [serverClass:AppsForOps] whitelist.0=*.ops.yourcompany.com [serverClass:AppsForOps:app:unix] [serverClass:AppsForOps:app:SplunkLightForwarder] Let's assume that we have the machines mentioned next. It is very rare for an organization of any size to have consistently named hosts, so I threw in a couple of rogue hosts at the bottom, as follows: spl-idx-west01 spl-idx-west02 spl-idx-east01 spl-idx-east02 app-east01 app-east02 app-west01 app-west02 web-east01 web-east02 web-west01 web-west02 db-east01 db-east02 db-west01 db-west02 qa01 homer-simpson The structure of serverclass.conf is essentially as follows: [serverClass:<className>] #options that should be applied to all apps in this class [serverClass:<className>:app:<appName>] #options that should be applied only to this app in this serverclass Please note that: <className> is an arbitrary name of your choosing. <appName> is the name of a directory in $SPLUNK_HOME/etc/deploymentapps/. The order of stanzas does not matter. Be sure to update <className> if you copy an :app: stanza. This is, by far, the easiest mistake to make. It is important that configuration changes do not trigger a restart of indexers. Let's apply this to our hosts, as follows: [global] restartSplunkd = True #by default trigger a splunk restart on configuration change ####INDEXERS ##handle indexers specially, making sure they do not restart [serverClass:indexers] whitelist.0=spl-idx-* restartSplunkd = False [serverClass:indexers:app:indexerbase] [serverClass:indexers:app:deploymentclient-yourcompanyname] [serverClass:indexers:app:props-web] [serverClass:indexers:app:props-app] [serverClass:indexers:app:props-db] #send props-west only to west indexers [serverClass:indexers-west] whitelist.0=spl-idx-west* restartSplunkd = False [serverClass:indexers-west:app:props-west] #send props-east only to east indexers [serverClass:indexers-east] whitelist.0=spl-idx-east* restartSplunkd = False [serverClass:indexers-east:app:props-east] ####FORWARDERS #send event parsing props apps everywhere #blacklist indexers to prevent unintended restart [serverClass:props] whitelist.0=* blacklist.0=spl-idx-* [serverClass:props:app:props-web] [serverClass:props:app:props-app] [serverClass:props:app:props-db] #send props-west only to west datacenter servers #blacklist indexers to prevent unintended restart [serverClass:west] whitelist.0=*-west* whitelist.1=qa01 blacklist.0=spl-idx-* [serverClass:west:app:props-west] [serverClass:west:app:deploymentclient-yourcompanyname] #send props-east only to east datacenter servers #blacklist indexers to prevent unintended restart [serverClass:east] whitelist.0=*-east* whitelist.1=homer-simpson blacklist.0=spl-idx-* [serverClass:east:app:props-east] [serverClass:east:app:deploymentclient-yourcompanyname] #define our appserver inputs [serverClass:appservers] whitelist.0=app-* whitelist.1=qa01 whitelist.2=homer-simpson [serverClass:appservers:app:inputs-app] #define our webserver inputs [serverClass:webservers] whitelist.0=web-* whitelist.1=qa01 whitelist.2=homer-simpson [serverClass:webservers:app:inputs-web] #define our dbserver inputs [serverClass:dbservers] whitelist.0=db-* whitelist.1=qa01 [serverClass:dbservers:app:inputs-db] #define our west coast forwarders [serverClass:fwd-west] whitelist.0=app-west* whitelist.1=web-west* whitelist.2=db-west* whitelist.3=qa01 [serverClass:fwd-west:app:outputs-west] #define our east coast forwarders [serverClass:fwd-east] whitelist.0=app-east* whitelist.1=web-east* whitelist.2=db-east* whitelist.3=homer-simpson [serverClass:fwd-east:app:outputs-east] You should organize the patterns and classes in a way that makes sense to your organization and data centers, but I would encourage you to keep it as simple as possible. I would strongly suggest opting for more lines than more complicated logic. A few more things to note about the format of serverclass.conf: The number following whitelist and blacklist must be sequential, starting with zero. For instance, in the following example, whitelist.3 will not be processed, since whitelist.2 is commented: [serverClass:foo] whitelist.0=a* whitelist.1=b* # whitelist.2=c* whitelist.3=d* whitelist.x and blacklist.x are tested against these values in the following order: clientName as defined in deploymentclient.conf: This is not commonly used but is useful when running multiple Splunk instances on the same machine, or when the DNS is completely unreliable. IP address: There is no CIDR matching, but you can use string patterns. Reverse DNS: This is the value returned by the DNS for an IP address. If your reverse DNS is not up to date, this can cause you problems, as this value is tested before the value of hostname, as provided by the host itself. If you suspect this, try ping <ip of machine> or something similar to see what the DNS is reporting. Hostname as provided by forwarder: This is always tested after reverse DNS, so be sure your reverse DNS is up to date. When copying :app: lines, be very careful to update the <className> appropriately! This really is the most common mistake made in serverclass.conf. Step 6 - restarting the deployment server If serverclass.conf did not exist, a restart of the Splunk instance which is running deployment server is required to activate the deployment server. After the deployment server is loaded, you can use the following command: $SPLUNK_HOME/bin/splunk reload deploy-server This command should be enough to pick up any changes in serverclass.conf a in etc/deployment-apps. Step 7 - installing deploymentclient.conf Now that we have a running deployment server, we need to set up the clients to call home. On each machine that will be running the deployment client, the procedure is essentially as follows: Copy the deploymentclient-yourcompanyname app to $SPLUNK_HOME/etc/apps/ Restart Splunk If everything is configured correctly, you should see the appropriate apps appear in $SPLUNK_HOME/etc/apps/, within a few minutes. To see what is happening, look at the log $SPLUNK_HOME/var/log/splunk/splunkd.log. If you have problems, enable debugging on either the client or the server by editing $SPLUNK_HOME/etc/log.cfg, followed by a restart. Look for the following lines: category.DeploymentServer=WARN category.DeploymentClient=WARN Once found, change them to the following lines and restart Splunk: category.DeploymentServer=DEBUG category.DeploymentClient=DEBUG After restarting Splunk, you will see the complete conversation in $SPLUNK_HOME/var/log/splunk/splunkd.log. Be sure to change the setting back once you no longer need the verbose logging! To summarize, we learned how to deploy a binary and set up configuration distribution in Splunk. If you've enjoyed this excerpt, head over to the book, Implementing Splunk 7 - Third Edition to learn how to use the Machine Learning Toolkit and best practices and tips to help you implement Splunk services effectively and efficiently. Splunk introduces machine learning capabilities in Splunk Enterprise and Cloud Creating effective dashboards using Splunk [Tutorial] Why should enterprises use Splunk?
Read more
  • 0
  • 0
  • 10133

article-image-downloading-and-setting-bootstrap
Packt
30 Jan 2013
4 min read
Save for later

Downloading and setting up Bootstrap

Packt
30 Jan 2013
4 min read
(For more resources related to this topic, see here.) Getting ready Twitter Bootstrap is more than a set of code. It is an online community. To get started, you will do well to familiarize yourself with Twitter Bootstrap's home base: http://twitter.github.com/bootstrap/ Here you'll find the following: The documentation: If this is your first visit, grab a cup of coffee and spend some time perusing the pages, scanning the components, reading the details, and soaking it in. (You'll see this is going to be fun.) The download button: You can get the latest and greatest versions of the Twitter Bootstrap's CSS, JavaScript plugins, and icons, compiled and ready for action, coming to you in a convenient ZIP folder. This is where we'll start. Downloading the example code You can download the example code files for all Packt books you have purchased from your account at http://www.PacktPub.com. If you purchased this book elsewhere, you can visit http://www.PacktPub.com/support and register to have the files e-mailed directly to you. How to do it… Whatever your experience level, as promised, I'll walk you through all the necessary steps. Here goes! Go to the Bootstrap homepage: http://twitter.github.com/bootstrap/ Click on the large Download Bootstrap button. Locate the download file and unzip or extract it. You should get a folder named simply bootstrap. Inside this folder you should find the folders and files shown in thefollowing screenshot: From the homepage, click on the main navigation item: Get started. Scroll down, or use the secondary navigation, to navigate to the heading: Examples. The direct link is: http://twitter.github.com/bootstrap/getting-started. html#examples Right-click and download the leftmost example, labeled Basic Marketing Site. You'll see that it is an HTML file, named hero.html Save (or move) it to your main bootstrap folder, right alongside the folders named css, img, and js. Rename the file index.html (a standard name for what will become our homepage). You should now see something similar to the following screenshot: Next, we need to update the links to the stylesheets Why? When you downloaded the starter template file, you changed the relationship between the file and its stylesheets. We need to let it know where to find the stylesheets in this new file structure. Open index.html (formerly, hero.html) in your code editor. Need a code editor? Windows users: You might try Notepad++ (http://notepadplus-plus.org/download/) Mac users: Consider TextWrangler (http://www. barebones.com/products/textwrangler/)   Find these lines near the top of the file (lines 11-18 in version 2.0.2): Update the href attributes in both link tags to read as follows: Save your changes! You're set to go! Open it up in your browser! (Double-click on index.html.) You should see something like this: Congratulations! Your first Bootstrap site is underway. Problems? Don't worry. If your page doesn't look like this yet, let me help you spot the problem. Revisit the steps above and double-check a couple of things: Are your folders and files in the right relationship? (see step 3 as detailed previosuly) In your index.html, did you update the href attributes in both stylesheet links? (These should be lines 11 and 18 as of Twitter Bootstrap version 2.1.0.) There's more… Of course, this is not the only way you could organize your files. Some developers prefer to place stylesheets, images, and JavaScript files all within a larger folder named assets or library. The organization method I've presented is recommended by the developers who contribute to the HTML5 Boilerplate. One advantage of this approach is that it reduces the length of the paths to our site assets. Thus, whereas others might have a path to a background image such as this: url('assets/img/bg.jpg'); In the organization scheme I've recommended it will be shorter: url('img/bg.jpg'); This is not a big deal for a single line of code. However, when you consider that there will be many links to stylesheets, JavaScript files, and images running throughout your site files, when we reduce each path a few characters, this can add up. And in a world where speed matters, every bit counts. Shorter paths save characters, reduce file size, and help support faster web browsing. Summary This article gave us a quick introduction to the Twitter Bootstrap. We've got a fair idea as to how to download and set up our Bootstrap. By following these simple steps, we can easily create our first Bootstrap site. Resources for Article : Further resources on this subject: Starting Up Tomcat 6: Part 1 [Article] Build your own Application to access Twitter using Java and NetBeans: Part 2 [Article] Integrating Twitter with Magento [Article]
Read more
  • 0
  • 0
  • 10132
Modal Close icon
Modal Close icon