Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Save more on your purchases! discount-offer-chevron-icon
Savings automatically calculated. No voucher code required.
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Newsletter Hub
Free Learning
Arrow right icon
timer SALE ENDS IN
0 Days
:
00 Hours
:
00 Minutes
:
00 Seconds

How-To Tutorials

7011 Articles
article-image-ai-distilled-38-latest-in-ai-sora-gemini-15-and-more
Merlyn Shelley
01 Mar 2024
9 min read
Save for later

AI_Distilled 38: Latest in AI: Sora, Gemini 1.5, and More

Merlyn Shelley
01 Mar 2024
9 min read
Dive deeper into the world of AI innovation and stay ahead of the AI curve! Subscribe to our AI_Distilled newsletter for the latest insights. Don't miss out – sign up today!👋 Hello,“People say AI is overhyped, but I think it's not hyped enough. The next generation who will use this in the next few years will have a much higher bar on what technology can do for them. So how you build it for that generation, how you build it for that future will be really interesting to see.”-Puneet Chandok, Microsoft India and South Asia presidentSpeaking at a panel discussion on AI at the Mumbai Tech Week, Chandok believes AI is not hyped enough considering its potential for disruptive transformation. He encourages more training on AI to realize its full potential.Welcome back to a new issue of AI Distilled - your one-stop destination for all things AI, ML, NLP, and Gen AI. Let’s get started with the latest news and developments across the AI sector:OpenAI unveils Sora, an AI model generating videos from textGoogle's latest conversational AI model Gemini 1.5 has a million-token context windowNew AI news reader app tackles clickbait headlines, provides summariesSlack is rolling out new AI features for enterprise users including thread summariesLangChain announced raising $25 million to launch new platform for building LLM appsAI helps improve medical imaging to benefit patients globallyResearchers develop AI model that determines a person's sex from brain scansWe’ve also curated the latest GPT and LLM resources, tutorials, and secret knowledge:Giving AI Models a Better Memory: How Google DeepMind Expanded Context WindowsAdvanced Techniques For More Relevant AI ResponsesReinforcement Learning ExplainedBridging the Gap Between AI and App DevelopmentFinally, don’t forget to check-out our hands-on tips and strategies from the AI community for you to use on your own projects:Creating Custom Models Without the Hassle of Data CollectionCode Your Own AI Coding BuddyEvaluating Code Quality with AI AssistantsEasily Deploy Language Models LocallyLooking for some inspiration? Here are some GitHub repositories to get your projects going!gptscript-ai/gptscriptkarpathy/minbpeAAAI-DISIM-UnivAQ/DALIQwenLM/QwenWriter’s Credit: Special shout-out to Vidhu Jain for her valuable contribution to this week’s issue.Cheers,  Kartikey Pandey  Editor-in-Chief, Packt  ⚡ TechWave: AI/GPT News & AnalysisOpenAI unveiled Sora, an AI model generating videos from text at up to a minute in length. Sora demonstrates an understanding of language and the physical world and photorealism across styles, though human subjects appear game-like.Google's latest conversational AI model Gemini 1.5 analyzes more information than before, thanks to a million-token context window. This allows for summarizing the Apollo 11 mission transcript or analyzing a 44-minute silent film in full. Early results show the system maintains performance as context grows into the millions.Bulletin, a new AI-powered news reader app, tackles clickbait headlines and provides summaries of news articles with customizable news sources.Slack is rolling out new AI features for enterprise users including thread summaries, channel recaps, and answering workplace questions. The tools provide highlights from missed messages and help catch up.LangChain announced raising $25 million to launch their new platform LangSmith for building and monitoring LLM apps. LangSmith allows developers to accelerate workflows across development, testing, deployment, and monitoring. It has already seen significant adoption with over 70,000 signups and 5000 monthly active companies.Courtesy: Bulletin/Shihab MehboobAI is helping improve medical imaging to benefit patients globally. ML can quickly analyze large datasets to find issues doctors may miss and flag urgent cases. Cloud solutions also enable sharing scans and remote expert assistance anywhere. Companies are applying these methods to speed diagnoses, reduce wait times, and bring ultrasounds directly to homes. Researchers have also developed an AI model that can determine a person's sex from brain scans with over 90% accuracy. The model analyzed dynamic MRI scans and identified the default mode, striatum, and limbic networks as key in distinguishing male and female brains. This breakthrough furthers our understanding of brain organization and could help address sex-specific health issues. 🔮 Expert Insights from Packt Community Generative AI with LangChain - By Dr. Ben AuffarthChatGPT and the GPT models by OpenAI have brought about a revolution not only in how we write and research but also in how we can process information.This book discusses the functioning, capabilities, and limitations of LLMs underlying chat systems, including ChatGPT and Bard. It also demonstrates, in a series of practical examples, how to use the LangChain framework to build production-ready and responsive LLM applications for tasks ranging from customer support to software development assistance and data analysis Key TakeawaysExplore the expansive utility of LLMs in real-world applications.Guidance on fine-tuning, prompt engineering, and best practices.Learn how to use the LangChain framework to build production-ready LLM applications.By the end of this book, you'll be equipped with the practical knowledge and skills to leverage the transformative power of generative AI with confidence and creativity.Read More🌟 Secret Knowledge: AI/LLM Resources🌀 Giving AI Models a Better Memory: How Google DeepMind Expanded Context Windows: Google DeepMind's latest AI model Gemini 1.5 has significantly improved how much information it can process at once, thanks to advances in "long context windows." The team discovered their model could understand over 1 million pieces of information in a single sitting, far surpassing earlier limits. This opens up new possibilities for tasks like summarizing lengthy documents, analyzing large codebases, and even comprehending full movies. Developers are excited to explore creative uses of this expanded recall.🌀 Advanced Techniques For More Relevant AI Responses: This article discusses how to improve AI conversation models like RAG by enhancing how information is stored, found and used. Methods covered include indexing sentences individually while keeping their surrounding context, combining keyword search with semantic search, and re-scoring results based on the question. The author demonstrates implementing these "advanced RAG" techniques in Python using tools like LlamaIndex and Weaviate. With these optimizations, AI systems can provide more helpful responses by accessing knowledge in a targeted manner.🌀 Reinforcement Learning Explained: This article breaks down the key concepts of reinforcement learning in an easy-to-understand way. It covers states, actions, rewards, and how agents interact with environments to learn policies. RL agents try different strategies to maximize long-term rewards through trial and error. Episodes provide a framework to evaluate policies. Deterministic policies pick set actions while stochastic policies use probabilities. Whether you're new to RL or a veteran, this primer is worth a read to get acquainted with the basics.🌀 Bridging the Gap Between AI and App Development: As AI becomes more advanced, developers need easier ways to integrate cutting-edge features into their work. However, directly using AI code frameworks can be challenging and limit scalability. The solution? AI gateways. By handling tasks like routing, caching, and monitoring behind the scenes, gateways act as a bridge between complex AI systems and traditional development workflows. They streamline the integration process while ensuring high performance. Are gateways the future of intelligent applications?Partnering with Notion Ever tried Notion? It's a workspace that helps you do things better and faster.You get AI for notes and teamwork, easy drag-and-drop for content, and cool new features to help manage projects and share knowledge.Give it a try!🔛 Masterclass: AI/LLM Tutorials🌀 Creating Custom Models Without the Hassle of Data Collection: Tired of spending big bucks to use proprietary AI APIs or going through the tedious process of collecting your training data? This page shows how you can train customized models more efficiently. By using an open-source LLM to generate synthetic annotations for a small sample of your data, you can then fine-tune a smaller model tailored exactly to your needs. The process takes just a few steps and allows you to analyze large datasets for a fraction of the cost. Best of all, you avoid sending sensitive data to third parties.🌀 Code Your Own AI Coding Buddy: This guide shows you how to build an AI assistant that lives right on your computer. Using tools like HuggingFace and Streamlit, you can create a chatbot trained on Code Llama. Simply ask it questions and it will respond with examples in languages like Python, Java, and C++. Better yet, the models are free and open-source. This is a neural net sidekick to help automate repetitive tasks and speed up your workflow.🌀 Evaluating Code Quality with AI Assistants: This article explores using AI to improve code quality by testing Python scripts with SonarQube and getting feedback from LLMs. The author ran tests on ChatGPT and open-source models like Code Llama to see if they could identify issues flagged by SonarQube. While the models struggled to pinpoint errors solely from descriptions, some provided insightful summaries. Continued development of coding-focused LLMs may help automate part of the review process.🌀 Easily Deploy Language Models Locally: With a simple four-step process, you can get powerful language models like ChatGPT running on your hardware. First, choose a model from HuggingFace and quantize it for faster performance. Then build an Ollama image to serve the model. For a slick interface, deploy a ChatGPT-style React app talking to Ollama via Docker. The whole setup only takes around 15 minutes. Now you've got a custom language assistant without internet dependence.🚀 HackHub: Trending AI Tools🌀 gptscript-ai/gptscript: Open source NLP tool that allows developers to automate tasks by writing scripts in plain English.🌀 karpathy/minbpe: Minimal and clean Python code for the byte pair encoding algorithm commonly used in NLP and language model tokenization.🌀 AAAI-DISIM-UnivAQ/DALI: Framework allowing developers to build multi-agent systems in Prolog for applications like robotics, event processing, and more.🌀 QwenLM/Qwen: Open source code, models, and documentation for the Qwen series of LLMs, including Qwen, Qwen-Chat, and their various sizes.
Read more
  • 0
  • 0
  • 48324

article-image-backpropagation-algorithm
Packt
08 Jun 2017
11 min read
Save for later

Backpropagation Algorithm

Packt
08 Jun 2017
11 min read
In this article by Gianmario Spacagna, Daniel Slater, Phuong Vo.T.H, and Valentino Zocca, the authors of the book Python Deep Learning, we will learnthe Backpropagation algorithmas it is one of the most important topics for multi-layer feed-forward neural networks. (For more resources related to this topic, see here.) Propagating the error back from last to first layer, hence the name Backpropagation. Backpropagation is one of the most difficult algorithms to understand at first, but all is needed is some knowledge of basic differential calculus and the chain rule. For a deep neural network the algorithm to set the weights is called the Backpropagation algorithm. The Backpropagation algorithm We have seen how neural networks can map inputs onto determined outputs, depending on fixed weights. Once the architecture of the neural network has been defined (feed-forward, number of hidden layers, number of neurons per layer), and once the activity function for each neuron has been chosen, we will need to set the weights that in turn will define the internal states for each neuron in the network. We will see how to do that for a 1-layer network and then how to extend it to a deep feed-forward network. For a deep neural network the algorithm to set the weights is called the Backpropagation algorithm, and we will discuss and explain this algorithm for most of this section as it is one of the most important topics for multilayer feed-forward neural networks. First, however, we will quickly discuss this for 1-layer neural networks. The general concept we need to understand is the following: every neural network is an approximation of a function, therefore each neural network will not be equal to the desired function, instead it will differ by some value. This value is called the error and the aim is to minimize this error. Since the error is a function of the weights in the neural network, we want to minimize the error with respect to the weights. The error function is a function of many weights, it is therefore a function of many variables. Mathematically, the set of points where this function is zero represents therefore a hypersurface and to find a minimum on this surface we want to pick a point and then follow a curve in the direction of the minimum. Linear regression To simplify things we are going to introduce matrix notation. Let x be the input, we can think of x as a vector. In the case of linear regression we are going to consider a single output neuron y, the set of weights w is therefore a vector of dimension the same as the dimension of x. The activation value is then defined as the inner product <x, w>. Let's say that for each input value x we want to output a target value t, while for each x the neural network will output a value y defined by the activity function chosen, in this case the absolute value of the difference (y-t) represents the difference between the predicted value and the actual value for the specific input example x. If we have m input values xi, each of them will have a target value ti. In this case we calculate the error using the mean squared error , where each yi is a function of w. The error is therefore a function of w and it is usually denoted with J(w). As mentioned above, this represents a hypersurface of dimension equal to the dimension of w (we are implicitly also considering the bias), and for each wj we need to find a curve that will lead towards the minimum of the surface. The direction in which a curve increases in a certain direction is given by its derivative with respect to that direction, in this case by: And in order to move towards the minimum we need to move in the opposite direction set by  for each wj. Let's calculate: If , then  and therefore: The notation can sometimes be confusing, especially the first time one sees it. The input is given by vectors xi, where the superscript indicated the ith example. Since x and w are vectors, the subscript indicates the jth coordinate of the vector. yi then represents the output of the neural network given the input xi, while ti represents the target, that is, the desired value corresponding to the input xi. In order to move towards the minimum, we need to move each weight in the direction of its derivative by a small amount l, called the learning rate, typically much smaller than 1, (say 0.1 or smaller). We can therefore drop the 2 in the derivative and incorporate it in the learning rate, to get the update rule therefore given by: or, more in general, we can write the update rule in matrix form as: where ∇ represents the vector of partial derivatives. This process is what is often called gradient descent. One last note, the update can be done after having calculated all the input vectors, however, in some cases, the weights could be updated after each example or after a defined preset number of examples. Logistic regression In logistic regression, the output is not continuous; rather it is defined as a set of classes. In this case, the activation function is not going to be the identity function like before, rather we are going to use the logistic sigmoid function. The logistic sigmoid function, as we have seen before, outputs a real value in (0,1) and therefore it can be interpreted as a probability function, and that is why it can work so well in a 2-class classification problem. In this case, the target can be one of two classes, and the output represents the probability that it be one of those two classes (say t=1).Let’s denote with σ(a), with a the activation value,the logistic sigmoid function, therefore, for each examplex, the probability that the output be the class y, given the weights w, is: We can write that equation more succinctly as: and, since for each sample xi the probabilities are independent, we have that the global probability is: If we take the natural log of the above equation (to turn products into sums), we get: The object is now to maximize this log to obtain the highest probability of predicting the correct results. Usually, this is obtained, as in the previous case, by using gradient descent to minimize the cost function defined by. As before, we calculate the derivative of the cost function with respect to the weights wj to obtain: In general, in case of a multi-class output t, with t a vector (t1, …, tn), we can generalize this equation using J (w) = −log(P( y x,w))= Ei,j ti j, log ( (di)) that brings to the update equation for the weights: This is similar to the update rule we have seen for linear regression. Backpropagation In the case of 1-layer, weight-adjustment was easy, as we could use linear or logistic regression and adjust the weights simultaneously to get a smaller error (minimizing the cost function). For multi-layer neural networks we can use a similar argument for the weights used to connect the last hidden layer to the output layer, as we know what we would like the output layer to be, but we cannot do the same for the hidden layers, as, a priori, we do not know what the values for the neurons in the hidden layers ought to be. What we do, instead, is calculate the error in the last hidden layer and estimate what it would be in the previous layer, propagating the error back from last to first layer, hence the name Backpropagation. Backpropagation is one of the most difficult algorithms to understand at first, but all is needed is some knowledge of basic differential calculus and the chain rule. Let's introduce some notation first. We denote with Jthe cost (error), with y the activity function that is defined on the activation value a (for example y could be the logistic sigmoid), which is a function of the weights w and the input x. Let's also define wi,j the weight between the ith input value and the jth output. Here we define input and output more generically than for 1-layer network, if wi,j connects a pair of successive layers in a feed-forward network, we denote as input the neurons on the first of the two successive layers, and output the neurons on the second of the two successive layers. In order not to make the notation too heavy, and have to denote on which layer each neuron is, we assume that the ith input yi is always in the layer preceding the layer containing the jth output yj The letter y is used to both denote an input and the activity function, and we can easily infer which one we mean by the contest. We also use subscripts i and jwhere we always have ibelonging to the layer preceding the layer containing the element with subscript j. We also use subscripts i and j, where we always have the element with subscript i belonging to the layer preceding the layer containing the element with subscript j. In this example, layer 1 represents the input, and layer 2 the output Using this notation, and the chain-rule for derivatives, for the last layer of our neural network we can write: Since we know that , we have: If y is the logistic sigmoid defined above, we get the same result we have already calculated at the end of the previous section, since we know the cost function and we can calculate all derivatives. For the previous layers the same formula holds: Since we know that and we know that  is the derivative of the activity function that we can calculate, all we need to calculate is the derivative . Let's notice that this is the derivative of the error with respect to the activation function in the second layer, and, if we can calculate this derivative for the last layer, and have a formula that allows us to calculate the derivative for one layer assuming we can calculate the derivative for the next, we can calculate all the derivatives starting from the last layer and move backwards. Let us notice that, as we defined the yj, they are the activation values for the neurons in the second layer, but they are also the activity functions, therefore functions of the activation values in the first layer. Therefore, applying the chain rule, we have: and once again we can calculate both and, so , once we knowwe can calculate, and since we can calculate for the last layer, we can move backward and calculate for any layer and therefore  for any layer. Summarizing, if we have a sequence of layers where: We then have these two fundamental equations, where the summation in the second equation should read as the sum over all the outgoing connections fromyj to any neuron yk in the successive layer: By using these two equations we can calculate the derivatives for the cost with respect to each layer. If we set ,   represents the variation of the cost with respect to the activation value, and we can think of as the error at the neuron yj. We can then rewrite as: which implies that . These two equations give an alternate way of seeing Backpropagation, as the variation of the cost with respect to the activation value, and provide a formula to calculate this variation for any layer once we know the variation for the following layer: We can also combine these equations and show that: The Backpropagation algorithm for updating the weights is then given on each layer by: In the last section we will provide a code example that will help understand and apply these concepts and formulas. Summary At the end of this articlewe learnt the post neural networks architecture phaseand the use of the Backpropagation algorithm and we saw see how we can stack many layers to create and use deep feed-forward neural networks, and how a neural network can have many layers, and why inner (hidden) layers are important. Resources for Article: Further resources on this subject: Basics of Jupyter Notebook and Python [article] Jupyter and Python Scripting [article] Getting Started with Python Packages [article]
Read more
  • 0
  • 0
  • 48284

article-image-squid-proxy-server-fine-tuning-achieve-better-performance
Packt
25 Apr 2011
12 min read
Save for later

Squid Proxy Server: Fine Tuning to Achieve Better Performance

Packt
25 Apr 2011
12 min read
  Squid Proxy Server 3.1: Beginner's Guide Improve the performance of your network using the caching and access control capabilities of Squid         Read more about this book       Whether you only run one site, or are in charge of a whole network, Squid is an invaluable tool which improves performance immeasurably. Caching and performance optimization usually requires a lot of work on the developer's part, but Squid does all that for you. In this article we will learn to fine-tune our cache to achieve a better HIT ratio to save bandwidth and reduce the average page load time. In this article by Kulbir Saini, author of Squid Proxy Server 3 Beginners Guide, we will take a look at the following: Cache peers or neighbors Caching the web documents in the main memory and hard disk Tuning Squid to enhance bandwidth savings and reduce latency (For more resources on Proxy Servers, see here.) Cache peers or neighbors Cache peers or neighbors are the other proxy servers with which our Squid proxy server can: Share its cache with to reduce bandwidth usage and access time Use it as a parent or sibling proxy server to satisfy its clients' requests Use it as a parent or sibling proxy server We normally deploy more than one proxy server in the same network to share the load of a single server for better performance. The proxy servers can use each other's cache to retrieve the cached web documents locally to improve performance. Let's have a brief look at the directives provided by Squid for communication among different cache peers. Declaring cache peers The directive cache_peer is used to tell Squid about proxy servers in our neighborhood. Let's have a quick look at the syntax for this directive: cache_peer HOSTNAME_OR_IP_ADDRESS TYPE PROXY_PORT ICP_PORT [OPTIONS] In this code, HOSTNAME_OR_IP_ADDRESS is the hostname or IP address of the target proxy server or cache peer. TYPE specifies the type of the proxy server, which in turn, determines how that proxy server will be used by our proxy server. The other proxy servers can be used as a parent, sibling, or a member of a multicast group. Time for action – adding a cache peer Let's add a proxy server (parent.example.com) that will act as a parent proxy to our proxy server: cache_peer parent.example.com parent 3128 3130 default proxy-only 3130 is the standard ICP port. If the other proxy server is not using the standard ICP port, we should change the code accordingly. This code will direct Squid to use parent.example.com as a proxy server to satisfy client requests in case it's not able to do so itself. The option default specifies that this cache peer should be used as a last resort in the scenario where other peers can't be contacted. The option proxy-only specifies that the content fetched using this peer should not be cached locally. This is helpful when we don't want to replicate cached web documents, especially when the two peers are connected with a high bandwidth backbone. What just happened? We added parent.example.com as a cache peer or parent proxy to our Squid proxy server. We also used the option proxy-only, which means the requests fetched using this cache peer will not be cached on our proxy server. There are several other options in which you can add cache peers, for various purposes, such as, a hierarchy. Quickly restricting access to domains using peers If we have added a few proxy servers as cache peers to our Squid server, we may have the desire to have a little bit of control over the requests being forwarded to the peers. The directive cache_peer_domain is a quick way to achieve the desired control. The syntax of this directive is quite simple: cache_peer_domain CACHE_PEER_HOSTNAME [!]DOMAIN1 [[!]DOMAIN2 ...] In the code, CACHE_PEER_HOSTNAME is the hostname or IP address of the cache peer being used when declaring it as a cache peer, using the cache_peer directive. We can specify any number of domains which may be fetched through this cache peer. Adding a bang (!) as a prefix to the domain name will prevent the use of this cache peer for that particular domain. Let's say we want to use the videoproxy.example.com cache peer for browsing video portals like Youtube, Netflix, Metacafe, and so on. cache_peer_domain videoproxy.example.com .youtube.com .netflix.comcache_peer_domain videoproxy.example.com .metacafe.com These two lines will configure Squid to use the videoproxy.example.com cache peer for requests to the domains youtube.com, netflix.com, and metacafe.com only. Requests to other domains will not be forwarded using this peer. Advanced control on access using peers We just learned about cache_peer_domain, which provides a way to control access using cache peers. However, it's not really flexible in granting or revoking access. That's when cache_peer_access comes into the picture, which provides a very flexible way to control access using cache peers using ACLs. The syntax and implications are similar to other access directives such as http_access. cache_peer_access CACHE_PEER_HOSTNAME allow|deny [!]ACL_NAME Let's write the following configuration lines, which will allow only the clients on the network 192.0.2.0/24 to use the cache peer acadproxy.example.com for accessing Youtube, Netflix, and Metacafe. acl my_network src 192.0.2.0/24acl video_sites dstdomain .youtube.com .netflix.com .metacafe.comcache_peer_access acadproxy.example.com allow my_network video_sitescache_peer_access acadproxy.example.com deny all In the same way, we can use other ACL types to achieve better control over access to various websites using cache peers. Caching web documents All this time, we have been talking about the caching of web documents and how it helps in saving bandwidth and improving the end user experience, now it's time to learn how and where Squid actually keeps these cached documents so that they can be served on demand. Squid uses main memory (RAM) and hard disks for storing or caching the web documents. Caching is a complex process but Squid handles it beautifully and exposes the directives using squid.conf, so that we can control how much should be cached and what should be given the highest priority while caching. Let's have a brief look at the caching-related directives provided by Squid. Using main memory (RAM) for caching The web documents cached in the main memory or RAM can be served very quickly as data read/write speeds of RAM are very high compared to hard disks with mechanical parts. However, as the amount of space available in RAM for caching is very low compared to the cache space available on hard disks, only very popular objects or the documents with a very high probability of being requested again, are stored in cache space available in RAM. As the cache space in memory is precious, the documents are stored on a priority basis. Let's have a look at the different types of objects which can be cached. In-transit objects or current requests These are the objects related to the current requests and they have the highest priority to be kept in the cache space in RAM. These objects must be kept in RAM and if there is a situation where the incoming request rate is quite high and we are about to overflow the cache space in RAM, Squid will try to keep the served part (the part which has already been sent to the client) on the disk to create free space in RAM. Hot or popular objects These objects or web documents are popular and are requested quite frequently compared to others. These are stored in the cache space left after storing the in-transit objects as these have a lower priority than in-transit objects. These objects are generally pushed to disk when there is a need to generate more in RAM cache space for storing the in-transit objects. Negatively cached objects Negatively cached objects are error messages which Squid has encountered while fetching a page or web document on behalf of a client. For example, if a request to a web page has resulted in a HTTP error 404 (page not found), and Squid receives a subsequent request for the same web page, then Squid will check if the response is still fresh and will return a reply from the cache itself. If there is a request for the same page after the negatively cached object corresponding to that page has expired, Squid will check again if the page is available. Negatively cached objects have the same priority as hot or popular objects and they can be pushed to disk at any time in favor of in-transit objects. Specifying cache space in RAM So far we have learned about how the available cache space is utilized for storing or caching different types of objects with different priorities. Now, it's time to learn about specifying the amount of RAM space we want to dedicate for caching. While deciding the RAM space for caching, we should be neither greedy nor paranoid. If we specify a large percentage of RAM for caching, the overall system performance will suffer as the system will start swapping processes in case there is no free RAM left for other processes. If we use a very low percentage of RAM for caching, then we'll not be able to take full advantage of Squid's caching mechanism. The default size of the memory cache is 256 MB. Time for action – specifying space for memory caching We can use extra RAM space available on a running system after sparing a chunk of memory that can be utilized by the running process under heavy load. To find out the amount of free RAM available on our system, we can use either the top or free command. To find out the free RAM in Megabytes, we can use the free command as follows: $ free -m For more details, please check the top(1) and free(1) man pages. Now, let's say we have 4 GB of total RAM on the server and all the processes are running comfortably in 1 GB of RAM space. After securing another 512 MB for emergency situations where running processes may take extra memory, we can safely allocate 2.5 GB of RAM for caching. To specify the cache size in the main memory, we use the directive cache_mem. It has a very simple format. As we have learned before, we can specify the memory size in bytes, KB, MB, or GB. Let's specify the cache memory size for the previous example: cache_mem 2500 MB The previous value specified with cache_mem is in Megabytes. What just happened? We learned about calculating the approximate space in the main memory, which can be used to cache web documents and therefore enhance the performance of the Squid server by a significant margin. Have a go hero – calculating cache_mem for your machine Note down the total RAM on your machine and calculate the approximate space in megabytes that you can allocate for memory caching. Maximum object size in memory As we have limited space in memory available for caching objects, we need to use the space in an optimized way. We should plan to set this a bit low, as setting it to a too larger size will mean that there will be a lesser number of cached objects in the memory and the HIT (being found in cache) rate will suffer significantly. The default maximum size used by Squid is 512 KB, but we can change it depending on our value for cache_mem. So, if we want to set it to 1 MB, as we have a lot of RAM available for caching (as in the previous example), we can use the maximum_object_size_in_memory directive as follows: maximum_object_size_in_memory 1 MB This command will set the allowed maximum object size in memory cache to 1 MB. Memory cache mode With the newer versions of Squid, we can control which objects we want to keep in the memory cache for optimizing the performance. Squid offers the directive memory_cache_mode to set the mode that Squid should use to utilize the space available in memory cache. There are three different modes available: Mode Description always The mode always is used to keep all the most recently fetched objects that can fit in the available space. This is the default mode used by Squid. disk When the disk mode is set, only the objects which are already cached on a hard disk and have received a HIT (meaning they were requested subsequently after being cached), will be stored in the memory cache. network Only the objects which have been fetched from the network (including neighbors) are kept in the memory cache, if the network mode is set. Setting the mode is easy and can be set using the memory_cache_mode directive as shown: memory_cache_mode always This configuration line will set memory cache mode to always; this means that most recently fetched objects will be kept in the memory.  
Read more
  • 0
  • 2
  • 48243

article-image-making-simple-web-based-ssh-client-using-nodejs-and-socketio
Jakub Mandula
28 Oct 2015
7 min read
Save for later

Making a simple Web based SSH client using Node.js and Socket.io

Jakub Mandula
28 Oct 2015
7 min read
If you are reading this post, you probably know what SSH stands for. But just for the sake of formality, here we go: SSH stands for Secure Shell. It is a network protocol for secure access to the shell on a remote computer. You can do much more over SSH besides commanding your computer. Here you can find further information: http://en.wikipedia.org/wiki/Secure_Shell. In this post, we are going to create a very simple web terminal. And when I say simple, I mean it! However much you like colors, it will not support them because the parsing is just beyond the scope of this post. If you want a good client-side terminal library use term.js. It is made by the same guy who wrote pty.js, which we will be using. It is able to handle pretty much all key events and COLORS!!!! Installation I am going to assume you already have your node and npm installed. First we will install all of the npm packages we will be using: npm install express pty.js socket.io Express is a super cool web framework for Node. We are going to use it to serve our static files. I know it is a bit overkill, but I like Express. pty.js is where the magic will be happening. It forks processes into virtual pseudo terminals and provides bindings for communication. Socket.io is what we will use to transmit the data from the web browser to the server and back. It uses modern WebSockets, but provides fallbacks for backward compatibility. Anytime you want to create a real-time application, Socket.io is the way to go. Planning First things first, we need to think what we want the program to do. We want the program to create an instance of a shell on the server (remote machine) and send all of the text to the browser. Back in the browser, we want to capture any user events and send them back to the server shell. The WebSSH server This is the code that will power the terminal forwarding. Open a new file named server.js and start by importing all of the libraries: var express = require('express'); var https = require('https'); var http = require('http'); var fs = require('fs'); var pty = require('pty.js'); Set up express: // Setup the express app var app = express(); // Static file serving app.use("/",express.static("./")); Next we are going to create the server. // Creating an HTTP server var server = http.createServer(app).listen(8080) If you want to use HTTPS, which you probably will, you need to generate a key and certificate and import them as shown. var options = { key: fs.readFileSync('keys/key.pem'), cert: fs.readFileSync('keys/cert.pem') }; Then use the options object to create the actual server. Notice that this time we are using the https package. // Create an HTTPS server var server = https.createServer(options, app).listen(8080) CAUTION: Even if you use HTTPS, do not use this example program on the Internet. You are not authenticating the client in any way and thus providing a free open gate to your computer. Please make sure you only use this on your Private network protected by a firewall!!! Now bind the socket.io instance to the server: var io = require('socket.io')(server); After this, we can set up the place where the magic happens. // When a new socket connects io.on('connection', function(socket){ // Create terminal var term = pty.spawn('sh', [], { name: 'xterm-color', cols: 80, rows: 30, cwd: process.env.HOME, env: process.env }); // Listen on the terminal for output and send it to the client term.on('data', function(data){ socket.emit('output', data); }); // Listen on the client and send any input to the terminal socket.on('input', function(data){ term.write(data); }); // When socket disconnects, destroy the terminal socket.on("disconnect", function(){ term.destroy(); console.log("bye"); }); }); In this block, all we do is wait for new connections. When we get one, we spawn a new virtual terminal and start to pump the data from the terminal to the socket and vice versa. After the socket disconnects, we make sure to destroy the terminal. If you have noticed, I am using the simple sh shell. I did this mainly because I don't have a fancy prompt on it. Because we are not adding any parsing logic, my bash prompt would show up like this: ]0;piman@mothership: ~ _[01;32m✓ [33mpiman_[0m ↣ _[1;34m[~]_[37m$[0m - Eww! But you may use any shell you like. This is all that we need on the server side. Save the file and close it. Client side The client side is going to be just a very simple HTML file. Start with a very simple HTML markup: <!doctype html> <html> <head> <title>SSH Client</title> <script type="text/javascript" src="//cdnjs.cloudflare.com/ajax/libs/socket.io/1.3.5/socket.io.min.js"></script> <script type="text/javascript" src="//cdnjs.cloudflare.com/ajax/libs/jquery/2.1.4/jquery.min.js"></script> <style> body { margin: 0; padding: 0; } .terminal { font-family: monospace; color: white; background: black; } </style> </head> <body> <h1>SSH</h1> <div class="terminal"> </div> <script> </script> </body> </html> I am downloading the client side libraries jquery and socket.io from cdnjs. All of the client code will be written in the script tag below the terminal div. Surprisingly the code is very simple: // Connect to the socket.io server var socket = io.connect('http://localhost:8080'); // Wait for data from the server socket.on('output', function (data) { // Insert some line breaks where they belong data = data.replace("n", "<br>"); data = data.replace("r", "<br>"); // Append the data to our terminal $('.terminal').append(data); }); // Listen for user input and pass it to the server $(document).on("keypress",function(e){ var char = String.fromCharCode(e.which); socket.emit("input", char); }); Notice that we do not have to explicitly append the text the client types to the terminal mainly because the server echos it back anyways. Now we are done! Run the server and open up the URL in your browser. node server.js You should see a small prompt and be able to start typing commands. You can now explore you machine from the browser! Remember that our Web Terminal does not support Tab, Ctrl, Backspace or Esc characters. Implementing this is your homework. Conclusion I hope you found this tutorial useful. You can apply the knowledge in any real-time application where communication with the server is critical. All the code is available here. Please note, that if you'd like to use a browser terminal I strongly recommend term.js. It supports colors and styles and all the basic keys including Tabs, Backspace etc. I use it in my PiDashboard project. It is much cleaner and less tedious than the example I have here. I can't wait what amazing apps you will invent based on this. About the Author Jakub Mandula is a student interested in anything to do with technology, computers, mathematics or science.
Read more
  • 0
  • 6
  • 48193

article-image-how-to-handle-categorical-data-for-machine-learning-algorithms
Packt Editorial Staff
20 Sep 2019
9 min read
Save for later

How to handle categorical data for machine learning algorithms

Packt Editorial Staff
20 Sep 2019
9 min read
The quality of data and the amount of useful information are key factors that determine how well a machine learning algorithm can learn. Therefore, it is absolutely critical that we make sure to encode categorical variables correctly, before we feed data into a machine learning algorithm. In this article, with simple yet effective examples we will explain how to deal with categorical data in computing machine learning algorithms and how we to map ordinal and nominal feature values to integer representations. The article is an excerpt from the book Python Machine Learning - Third Edition by Sebastian Raschka and Vahid Mirjalili. This book is a comprehensive guide to machine learning and deep learning with Python. It acts as both a clear step-by-step tutorial, and a reference you’ll keep coming back to as you build your machine learning systems.  It is not uncommon that real-world datasets contain one or more categorical feature columns. When we are talking about categorical data, we have to further distinguish between nominal and ordinal features. Ordinal features can be understood as categorical values that can be sorted or ordered. For example, t-shirt size would be an ordinal feature, because we can define an order XL > L > M. In contrast, nominal features don't imply any order and, to continue with the previous example, we could think of t-shirt color as a nominal feature since it typically doesn't make sense to say that, for example, red is larger than blue. Categorical data encoding with pandas Before we explore different techniques to handle such categorical data, let's create a new DataFrame to illustrate the problem: >>> import pandas as pd >>> df = pd.DataFrame([ ...            ['green', 'M', 10.1, 'class1'], ...            ['red', 'L', 13.5, 'class2'], ...            ['blue', 'XL', 15.3, 'class1']]) >>> df.columns = ['color', 'size', 'price', 'classlabel'] >>> df color  size price  classlabel 0   green     M 10.1     class1 1     red   L 13.5      class2 2    blue   XL 15.3      class1 As we can see in the preceding output, the newly created DataFrame contains a nominal feature (color), an ordinal feature (size), and a numerical feature (price) column. The class labels (assuming that we created a dataset for a supervised learning task) are stored in the last column. Mapping ordinal features To make sure that the learning algorithm interprets the ordinal features correctly, we need to convert the categorical string values into integers. Unfortunately, there is no convenient function that can automatically derive the correct order of the labels of our size feature, so we have to define the mapping manually. In the following simple example, let's assume that we know the numerical difference between features, for example, XL = L + 1 = M + 2: >>> size_mapping = { ...                 'XL': 3, ...                 'L': 2, ...                 'M': 1} >>> df['size'] = df['size'].map(size_mapping) >>> df color  size price  classlabel 0   green     1 10.1     class1 1     red   2 13.5      class2 2    blue     3 15.3     class1 If we want to transform the integer values back to the original string representation at a later stage, we can simply define a reverse-mapping dictionary inv_size_mapping = {v: k for k, v in size_mapping.items()} that can then be used via the pandas map method on the transformed feature column, similar to the size_mapping dictionary that we used previously. We can use it as follows: >>> inv_size_mapping = {v: k for k, v in size_mapping.items()} >>> df['size'].map(inv_size_mapping) 0   M 1   L 2   XL Name: size, dtype: object Encoding class labels Many machine learning libraries require that class labels are encoded as integer values. Although most estimators for classification in scikit-learn convert class labels to integers internally, it is considered good practice to provide class labels as integer arrays to avoid technical glitches. To encode the class labels, we can use an approach similar to the mapping of ordinal features discussed previously. We need to remember that class labels are not ordinal, and it doesn't matter which integer number we assign to a particular string label. Thus, we can simply enumerate the class labels, starting at 0: >>> import numpy as np >>> class_mapping = {label:idx for idx,label in ...                  enumerate(np.unique(df['classlabel']))} >>> class_mapping {'class1': 0, 'class2': 1} Next, we can use the mapping dictionary to transform the class labels into integers: >>> df['classlabel'] = df['classlabel'].map(class_mapping) >>> df     color  size price  classlabel 0   green     1 10.1         0 1     red   2 13.5           1 2    blue     3 15.3           0 We can reverse the key-value pairs in the mapping dictionary as follows to map the converted class labels back to the original string representation: >>> inv_class_mapping = {v: k for k, v in class_mapping.items()} >>> df['classlabel'] = df['classlabel'].map(inv_class_mapping) >>> df     color  size price  classlabel 0   green     1 10.1     class1 1     red   2 13.5      class2 2    blue     3 15.3     class1 Alternatively, there is a convenient LabelEncoder class directly implemented in scikit-learn to achieve this: >>> from sklearn.preprocessing import LabelEncoder >>> class_le = LabelEncoder() >>> y = class_le.fit_transform(df['classlabel'].values) >>> y array([0, 1, 0]) Note that the fit_transform method is just a shortcut for calling fit and transform separately, and we can use the inverse_transform method to transform the integer class labels back into their original string representation: >>> class_le.inverse_transform(y) array(['class1', 'class2', 'class1'], dtype=object) Performing a technique ‘one-hot encoding’ on nominal features In the Mapping ordinal features section, we used a simple dictionary-mapping approach to convert the ordinal size feature into integers. Since scikit-learn's estimators for classification treat class labels as categorical data that does not imply any order (nominal), we used the convenient LabelEncoder to encode the string labels into integers. It may appear that we could use a similar approach to transform the nominal color column of our dataset, as follows: >>> X = df[['color', 'size', 'price']].values >>> color_le = LabelEncoder() >>> X[:, 0] = color_le.fit_transform(X[:, 0]) >>> X array([[1, 1, 10.1],        [2, 2, 13.5],        [0, 3, 15.3]], dtype=object) After executing the preceding code, the first column of the NumPy array X now holds the new color values, which are encoded as follows: blue = 0 green = 1 red = 2 If we stop at this point and feed the array to our classifier, we will make one of the most common mistakes in dealing with categorical data. Can you spot the problem? Although the color values don't come in any particular order, a learning algorithm will now assume that green is larger than blue, and red is larger than green. Although this assumption is incorrect, the algorithm could still produce useful results. However, those results would not be optimal. A common workaround for this problem is to use a technique called one-hot encoding. The idea behind this approach is to create a new dummy feature for each unique value in the nominal feature column. Here, we would convert the color feature into three new features: blue, green, and red. Binary values can then be used to indicate the particular color of an example; for example, a blue example can be encoded as blue=1, green=0, red=0. To perform this transformation, we can use the OneHotEncoder that is implemented in scikit-learn's preprocessing module: >>> from sklearn.preprocessing import OneHotEncoder >>> X = df[['color', 'size', 'price']].values >>> color_ohe = OneHotEncoder() >>> color_ohe.fit_transform(X[:, 0].reshape(-1, 1)).toarray()  array([[0., 1., 0.],            [0., 0., 1.],            [1., 0., 0.]]) Note that we applied the OneHotEncoder to a single column (X[:, 0].reshape(-1, 1))) only, to avoid modifying the other two columns in the array as well. If we want to selectively transform columns in a multi-feature array, we can use the ColumnTransformer that accepts a list of (name, transformer, column(s)) tuples as follows: >>> from sklearn.compose import ColumnTransformer >>> X = df[['color', 'size', 'price']].values >>> c_transf = ColumnTransformer([ ...     ('onehot', OneHotEncoder(), [0]), ...     ('nothing', 'passthrough', [1, 2]) ... ]) >>> c_transf.fit_transform(X) .astype(float)     array([[0.0, 1.0, 0.0, 1, 10.1],            [0.0, 0.0, 1.0, 2, 13.5],            [1.0, 0.0, 0.0, 3, 15.3]]) In the preceding code example, we specified that we only want to modify the first column and leave the other two columns untouched via the 'passthrough' argument. An even more convenient way to create those dummy features via one-hot encoding is to use the get_dummies method implemented in pandas. Applied to a DataFrame, the get_dummies method will only convert string columns and leave all other columns unchanged: >>> pd.get_dummies(df[['price', 'color', 'size']])     price  size color_blue  color_green color_red 0    10.1     1   0 1          0 1    13.5     2   0 0          1 2    15.3     3   1 0          0 When we are using one-hot encoding datasets, we have to keep in mind that it introduces multicollinearity, which can be an issue for certain methods (for instance, methods that require matrix inversion). If features are highly correlated, matrices are computationally difficult to invert, which can lead to numerically unstable estimates. To reduce the correlation among variables, we can simply remove one feature column from the one-hot encoded array. Note that we do not lose any important information by removing a feature column, though; for example, if we remove the column color_blue, the feature information is still preserved since if we observe color_green=0 and color_red=0, it implies that the observation must be blue. If we use the get_dummies function, we can drop the first column by passing a True argument to the drop_first parameter, as shown in the following code example: >>> pd.get_dummies(df[['price', 'color', 'size']], ...                drop_first=True)     price  size color_green  color_red 0    10.1     1     1 0 1    13.5     2     0 1 2    15.3     3     0 0 In order to drop a redundant column via the OneHotEncoder , we need to set drop='first' and set categories='auto' as follows: >>> color_ohe = OneHotEncoder(categories='auto', drop='first') >>> c_transf = ColumnTransformer([  ...            ('onehot', color_ohe, [0]), ...            ('nothing', 'passthrough', [1, 2]) ... ]) >>> c_transf.fit_transform(X).astype(float) array([[  1. , 0. ,  1. , 10.1],        [  0. ,  1. , 2. ,  13.5],        [  0. ,  0. , 3. ,  15.3]]) In this article, we have gone through some of the methods to deal with categorical data in datasets. We distinguished between nominal and ordinal features, and with examples we explained how they can be handled. To harness the power of the latest Python open source libraries in machine learning check out this book Python Machine Learning - Third Edition, written by Sebastian Raschka and Vahid Mirjalili. Other interesting read in data! The best business intelligence tools 2019: when to use them and how much they cost Introducing Microsoft’s AirSim, an open-source simulator for autonomous vehicles built on Unreal Engine Media manipulation by Deepfakes and cheap fakes require both AI and social fixes, finds a Data & Society report
Read more
  • 0
  • 0
  • 48131

article-image-what-is-lstm
Richard Gall
11 Apr 2018
3 min read
Save for later

What is LSTM?

Richard Gall
11 Apr 2018
3 min read
What does LSTM stand for? LSTM stands for long short term memory. It is a model or architecture that extends the memory of recurrent neural networks. Typically, recurrent neural networks have 'short term memory' in that they use persistent previous information to be used in the current neural network. Essentially, the previous information is used in the present task. That means we do not have a list of all of the previous information available for the neural node. Find out how LSTM works alongside recurrent neural networks. Watch this short video tutorial. How does LSTM work? LSTM introduces long-term memory into recurrent neural networks. It mitigates the vanishing gradient problem, which is where the neural network stops learning because the updates to the various weights within a given neural network become smaller and smaller. It does this by using a series of 'gates'. These are contained in memory blocks which are connected through layers, like this: There are three types of gates within a unit: Input Gate: Scales input to cell (write) Output Gate: Scales output to cell (read) Forget Gate: Scales old cell value (reset) Each gate is like a switch that controls the read/write, thus incorporating the long-term memory function into the model. Applications of LSTM There are a huge range of ways that LSTM can be used, including: Handwriting recognition Time series anomaly detection Speech recognition Learning grammar Composing music The difference between LSTM and GRU There are many similarities between LSTM and GRU (Gated Recurrent Units). However, there are some important differences that are worth remembering: A GRU has two gates, whereas an LSTM has three gates. GRUs don't possess any internal memory that is different from the exposed hidden state. They don't have the output gate, which is present in LSTMs. There is no second nonlinearity applied when computing the output in GRU. GRU as a concept, is a little newer than LSTM. It is generally more efficient - it trains models at a quicker rate than LSTM. It is also easier to use. Any modifications you need to make to a model can be done fairly easily. However, LSTM should perform better than GRU where longer term memory is required. Ultimately, comparing performance is going to depend on the data set you are using. 4 ways to enable Continual learning into Neural Networks Build a generative chatbot using recurrent neural networks (LSTM RNNs) How Deep Neural Networks can improve Speech Recognition and generation
Read more
  • 0
  • 0
  • 48111
Unlock access to the largest independent learning library in Tech for FREE!
Get unlimited access to 7500+ expert-authored eBooks and video courses covering every tech area you can think of.
Renews at $19.99/month. Cancel anytime
article-image-how-to-deploy-serverless-applications-in-go-using-aws-lambda-tutorial
Savia Lobo
03 Oct 2018
12 min read
Save for later

How to deploy Serverless Applications in Go using AWS Lambda [Tutorial]

Savia Lobo
03 Oct 2018
12 min read
Building a serverless application allows you to focus on your application code instead of managing and operating infrastructure. If you choose AWS for this purpose, you do not have to think about provisioning or configuring servers since AWS will handle all of this for you. This reduces your infrastructure management burden and helps you get faster time-to-market. This tutorial is an excerpt taken from the book Hands-On Serverless Applications with Go written by Mohamed Labouardy. In this book, you will learn how to design and build a production-ready application in Go using AWS serverless services with zero upfront infrastructure investment. This article will cover the following points: Build, deploy, and manage our Lambda functions going through some advanced AWS CLI commands Publish multiple versions of the API Learn how to separate multiple deployment environments (sandbox, staging, and production) with aliases Cover the usage of the API Gateway stage variables to change the method endpoint's behavior. Lambda CLI commands In this section, we will go through the various AWS Lambda commands that you might use while building your Lambda functions. We will also learn how you can use them to automate your deployment process. The list-functions command As its name implies, it lists all Lambda functions in the AWS region you provided. The following command will return all Lambda functions in the North Virginia region: aws lambda list-functions --region us-east-1 For each function, the response includes the function's configuration information (FunctionName, Resources usage, Environment variables, IAM Role, Runtime environment, and so on), as shown in the following screenshot: To list only some attributes, such as the function name, you can use the query filter option, as follows: aws lambda list-functions --query Functions[].FunctionName[] The create-function command You should be familiar with this command as it has been used multiple times to create a new Lambda function from scratch. In addition to the function's configuration, you can use the command to provide the deployment package (ZIP) in two ways: ZIP file: It provides the path to the ZIP file of the code you are uploading with the --zip-file option: aws lambda create-function --function-name UpdateMovie \ --description "Update an existing movie" \ --runtime go1.x \ --role arn:aws:iam::ACCOUNT_ID:role/UpdateMovieRole \ --handler main \ --environment Variables={TABLE_NAME=movies} \ --zip-file fileb://./deployment.zip \ --region us-east-1a S3 Bucket object: It  provides the S3 bucket and object name with the --code option: aws lambda create-function --function-name UpdateMovie \ --description "Update an existing movie" \ --runtime go1.x \ --role arn:aws:iam::ACCOUNT_ID:role/UpdateMovieRole \ --handler main \ --environment Variables={TABLE_NAME=movies} \ --code S3Bucket=movies-api-deployment-package,S3Key=deployment.zip \ --region us-east-1 The as-mentioned commands will return a summary of the function's settings in a JSON format, as follows: It's worth mentioning that while creating your Lambda function, you might override the compute usage and network settings based on your function's behavior with the following options: --timeout: The default execution timeout is three seconds. When the three seconds are reached, AWS Lambda terminates your function. The maximum timeout you can set is five minutes. --memory-size: The amount of memory given to your function when executed. The default value is 128 MB and the maximum is 3,008 MB (increments of 64 MB). --vpc-config: This deploys the Lambda function in a private VPC. While it might be useful if the function requires communication with internal resources, it should ideally be avoided as it impacts the Lambda performance and scaling. AWS doesn't allow you to set the CPU usage of your function as it's calculated automatically based on the memory allocated for your function. CPU usage is proportional to the memory. The update-function-code command In addition to AWS Management Console, you can update your Lambda function's code with AWS CLI. The command requires the target Lambda function name and the new deployment package. Similarly to the previous command, you can provide the package as follows: The path to the new .zip file: aws lambda update-function-code --function-name UpdateMovie \ --zip-file fileb://./deployment-1.0.0.zip \ --region us-east-1 The S3 bucket where the .zip file is stored: aws lambda update-function-code --function-name UpdateMovie \ --s3-bucket movies-api-deployment-packages \ --s3-key deployment-1.0.0.zip \ --region us-east-1   This operation prints a new unique ID (called RevisionId) for each change in the Lambda function's code: The get-function-configuration command In order to retrieve the configuration information of a Lambda function, issue the following command: aws lambda get-function-configuration --function-name UpdateMovie --region us-east-1 The preceding command will provide the same information in the output that was displayed when the create-function command was used. To retrieve configuration information for a specific Lambda version or alias (following section), you can use the --qualifier option. The invoke command So far, we invoked our Lambda functions directly from AWS Lambda Console and through HTTP events with API Gateway. In addition to that, Lambda can be invoked from the AWS CLI with the invoke command: aws lambda invoke --function-name UpdateMovie result.json The preceding command will invoke the UpdateMovie function and save the function's output in result.json file: The status code is 400, which is normal, as UpdateFunction is expecting a JSON input. Let's see how to provide a JSON to our function with the invoke command. Head back to the DynamoDB movies table, and pick up a movie that you want to update. In this example, we will update the movie with the ID as 13, shown as follows: Create a JSON file with a body attribute that contains the new movie item attribute, as the Lambda function is expecting the input to be in the API Gateway Proxy request format: { "body": "{\"id\":\"13\", \"name\":\"Deadpool 2\"}" } Finally, run the invoke function command again with the JSON file as the input parameter: aws lambda invoke --function UpdateMovie --payload file://input.json result.json If you print the result.json content, the updated movie should be returned, shown as follows: You can verify that the movie's name is updated in the DynamoDB table by invoking the FindAllMovies function: aws lambda invoke --function-name FindAllMovies result.json The body attribute should contain the new updated movie, shown as follows: Head back to DynamoDB Console; the movie with the ID of 13 should have a new name, as shown in the following  screenshot: The delete-function command To delete a Lambda function, you can use the following command: aws lambda delete-function --function-name UpdateMovie By default, the command will delete all function versions and aliases. To delete a specific version or alias, you might want to use the --qualifier option. By now, you should be familiar with all the AWS CLI commands you might use and need while building your serverless applications in AWS Lambda. In the upcoming section, we will see how to create different versions of your Lambda functions and maintain multiple environments with aliases. Versions and aliases When you're building your serverless application, you must separate your deployment environments to test new changes without impacting your production. Therefore, having multiple versions of your Lambda functions makes sense. Versioning A version represents a state of your function's code and configuration in time. By default, each Lambda function has the $LATEST version pointing to the latest changes of your function, as shown in the following screenshot: In order to create a new version from the $LATEST version, click on Actions and Publish new version. Let's call it 1.0.0, as shown in  the next screenshot:   The new version will be created with an ID=1 (incremental). Note the ARN Lambda function at the top of the window in the following screenshot; it has the version ID: Once the version is created, you cannot update the function code, shown as follows: Moreover, advanced settings, such as IAM roles, network configuration, and compute usage, cannot be changed, shown as follows: Versions are called immutable, which means they cannot be changed once they're published; only the $LATEST version is editable. Now, we know how to publish a new version from the console. Let's publish a new version with the AWS CLI. But first, we need to update the FindAllMovies function as we cannot publish a new version if no changes were made to $LATEST since publishing version 1.0.0. The new version will have a pagination system. The function will return only the number of items requested by the user. The following code will read the Count header parameter, convert it to a number, and use the Scan operation with the Limit parameter to fetch the movies from DynamoDB: func findAll(request events.APIGatewayProxyRequest) (events.APIGatewayProxyResponse, error) { size, err := strconv.Atoi(request.Headers["Count"]) if err != nil { return events.APIGatewayProxyResponse{ StatusCode: http.StatusBadRequest, Body: "Count Header should be a number", }, nil } ... svc := dynamodb.New(cfg) req := svc.ScanRequest(&dynamodb.ScanInput{ TableName: aws.String(os.Getenv("TABLE_NAME")), Limit: aws.Int64(int64(size)), }) ... } Next, we update the FindAllMovies Lambda function's code with the update-function-code command: aws lambda update-function-code --function-name FindAllMovies \ --zip-file fileb://./deployment.zip Then, publish a new version, 1.1.0, based on the current configuration and code with the following command: aws lambda publish-version --function-name FindAllMovies --description 1.1.0 Go back to AWS Lambda Console and navigate to your FindAllMovies; a new version should be created with a new ID=2, as shown in the following screenshot: Now that our versions are created, let's test them out by using the AWS CLI invoke command. FindAllMovies v1.0.0 Invoke the FindAllMovies v1.0.0 version with its ID in the qualifier parameter with the following command: aws lambda invoke --function-name FindAllMovies --qualifier 1 result.json result.json should have all the movies in the DynamoDB movies table, shown as follows: The output showing all the movies in the DynamoDB movies tableTo know more about the output in the FindAllMovies v1.1.0 and more about Semantic versioning, head over to the book. Aliases The alias is a pointer to a specific version, it allows you to promote a function from one environment to another (such as staging to production). Aliases are mutable, unlike versions, which are immutable. To illustrate the concept of aliases, we will create two aliases, as illustrated in the following diagram: a Production alias pointing to FindAllMovies Lambda function 1.0.0 version, and a Staging alias that points to function 1.1.0 version. Then, we will configure API Gateway to use these aliases instead of the $LATEST version: Head back to the FindAllMovies configuration page. If you click on the Qualifiers drop-down list, you should see a default alias called Unqualified pointing to your $LATEST version, as shown in the following screenshot: To create a new alias, click on Actions and then Create a new alias called Staging. Select the 5 version as the target, shown as follows: Once created, the new version should be added to the list of Aliases, shown as follows: Next, create a new alias for the Production environment that points to version 1.0.0 using the AWS command line: aws lambda create-alias --function-name FindAllMovies \ --name Production --description "Production environment" \ --function-version 1 Similarly, the new alias should be successfully created: Now that our aliases have been created, let's configure the API Gateway to use those aliases with Stage variables. Stage variables Stage variables are environment variables that can be used to change the behavior at runtime of the API Gateway methods for each deployment stage. The following section will illustrate how to use stage variables with API Gateway.   On the API Gateway Console, navigate to the Movies API, click on the GET method, and update the target Lambda Function to use a stage variable instead of a hardcoded Lambda function name, as shown in the following screenshot: When you save it, a new prompt will ask you to grant the permissions to API Gateway to call your Lambda function aliases, as shown in the following screenshot: Execute the following commands to allow API Gateway to invoke the Production and Staging aliases: Production alias: aws lambda add-permission --function-name "arn:aws:lambda:us-east-1:ACCOUNT_ID:function:FindAllMovies:Production" \ --source-arn "arn:aws:execute-api:us-east-1:ACCOUNT_ID:API_ID/*/GET/movies" \ --principal apigateway.amazonaws.com \ --statement-id STATEMENT_ID \ --action lambda:InvokeFunction Staging alias: aws lambda add-permission --function-name "arn:aws:lambda:us-east-1:ACCOUNT_ID:function:FindAllMovies:Staging" \ --source-arn "arn:aws:execute-api:us-east-1:ACCOUNT_ID:API_ID/*/GET/movies" \ --principal apigateway.amazonaws.com \ --statement-id STATEMENT_ID \ --action lambda:InvokeFunction Then, create a new stage called production, as shown in next screenshot: Next, click on the Stages Variables tab, and create a new stage variable called lambda and set FindAllMovies:Production as a value, shown as follows: Do the same for the staging environment with the lambda variable pointing to the Lambda function's Staging alias, shown as follows: To test the endpoint, use the cURL command or any REST client you're familiar with. I opt for Postman. A GET method on the API Gateway's production stage invoked URL should return all the movies in the database, shown as follows: Do the same for the staging environment, with a new Header key called Count=4; you should have only four movies items in return, shown as follows: That's how you can maintain multiple environments of your Lambda functions. You can now easily promote the 1.1.0 version into production by changing the Production pointer to point to 1.1.0 instead of 1.0.0, and roll back in case of failure to the previous working version without changing the API Gateway settings. To summarize, we learned about how to deploy serverless applications using the AWS Lambda functions. If you've enjoyed reading this and want to know more about how to set up a CI/CD pipeline from scratch to automate the process of deploying Lambda functions to production with Go programming language, check out our book, Hands-On Serverless Applications with Go. Keep your serverless AWS applications secure [Tutorial] Azure Functions 2.0 launches with better workload support for serverless How Serverless computing is making AI development easier
Read more
  • 0
  • 0
  • 48105

article-image-step-detector-and-step-counters-sensors
Packt
14 Apr 2016
13 min read
Save for later

Step Detector and Step Counters Sensors

Packt
14 Apr 2016
13 min read
In this article by Varun Nagpal, author of the book, Android Sensor Programming By Example, we will focus on learning about the use of step detector and step counter sensors. These sensors are very similar to each other and are used to count the steps. Both the sensors are based on a common hardware sensor, which internally uses accelerometer, but Android still treats them as logically separate sensors. Both of these sensors are highly battery optimized and consume very low power. Now, lets look at each individual sensor in detail. (For more resources related to this topic, see here.) In this article by Varun Nagpal, author of the book, Android Sensor Programming By Example, we will focus on learning about the use of step detector and step counter sensors. These sensors are very similar to each other and are used to count the steps. Both the sensors are based on a common hardware sensor, which internally uses accelerometer, but Android still treats them as logically separate sensors. Both of these sensors are highly battery optimized and consume very low power. Now, lets look at each individual sensor in detail. The step counter sensor The step counter sensor is used to get the total number of steps taken by the user since the last reboot (power on) of the phone. When the phone is restarted, the value of the step counter sensor is reset to zero. In the onSensorChanged() method, the number of steps is give by event.value[0]; although it's a float value, the fractional part is always zero. The event timestamp represents the time at which the last step was taken. This sensor is especially useful for those applications that don't want to run in the background and maintain the history of steps themselves. This sensor works in batches and in continuous mode. If we specify 0 or no latency in the SensorManager.registerListener() method, then it works in a continuous mode; otherwise, if we specify any latency, then it groups the events in batches and reports them at the specified latency. For prolonged usage of this sensor, it's recommended to use the batch mode, as it saves power. Step counter uses the on-change reporting mode, which means it reports the event as soon as there is change in the value. The step detector sensor The step detector sensor triggers an event each time a step is taken by the user. The value reported in the onSensorChanged() method is always one, the fractional part being always zero, and the event timestamp is the time when the user's foot hit the ground. The step detector sensor has very low latency in reporting the steps, which is generally within 1 to 2 seconds. The Step detector sensor has lower accuracy and produces more false positive, as compared to the step counter sensor. The step counter sensor is more accurate, but has more latency in reporting the steps, as it uses this extra time after each step to remove any false positive values. The step detector sensor is recommended for those applications that want to track the steps in real time and want to maintain their own history of each and every step with their timestamp. Time for action – using the step counter sensor in activity Now, you will learn how to use the step counter sensor with a simple example. The good thing about the step counter is that, unlike other sensors, your app doesn't need to tell the sensor when to start counting the steps and when to stop counting them. It automatically starts counting as soon as the phone is powered on. For using it, we just have to register the listener with the sensor manager and then unregister it after using it. In the following example, we will show the total number of steps taken by the user since the last reboot (power on) of the phone in the Android activity. We created a PedometerActivity and implemented it with the SensorEventListener interface, so that it can receive the sensor events. We initiated the SensorManager and Sensor object of the step counter and also checked the sensor availability in the OnCreate() method of the activity. We registered the listener in the onResume() method and unregistered it in the onPause() method as a standard practice. We used a TextView to display the total number of steps taken and update its latest value in the onSensorChanged() method. public class PedometerActivity extends Activity implements SensorEventListener{ private SensorManager mSensorManager; private Sensor mSensor; private boolean isSensorPresent = false; private TextView mStepsSinceReboot; @Override protected void onCreate(Bundle savedInstanceState) { super.onCreate(savedInstanceState); setContentView(R.layout.activity_pedometer); mStepsSinceReboot = (TextView)findViewById(R.id.stepssincereboot); mSensorManager = (SensorManager) this.getSystemService(Context.SENSOR_SERVICE); if(mSensorManager.getDefaultSensor(Sensor.TYPE_STEP_COUNTER) != null) { mSensor = mSensorManager.getDefaultSensor(Sensor.TYPE_STEP_COUNTER); isSensorPresent = true; } else { isSensorPresent = false; } } @Override protected void onResume() { super.onResume(); if(isSensorPresent) { mSensorManager.registerListener(this, mSensor, SensorManager.SENSOR_DELAY_NORMAL); } } @Override protected void onPause() { super.onPause(); if(isSensorPresent) { mSensorManager.unregisterListener(this); } } @Override public void onSensorChanged(SensorEvent event) { mStepsSinceReboot.setText(String.valueOf(event.values[0])); } Time for action – maintaining step history with step detector sensor The Step counter sensor works well when we have to deal with the total number of steps taken by the user since the last reboot (power on) of the phone. It doesn't solve the purpose when we have to maintain history of each and every step taken by the user. The Step counter sensor may combine some steps and process them together, and it will only update with an aggregated count instead of reporting individual step detail. For such cases, the step detector sensor is the right choice. In our next example, we will use the step detector sensor to store the details of each step taken by the user, and we will show the total number of steps for each day, since the application was installed. Our next example will consist of three major components of Android, namely service, SQLite database, and activity. Android service will be used to listen to all the individual step details using the step counter sensor when the app is in the background. All the individual step details will be stored in the SQLite database and finally the activity will be used to display the list of total number of steps along with dates. Let's look at the each component in detail. The first component of our example is PedometerListActivity. We created a ListView in the activity to display the step count along with dates. Inside the onCreate() method of PedometerListActivity, we initiated the ListView and ListAdaptor required to populate the list. Another important task that we do in the onCreate() method is starting the service (StepsService.class), which will listen to all the individual steps' events. We also make a call to the getDataForList() method, which is responsible for fetching the data for ListView. public class PedometerListActivity extends Activity{ private ListView mSensorListView; private ListAdapter mListAdapter; @Override protected void onCreate(Bundle savedInstanceState) { super.onCreate(savedInstanceState); setContentView(R.layout.activity_main); mSensorListView = (ListView)findViewById(R.id.steps_list); getDataForList(); mListAdapter = new ListAdapter(); mSensorListView.setAdapter(mListAdapter); Intent mStepsIntent = new Intent(getApplicationContext(), StepsService.class); startService(mStepsIntent); } In our example, the DateStepsModel class is used as a POJO (Plain Old Java Object) class, which is a handy way of grouping logical data together, to store the total number of steps and date. We also use the StepsDBHelper class to read and write the steps data in the database (discussed further in the next section). Inside the getDataForList() method, we initiated the object of the StepsDBHelper class and call the readStepsEntries() method of the StepsDBHelper class, which returns ArrayList of the DateStepsModel objects containing the total number of steps along with dates after reading from database. The ListAdapter class is used for populating the values for ListView, which internally uses ArrayList of DateStepsModel as the data source. The individual list item is the string, which is the concatenation of date and the total number of steps. class DateStepsModel { public String mDate; public int mStepCount; } private StepsDBHelper mStepsDBHelper; private ArrayList<DateStepsModel> mStepCountList; public void getDataForList() { mStepsDBHelper = new StepsDBHelper(this); mStepCountList = mStepsDBHelper.readStepsEntries(); } private class ListAdapter extends BaseAdapter{ private TextView mDateStepCountText; @Override public int getCount() { return mStepCountList.size(); } @Override public Object getItem(int position) { return mStepCountList.get(position); } @Override public long getItemId(int position) { return position; } @Override public View getView(int position, View convertView, ViewGroup parent) { if(convertView==null){ convertView = getLayoutInflater().inflate(R.layout.list_rows, parent, false); } mDateStepCountText = (TextView)convertView.findViewById(R.id.sensor_name); mDateStepCountText.setText(mStepCountList.get(position).mDate + " - Total Steps: " + String.valueOf(mStepCountList.get(position).mStepCount)); return convertView; } } The second component of our example is StepsService, which runs in the background and listens to the step detector sensor until the app is uninstalled. We implemented this service with the SensorEventListener interface so that it can receive the sensor events. We also initiated theobjects of StepsDBHelper, SensorManager, and the step detector sensor inside the OnCreate() method of the service. We only register the listener when the step detector sensor is available on the device. A point to note here is that we never unregistered the listener because we expect our app to log the step information indefinitely until the app is uninstalled. Both step detector and step counter sensors are very low on battery consumptions and are highly optimized at the hardware level, so if the app really requires, it can use them for longer durations without affecting the battery consumption much. We get a step detector sensor callback in the onSensorChanged() method whenever the operating system detects a step, and from CC: specify, we call the createStepsEntry() method of the StepsDBHelperclass to store the step information in the database. public class StepsService extends Service implements SensorEventListener{ private SensorManager mSensorManager; private Sensor mStepDetectorSensor; private StepsDBHelper mStepsDBHelper; @Override public void onCreate() { super.onCreate(); mSensorManager = (SensorManager) this.getSystemService(Context.SENSOR_SERVICE); if(mSensorManager.getDefaultSensor(Sensor.TYPE_STEP_DETECTOR) != null) { mStepDetectorSensor = mSensorManager.getDefaultSensor(Sensor.TYPE_STEP_DETECTOR); mSensorManager.registerListener(this, mStepDetectorSensor, SensorManager.SENSOR_DELAY_NORMAL); mStepsDBHelper = new StepsDBHelper(this); } } @Override public int onStartCommand(Intent intent, int flags, int startId) { return Service.START_STICKY; } @Override public void onSensorChanged(SensorEvent event) { mStepsDBHelper.createStepsEntry(); } The last component of our example is the SQLite database. We created a StepsDBHelper class and extended it from the SQLiteOpenHelper abstract utility class provided by the Android framework to easily manage database operations. In the class, we created a database called StepsDatabase, which is automatically created on the first object creation of the StepsDBHelper class by the OnCreate() method. This database has one table StepsSummary, which consists of only three columns (id, stepscount, and creationdate). The first column, id, is the unique integer identifier for each row of the table and is incremented automatically on creation of every new row. The second column, stepscount, is used to store the total number of steps taken for each date. The third column, creationdate, is used to store the date in the mm/dd/yyyy string format. Inside the createStepsEntry() method, we first check whether there is an existing step count with the current date, and we if find one, then we read the existing step count of the current date and update the step count by incrementing it by 1. If there is no step count with the current date found, then we assume that it is the first step of the current date and we create a new entry in the table with the current date and step count value as 1. The createStepsEntry() method is called from onSensorChanged() of the StepsService class whenever a new step is detected by the step detector sensor. public class StepsDBHelper extends SQLiteOpenHelper { private static final int DATABASE_VERSION = 1; private static final String DATABASE_NAME = "StepsDatabase"; private static final String TABLE_STEPS_SUMMARY = "StepsSummary"; private static final String ID = "id"; private static final String STEPS_COUNT = "stepscount"; private static final String CREATION_DATE = "creationdate";//Date format is mm/dd/yyyy private static final String CREATE_TABLE_STEPS_SUMMARY = "CREATE TABLE " + TABLE_STEPS_SUMMARY + "(" + ID + " INTEGER PRIMARY KEY AUTOINCREMENT," + CREATION_DATE + " TEXT,"+ STEPS_COUNT + " INTEGER"+")"; StepsDBHelper(Context context) { super(context, DATABASE_NAME, null, DATABASE_VERSION); } @Override public void onCreate(SQLiteDatabase db) { db.execSQL(CREATE_TABLE_STEPS_SUMMARY); } public boolean createStepsEntry() { boolean isDateAlreadyPresent = false; boolean createSuccessful = false; int currentDateStepCounts = 0; Calendar mCalendar = Calendar.getInstance(); String todayDate = String.valueOf(mCalendar.get(Calendar.MONTH))+"/" + String.valueOf(mCalendar.get(Calendar.DAY_OF_MONTH))+"/"+String.valueOf(mCalendar.get(Calendar.YEAR)); String selectQuery = "SELECT " + STEPS_COUNT + " FROM " + TABLE_STEPS_SUMMARY + " WHERE " + CREATION_DATE +" = '"+ todayDate+"'"; try { SQLiteDatabase db = this.getReadableDatabase(); Cursor c = db.rawQuery(selectQuery, null); if (c.moveToFirst()) { do { isDateAlreadyPresent = true; currentDateStepCounts = c.getInt((c.getColumnIndex(STEPS_COUNT))); } while (c.moveToNext()); } db.close(); } catch (Exception e) { e.printStackTrace(); } try { SQLiteDatabase db = this.getWritableDatabase(); ContentValues values = new ContentValues(); values.put(CREATION_DATE, todayDate); if(isDateAlreadyPresent) { values.put(STEPS_COUNT, ++currentDateStepCounts); int row = db.update(TABLE_STEPS_SUMMARY, values, CREATION_DATE +" = '"+ todayDate+"'", null); if(row == 1) { createSuccessful = true; } db.close(); } else { values.put(STEPS_COUNT, 1); long row = db.insert(TABLE_STEPS_SUMMARY, null, values); if(row!=-1) { createSuccessful = true; } db.close(); } } catch (Exception e) { e.printStackTrace(); } return createSuccessful; } The readStepsEntries() method is called from PedometerListActivity to display the total number of steps along with the date in the ListView. The readStepsEntries() method reads all the step counts along with their dates from the table and fills the ArrayList of DateStepsModelwhich is used as a data source for populating the ListView in PedometerListActivity. public ArrayList<DateStepsModel> readStepsEntries() { ArrayList<DateStepsModel> mStepCountList = new ArrayList<DateStepsModel>(); String selectQuery = "SELECT * FROM " + TABLE_STEPS_SUMMARY; try { SQLiteDatabase db = this.getReadableDatabase(); Cursor c = db.rawQuery(selectQuery, null); if (c.moveToFirst()) { do { DateStepsModel mDateStepsModel = new DateStepsModel(); mDateStepsModel.mDate = c.getString((c.getColumnIndex(CREATION_DATE))); mDateStepsModel.mStepCount = c.getInt((c.getColumnIndex(STEPS_COUNT))); mStepCountList.add(mDateStepsModel); } while (c.moveToNext()); } db.close(); } catch (Exception e) { e.printStackTrace(); } return mStepCountList; } What just happened? We created a small pedometer utility app that maintains the step history along with dates using the steps detector sensor. We used PedometerListActivityto display the list of the total number of steps along with their dates. StepsServiceis used to listen to all the steps detected by the step detector sensor in the background. And finally, the StepsDBHelperclass is used to create and update the total step count for each date and to read the total step counts along with dates from the database. Resources for Article: Further resources on this subject: Introducing the Android UI [article] Building your first Android Wear Application [article] Mobile Phone Forensics – A First Step into Android Forensics [article]
Read more
  • 0
  • 3
  • 48076

article-image-brett-lantz-on-implementing-a-decision-tree-using-c5-0-algorithm-in-r
Packt Editorial Staff
29 Mar 2019
9 min read
Save for later

Brett Lantz on implementing a decision tree using C5.0 algorithm in R

Packt Editorial Staff
29 Mar 2019
9 min read
Decision tree learners are powerful classifiers that utilize a tree structure to model the relationships among the features and the potential outcomes. This structure earned its name due to the fact that it mirrors the way a literal tree begins at a wide trunk and splits into narrower and narrower branches as it is followed upward. In much the same way, a decision tree classifier uses a structure of branching decisions that channel examples into a final predicted class value. In this article, we demonstrate the implementation of decision tree using C5.0 algorithm in R. This article is taken from the book, Machine Learning with R, Fourth Edition written by Brett Lantz. This 10th Anniversary Edition of the classic R data science book is updated to R 4.0.0 with newer and better libraries. This book features several new chapters that reflect the progress of machine learning in the last few years and help you build your data science skills and tackle more challenging problems There are numerous implementations of decision trees, but the most well-known is the C5.0 algorithm. This algorithm was developed by computer scientist J. Ross Quinlan as an improved version of his prior algorithm, C4.5 (C4.5 itself is an improvement over his Iterative Dichotomiser 3 (ID3) algorithm). Although Quinlan markets C5.0 to commercial clients (see http://www.rulequest.com/ for details), the source code for a single-threaded version of the algorithm was made public, and has therefore been incorporated into programs such as R. The C5.0 decision tree algorithm The C5.0 algorithm has become the industry standard for producing decision trees because it does well for most types of problems directly out of the box. Compared to other advanced machine learning models, the decision trees built by C5.0 generally perform nearly as well but are much easier to understand and deploy. Additionally, as shown in the following table, the algorithm's weaknesses are relatively minor and can be largely avoided. Strengths An all-purpose classifier that does well on many types of problems. Highly automatic learning process, which can handle numeric or nominal features, as well as missing data. Excludes unimportant features. Can be used on both small and large datasets. Results in a model that can be interpreted without a mathematical background (for relatively small trees). More efficient than other complex models. Weaknesses Decision tree models are often biased toward splits on features having a large number of levels. It is easy to overfit or underfit the model. Can have trouble modeling some relationships due to reliance on axis-parallel splits. Small changes in training data can result in large changes to decision logic. Large trees can be difficult to interpret and the decisions they make may seem counterintuitive. To keep things simple, our earlier decision tree example ignored the mathematics involved with how a machine would employ a divide and conquer strategy. Let's explore this in more detail to examine how this heuristic works in practice. Choosing the best split The first challenge that a decision tree will face is to identify which feature to split upon. In the previous example, we looked for a way to split the data such that the resulting partitions contained examples primarily of a single class. The degree to which a subset of examples contains only a single class is known as purity, and any subset composed of only a single class is called pure. There are various measurements of purity that can be used to identify the best decision tree splitting candidate. C5.0 uses entropy, a concept borrowed from information theory that quantifies the randomness, or disorder, within a set of class values. Sets with high entropy are very diverse and provide little information about other items that may also belong in the set, as there is no apparent commonality. The decision tree hopes to find splits that reduce entropy, ultimately increasing homogeneity within the groups. Typically, entropy is measured in bits. If there are only two possible classes, entropy values can range from 0 to 1. For n classes, entropy ranges from 0 to log2(n). In each case, the minimum value indicates that the sample is completely homogenous, while the maximum value indicates that the data are as diverse as possible, and no group has even a small plurality. In mathematical notion, entropy is specified as: In this formula, for a given segment of data (S), the term c refers to the number of class levels, and pi  refers to the proportion of values falling into class level i. For example, suppose we have a partition of data with two classes: red (60 percent) and white (40 percent). We can calculate the entropy as: > -0.60 * log2(0.60) - 0.40 * log2(0.40) [1] 0.9709506 We can visualize the entropy for all possible two-class arrangements. If we know the proportion of examples in one class is x, then the proportion in the other class is (1 – x). Using the curve() function, we can then plot the entropy for all possible values of x: > curve(-x * log2(x) - (1 - x) * log2(1 - x),     col = "red", xlab = "x", ylab = "Entropy", lwd = 4) This results in the following figure: The total entropy as the proportion of one class varies in a two-class outcome As illustrated by the peak in entropy at x = 0.50, a 50-50 split results in the maximum entropy. As one class increasingly dominates the other, the entropy reduces to zero. To use entropy to determine the optimal feature to split upon, the algorithm calculates the change in homogeneity that would result from a split on each possible feature, a measure known as information gain. The information gain for a feature F is calculated as the difference between the entropy in the segment before the split (S1) and the partitions resulting from the split (S2): One complication is that after a split, the data is divided into more than one partition. Therefore, the function to calculate Entropy(S2) needs to consider the total entropy across all of the partitions. It does this by weighting each partition's entropy according to the proportion of all records falling into that partition. This can be stated in a formula as: In simple terms, the total entropy resulting from a split is the sum of entropy of each of the n partitions weighted by the proportion of examples falling in the partition (wi). The higher the information gain, the better a feature is at creating homogeneous groups after a split on that feature. If the information gain is zero, there is no reduction in entropy for splitting on this feature. On the other hand, the maximum information gain is equal to the entropy prior to the split. This would imply the entropy after the split is zero, which means that the split results in completely homogeneous groups. The previous formulas assume nominal features, but decision trees use information gain for splitting on numeric features as well. To do so, a common practice is to test various splits that divide the values into groups greater than or less than a threshold. This reduces the numeric feature into a two-level categorical feature that allows information gain to be calculated as usual. The numeric cut point yielding the largest information gain is chosen for the split. Note: Though it is used by C5.0, information gain is not the only splitting criterion that can be used to build decision trees. Other commonly used criteria are Gini index, chi-squared statistic, and gain ratio. For a review of these (and many more) criteria, refer to An Empirical Comparison of Selection Measures for Decision-Tree Induction, Mingers, J, Machine Learning, 1989, Vol. 3, pp. 319-342. Pruning the decision tree As mentioned earlier, a decision tree can continue to grow indefinitely, choosing splitting features and dividing into smaller and smaller partitions until each example is perfectly classified or the algorithm runs out of features to split on. However, if the tree grows overly large, many of the decisions it makes will be overly specific and the model will be overfitted to the training data. The process of pruning a decision tree involves reducing its size such that it generalizes better to unseen data. One solution to this problem is to stop the tree from growing once it reaches a certain number of decisions or when the decision nodes contain only a small number of examples. This is called early stopping or prepruning the decision tree. As the tree avoids doing needless work, this is an appealing strategy. However, one downside to this approach is that there is no way to know whether the tree will miss subtle but important patterns that it would have learned had it grown to a larger size. An alternative, called post-pruning, involves growing a tree that is intentionally too large and pruning leaf nodes to reduce the size of the tree to a more appropriate level. This is often a more effective approach than prepruning because it is quite difficult to determine the optimal depth of a decision tree without growing it first. Pruning the tree later on allows the algorithm to be certain that all of the important data structures were discovered. Note: The implementation details of pruning operations are very technical and beyond the scope of this book. For a comparison of some of the available methods, see A Comparative Analysis of Methods for Pruning Decision Trees, Esposito, F, Malerba, D, Semeraro, G, IEEE Transactions on Pattern Analysis and Machine Intelligence, 1997, Vol. 19, pp. 476-491. One of the benefits of the C5.0 algorithm is that it is opinionated about pruning—it takes care of many of the decisions automatically using fairly reasonable defaults. Its overall strategy is to post-prune the tree. It first grows a large tree that overfits the training data. Later, the nodes and branches that have little effect on the classification errors are removed. In some cases, entire branches are moved further up the tree or replaced by simpler decisions. These processes of grafting branches are known as subtree raising and subtree replacement, respectively. Getting the right balance of overfitting and underfitting is a bit of an art, but if model accuracy is vital, it may be worth investing some time with various pruning options to see if it improves the test dataset performance. To summarize , decision trees are widely used due to their high accuracy and ability to formulate a statistical model in plain language.  Here, we looked at a highly popular and easily configurable decision tree algorithm C5.0. The major strength of the C5.0 algorithm over other decision tree implementations is that it is very easy to adjust the training options. Harness the power of R to build flexible, effective, and transparent machine learning models with Brett Lantz’s latest book Machine Learning with R, Fourth Edition. Dr.Brandon explains Decision Trees to Jon Building a classification system with Decision Trees in Apache Spark 2.0 Implementing Decision Trees
Read more
  • 0
  • 0
  • 48067

article-image-rust-is-the-future-of-systems-programming-c-is-the-new-assembly-intel-principal-engineer-josh-triplett
Bhagyashree R
27 Aug 2019
10 min read
Save for later

“Rust is the future of systems programming, C is the new Assembly”: Intel principal engineer, Josh Triplett

Bhagyashree R
27 Aug 2019
10 min read
At Open Source Technology Summit (OSTS) 2019, Josh Triplett, a Principal Engineer at Intel gave an insight into what Intel is contributing to bring the most loved language, Rust to full parity with C. In his talk titled Intel and Rust: the Future of Systems Programming, he also spoke about the history of systems programming, how C became the “default” systems programming language, what features of Rust gives it an edge over C, and much more. Until now, OSTS was Intel's closed event where the company's business and tech leaders come together to discuss the various trends, technologies, and innovations that will help shape the open-source ecosystem. However, this year was different as the company welcomed non-Intel attendees including media, partners, and developers for the first time. The event hosts keynotes, more than 50 technical sessions, panels, demos covering all the open source technologies Intel is involved in. These include integrated software stacks (edge, AI, infrastructure), firmware, embedded and IoT projects, and cloud system software. This year the event happened from May 14-16 at Stevenson, Washington. What is systems programming Systems programming is the development and management of software that serves as a platform for other software to be built upon. The system software also directly or closely interfaces with computer hardware in order to gain necessary performance and expose abstractions. Unlike application programming where software is created to provide services to the user, it aims to produce software that provides services to the computer hardware. Triplett broadly defines systems programming as “anything that isn't an app.” It includes things like BIOS, firmware, boot loaders, operating systems kernels, embedded and similar types of low-level code, virtual machine implementations. Triplett also counts a web browser as a system software as it is more than “just an app,” they are actually “platforms for websites and web apps,” he says. How C became the “default” systems programming language Previously, most system software including BIOS, boot loaders, and firmware were written in Assembly. In the 1960s, experiments to bring hardware support in high-level languages started, which resulted in the creation of languages such as PL/S, BLISS, BCPL, and extended ALGOL. Then in the 1970s, Dennis Ritchie created the C programming language for the Unix operating system. Derived from the typeless B programming language, C was packed with powerful high-level functionalities and detailed features that were best suited for writing an operating system. Several UNIX components including its kernel were eventually rewritten in C. Many other system software including the Oracle database, a large portion of Windows source code, Linux operating system, were all written in C. C was seeing a huge adoption at this point. But, what exactly made developers comfortable moving to C? Triplett believes that in order to make this move from one language to another, developers have to be comfortable in terms of two things: features and parity. First, the language should offer “sufficiently compelling” features. “It can’t just be a little bit better. It has to be substantially better to warrant the effort and engineering time needed to move,” he adds. As compared to Assembly, C had a lot to offer. It had some degree of type safety, provided portability, better productivity with high-level constructs, and much more readable code. Second, the language has to provide parity, which means developers had to be confident that it is no less capable than Assembly. He states, “It can’t just be better, it also has to be no worse.” In addition to being faster and expressing any type of data that Assembly was able to, it also had what Triplett calls “escape hatch.”  This means you are allowed to make the move incrementally and also combine Assembly if required. Triplett believes that C is now becoming what Assembly was years ago. “C is the new Assembly,” he concludes. Developers are looking for a high-level language that not only addresses the problems in C that can’t be fixed but also leverage other exciting features that these languages provide. Such a language that aims to be compelling enough to make developers move from C should be memory safe, provide automatic memory management,  security, and much more. “Any language that wants to be better than C has to offer a lot more than just protection from buffer overflows if it's actually going to be a compelling alternative. People care about usability and productivity. They care about writing code that is self-explanatory, which accomplishes more work in less code. It also needs to address security issues. Usability and productivity go hand in hand with security. The less code you need to write to accomplish something, the less chance you have of introducing bugs security bugs or otherwise,” he explains. Comparing Rust with C Back in 2006, Graydon Hoare, a Mozilla employee started writing Rust as a personal project. Mozilla, in 2009, started sponsoring the project and also expanded the team to drive further development of the language. One of the reasons why Mozilla got interested is that Firefox was written in more than 4 million lines of C++ code and had quite a bit of highly critical vulnerabilities. Rust was built with safety and concurrency in mind making it the perfect choice for rewriting many components of Firefox under Project Quantum. It is also using Rust to develop Servo, an HTML rendering engine that will eventually replace Firefox’s rendering engine. Many other companies have also started using Rust for their projects including Microsoft, Google, Facebook, Amazon, Dropbox, Fastly, Chef, Baidu, and more. Rust addresses the memory management problem in C. It offers automatic memory management so that developers do not have to manually call free on every object. What sets it apart from other modern languages is that it does not have a garbage collector or runtime system of any kind. Rust instead has the concepts of ownership, borrowing, references, and lifetimes. “Rust has a system of declaring whether any given use of an object is the owner of that object or whether it's just borrowing that object temporarily. If you're just borrowing an object the compiler will keep track of that. It'll make sure that the original sticks around as long as you reference it. Rust makes sure that the owner of the object frees it when it's done and it inserts the call to free at compile time with no extra runtime overhead,” Triplett explains. Not having a runtime is also a plus for Rust. Triplett believes that languages that have a runtime are difficult to use as a system programming language. He adds, “You have to initialize that runtime before you can call any code, you have to use that runtime to call functions, and the runtime itself might run extra code behind your back at unexpected times.” Rust also aims to provide safe concurrent programming. The same features that make it memory safe, keep track of things like which thread own which object, which objects can be passed between threads, and which objects require acquiring locks. These features make Rust compelling enough for developers to choose for systems programming. However, talking about the second criteria, Rust does not have enough parity with C yet. “Achieving parity with C is exactly what got me involved in Rust,” says Triplett Teaching Rust about C compatible unions Triplett's first contribution to the Rust programming language was in the form of the 1444 RFC, which was started in 2015 and got accepted in 2016. This RFC proposed to bring native support for C-compatible unions in Rust that would be defined via a new "contextual keyword" union. Triplett understood the need for this proposal when he wanted to build a virtual machine in Rust and the Linux kernel interface for that /dev/kvm required unions. "I worked with the Rust community and with the language team to get unions into Rust and because of that work I'm actually now part of the Rust language governance team helping to evaluate and guide other changes into the language," he adds. He talked about this RFC in much detail at the very first RustConf in 2016: https://www.youtube.com/watch?v=U8Gl3RTXf88 Support for unnamed struct and union types Another feature that Triplett worked on was the support for unnamed struct and union types in Rust. This has been a widespread C compiler extension for decades and was also included in the C11 standard. This allowed developers to group and layout fields in arbitrary ways to match C data structures used in the Foreign Function Interface (FFI). With this proposal implemented, Rust will be able to represent such types using the same names as the structures without interposing artificial field names that will confuse users of well-established interfaces from existing platforms. A stabilized support for inline Assembly in Rust Systems programming often involves low-level manipulations and requires low-level details of the processors such as privileged instructions. For this, Rust supports using inline Assembly via the ‘asm!’ macro. However, it is only present in the nightly compiler and not yet stabilized. Triplett in a collaboration with other Rust developers is writing a proposal to introduce more robust syntax for inline Assembly. To know more in detail about support for inline Assembly, check out this pre-RFC. BFLOAT16 support into Rust Many Intel processors including Xeon Scalable ‘Cooper Lake-SP’ now support BFLOAT16, a new floating-point format. This truncated 16-bit version of the 32-bit IEEE 754 single-precision floating-point format was mainly designed for deep learning. This format is also used in machine learning libraries like Tensorflow that work with huge datasets. It also makes interoperating with existing systems, functions, and storage much easier. This is why Triplett is working on adding support for BFLOAT16 in Rust so that developers would be able to use the full capabilities of their hardware. FFI/C Parity Working Group This was one of the important announcements that Triplett made. He is starting a working group that will focus on achieving full parity with C. Under this group, he aims to collaborate with both the Rust community and other Intel developers to develop the specifications for the remaining features that need to be implemented in Rust for system programming. This group will also focus on bringing support for systems programming using the stable releases of Rust, not just experimental nightly releases of the compiler. In last week’s Reddit discussion, Triplett shared the current status of the working group, “To pre-answer, one question: the FFI / C Parity working group is in the process of being launched, and hasn't quite kicked off yet. I'll be posting about it here and elsewhere when it is, along with the initial goals.” Watch Josh Triplett’s full OSTS talk to know more about Intel’s contribution to Rust: https://www.youtube.com/watch?v=l9hM0h6IQDo [box type="shadow" align="" class="" width=""]Update: We have made the following corrections based on feedback from Josh Triplett: This year OSTS was open to Intel's partners and press. Previously, the article read 'escape patch', but it is 'escape hatch.' RFC 1444 wasn't last year, it was started in 2015 and accepted in 2016. 'dev KVM' is now corrected to '/dev/kvm'[/box] AMD competes with Intel by launching EPYC Rome, world’s first 7 nm chip for data centers, luring in Twitter and Google Hot Chips 31: IBM Power10, AMD’s AI ambitions, Intel NNP-T, Cerebras largest chip with 1.2 trillion transistors and more Intel’s 10th gen 10nm ‘Ice Lake’ processor offers AI apps, new graphics and best connectivity
Read more
  • 0
  • 0
  • 48003
article-image-setting-up-logistic-regression-model-using-tensorflow
Packt Editorial Staff
25 Apr 2018
8 min read
Save for later

Setting up Logistic Regression model using TensorFlow

Packt Editorial Staff
25 Apr 2018
8 min read
TensorFlow is another open source library developed by the Google Brain Team to build numerical computation models using data flow graphs. The core of TensorFlow was developed in C++ with the wrapper in Python. The tensorflow package in R gives you access to the TensorFlow API composed of Python modules to execute computation models. TensorFlow supports both CPU- and GPU-based computations. In this article, we will cover the application of TensorFlow in setting up a logistic regression model. The example will use a similar dataset to that used in the H2O model setup. The tensorflow package in R calls the Python tensorflow API for execution, which is essential to install the tensorflow package in both R and Python to make R work. The following are the dependencies for tensorflow: Python 2.7 / 3.x R (>3.2) devtools package in R for installing TensorFlow from GitHub TensorFlow in Python pip Getting ready The code for this section is created on Linux but can be run on any operating system. To start modeling, load the tensorflow package in the environment. R loads the default TensorFlow environment variable and also the NumPy library from Python in the np variable: library("tensorflow") # Load TensorFlow np <- import("numpy") # Load numpy library How to do it... The data is imported using a standard function from R, as shown in the following code. The data is imported using the csv file and transformed into the matrix format followed by selecting the features used to model as defined in xFeatures and yFeatures. The next step in TensorFlow is to set up a graph to run optimization: # Loading input and test data xFeatures = c("Temperature", "Humidity", "Light", "CO2", "HumidityRatio") yFeatures = "Occupancy" occupancy_train <-as.matrix(read.csv("datatraining.txt",stringsAsFactors = T)) occupancy_test <- as.matrix(read.csv("datatest.txt",stringsAsFactors = T)) # subset features for modeling and transform to numeric values occupancy_train<-apply(occupancy_train[, c(xFeatures, yFeatures)], 2, FUN=as.numeric) occupancy_test<-apply(occupancy_test[, c(xFeatures, yFeatures)], 2, FUN=as.numeric) # Data dimensions nFeatures<-length(xFeatures) nRow<-nrow(occupancy_train) Before setting up the graph, let's reset the graph using the following command: # Reset the graph tf$reset_default_graph() Additionally, let's start an interactive session as it will allow us to execute variables without referring to the session-to-session object: # Starting session as interactive session sess<-tf$InteractiveSession() Define the logistic regression model in TensorFlow: # Setting-up Logistic regression graph x <- tf$constant(unlist(occupancy_train[, xFeatures]), shape=c(nRow, nFeatures), dtype=np$float32) # W <- tf$Variable(tf$random_uniform(shape(nFeatures, 1L))) b <- tf$Variable(tf$zeros(shape(1L))) y <- tf$matmul(x, W) + b The input feature x is defined as a constant as it will be an input to the system. The weight W and bias b are defined as variables that will be optimized during the optimization process. The y is set up as a symbolic representation between x, W, and b. The weight W is set up to initialize random uniform distribution and b is assigned the value zero. The next step is to set up the cost function for logistic regression: # Setting-up cost function and optimizer y_ <- tf$constant(unlist(occupancy_train[, yFeatures]), dtype="float32", shape=c(nRow, 1L)) cross_entropy<-tf$reduce_mean(tf$nn$sigmoid_cross_entropy_with_logits(labels=y_, logits=y, name="cross_entropy")) optimizer <- tf$train$GradientDescentOptimizer(0.15)$minimize(cross_entropy) # Start a session init <- tf$global_variables_initializer() sess$run(init) Execute the gradient descent algorithm for the optimization of weights using cross entropy as the loss function: # Running optimization for (step in 1:5000) {   sess$run(optimizer)   if (step %% 20== 0)     cat(step, "-", sess$run(W), sess$run(b), "==>", sess$run(cross_entropy), "n") } How it works... The performance of the model can be evaluated using AUC: # Performance on Train library(pROC) ypred <- sess$run(tf$nn$sigmoid(tf$matmul(x, W) + b)) roc_obj <- roc(occupancy_train[, yFeatures], as.numeric(ypred)) # Performance on test nRowt<-nrow(occupancy_test) xt <- tf$constant(unlist(occupancy_test[, xFeatures]), shape=c(nRowt, nFeatures), dtype=np$float32) ypredt <- sess$run(tf$nn$sigmoid(tf$matmul(xt, W) + b)) roc_objt <- roc(occupancy_test[, yFeatures], as.numeric(ypredt)). AUC can be visualized using the plot.auc function from the pROC package, as shown in the screenshot following this command. The performance for training and testing (hold-out) is very similar. plot.roc(roc_obj, col = "green", lty=2, lwd=2) plot.roc(roc_objt, add=T, col="red", lty=4, lwd=2) Performance of logistic regression using TensorFlow Visualizing TensorFlow graphs TensorFlow graphs can be visualized using TensorBoard. It is a service that utilizes TensorFlow event files to visualize TensorFlow models as graphs. Graph model visualization in TensorBoard is also used to debug TensorFlow models. Getting ready TensorBoard can be started using the following command in the terminal: $ tensorboard --logdir home/log --port 6006 The following are the major parameters for TensorBoard: --logdir : To map to the directory to load TensorFlow events --debug: To increase log verbosity --host: To define the host to listen to its localhost (0.0.1) by default --port: To define the port to which TensorBoard will serve The preceding command will launch the TensorFlow service on localhost at port 6006, as shown in the following screenshot: TensorBoard The tabs on the TensorBoard capture relevant data generated during graph execution. How to do it... The section covers how to visualize TensorFlow models and output in TernsorBoard. To visualize summaries and graphs, data from TensorFlow can be exported using the FileWriter command from the summary module. A default session graph can be added using the following command: # Create Writer Obj for log log_writer = tf$summary$FileWriter('c:/log', sess$graph) The graph for logistic regression developed using the preceding code is shown in the following screenshot: Visualization of the logistic regression graph in TensorBoard Details about symbol descriptions on TensorBoard can be found at https://www.tensorflow.org/get_started/graph_viz. Similarly, other variable summaries can be added to the TensorBoard using correct summaries, as shown in the following code: # Adding histogram summary to weight and bias variable w_hist = tf$histogram_summary("weights", W) b_hist = tf$histogram_summary("biases", b) Create a cross entropy evaluation for test. An example script to generate the cross entropy cost function for test and train is shown in the following command: # Set-up cross entropy for test nRowt<-nrow(occupancy_test) xt <- tf$constant(unlist(occupancy_test[, xFeatures]), shape=c(nRowt, nFeatures), dtype=np$float32) ypredt <- tf$nn$sigmoid(tf$matmul(xt, W) + b) yt_ <- tf$constant(unlist(occupancy_test[, yFeatures]), dtype="float32", shape=c(nRowt, 1L)) cross_entropy_tst<-tf$reduce_mean(tf$nn$sigmoid_cross_entropy_with_logits(labels=yt_, logits=ypredt, name="cross_entropy_tst")) Add summary variables to be collected: # Add summary ops to collect data w_hist = tf$summary$histogram("weights", W) b_hist = tf$summary$histogram("biases", b) crossEntropySummary<-tf$summary$scalar("costFunction", cross_entropy) crossEntropyTstSummary<-tf$summary$scalar("costFunction_test", cross_entropy_tst) Open the writing object, log_writer. It writes the default graph to the location, c:/log: # Create Writer Obj for log log_writer = tf$summary$FileWriter('c:/log', sess$graph) Run the optimization and collect the summaries: for (step in 1:2500) {   sess$run(optimizer)   # Evaluate performance on training and test data after 50 Iteration   if (step %% 50== 0){    ### Performance on Train    ypred <- sess$run(tf$nn$sigmoid(tf$matmul(x, W) + b))    roc_obj <- roc(occupancy_train[, yFeatures], as.numeric(ypred))    ### Performance on Test    ypredt <- sess$run(tf$nn$sigmoid(tf$matmul(xt, W) + b))    roc_objt <- roc(occupancy_test[, yFeatures], as.numeric(ypredt))    cat("train AUC: ", auc(roc_obj), " Test AUC: ", auc(roc_objt), "n")    # Save summary of Bias and weights    log_writer$add_summary(sess$run(b_hist), global_step=step)    log_writer$add_summary(sess$run(w_hist), global_step=step)    log_writer$add_summary(sess$run(crossEntropySummary), global_step=step)    log_writer$add_summary(sess$run(crossEntropyTstSummary), global_step=step) } } Collect all the summaries to a single tensor using themerge_all command from the summary module: summary = tf$summary$merge_all() Write the summaries to the log file using the log_writer object: log_writer = tf$summary$FileWriter('c:/log', sess$graph) summary_str = sess$run(summary) log_writer$add_summary(summary_str, step) log_writer$close() We have learned how to perform logistic regression using TensorFlow also we have covered the application of TensorFlow in setting up a logistic regression model. [box type="shadow" align="" class="" width=""]This article is book excerpt taken from, R Deep Learning Cookbook, co-authored by PKS Prakash & Achyutuni Sri Krishna Rao. This book contains powerful and independent recipes to build deep learning models in different application areas using R libraries.[/box] Read More Getting started with Linear and logistic regression Healthcare Analytics: Logistic Regression to Reduce Patient Readmissions Using Logistic regression to predict market direction in algorithmic trading  
Read more
  • 0
  • 0
  • 47924

article-image-connecting-cloud-object-storage-with-databricks-unity-catalog
Pulkit Chadha
22 Oct 2024
10 min read
Save for later

Connecting Cloud Object Storage with Databricks Unity Catalog

Pulkit Chadha
22 Oct 2024
10 min read
This article is an excerpt from the book, Data Engineering with Databricks Cookbook, by Pulkit Chadha. This book shows you how to use Apache Spark, Delta Lake, and Databricks to build data pipelines, manage and transform data, optimize performance, and more. Additionally, you’ll implement DataOps and DevOps practices, and orchestrate data workflows.IntroductionDatabricks Unity Catalog allows you to manage and access data in cloud object storage using a unified namespace and a consistent set of APIs. With Unity Catalog, you can do the following: Create and manage storage credentials, external locations, storage locations, and volumes using SQL commands or the Unity Catalog UI Access data from various cloud platforms (AWS S3, Azure Blob Storage, or Google Cloud Storage) and storage formats (Parquet, Delta Lake, CSV, or JSON) using the same SQL syntax or Spark APIs Apply fine-grained access control and data governance policies to your data using Databricks SQL Analytics or Databricks Runtime In this article, you will learn what Unity Catalog is and how it integrates with AWS S3. Getting ready Before you start setting up and configuring Unity Catalog, you need to have the following prerequisites: A Databricks workspace with administrator privileges A Databricks workspace with the Unity Catalog feature enabled A cloud storage account (such as AWS S3, Azure Blob Storage, or Google Cloud Storage) with the necessary permissions to read and write data How to do it… In this section, we will first create a storage credential, the IAM role, with access to an s3 bucket. Then, we will create an external location in Databricks Unity Catalog that will use the storage credential to access the s3 bucket. Creating a storage credential You must create a storage credential to access data from an external location or a volume. In this example, you will create a storage credential that uses an IAM role taccess the S3 Bucket. The steps are as follows: 1. Go to Catalog Explorer: Click on Catalog in the left panel and go to Catalog Explorer. 2. Create storage credentials: Click on +Add and select Add a storage credential. Figure 10.1 – Add a storage credential 3. Enter storage credential details: Give the credential a name, the IAM role ARN that allows Unity Catalog to access the storage location on your cloud tenant, and a comment if you want, and click on Create.  Figure 10.2 – Create a new storage credential Important note To learn more about IAM roles in AWS, you can reference the user guide here: https:// docs.aws.amazon.com/IAM/latest/UserGuide/introduction.html. 4. Get External ID: In the Storage credential created dialog, copy the External ID value and click on Done.  Figure 10.3 – External ID for the storage credential 5. Update the trust policy with an External ID: Update the trust policy associated with the IAM role and add the External ID value for sts:ExternalId:  Figure 10.4 – Updated trust policy with External ID Creating an external location An external location contains a reference to a storage credential and a cloud storage path. You need to create an external location to access data from a custom storage location that Unity Catalog uses to reference external tables. In this example, you will create an external location that points to the de-book-ext-loc folder in an S3 bucket. To create an external location, you can follow these steps: 1. Go to Catalog Explorer: Click on Catalog in the left panel to go to Catalog Explorer. 2. Create external location: Click on +Add and select Add an external location:  Figure 10.5 – Add an external location 3. Pick an external location creation method: Select Manual and then click on Next:  Figure 10.6 – Create a new external location 4. Enter external location details: Enter the external location name, select the storage credential, and enter the S3 URL; then, click on the Create button:  Figure 10.7 – Create a new external location manually 5. Test connection: Test the connection to make sure you have set up the credentials accurately and that Unity Catalog is able to access cloud storage:  Figure 10.8 – Test connection for external location If everything is set up right, you should see a screen like the following. Click on Done:  Figure 10.9 – Test connection results See also Databricks Unity Catalog: https://www.databricks.com/product/unity-catalog What is Unity Catalog: https://docs.databricks.com/en/data-governance/ unity-catalog/index.html Databricks Unity Catalog documentation: https://docs.databricks.com/en/ compute/access-mode-limitations.html Databricks SQL documentation: https://docs.databricks.com/en/datagovernance/unity-catalog/create-tables.html Databricks Unity Catalog: A Comprehensive Guide to Features, Capabilities, and Architecture: https://atlan.com/databricks-unity-catalog/ Step By Step Guide on Databricks Unity Catalog Setup and its key Features: https:// medium.com/@sauravkum780/step-by-step-guide-on-databricks-unitycatalog-setup-and-its-features-1d0366c282b7 Conclusion In summary, connecting to cloud object storage using Databricks Unity Catalog provides a streamlined approach to managing and accessing data across various cloud platforms such as AWS S3, Azure Blob Storage, and Google Cloud Storage. By utilizing a unified namespace, consistent APIs, and powerful governance features, Unity Catalog simplifies the process of creating and managing storage credentials and external locations. With built-in fine-grained access controls, you can securely manage data stored in different formats and cloud environments, all while leveraging Databricks' powerful data analytics capabilities. This guide walks through setting up an IAM role and creating an external location in AWS S3, demonstrating how easy it is to connect cloud storage with Unity Catalog. Author BioPulkit Chadha is a seasoned technologist with over 15 years of experience in data engineering. His proficiency in crafting and refining data pipelines has been instrumental in driving success across diverse sectors such as healthcare, media and entertainment, hi-tech, and manufacturing. Pulkit’s tailored data engineering solutions are designed to address the unique challenges and aspirations of each enterprise he collaborates with.
Read more
  • 0
  • 0
  • 47916

article-image-restful-web-services-with-kotlin
Natasha Mathur
01 Jun 2018
9 min read
Save for later

Building RESTful web services with Kotlin

Natasha Mathur
01 Jun 2018
9 min read
Kotlin has been eating up the Java world. It has already become a hit in the Android Ecosystem which was dominated by Java and is welcomed with open arms. Kotlin is not limited to Android development and can be used to develop server-side and client-side web applications as well. Kotlin is 100% compatible with the JVM so you can use any existing frameworks such as Spring Boot, Vert.x, or JSF for writing Java applications. In this tutorial, we will learn how to implement RESTful web services using Kotlin. This article is an excerpt from the book 'Kotlin Programming Cookbook', written by, Aanand Shekhar Roy and Rashi Karanpuria. Setting up dependencies for building RESTful services In this recipe, we will lay the foundation for developing the RESTful service. We will see how to set up dependencies and run our first SpringBoot web application. SpringBoot provides great support for Kotlin, which makes it easy to work with Kotlin. So let's get started. We will be using IntelliJ IDEA and Gradle build system. If you don't have that, you can get it from https://www.jetbrains.com/idea/. How to do it… Let's follow the given steps to set up the dependencies for building RESTful services: First, we will create a new project in IntelliJ IDE. We will be using the Gradle build system for maintaining dependency, so create a Gradle project: When you have created the project, just add the following lines to your build.gradle file. These lines of code contain spring-boot dependencies that we will need to develop the web app: buildscript { ext.kotlin_version = '1.1.60' // Required for Kotlin integration ext.spring_boot_version = '1.5.4.RELEASE' repositories { jcenter() } dependencies { classpath "org.jetbrains.kotlin:kotlin-gradle-plugin:$kotlin_version" // Required for Kotlin integration classpath "org.jetbrains.kotlin:kotlin-allopen:$kotlin_version" // See https://kotlinlang.org/docs/reference/compiler-plugins.html#kotlin-spring-compiler-plugin classpath "org.springframework.boot:spring-boot-gradle-plugin:$spring_boot_version" } } apply plugin: 'kotlin' // Required for Kotlin integration apply plugin: "kotlin-spring" // See https://kotlinlang.org/docs/reference/compiler-plugins.html#kotlin-spring-compiler-plugin apply plugin: 'org.springframework.boot' jar { baseName = 'gs-rest-service' version = '0.1.0' } sourceSets { main.java.srcDirs += 'src/main/kotlin' } repositories { jcenter() } dependencies { compile "org.jetbrains.kotlin:kotlin-stdlib:$kotlin_version" // Required for Kotlin integration compile 'org.springframework.boot:spring-boot-starter-web' testCompile('org.springframework.boot:spring-boot-starter-test') } Let's now create an App.kt file in the following directory hierarchy: It is important to keep the App.kt file in a package (we've used the college package). Otherwise, you will get an error that says the following: ** WARNING ** : Your ApplicationContext is unlikely to start due to a `@ComponentScan` of the default package. The reason for this error is that if you don't include a package declaration, it considers it a "default package," which is discouraged and avoided. Now, let's try to run the App.kt class. We will put the following code to test if it's running: @SpringBootApplication open class App { } fun main(args: Array<String>) { SpringApplication.run(App::class.java, *args) } Now run the project; if everything goes well, you will see output with the following line at the end: Started AppKt in 5.875 seconds (JVM running for 6.445) We now have our application running on our embedded Tomcat server. If you go to http://localhost:8080, you will see an error as follows: The preceding error is 404 error and the reason for that is we haven't told our application to do anything when a user is on the / path. Creating a REST controller In the previous recipe, we learned how to set up dependencies for creating RESTful services. Finally, we launched our backend on the http://localhost:8080 endpoint but got 404 error as our application wasn't configured to handle requests at that path (/). We will start from that point and learn how to create a REST controller. Let's get started! We will be using IntelliJ IDE for coding purposes. For setting up of the environment, refer to the previous recipe. You can also find the source in the repository at https://gitlab.com/aanandshekharroy/kotlin-webservices. How to do it… In this recipe, we will create a REST controller that will fetch us information about students in a college. We will be using an in-memory database using a list to keep things simple: Let's first create a Student class having a name and roll number properties: package college class Student() { lateinit var roll_number: String lateinit var name: String constructor( roll_number: String, name: String): this() { this.roll_number = roll_number this.name = name } } Next, we will create the StudentDatabase endpoint, which will act as a database for the application: @Component class StudentDatabase { private val students = mutableListOf<Student>() } Note that we have annotated the StudentDatabase class with @Component, which means its lifecycle will be controlled by Spring (because we want it to act as a database for our application). We also need a @PostConstruct annotation, because it's an in-memory database that is destroyed when the application closes. So we would like to have a filled database whenever the application launches. So we will create an init method, which will add a few items into the "database" at startup time: @PostConstruct private fun init() { students.add(Student("2013001","Aanand Shekhar Roy")) students.add(Student("2013165","Rashi Karanpuria")) } Now, we will create a few other methods that will help us deal with our database: getStudent: Gets the list of students present in our database: fun getStudents()=students addStudent: This method will add a student to our database: fun addStudent(student: Student): Boolean { students.add(student) return true } Now let's put this database to use. We will be creating a REST controller that will handle the request. We will create a StudentController and annotate it with @RestController. Using @RestController is simple, and it's the preferred method for creating MVC RESTful web services. Once created, we need to provide our database using Spring dependency injection, for which we will need the @Autowired annotation. Here's how our StudentController looks: @RestController class StudentController { @Autowired private lateinit var database: StudentDatabase } Now we will set our response to the / path. We will show the list of students in our database. For that, we will simply create a method that lists out students. We will need to annotate it with @RequestMapping and provide parameters such as path and request method (GET, POST, and such): @RequestMapping("", method = arrayOf(RequestMethod.GET)) fun students() = database.getStudents() This is what our controller looks like now. It is a simple REST controller: package college import org.springframework.beans.factory.annotation.Autowired import org.springframework.web.bind.annotation.RequestMapping import org.springframework.web.bind.annotation.RequestMethod import org.springframework.web.bind.annotation.RestController @RestController class StudentController { @Autowired private lateinit var database: StudentDatabase @RequestMapping("", method = arrayOf(RequestMethod.GET)) fun students() = database.getStudents() } Now when you restart the server and go to http://localhost:8080, we will see the response as follows: As you can see, Spring is intelligent enough to provide the response in the JSON format, which makes it easy to design APIs. Now let's try to create another endpoint that will fetch a student's details from a roll number: @GetMapping("/student/{roll_number}") fun studentWithRollNumber( @PathVariable("roll_number") roll_number:String) = database.getStudentWithRollNumber(roll_number) Now, if you try the http://localhost:8080/student/2013001 endpoint, you will see the given output: {"roll_number":"2013001","name":"Aanand Shekhar Roy"} Next, we will try to add a student to the database. We will be doing it via the POST method: @RequestMapping("/add", method = arrayOf(RequestMethod.POST)) fun addStudent(@RequestBody student: Student) = if (database.addStudent(student)) student else throw Exception("Something went wrong") There's more… So far, our server has been dependent on IDE. We would definitely want to make it independent of an IDE. Thanks to Gradle, it is very easy to create a runnable JAR just with the following: ./gradlew clean bootRepackage The preceding command is platform independent and uses the Gradle build system to build the application. Now, you just need to type the mentioned command to run it: java -jar build/libs/gs-rest-service-0.1.0.jar You can then see the following output as before: Started AppKt in 4.858 seconds (JVM running for 5.548) This means your server is running successfully. Creating the Application class for Spring Boot The SpringApplication class is used to bootstrap our application. We've used it in the previous recipes; we will see how to create the Application class for Spring Boot in this recipe. We will be using IntelliJ IDE for coding purposes. To set up the environment, read previous recipes, especially the Setting up dependencies for building RESTful services recipe. How to do it… If you've used Spring Boot before, you must be familiar with using @Configuration, @EnableAutoConfiguration, and @ComponentScan in your main class. These were used so frequently that Spring Boot provides a convenient @SpringBootApplication alternative. The Spring Boot looks for the public static main method, and we will use a top-level function outside the Application class. If you noted, while setting up the dependencies, we used the kotlin-spring plugin, hence we don't need to make the Application class open. Here's an example of the Spring Boot application: package college import org.springframework.boot.SpringApplication import org.springframework.boot.autoconfigure.SpringBootApplication @SpringBootApplication class Application fun main(args: Array<String>) { SpringApplication.run(Application::class.java, *args) } The Spring Boot application executes the static run() method, which takes two parameters and starts a autoconfigured Tomcat web server when Spring application is started. When everything is set, you can start the application by executing the following command: ./gradlew bootRun If everything goes well, you will see the following output in the console: This is along with the last message—Started AppKt in xxx seconds. This means that your application is up and running. In order to run it as an independent server, you need to create a JAR and then you can execute as follows: ./gradlew clean bootRepackage Now, to run it, you just need to type the following command: java -jar build/libs/gs-rest-service-0.1.0.jar We learned how to set up dependencies for building RESTful services, creating a REST controller, and creating the application class for Spring boot. If you are interested in learning more about Kotlin then be sure to check out the 'Kotlin Programming Cookbook'. Build your first Android app with Kotlin 5 reasons to choose Kotlin over Java Getting started with Kotlin programming Forget C and Java. Learn Kotlin: the next universal programming language
Read more
  • 0
  • 0
  • 47761
article-image-mastering-code-generation-exploring-the-llvm-backend
Kai Nacke, Amy Kwan
24 Oct 2024
10 min read
Save for later

Mastering Code Generation: Exploring the LLVM Backend

Kai Nacke, Amy Kwan
24 Oct 2024
10 min read
This article is an excerpt from the book, Learn LLVM 17 - Second Edition, by Kai Nacke, Amy Kwan. Learn how to build your own compiler, from reading the source to emitting optimized machine code. This book guides you through the JIT compilation framework, extending LLVM in a variety of ways, and using the right tools for troubleshooting.Introduction Generating optimized machine code is a critical task in the compilation process, and the LLVM backend plays a pivotal role in this transformation. The backend translates the LLVM Intermediate Representation (IR), derived from the Abstract Syntax Tree (AST), into machine code that can be executed on target architectures. Understanding how to generate this IR effectively is essential for leveraging LLVM's capabilities. This article delves into the intricacies of generating LLVM IR using a simple expression language example. We'll explore the necessary steps, from declaring library functions to implementing a code generation visitor, ensuring a comprehensive understanding of the LLVM backend's functionality. Generating code with the LLVM backend The task of the backend is to create optimized machine code from the LLVM IR of a module. The IR is the interface to the backend and can be created using a C++ interface or in textual form. Again, the IR is generated from the AST. Textual representation of LLVM IR Before trying to generate the LLVM IR, it should be clear what we want to generate. For our example expression language, the high-level plan is as follows: 1. Ask the user for the value of each variable. 2. Calculate the value of the expression. 3. Print the result. To ask the user to provide a value for a variable and to print the result, two library functions are used: calc_read() and calc_write(). For the with a: 3*a expression, the generated IR is as follows: 1. The library functions must be declared, like in C. The syntax also resembles C. The type before the function name is the return type. The type names surrounded by parenthesis are the argument types. The declaration can appear anywhere in the file: declare i32 @calc_read(ptr) declare void @calc_write(i32) 2. The calc_read() function takes the variable name as a parameter. The following construct defines a constant, holding a and the null byte used as a string terminator in C: @a.str = private constant [2 x i8] c"a\00" 3. It follows the main() function. The parameter names are omitted because they are not used.  Just as in C, the body of the function is enclosed in braces: define i32 @main(i32, ptr) { 4. Each basic block must have a label. Because this is the first basic block of the function, we name it entry: entry: 5. The calc_read() function is called to read the value for the a variable. The nested getelemenptr instruction performs an index calculation to compute the pointer to the first element of the string constant. The function result is assigned to the unnamed %2 variable.  %2 = call i32 @calc_read(ptr @a.str) 6. Next, the variable is multiplied by 3:  %3 = mul nsw i32 3, %2 7. The result is printed on the console via a call to the calc_write() function:  call void @calc_write(i32 %3) 8. Last, the main() function returns 0 to indicate a successful execution:  ret i32 0 } Each value in the LLVM IR is typed, with i32 denoting the 32-bit bit integer type and ptr denoting a pointer. Note: previous versions of LLVM used typed pointers. For example, a pointer to a byte was expressed as i8* in LLVM. Since L LVM 16, opaque pointers are the default. An opaque pointer is just a pointer to memory, without carrying any type information about it. The notation in LLVM IR is ptr.Previous versions of LLVM used typed pointers. For example, a pointer to a byte was expressed as i8* in LLVM. Since L LVM 16, opaque pointers are the default. An opaque pointer is just a pointer to memory, without carrying any type information about it. The notation in LLVM IR is ptr. Since it is now clear what the IR looks like, let’s generate it from the AST.  Generating the IR from the AST The interface, provided in the CodeGen.h header file, is very small:  #ifndef CODEGEN_H #define CODEGEN_H #include "AST.h" class CodeGen { public: void compile(AST *Tree); }; #endifBecause the AST contains the information, the basic idea is to use a visitor to walk the AST. The CodeGen.cpp file is implemented as follows: 1. The required includes are at the top of the file: #include "CodeGen.h" #include "llvm/ADT/StringMap.h" #include "llvm/IR/IRBuilder.h" #include "llvm/IR/LLVMContext.h" #include "llvm/Support/raw_ostream.h" 2. The namespace of the LLVM libraries is used for name lookups: using namespace llvm; 3. First, some private members are declared in the visitor. Each compilation unit is represented in LLVM by the Module class and the visitor has a pointer to the module called M. For easy IR generation, the Builder (of type IRBuilder<>) is used. LLVM has a class hierarchy to represent types in IR. You can look up the instances for basic types such as i32 from the LLVM context. These basic types are used very often. To avoid repeated lookups, we cache the needed type instances: VoidTy, Int32Ty, PtrTy, and Int32Zero. The V member is the current calculated value, which is updated through the tree traversal. And last, nameMap maps a variable name to the value returned from the calc_read() function: namespace { class ToIRVisitor : public ASTVisitor { Module *M; IRBuilder<> Builder; Type *VoidTy; Type *Int32Ty; PointerType *PtrTy; Constant *Int32Zero; Value *V; StringMap<Value *> nameMap;4. The constructor initializes all members: public: ToIRVisitor(Module *M) : M(M), Builder(M->getContext()) { VoidTy = Type::getVoidTy(M->getContext()); Int32Ty = Type::getInt32Ty(M->getContext()); PtrTy = PointerType::getUnqual(M->getContext()); Int32Zero = ConstantInt::get(Int32Ty, 0, true); }5. For each function, a FunctionType instance must be created. In C++ terminology, this is a function prototype. A function itself is defined with a Function instance. The run() method defines the main() function in the LLVM IR first:  void run(AST *Tree) { FunctionType *MainFty = FunctionType::get( Int32Ty, {Int32Ty, PtrTy}, false); Function *MainFn = Function::Create( MainFty, GlobalValue::ExternalLinkage, "main", M); 6. Then we create the BB basic block with the entry label, and attach it to the IR builder: BasicBlock *BB = BasicBlock::Create(M->getContext(), "entry", MainFn); Builder.SetInsertPoint(BB);7. With this preparation done, the tree traversal can begin:     Tree->accept(*this); 8. After the tree traversal, the computed value is printed via a call to the calc_write() function. Again, a function prototype (an instance of FunctionType) has to be created. The only parameter is the current value, V:  FunctionType *CalcWriteFnTy = FunctionType::get(VoidTy, {Int32Ty}, false); Function *CalcWriteFn = Function::Create( CalcWriteFnTy, GlobalValue::ExternalLinkage, "calc_write", M); Builder.CreateCall(CalcWriteFnTy, CalcWriteFn, {V});9. The generation finishes  by returning 0 from the main() function: Builder.CreateRet(Int32Zero); }10. A WithDecl node holds the names of the declared variables. First, we create a function prototype for the calc_read() function:  virtual void visit(WithDecl &Node) override { FunctionType *ReadFty = FunctionType::get(Int32Ty, {PtrTy}, false); Function *ReadFn = Function::Create( ReadFty, GlobalValue::ExternalLinkage, "calc_read", M);11. The method loops through the variable names:  for (auto I = Node.begin(), E = Node.end(); I != E; ++I) { 12. For each  variable, a string  with a variable name is created:  StringRef Var = *I; Constant *StrText = ConstantDataArray::getString( M->getContext(), Var); GlobalVariable *Str = new GlobalVariable( *M, StrText->getType(), /*isConstant=*/true, GlobalValue::PrivateLinkage, StrText, Twine(Var).concat(".str"));13. Then the IR code to call the calc_read() function is created. The string created in the previous step is passed as a parameter:  CallInst *Call = Builder.CreateCall(ReadFty, ReadFn, {Str});14. The returned value is stored in the mapNames map for later use:  nameMap[Var] = Call; }15. The tree traversal continues with the expression:  Node.getExpr()->accept(*this); };16. A Factor node is either a variable name or a number. For a variable name, the value is looked up in the mapNames map. For a number, the value is converted to an integer and turned into a constant value: virtual void visit(Factor &Node) override { if (Node.getKind() == Factor::Ident) { V = nameMap[Node.getVal()]; } else { int intval; Node.getVal().getAsInteger(10, intval); V = ConstantInt::get(Int32Ty, intval, true); } };17. And last, for a BinaryOp node, the right calculation operation must be used:  virtual void visit(BinaryOp &Node) override { Node.getLeft()->accept(*this); Value *Left = V; Node.getRight()->accept(*this); Value *Right = V; switch (Node.getOperator()) { case BinaryOp::Plus: V = Builder.CreateNSWAdd(Left, Right); break; case BinaryOp::Minus: V = Builder.CreateNSWSub(Left, Right); break; case BinaryOp::Mul: V = Builder.CreateNSWMul(Left, Right); break; case BinaryOp::Div: V = Builder.CreateSDiv(Left, Right); break;    }       };       };       }18. With this, the visitor class is complete. The compile() method creates the global context and the  module, runs the tree traversal, and dumps the generated IR to the console:  void CodeGen::compile(AST *Tree) { LLVMContext Ctx; Module *M = new Module("calc.expr", Ctx); ToIRVisitor ToIR(M); ToIR.run(Tree); M->print(outs(), nullptr); }We now have implemented the frontend of the compiler, from reading the source up to generating the IR. Of course, all these components must work together on user input, which is the task of the compiler driver. We also need to implement the functions needed at runtime. Both are topics of the next section covered in the book. Conclusion In conclusion, the process of generating LLVM IR from an AST involves multiple steps, each crucial for producing efficient machine code. This article highlighted the structure and components necessary for this task, including function declarations, basic block management, and tree traversal using a visitor pattern. By carefully managing these elements, developers can harness the power of LLVM to create optimized and reliable machine code. The integration of all these components, alongside user input and runtime functions, completes the frontend implementation of the compiler. This sets the stage for the next phase, focusing on the compiler driver and runtime functions, ensuring seamless execution and integration of the compiled code. Author BioKai Nacke is a professional IT architect currently residing in Toronto, Canada. He holds a diploma in computer science from the Technical University of Dortmund, Germany. and his diploma thesis on universal hash functions was recognized as the best of the semester. With over 20 years of experience in the IT industry, Kai has extensive expertise in the development and architecture of business and enterprise applications. In his current role, he evolves an LLVM/clang-based compiler.nFor several years, Kai served as the maintainer of LDC, the LLVM-based D compiler. He is the author of D Web Development and Learn LLVM 12, both published by Packt. In the past, he was a speaker in the LLVM developer room at the Free and Open Source Software Developers’ European Meeting (FOSDEM).Amy Kwan is a compiler developer currently residing in Toronto, Canada. Originally, from the Canadian prairies, Amy holds a Bachelor of Science in Computer Science from the University of Saskatchewan. In her current role, she leverages LLVM technology as a backend compiler developer. Previously, Amy has been a speaker at the LLVM Developer Conference in 2022 alongside Kai Nacke.
Read more
  • 0
  • 0
  • 47753

article-image-powershell-basics-for-it-professionals
Savia Lobo
16 Dec 2019
6 min read
Save for later

PowerShell Basics for IT Professionals

Savia Lobo
16 Dec 2019
6 min read
PowerShell is Microsoft’s automation platform for IT Pros. Of late, there have been a lot of questions around the complexity of this latest automation tool by Microsoft. At Microsoft Ignite 2018, Jason Himmelstein, Director of Technical Strategy and Strategic Partnerships, Office Apps & Services MVP, explained the basics of PowerShell and how to truly optimize your SharePoint implementation using this powerful IT pro toolset. While in this post we look at the big picture, you can check out the complete video here: ‘Introduction to PowerShell for the anxious IT pro’. Want to do more with PowerShell? After learning the basics, you can learn how to use PowerShell to automate complex Windows server tasks. You can also improve PowerShell's usability, and control and manage Windows-based environments by working through exciting recipes given in Windows Server 2019 Automation with PowerShell Cookbook - Third Edition written by Thomas Lee.  Himmelstein starts off by saying PowerShell isn’t a packaged executable, nor it is developer-centric that needs one to understand code, and it is easy for an IT Pro to understand. What is PowerShell? Windows PowerShell is Microsoft’s task automation framework, consisting of a command-line shell and associated scripting language built on .NET Framework. It provides full access to COM and WMI, enabling administrators to perform administrative tasks on both local and remote Windows systems. In simple words, PowerShell is an object-based, not a text-based, command-line interface for Microsoft Technologies. This means results in PowerShell can be acted upon and not just read from. One can cause huge damage to an environment using PowerShell as there is no back button in PowerShell. However, to check what must have gone wrong, you can check the logs but can not undo actions. Why PowerShell matters Regardless of the platform a person uses such as Office 365, Azure, etc., PowerShell can be easily implemented due to its cross-platform capability. Himmelstein also highlights one can also get started with Azure PowerShell by trying it out in an Azure Cloud Shell environment, an interactive, authenticated, browser-accessible shell for managing Azure resources.  Azure Cloud Shell comes equipped with commonly used CLI tools including Linux shell interpreters, PowerShell modules, Azure tools, text editors, source control, build tools, container tools, database tools and more. Cloud Shell also includes language support for several popular programming languages such as Node.js, .NET and Python. Cloud Shell also securely authenticates automatically for instant access to your resources through the Azure CLI or Azure PowerShell cmdlets. Users can use PowerShell in Cloud Shell. One can also develop applications using PowerShell or can use PowerShell via Source Control Management (SCM). Basics of PowerShell PowerShell Hardware There are two ways one can use PowerShell; one is via the PowerShell Console, which is similar to a command line. The other is PowerShell ISE (Integrated Scripting Environment). One thing Himmelstein encourages is, “we run PowerShell in the Console and we write PowerShell in the ISE.” The reason is there are certain functionalities that do not work in the ISE when one hits the ‘Run’ command. In such cases, the user will have to take that PowerShell out, copy it, save the file and run it in a command window. cmdlets Cmdlets are the main building blocks of PowerShell. These are mini commands that perform one action. These have the ability to pipe the output of one cmdlet into further cmdlets. These can also perform equality tests with expressions such as -eq, -lt, -match; one can diff easily within a PowerShell. Modules There are four types of Modules in PowerShell: Script: A Script module is a file (.psm1) that contains any valid Windows PowerShell code. Binary: A binary module is a .NET framework assembly (.dll) that contains compiled code. Manifest: A module Manifest is a Windows PowerShell data file (.psd1) that describes the contents of a module and determines how a module is processed. Dynamic: A dynamic module does not persist to disk. It is created using New Module, is intended to be short-lived, and cannot be accessed by Get-Module. Himmelstein prefers not to use the Dynamic module as it persists for just one session. Objects and Members Objects are instances of classes and have properties and methods. Members are properties and methods of an object. Properties define what an Object is and Methods define what you can do with the object. Himmelstein puts together all these terms in a simple way: Objects = stuff Cmdlets = things you can do with the stuff Modules = list of things you can do with the stuff Properties = details about the stuff Methods = instructions for things you can do with the stuff PipeLine Using PipeLines one can chain objects together for processing. The output of a pipelined object becomes the object itself. Functional Explanation Get-command: Gets all the cmdlet installed on your computer. Get-help: Displays additional information about a cmdlet Get-member: Listing the Properties and Methods of a Command or Object Get-verb: Gets approved Windows PowerShell verbs Start-transcript: Logs everything you do in that PowerShell window to a file Get- history: If you didn’t start transcript, you can still review your history before closing your Shell or ISE window. Tips for PowerShell beginners Use Variables: You can use any variables except the ones that are reserved by the system, which you will be prompted when you try to enter a reserved variable. Call one thing at a time Comment your scripts as this may save you a lot of time. Create scripts using an ISE/IDE, you can also use the Visual Studio Code and then execute in Shell. Dispose of your objects. Close the command window by typing Exit. Test before using in Production Write reusable scripts. What Powershell beginners should avoid Rewriting your variables Hard coding your scripts such as Password as it may get fired by PowerShell Taking code from the internet or vendor and just Run in your environment (You should read every code before you run it in your environment). Assuming the code is not harmful; it is. There is no back button in PowerShell and you cannot undo things. Running your code in an IDE/ISE and expect everything to work. PowerShell Syntax and Bracketology Syntax ‘#’ is for Comment ‘+’ is for Add ‘=’, ‘-eq’, are for Equal ‘!’, ‘-ne’, ‘-not’ are for ‘not equal’ Brackets ‘()’ Curved brackets also known as Parentheses are used for required options, compulsory arguments, or control structures. ‘{}’ Curly brackets are used for block expression within a command block and is also used to open a code block ‘[]’ Square brackets are used to denote optional elements or parameters and also used for match functions. Now that you know the basics of PowerShell, you can start performing key admin tasks on Windows Server 2019. To further learn how to employ best practices for writing PowerShell scripts and configuring Windows Server 2019 and leverage PowerShell to automate complex Windows server tasks, check out our book, Windows Server 2019 Automation with PowerShell Cookbook - Third Edition written by Thomas Lee. Weaponizing PowerShell with Metasploit and how to defend against PowerShell attacks [Tutorial] Scripting with Windows Powershell Desired State Configuration [Video] Automate tasks using Azure PowerShell and Azure CLI [Tutorial]
Read more
  • 0
  • 0
  • 47589
Modal Close icon
Modal Close icon