Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Save more on your purchases! discount-offer-chevron-icon
Savings automatically calculated. No voucher code required.
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Events
Videos
Audiobooks
Packt Hub
Free Learning
Arrow right icon
timer SALE ENDS IN
0 Days
:
00 Hours
:
00 Minutes
:
00 Seconds

How-To Tutorials

7018 Articles
article-image-ubuntu-server-and-wordpress-15-minutes-flat
Packt
21 Sep 2010
6 min read
Save for later

Ubuntu Server and WordPress in 15 Minutes Flat

Packt
21 Sep 2010
6 min read
(For more resources on WordPress, see here.) Introduction Ubuntu Server is a robust, powerful and user-friendly distribution engineered by a dedicated team at Canonical as well as hundreds (if not thousands) of volunteers around the world. It powers thousands of server installations, but public and private and is becoming a very popular and trusted solution for all types of server needs. In this article I will outline how to install Ubuntu server toward the goal of running and publishing your own blog, using the WordPress blogging software. This can be used to run a personal blog out of your home, or even run a corporate blog in a workplace. Hundreds of companies use Wordpress as their blogging software of choice—I've deployed it at my office even. I personally maintain about a dozen Wordpress installations, all at varying levels of popularity and traffic. Wordpress scales well, is easy to maintain, and very intuitive to use. If you're not familiar with the Wordpress blogging software I'd invite you to go check it out at http://www.wordpress.com. Requirements In order to get this whole process started you'll only need a few simple things. First, a copy of Ubuntu Server. At the time of this writing, the latest release is 10.04.1 LTS (Long Term Support), which will be supported and provide security and errata updates for five years. You can download a free copy of Ubuntu Server here: http://www.ubuntu.com/server In addition to a copy of Ubuntu Server you'll need, of course, a platform to install it one. This could be a physical server, or a virtual machine. Your times (the 15 minute goal) may vary based on your physical hardware speeds. I based this article on the following platform and specifications: Dell D630 Core 2 Duo 2.10 Ghz 2G RAM VirtualBox 3.2.8 Open Source Edition Again, your mileage may vary depending on your hardware and network, but overall this article will quickly get you from zero to blogger in no time! The last requirement you'll need, and I mentioned this just very briefly in this last paragraph, is network access. If you're installing this on a physical machine, make sure that you'll have local network access to that machine. If you're planning on installing this on a virtual machine, make sure that you configure the virtual machine to use bridged networking, making it accessible to your local area network. To recap, your requirements are: Ubuntu Server 10.04.1 LTS .iso (or printed CD) Physical or virtual machine to provision Local network access to said machine Getting started Once you have everything prepared we can jump right in and get started. Start up your virtual machine, or drop in your CD-ROM, and we'll start the installation. I've taken screenshots of each step in the process so you should be able to follow along closely. In most situations I chose the default configuration. If you are unsure about the configuration requirements during installation, it is generally safe to select the default. Again, just follow my lead and you should be fine! This is the initial installer screen. You'll notice there are a number of options available. The highlighted option (also the default) of "Install Ubuntu Server" is what you'll want to select here. Next, the installer will prompt you for your preferred or native language. The default here is English, and was my selection. You'll notice that there is a huge number of available languages here. This is one of the goals and strengths of Ubuntu, "that software tools should be usable by people in their local language". Select your preferred language and move on to the next step. The next step is to select your country. If you selected English as your primary language you'll then need to select your region. The default is United States, and was also my selection. The Ubuntu installer can automatically detect your keyboard layout if you ask it to. The default prompt is no, which then allows you to select your keyboard from a list. I prefer to use the auto-detection, which I find a bit faster. You can use your own preference here, but be sure you select the correct layout. Nothing more frustrating than not being able to type properly on your keyboard! Next you'll need to assign a hostname to your machine. This is an enjoyable part of the process for me, as I get to assign a unique name to the machine I'll be working with. This always seems to personalize the process for me, and I've chosen a number of creative names for my machines. Select whatever you like here, just make sure it is unique compared to the other machines on your current network. To help ensure that your clock is set properly the Ubuntu installer will auto-detect or prompt you for your time zone. I've found that, when installing on physical hardware, the auto-detection is usuall pretty accurate. When installing on virtual hardware it has a more difficult time. The screenshot above was taken on virtual hardware, which required me to select my time zone manually. If this is the case for you, find your time zone and hit ENTER. The next step in the installation process is partitioning the disks. Unless you have specific needs here, I'd suggest safely selecting the defaults. If you're wondering whether or not you do have specific needs, you probably don't. For our intentions here toward the goal of setting up a web server to run Wordpress, the default is just fine. Select "Guided – use entire disk and set up LVM" and hit ENTER. The installer will prompt you with a confirmation dialog before writing partitioning changes to the disk. Based on the fact that making changes to partitions and filesystems will destroy any existing data on the disk(s), this requires secondary confirmation. If you are installing on a newly created virtual machine you should have nothing to worry about here. If you are installing on physical hardware, please note that it will destroy any existing data and you should be OK with that action. You also have the option of defining the size of the disk made available to your installation. Again, I selected the default here which is to use 100% of the available space. If you have more specific requirements, make them here. Lastly, in regards to the partitioning, one more final confirmation. This screen outlines the partitions that will be created or changed and the filesystems and formatting that will be done on those partitions. Each of these filesystem related screenshots selected the default values. If you've done the same, and you're OK with losing any existing data that might be on the machine, finalize this change by selecting YES. At this point the installer will install the base system within the newly created partitions. This will take a few minutes (again, your mileage may vary depending on hardware type). There are no prompts during this process, just a progress bar and a communication of the packages that are being installed and configured.
Read more
  • 0
  • 0
  • 7910

article-image-digging-deep-requests
Packt
16 Jun 2015
17 min read
Save for later

Digging Deep into Requests

Packt
16 Jun 2015
17 min read
In this article by Rakesh Vidya Chandra and Bala Subrahmanyam Varanasi, authors of the book Python Requests Essentials, we are going to deal with advanced topics in the Requests module. There are many more features in the Requests module that makes the interaction with the web a cakewalk. Let us get to know more about different ways to use Requests module which helps us to understand the ease of using it. (For more resources related to this topic, see here.) In a nutshell, we will cover the following topics: Persisting parameters across requests using Session objects Revealing the structure of request and response Using prepared requests Verifying SSL certificate with Requests Body Content Workflow Using generator for sending chunk encoded requests Getting the request method arguments with event hooks Iterating over streaming API Self-describing the APIs with link headers Transport Adapter Persisting parameters across Requests using Session objects The Requests module contains a session object, which has the capability to persist settings across the requests. Using this session object, we can persist cookies, we can create prepared requests, we can use the keep-alive feature and do many more things. The Session object contains all the methods of Requests API such as GET, POST, PUT, DELETE and so on. Before using all the capabilities of the Session object, let us get to know how to use sessions and persist cookies across requests. Let us use the session method to get the resource. >>> import requests >>> session = requests.Session() >>> response = requests.get("https://google.co.in", cookies={"new-cookie-identifier": "1234abcd"}) In the preceding example, we created a session object with requests and its get method is used to access a web resource. The cookie value which we had set in the previous example will be accessible using response.request.headers. >>> response.request.headers CaseInsensitiveDict({'Cookie': 'new-cookie-identifier=1234abcd', 'Accept-Encoding': 'gzip, deflate, compress', 'Accept': '*/*', 'User-Agent': 'python-requests/2.2.1 CPython/2.7.5+ Linux/3.13.0-43-generic'}) >>> response.request.headers['Cookie'] 'new-cookie-identifier=1234abcd' With session object, we can specify some default values of the properties, which needs to be sent to the server using GET, POST, PUT and so on. We can achieve this by specifying the values to the properties like headers, auth and so on, on a Session object. >>> session.params = {"key1": "value", "key2": "value2"} >>> session.auth = ('username', 'password') >>> session.headers.update({'foo': 'bar'}) In the preceding example, we have set some default values to the properties—params, auth, and headers using the session object. We can override them in the subsequent request, as shown in the following example, if we want to: >>> session.get('http://mysite.com/new/url', headers={'foo': 'new-bar'}) Revealing the structure of request and response A Requests object is the one which is created by the user when he/she tries to interact with a web resource. It will be sent as a prepared request to the server and does contain some parameters which are optional. Let us have an eagle eye view on the parameters: Method: This is the HTTP method to be used to interact with the web service. For example: GET, POST, PUT. URL: The web address to which the request needs to be sent. headers: A dictionary of headers to be sent in the request. files: This can be used while dealing with the multipart upload. It's the dictionary of files, with key as file name and value as file object. data: This is the body to be attached to the request.json. There are two cases that come in to the picture here: If json is provided, content-type in the header is changed to application/json and at this point, json acts as a body to the request. In the second case, if both json and data are provided together, data is silently ignored. params: A dictionary of URL parameters to append to the URL. auth: This is used when we need to specify the authentication to the request. It's a tuple containing username and password. cookies: A dictionary or a cookie jar of cookies which can be added to the request. hooks: A dictionary of callback hooks. A Response object contains the response of the server to a HTTP request. It is generated once Requests gets a response back from the server. It contains all of the information returned by the server and also stores the Request object we created originally. Whenever we make a call to a server using the requests, two major transactions are taking place in this context which are listed as follows: We are constructing a Request object which will be sent out to the server to request a resource A Response object is generated by the requests module Now, let us look at an example of getting a resource from Python's official site. >>> response = requests.get('https://python.org') In the preceding line of code, a requests object gets constructed and will be sent to 'https://python.org'. Thus obtained Requests object will be stored in the response.request variable. We can access the headers of the Request object which was sent off to the server in the following way: >>> response.request.headers CaseInsensitiveDict({'Accept-Encoding': 'gzip, deflate, compress', 'Accept': '*/*', 'User-Agent': 'python-requests/2.2.1 CPython/2.7.5+ Linux/3.13.0-43-generic'}) The headers returned by the server can be accessed with its 'headers' attribute as shown in the following example: >>> response.headers CaseInsensitiveDict({'content-length': '45950', 'via': '1.1 varnish', 'x-cache': 'HIT', 'accept-ranges': 'bytes', 'strict-transport-security': 'max-age=63072000; includeSubDomains', 'vary': 'Cookie', 'server': 'nginx', 'age': '557','content-type': 'text/html; charset=utf-8', 'public-key-pins': 'max-age=600; includeSubDomains; ..) The response object contains different attributes like _content, status_code, headers, url, history, encoding, reason, cookies, elapsed, request. >>> response.status_code 200 >>> response.url u'https://www.python.org/' >>> response.elapsed datetime.timedelta(0, 1, 904954) >>> response.reason 'OK' Using prepared Requests Every request we send to the server turns to be a PreparedRequest by default. The request attribute of the Response object which is received from an API call or a session call is actually the PreparedRequest that was used. There might be cases in which we ought to send a request which would incur an extra step of adding a different parameter. Parameters can be cookies, files, auth, timeout and so on. We can handle this extra step efficiently by using the combination of sessions and prepared requests. Let us look at an example: >>> from requests import Request, Session >>> header = {} >>> request = Request('get', 'some_url', headers=header) We are trying to send a get request with a header in the previous example. Now, take an instance where we are planning to send the request with the same method, URL, and headers, but we want to add some more parameters to it. In this condition, we can use the session method to receive complete session level state to access the parameters of the initial sent request. This can be done by using the session object. >>> from requests import Request, Session >>> session = Session() >>> request1 = Request('GET', 'some_url', headers=header) Now, let us prepare a request using the session object to get the values of the session level state: >>> prepare = session.prepare_request(request1) We can send the request object request with more parameters now, as follows: >>> response = session.send(prepare, stream=True, verify=True) 200 Voila! Huge time saving! The prepare method prepares the complete request with the supplied parameters. In the previous example, the prepare_request method was used. There are also some other methods like prepare_auth, prepare_body, prepare_cookies, prepare_headers, prepare_hooks, prepare_method, prepare_url which are used to create individual properties. Verifying an SSL certificate with Requests Requests provides the facility to verify an SSL certificate for HTTPS requests. We can use the verify argument to check whether the host's SSL certificate is verified or not. Let us consider a website which has got no SSL certificate. We shall send a GET request with the argument verify to it. The syntax to send the request is as follows: requests.get('no ssl certificate site', verify=True) As the website doesn't have an SSL certificate, it will result an error similar to the following: requests.exceptions.ConnectionError: ('Connection aborted.', error(111, 'Connection refused')) Let us verify the SSL certificate for a website which is certified. Consider the following example: >>> requests.get('https://python.org', verify=True) <Response [200]> In the preceding example, the result was 200, as the mentioned website is SSL certified one. If we do not want to verify the SSL certificate with a request, then we can put the argument verify=False. By default, the value of verify will turn to True. Body content workflow Take an instance where a continuous stream of data is being downloaded when we make a request. In this situation, the client has to listen to the server continuously until it receives the complete data. Consider the case of accessing the content from the response first and the worry about the body next. In the above two situations, we can use the parameter stream. Let us look at an example: >>> requests.get("https://pypi.python.org/packages/source/F/Flask/Flask-0.10.1.tar.gz", stream=True) If we make a request with the parameter stream=True, the connection remains open and only the headers of the response will be downloaded. This gives us the capability to fetch the content whenever we need by specifying the conditions like the number of bytes of data. The syntax is as follows: if int(request.headers['content_length']) < TOO_LONG: content = r.content By setting the parameter stream=True and by accessing the response as a file-like object that is response.raw, if we use the method iter_content, we can iterate over response.data. This will avoid reading of larger responses at once. The syntax is as follows: iter_content(chunk_size=size in bytes, decode_unicode=False) In the same way, we can iterate through the content using iter_lines method which will iterate over the response data one line at a time. The syntax is as follows: iter_lines(chunk_size = size in bytes, decode_unicode=None, delimitter=None) The important thing that should be noted while using the stream parameter is it doesn't release the connection when it is set as True, unless all the data is consumed or response.close is executed. Keep-alive facility As the urllib3 supports the reuse of the same socket connection for multiple requests, we can send many requests with one socket and receive the responses using the keep-alive feature in the Requests library. Within a session, it turns to be automatic. Every request made within a session automatically uses the appropriate connection by default. The connection that is being used will be released after all the data from the body is read. Streaming uploads A file-like object which is of massive size can be streamed and uploaded using the Requests library. All we need to do is to supply the contents of the stream as a value to the data attribute in the request call as shown in the following lines. The syntax is as follows: with open('massive-body', 'rb') as file:    requests.post('http://example.com/some/stream/url',                  data=file) Using generator for sending chunk encoded Requests Chunked transfer encoding is a mechanism for transferring data in an HTTP request. With this mechanism, the data is sent in a series of chunks. Requests supports chunked transfer encoding, for both outgoing and incoming requests. In order to send a chunk encoded request, we need to supply a generator for your body. The usage is shown in the following example: >>> def generator(): ...     yield "Hello " ...     yield "World!" ... >>> requests.post('http://example.com/some/chunked/url/path',                  data=generator()) Getting the request method arguments with event hooks We can alter the portions of the request process signal event handling using hooks. For example, there is hook named response which contains the response generated from a request. It is a dictionary which can be passed as a parameter to the request. The syntax is as follows: hooks = {hook_name: callback_function, … } The callback_function parameter may or may not return a value. When it returns a value, it is assumed that it is to replace the data that was passed in. If the callback function doesn't return any value, there won't be any effect on the data. Here is an example of a callback function: >>> def print_attributes(request, *args, **kwargs): ...     print(request.url) ...     print(request .status_code) ...     print(request .headers) If there is an error in the execution of callback_function, you'll receive a warning message in the standard output. Now let us print some of the attributes of the request, using the preceding callback_function: >>> requests.get('https://www.python.org/',                  hooks=dict(response=print_attributes)) https://www.python.org/ 200 CaseInsensitiveDict({'content-type': 'text/html; ...}) <Response [200]> Iterating over streaming API Streaming API tends to keep the request open allowing us to collect the stream data in real time. While dealing with a continuous stream of data, to ensure that none of the messages being missed from it we can take the help of iter_lines() in Requests. The iter_lines() iterates over the response data line by line. This can be achieved by setting the parameter stream as True while sending the request. It's better to keep in mind that it's not always safe to call the iter_lines() function as it may result in loss of received data. Consider the following example taken from http://docs.python-requests.org/en/latest/user/advanced/#streaming-requests: >>> import json >>> import requests >>> r = requests.get('http://httpbin.org/stream/4', stream=True) >>> for line in r.iter_lines(): ...     if line: ...         print(json.loads(line) ) In the preceding example, the response contains a stream of data. With the help of iter_lines(), we tried to print the data by iterating through every line. Encodings As specified in the HTTP protocol (RFC 7230), applications can request the server to return the HTTP responses in an encoded format. The process of encoding turns the response content into an understandable format which makes it easy to access it. When the HTTP header fails to return the type of encoding, Requests will try to assume the encoding with the help of chardet. If we access the response headers of a request, it does contain the keys of content-type. Let us look at a response header's content-type: >>> re = requests.get('http://google.com') >>> re.headers['content-type'] 'text/html; charset=ISO-8859-1' In the preceding example the content type contains 'text/html; charset=ISO-8859-1'. This happens when the Requests finds the charset value to be None and the 'content-type' value to be 'Text'. It follows the protocol RFC 7230 to change the value of charset to ISO-8859-1 in this type of a situation. In case we are dealing with different types of encodings like 'utf-8', we can explicitly specify the encoding by setting the property to Response.encoding. HTTP verbs Requests support the usage of the full range of HTTP verbs which are defined in the following table. To most of the supported verbs, 'url' is the only argument that must be passed while using them. Method Description GET GET method requests a representation of the specified resource. Apart from retrieving the data, there will be no other effect of using this method. Definition is given as requests.get(url, **kwargs) POST The POST verb is used for the creation of new resources. The submitted data will be handled by the server to a specified resource. Definition is given as requests.post(url, data=None, json=None, **kwargs) PUT This method uploads a representation of the specified URI. If the URI is not pointing to any resource, the server can create a new object with the given data or it will modify the existing resource. Definition is given as requests.put(url, data=None, **kwargs) DELETE This is pretty easy to understand. It is used to delete the specified resource. Definition is given as requests.delete(url, **kwargs) HEAD This verb is useful for retrieving meta-information written in response headers without having to fetch the response body. Definition is given as requests.head(url, **kwargs) OPTIONS OPTIONS is a HTTP method which returns the HTTP methods that the server supports for a specified URL. Definition is given as requests.options(url, **kwargs) PATCH This method is used to apply partial modifications to a resource. Definition is given as requests.patch(url, data=None, **kwargs) Self-describing the APIs with link headers Take a case of accessing a resource in which the information is accommodated in different pages. If we need to approach the next page of the resource, we can make use of the link headers. The link headers contain the meta data of the requested resource, that is the next page information in our case. >>> url = "https://api.github.com/search/code?q=addClass+user:mozilla&page=1&per_page=4" >>> response = requests.head(url=url) >>> response.headers['link'] '<https://api.github.com/search/code?q=addClass+user%3Amozilla&page=2&per_page=4>; rel="next", <https://api.github.com/search/code?q=addClass+user%3Amozilla&page=250&per_page=4>; rel="last" In the preceding example, we have specified in the URL that we want to access page number one and it should contain four records. The Requests automatically parses the link headers and updates the information about the next page. When we try to access the link header, it showed the output with the values of the page and the number of records per page. Transport Adapter It is used to provide an interface for Requests sessions to connect with HTTP and HTTPS. This will help us to mimic the web service to fit our needs. With the help of Transport Adapters, we can configure the request according to the HTTP service we opt to use. Requests contains a Transport Adapter called HTTPAdapter included in it. Consider the following example: >>> session = requests.Session() >>> adapter = requests.adapters.HTTPAdapter(max_retries=6) >>> session.mount("http://google.co.in", adapter) In this example, we created a request session in which every request we make retries only six times, when the connection fails. Summary In this article, we learnt about creating sessions and using the session with different criteria. We also looked deeply into HTTP verbs and using proxies. We learnt about streaming requests, dealing with SSL certificate verifications and streaming responses. We also got to know how to use prepared requests, link headers and chunk encoded requests. Resources for Article: Further resources on this subject: Machine Learning [article] Solving problems – closest good restaurant [article] Installing NumPy, SciPy, matplotlib, and IPython [article]
Read more
  • 0
  • 0
  • 7908

article-image-essbase-aso-aggregate-storage-option
Packt
14 Oct 2009
5 min read
Save for later

Essbase ASO (Aggregate Storage Option)

Packt
14 Oct 2009
5 min read
Welcome to the exciting world of Essbase Analytics known as the Aggregate Storage Option (ASO). Well, now you're ready to take everything one step further. You see, the BSO architecture used by Essbase is the original database architecture as the behind the scenes method of data storage in an Essbase database. The ASO method is entirely different. What is ASO ASO is Essbase's alternative to the sometimes cumbersome BSO method of storing data in an Essbase database. In fact, it is BSO that is exactly what makes Essbase a superior OLAP analytical tool but it is also the BSO that can occasionally be a detriment to the level of system performance demanded in today's business world. In a BSO database, all data is stored, except for dynamically calculated members. All data consolidations and parent-child relationships in the database outline are stored as well. While the block storage method is quite efficient from a data to size ratio perspective, a BSO database can require large amounts of overhead to deliver the retrieval performance demanded by the business customer. The ASO database efficiently stores not only zero level data, but can also store aggregated hierarchical data with the understandings that stored hierarchies can only have the no-consolidation (~) or the addition (+) operator assigned to them and the no-consolidation (~) operator can only be used underneath Label Only members. Outline member consolidations are performed on the fly using dynamic calculations and only at the time of the request for data. This is the main reason why ASO is a valuable option worth consideration when building an Essbase system for your customer. Because of the simplified levels of data stored in the ASO database, a more simplified method of storing the physical data on the disk can also be used. It is this simplified storage method which can help result in higher performance for the customer. Your choice of one database type over the other will always depend on balancing the customer's needs with the server's physical capabilities, along with the volume of data. These factors must be given equal consideration. Creating an aggregate storage Application|Database Believe it or not, creating an ASO Essbase application and database is as easy as creating a BSO application and database. All you need to do is follow these simple steps: Right-click on the server name in your EAS console for the server on which you want to create your ASO application. Select Create application | Using aggregate storage as shown in the following screenshot: Click on Using aggregate storage and that's it. The rest of the steps are easy to follow and basically the same as for a BSO application. To create an ASO application and database, you follow virtually the same steps as you do to create a BSO application and database. However, there are some important differences, and here we list a few: A BSO database outline can be converted into an Aggregate Storage database outline, but an Aggregate Storage database outline cannot be converted into a Block Storage database outline.Steps to convert a BSO application into an ASO application: Open the BSO outline that you wish to convert, select the Essbase database and click on the File | Wizards | Aggregate Storage Outline Conversion option. You will see the first screen Select Source Outline. The source of the outline can be in a file system or on the Essbase Server. In this case, we have selected the OTL from the Essbase Server and then click Next as shown in the following screenshot: In the Next screen, the conversion wizard will verify the conversion and display a message that the conversion has completed successfully. Click Next. Here, Essbase prompts you to select the destination of the ASO outline. If you have not yet created an ASO application, you can click on the Create Aggregate Storage Application on the bottom-right corner of the screen as shown in the next screenshot: Enter the Application and the Database name and click on OK. Your new ASO application is created, now click on Finish. Your BSO application is now converted into an ASO application. You may still need to tweak the ASO application settings and outline members to be the best fit for your needs. In an ASO database, all dimensions are Sparse so there is no need to try to determine the best Dense/Sparse settings as you would do with a BSO database. Although Essbase recommends that you only have one Essbase database in an Essbase application, you can create more than one database per application when you are using the BSO. When you create an ASO application, Essbase will only allow one database per application. There is quite a bit to know about ASO but have no fear, with all that you know about Essbase and how to design and build an Essbase system, it will seem easy for you. Keep reading for more valuable information on the ASO for things like, when it is a good time to use ASO, or how do you query ASO databases effectively, or even what are the differences between ASO and BSO. If you understand the differences, you can then understand the benefits.
Read more
  • 0
  • 0
  • 7905

article-image-chatgpt-and-azure-low-code-machine-learning
M.T. White
22 Aug 2023
12 min read
Save for later

ChatGPT and Azure Low Code Machine Learning

M.T. White
22 Aug 2023
12 min read
IntroductionChatGPT can do many amazing things. ChatGPT can easily troubleshoot code, it can generate source code, and much more.  However, software development and by extension data engineering is comprised of so much more than just text-based programming.  For example, Azure offers a low/no-code tool that can be used to generate machine learning models without having to spend countless hours writing millions of lines of code.  However, there is a caveat to this service; in short, a person has to know what they are doing to use it.  For many building a machine-learning system is a complex task.  This is where ChatGPT comes into play.  ChatGPT can easily guide a user into building the pipeline needed to meet their goals.  What is especially interesting about this notion is that we can use AI to help build other AI!  To demonstrate this we’re going to use ChatGPT as a guiding tool to build our own linear regression model in Azure. ExpectationsAt the time of writing this article, ChatGPT cannot produce actual diagrams; however, what it can do is guide you. This means that you still have to be familiar with the overall gist of technology, but you don’t have to be a data science expert to figure out how to make a model from scratch.  ChatGPT is an excellent tool to get you up and running quickly, but as stated in many other articles throughout the internet, ChatGPT is not a replacement for developers.  On a side note, this tutorial is meant to be interactive.  ChatGPT will only generate a series of steps that must be interpreted. This means that if you don’t know how to carry out a task there are a lot of opportunities to create your own prompts.  With that, let’s start!SetupThis tutorial is going to assume that you have a basic knowledge of Azure and the Azure machine learning services.  However, we’re going to use ChatGPT as a guide from start to finish, so if you’re not sure of what you’re doing you can always modify the prompts or create new ones to help you along.  As an exercise, if you do get stuck it is a good idea to try to write some prompts on your own to get a feel for ChatGPT.  With that let’s start by having ChatGPT guide us in creating the necessary environment for our machine learning pipeline.EnvironmentTo begin, let’s start with getting our environment set up.  To figure out how to do this let’s use the following prompt:What infrastructure will I need to build a no-code Azure machine learning pipeline?Once you run this prompt you should be met with something akin to the following:To build a no-code Azure Machine Learning pipeline, you will need the following infrastructure components: 1.     Azure Subscription: You need an active Azure subscription to access and utilize Azure Machine Learning services. 2.     Azure Machine Learning Workspace: Create an Azure Machine Learning workspace, which serves as the top-level resource for organizing and managing your machine learning assets. The workspace provides a centralized location to store your pipelines, datasets, models, and other resources. 3.     Compute Resources: Azure Machine Learning provides various compute options for running your pipeline. Depending on your workload and requirements, you can choose from options such as Azure Machine Learning Compute, Azure Databricks, or remote VMs. These resources will execute the pipeline steps and perform the required computations. 4.     Data Storage: Azure Machine Learning supports multiple data storage options, such as Azure Blob Storage, Azure Data Lake Storage, or Azure SQL Database. You need a suitable data storage solution to store your input datasets, intermediate data, and any other data required for your pipeline. 5.     Networking and Security: Ensure that your Azure subscription and associated resources are properly secured. Azure provides various security features, such as Virtual Network (VNet) integration, network security groups (NSGs), role-based access control (RBAC), and Azure Active Directory (AAD) authentication, to protect your resources and control access. 6.     Monitoring and Logging: Azure Machine Learning offers monitoring and logging capabilities to track the execution of your pipelines, monitor performance metrics, and troubleshoot issues. You can use Azure Monitor, Azure Log Analytics, or other monitoring tools to gain insights into pipeline performance and health. 7.     Integration with DevOps: If you have a DevOps workflow, you may want to integrate your no-code pipeline with your existing CI/CD processes. Azure Machine Learning can be integrated with Azure DevOps, GitHub Actions, or other tools to automate the deployment and management of your pipelines. Remember to evaluate your specific workload requirements and choose the appropriate infrastructure components accordingly. The exact infrastructure setup will vary depending on factors such as data size, computational needs, security requirements, and integration with existing systems.This prompt generated a lot of needed infrastructure components that are necessary for a production-level pipeline.  However, our pipeline will only be used for practice so all we need to do is follow steps 1-3.  In other words, all we need for this tutorial is an Azure instance, an Azure Machine Learning workspace, and a compute instance.  If you can create these by yourself do so; however, if you can’t try writing your own prompt.  For example, try engineering a prompt that can give you the steps necessary to set up a compute instance.  You could try something like the following:How do I set up a compute instance in Azure?If this prompt doesn’t provide you with the necessary information, try rewording it and giving it more details until you can do it with the provided steps. PipelineThe next step is building out the pipeline.  The pipeline is the logic for the model.  In other words, the pipeline is the algorithm that will load the data for the model, clean the data, train the model, and evaluate the model if necessary.  In short, the best way to think of the pipeline is as a graphical machine learning program.  As stated before, ChatGPT is mostly limited to text-based responses.  However, we can still get step-by-step directions with the following prompt:Build a minimal linear regression model using Azure Machine Learning StudioWhich will output the following response:As an AI language model, I don't have direct access to external tools or platforms like Azure Machine Learning Studio, and my capabilities are limited to providing instructions and guidance. However, I can walk you through the steps to build a minimal linear regression model using Azure Machine Learning Studio. Please note that the steps might slightly vary based on updates to the platform since my last update in September 2021. Make sure to check the official Azure Machine Learning documentation for any changes. Here's a general outline of the steps: 1.     Set up Azure Machine Learning Studio: If you don't have an Azure Machine Learning account, sign up for one. Once you have access to Azure Machine Learning Studio, create a new project or workspace. 2.     Create a new Experiment: In Azure Machine Learning Studio, create a new experiment by clicking on "Experiments" and then "New." Give your experiment a descriptive name. 3.     Data Ingestion: Upload your dataset to Azure Machine Learning Studio. You can do this by clicking on "Datasets" and then "New" to upload your dataset. 4.     Data Cleaning and Preparation: Prepare your data by cleaning and transforming it if needed. This might include removing missing values, encoding categorical variables, and splitting the data into training and testing sets. 5.     Add Linear Regression Module: Drag and drop the "Linear Regression" module from the left panel to the canvas in your experiment. 6.     Connect Data to the Model: Connect the dataset you uploaded earlier to the "Linear Regression" module by clicking and dragging from the dataset to the input of the linear regression module. 7.     Train the Model: Run the experiment to train the linear regression model on your dataset. 8.     Evaluate the Model: After the training is complete, you can use evaluation modules like "Score Model" and "Evaluate Model" to assess the performance of the trained linear regression model. 9.     Deploy the Model (Optional): If you want to use the trained model for predictions in real-world scenarios, you can deploy the model as a web service in Azure Machine Learning Studio. 10. Test the Deployed Model (Optional): Test the deployed model to ensure it is working correctly and providing accurate predictions. Remember, this is just a minimal example of building a linear regression model using Azure Machine Learning Studio. Depending on your dataset and specific use case, you may need to perform additional steps like feature engineering, hyperparameter tuning, and more thorough model evaluation. Always refer to the official documentation and resources from Azure Machine Learning Studio for the most up-to-date and detailed instructions.Analyzing the output, the first two steps are self-explanatory.  The third step will require some preparation since we’re going to use a custom dataset for this tutorial. This tutorial is going to use a dataset that consists of 14 points.  Normally, you’ll want to have a dataset of at least 30 points to ensure normal distribution; however, for this example, we’re going to assume the data has a normal distribution.  With that, to train the model for this tutorial, copy the following data into either a plain text file or an Excel spreadsheet and save it as a .csv file:Once you have the data saved, you’ll need to create a data asset and upload the file to Azure.  If you don’t know the steps to upload the dataset you should try writing another prompt for directions.  If you need more information on this step you could try a prompt like:How do I upload a custom dataset to Azure Machine Learning Studio?Moving on to the rest of the instructions, we can construct a pipeline that resembles the pipeline in Figure 1:Figure 1 – Completed PipelineThis pipeline is, for the most part, a one-to-one representation of the instructions.  However, this pipeline expanded in Step 4.  In this pipeline, we added a Select Columns in the Dataset module to ensure we are only including the necessary data for the model.  From there we ran that data to a Split Data component as was suggested by ChatGPT.  Other than that, the model is exactly as described with the exception of the last two steps which are for deployment and testing; that is, the two steps that were labeled as “optional” by ChatGPT. For this tutorial build the model as is seen in Figure 1 and run it.  After you run the pipeline, you can see how well the model performed. To see the statistics, you can click the Evaluate Model component to see the statistics for the model.  There is a lot of information that can be unpacked, but you can see the scores by navigating to the metrics tab.  If you used the same dataset, your numbers should be around the values that are in Figure 2.Figure 2 – Linear Regression OutputsAt this point, ChatGPT has guided us in building a linear regression model.  Overall, the linear regression model that ChatGPT guided us to build is a very simple model that, all things considered, is fairly accurate. SummaryThis tutorial has been a crash course on how ChatGPT can be used to build no-code solutions in Azure Machine Learning Studios.  What’s incredible about this tutorial is that we used AI to help build another AI system.  However, as was seen in the tutorial, ChatGPT was only a guide.  For graphical systems, ChatGPT, at best, can be used as a guide.  This means that for systems like Azure Machine Learning Studio, a basic understanding of using the system is required.  As such, this means that for graphical systems ChatGPT is best utilized by persons that are knowledgeable of the system but need guidance for the task at hand.  For example, if this were a real project the ideal engineer would be a person that knows how to use Azure Machine Learning Studio but needs help with creating the pipeline logic. In terms of graphical-based programming, ChatGPT is almost a little ironic.  For example, when it comes to text-based programming in Java, Python, or whatever other language ChatGPT can be used to generate working code.  However, due to the current limitation of ChatGPT not being able to generate graphical programs for systems like Azure Machine Learning Studio a person will require a more in-depth knowledge of the system.  As ChatGPT matures this may change but for now, it is best to still have a knowledgeable engineer driving ChatGPT and implementing its solutions.  Overall, ChatGPT is an excellent assistant but will require a person that is knowledgeable of the technology being used. Author BioM.T. White has been programming since the age of 12. His fascination with robotics flourished when he was a child programming microcontrollers such as Arduino. M.T. currently holds an undergraduate degree in mathematics, and a master's degree in software engineering, and is currently working on an MBA in IT project management. M.T. is currently working as a software developer for a major US defense contractor and is an adjunct CIS instructor at ECPI University. His background mostly stems from the automation industry where he programmed PLCs and HMIs for many different types of applications. M.T. has programmed many different brands of PLCs over the years and has developed HMIs using many different tools.Author of the book: Mastering PLC Programming 
Read more
  • 0
  • 0
  • 7904

article-image-basic-doctest-python
Packt
29 Jan 2010
9 min read
Save for later

Basic Doctest in Python

Packt
29 Jan 2010
9 min read
Doctest will be the mainstay of your testing toolkit. You'll be using it for tests, of course, but also for things that you may not think of as tests right now. For example, program specifications and API documentation both benefit from being written as doctests and checked alongside your other tests. Like program source code, doctest tests are written in plain text. Doctest extracts the tests and ignores the rest of the text, which means that the tests can be embedded in human-readable explanations or discussions. This is the feature that makes doctest so suitable for non-classical uses such as program specifications. Time for action – creating and running your first doctest We'll create a simple doctest, to demonstrate the fundamentals of using doctest. Open a new text file in your editor, and name it test.txt. Insert the following text into the file: This is a simple doctest that checks some of Python's arithmeticoperations.>>> 2 + 24>>> 3 * 310 We can now run the doctest. The details of how we do that depend on which version of Python we're using. At the command prompt, change to the directory where you saved test.txt. If you are using Python 2.6 or higher, type: $ python -m doctest test.txt If you are using python 2.5 or lower, the above command may seem to work, but it won't produce the expected result. This is because Python 2.6 is the first version in which doctest looks for test file names on the command line when you invoke it this way. If you're using an older version of Python, you can run your doctest by typing: $ python -c "__import__('doctest').testfile('test.txt')" When the test is run, you should see output as shown in the following screen: What just happened? You wrote a doctest file that describes a couple of arithmetic operations, and executed it to check whether Python behaved as the tests said it should. You ran the tests by telling Python to execute doctest on the files that contained the tests. In this case, Python's behavior differed from the tests because according to the tests, three times three equals ten! However, Python disagrees on that. As doctest expected one thing and Python did something different, doctest presented you with a nice little error report showing where to find the failed test, and how the actual result differed from the expected result. At the bottom of the report, is a summary showing how many tests failed in each file tested, which is helpful when you have more than one file containing tests. Remember, doctest files are for computer and human consumption. Try to write the test code in a way that human readers can easily understand, and add in plenty of plain language commentary. The syntax of doctests You might have guessed from looking at the previous example: doctest recognizes tests by looking for sections of text that look like they've been copied and pasted from a Python interactive session. Anything that can be expressed in Python is valid within a doctest. Lines that start with a >>> prompt are sent to a Python interpreter. Lines that start with a ... prompt are sent as continuations of the code from the previous line, allowing you to embed complex block statements into your doctests. Finally, any lines that don't start with >>> or ..., up to the next blank line or >>> prompt, represent the output expected from the statement. The output appears as it would in an interactive Python session, including both the return value and the one printed to the console. If you don't have any output lines, doctest assumes it to mean that the statement is expected to have no visible result on the console. Doctest ignores anything in the file that isn't part of a test, which means that you can place explanatory text, HTML, line-art diagrams, or whatever else strikes your fancy in between your tests. We took advantage of that in the previous doctest, to add an explanatory sentence before the test itself. Time for action – writing a more complex test We'll write another test (you can add it to test.txt if you like) which shows off most of the details of doctest syntax. Insert the following text into your doctest file (test.txt), separated from the existing tests by at least one blank line: Now we're going to take some more of doctest's syntax for a spin.>>> import sys>>> def test_write():... sys.stdout.write("Hellon")... return True>>> test_write()HelloTrue Think about it for a moment: What does this do? Do you expect the test to pass, or to fail? Run doctest on the test file, just as we discussed before. Because we added the new tests to the same file containing the tests from before, we still see the notification that three times three does not equal ten. Now, though, we also see that five tests were run, which means our new tests ran and succeeded. What just happened? As far as doctest is concerned, we added three tests to the file. The first one says that when we import sys, nothing visible should happen. The second test says that when we define the test_write function, nothing visible should happen. The third test says that when we call the test_write function, Hello and True should appear on the console, in that order, on separate lines. Since all three of these tests pass, doctest doesn't bother to say much about them. All it did was increase the number of tests reported at the bottom from two to five. Expecting exceptions That's all well and good for testing that things work as expected, but it is just as important to make sure that things fail when they're supposed to fail. Put another way; sometimes your code is supposed to raise an exception, and you need to be able to write tests that check that behavior as well. Fortunately, doctest follows nearly the same principle in dealing with exceptions, that it does with everything else; it looks for text that looks like a Python interactive session. That means it looks for text that looks like a Python exception report and traceback, matching it against any exception that gets raised. Doctest does handle exceptions a little differently from other tools. It doesn't just match the text precisely and report a failure if it doesn't match. Exception tracebacks tend to contain many details that are not relevant to the test, but which can change unexpectedly. Doctest deals with this by ignoring the traceback entirely: it's only concerned with the first line—Traceback (most recent call last)—which tells it that you expect an exception, and the part after the traceback, which tells it which exception you expect. Doctest only reports a failure if one of these parts does not match. That's helpful for a second reason as well: manually figuring out what the traceback would look like, when you're writing your tests would require a significant amount of effort, and would gain you nothing. It's better to simply omit them. Time for action – expecting an exception This is yet another test that you can add to test.txt, this time testing some code that ought to raise an exception. Insert the following text into your doctest file (Please note that the last line of this text has been wrapped due to the constraints of the article's format, and should be a single line): Here we use doctest's exception syntax to check that Python iscorrectly enforcing its grammar.>>> def faulty():... yield 5... return 7Traceback (most recent call last):SyntaxError: 'return' with argument inside generator(<doctest test.txt[5]>, line 3) The test is supposed to raise an exception, so it will fail if it doesn't raise the exception, or if it raises the wrong exception. Make sure you have your mind wrapped around that: if the test code executes successfully, the test fails, because it expected an exception. Run the tests using doctest and the following screen will be displayed: What just happened? Since Python doesn't allow a function to contain both yield statements and return statements with values, having the test to define such a function caused an exception. In this case, the exception was a SyntaxError with the expected value. As a result, doctest considered it a match with the expected output, and thus the test passed. When dealing with exceptions, it is often desirable to be able to use a wildcard matching mechanism. Doctest provides this facility through its ellipsis directive, which we'll discuss later Expecting blank lines in the output Doctest uses the first blank line to identify the end of the expected output. So what do you do, when the expected output actually contains a blank line? Doctest handles this situation by matching a line that contains only the text <BLANKLINE> in the expected output, with a real blank line in the actual output. Using directives to control doctest Sometimes, the default behavior of doctest makes writing a particular test inconvenient. That's where doctest directives come to our rescue. Directives are specially formatted comments that you place after the source code of a test, which tell doctest to alter its default behavior in some way. A directive comment begins with # doctest:, after which comes a comma-separated list of options, that either enable or disable various behaviors. To enable a behavior, write a + (plus symbol) followed by the behavior name. To disable a behavior, white a – (minus symbol) followed by the behavior name. Ignoring part of the result It's fairly common that only part of the output of a test is actually relevant to determining whether the test passes. By using the +ELLIPSIS directive, you can make doctest treat the text ... (called an ellipsis) in the expected output as a wildcard, which will match any text in the output. When you use an ellipsis, doctest will scan ahead until it finds text matching whatever comes after the ellipsis in the expected output, and continue matching from there. This can lead to surprising results such as an ellipsis matching against a 0-length section of the actual output, or against multiple lines. For this reason, it needs to be used thoughtfully.
Read more
  • 0
  • 0
  • 7899

article-image-recording-interview-skype-using-audacity-13
Packt
23 Mar 2010
4 min read
Save for later

Recording an Interview with Skype using Audacity 1.3

Packt
23 Mar 2010
4 min read
In a previous article we described everything that you need to know about recording voice tracks. In this article by Bethany Hiitola, author of Getting started with Audacity 1.3, we will learn all the details of using third-party internet telephony software such as Skype to record telephone interviews. We will also cover how to set up a timed recording. Recording an interview with Skype If you are interested in doing more than solo podcasts with Audacity, you can always try creating interview podcasts. You can record these live in your office with your computer's internal microphone, or with additional microphones. However, you can't always perform an interview from the comfort of your office due to conflicting schedules and the location of your interviewee. Hence, let's learn how to record an interview using your phone and your computer. First you'll need to install another software that allows you to make phone calls using your computer. The program we are going to use for this example is Skype. However, you could use other software that does the same thing for your Internet telephony set-up. Download and install Skype Skype is software that allows us to make voice calls over the Internet, particularly to other users of Skype. Some numbers (such as toll-free numbers) are free of charge, while calls to landlines and mobile phones may require a small fee. For details on pricing for Skype credits for landline and cell calls go to: http://www.skype.com/. Let's briefy discuss how to download and install Skype. First, go to http://www.skype.com/ and download the appropriate version of the software for your computer. Once the installation package has been downloaded to your computer, double-click on it to begin the installation. For Mac computers, a .DMG file is downloaded. All you need to do is uncompress that file and drag-and-drop the Skype package to the Application folder. For any Windows device, an .exe file is downloaded. Double-click on that file to begin the installation. For Linux, there are multiple distributions available. If you aren't already prompted to do so, start the Skype application and follow the on-screen instructions to sign up for a new Skype account. Once you have registered and signed in, the main Skype screen is displayed, which should look similar to the next screenshot: Set up Skype for your telephone interview For our project, we've been using the computer's internal microphone, so there shouldn't be any additional set up in either Skype or Audacity. However, to be sure you may want to check the recording input devices in Audacity to make sure that you can record both sides of the interview. To do this, use the following steps: In the Audacity window, go to the main menu, and then select Audacity and then Preferences. When using a computer running the Microsoft Windows or Linux operating systems, you can find these preferences from the main menu. Select File and then Preferences. In the Audacity Preferences window, select Devices. Check the Device settings under Recording. Particularly if you are using multiple inputs, it may be best to select Stereo Mixer or similar input. If there are many devices listed with the Recording | Device drop-down menu, perform a few interview tests with a friend on Skype prior to the recording session, to determine which of the connected devices will actually be doing the recording. You might also want to turn off all notifcations in Skype. These are all of the alert sounds for events such as contacts logging in and out, incoming call alerts, and so on. To do this, follow the steps shown below: Open Skype and log in. From the main menu, select Skype and then Preferences. Select the Notifcations tab. Be sure to uncheck the Play Sound checkbox. This will make sure that all sounds are suspended and won't interrupt your recording session.
Read more
  • 0
  • 0
  • 7897
Unlock access to the largest independent learning library in Tech for FREE!
Get unlimited access to 7500+ expert-authored eBooks and video courses covering every tech area you can think of.
Renews at €18.99/month. Cancel anytime
article-image-create-box-whisker-plot-tableau
Sugandha Lahoti
30 Dec 2017
5 min read
Save for later

How to create a Box and Whisker Plot in Tableau

Sugandha Lahoti
30 Dec 2017
5 min read
[box type="note" align="" class="" width=""]This article is an excerpt from a book written by Shweta Sankhe-Savale, titled Tableau Cookbook – Recipes for Data Visualization. With the recipes in this book, learn to create beautiful data visualizations in no time on Tableau.[/box] In today’s tutorial, we will learn how to create a Box and Whisker plot in Tableau. The Box plot, or Box and Whisker plot as it is popularly known, is a convenient statistical representation of the variation in a statistical population. It is a great way of showing a number of data points as well as showing the outliers and the central tendencies of data. This visual representation of the distribution within a dataset was first introduced by American mathematician John W. Tukey in 1969. A box plot is significantly easier to plot than say a histogram and it does not require the user to make assumptions regarding the bin sizes and number of bins; and yet it gives significant insight into the distribution of the dataset. The box plot primarily consists of four parts: The median provides the central tendency of our dataset. It is the value that divides our dataset into two parts, values that are either higher or lower than the median. The position of the median within the box indicates the skewness in the data as it shifts either towards the upper or lower quartile. The upper and lower quartiles, which form the box, represent the degree of dispersion or spread of the data between them. The difference between the upper and lower quartile is called the Interquartile Range (IQR) and it indicates the mid-spread within which 50 percentage of the points in our dataset lie. The upper and lower whiskers in a box plot can either be plotted at the maximum and minimum value in the dataset, or 1.5 times the IQR on the upper and lower side. Plotting the whiskers at the maximum and minimum values includes 100 percentage of all values in the dataset including all the outliers. Whereas plotting the whiskers at 1.5 times the IQR on the upper and lower side represents outliers in the data beyond the whiskers. The points lying between the lower whisker and the lower quartile are the lower 25 percent of values in the dataset, whereas the points lying between the upper whisker and the upper quartile are the upper 25 percent of values in the dataset. In a typical normal distribution, each part of the box plot will be equally spaced. However, in most cases, the box plot will quickly show the underlying variations and trends in data and allows for easy comparison between datasets: Getting Ready Create a Box and Whisker plot in a new sheet in a workbook. For this purpose, we will connect to an Excel file named Data for Box plot & Gantt chart, which has been uploaded on https://1drv.ms/f/ s!Av5QCoyLTBpnhkGyrRrZQWPHWpcY. Let us save this Excel file in Documents | My Tableau Repository | Datasources | Tableau Cookbook data folder. The data contains information about customers in terms of their gender and recorded weight. The data contains 100 records, one record per customer. Using this data, let us look at how we can create a Box and Whisker plot. How to do it Once we have downloaded and saved the data from the link provided in the Getting ready section, we will create a new worksheet in our existing workbook and rename it to Box and Whisker plot. Since we haven't connected to the new dataset yet, establish a new data connection by pressing Ctrl + D on our keyboard. Select the Excel option and connect to the Data for Box plot & Gantt chart file, which is saved in our Documents | My Tableau Repository | Datasources | Tableau Cookbook data folder. Next let us select the table named Box and Whisker plot data by doubleclicking on it. Let us go ahead with the Live option to connect to this data. Next let us multi-select the Customer and Gender field from the Dimensions pane and the Weight from the Measures pane by doing a Ctrl + Select. Refer to the following image: 6. Next let us click on the Show Me! button and select the box-and-whisker plot. Refer to the highlighted section in the following image: 7. Once we click on the box-and-whisker plot option, we will see the following view: How it works In the preceding chart, we get two box and whisker plots: one for each gender. The whiskers are the maximum and minimum extent of the data. Furthermore, in each category we can see some circles, which are essentially representing a customer. Thus, within each gender category, the graph is showing the distribution of customers by their respective weights. When we hover over any of these circles, we can see details of the customer in terms of name, gender, and recorded weight in the tooltip. Refer to the following image: However, when we hover over the box (gray section), we will see the details in terms of median, lower quartiles, upper quartiles, and so on. Refer to the following image: Thus, a summary of the box plot that we created is as follows: In more simple terms, for the female category, the majority of the population lies between the weight range of 44 to 75, whereas for the male category, the majority of the population lies between the weight range of 44 to 82. Please note that in our visualization, even though the Row shelf displays SUM(Weight), since we have Customer in the Detail shelf, there's only one entry per customer, so SUM(Weight) is actually the same as MIN(Weight), MAX(Weight), or AVG(Weight). We learnt the basics of Box and Whisker plot and how to create them using Tableau. If you had fun with this recipe, do check out our book Tableau Cookbook – Recipes for Data Visualization to create interactive dashboards and beautiful data visualizations with Tableau.        
Read more
  • 0
  • 0
  • 7888

article-image-introduction-machine-learning-r
Packt
18 Feb 2016
7 min read
Save for later

Introduction to Machine Learning with R

Packt
18 Feb 2016
7 min read
If science fiction stories are to be believed, the invention of artificial intelligence inevitably leads to apocalyptic wars between machines and their makers. In the early stages, computers are taught to play simple games of tic-tac-toe and chess. Later, machines are given control of traffic lights and communications, followed by military drones and missiles. The machine's evolution takes an ominous turn once the computers become sentient and learn how to teach themselves. Having no more need for human programmers, humankind is then deleted. (For more resources related to this topic, see here.) Thankfully, at the time of writing this, machines still require user input. Though your impressions of machine learning may be colored by these mass-media depictions, today's algorithms are too application-specific to pose any danger of becoming self-aware. The goal of today's machine learning is not to create an artificial brain, but rather to assist us in making sense of the world's massive data stores. Putting popular misconceptions aside, in this article we will learn the following topics: Installing R packages Loading and unloading R packages Machine learning with R Many of the algorithms needed for machine learning with R are not included as part of the base installation. Instead, the algorithms needed for machine learning are available via a large community of experts who have shared their work freely. These must be installed on top of base R manually. Thanks to R's status as free open source software, there is no additional charge for this functionality. A collection of R functions that can be shared among users is called a package. Free packages exist for each of the machine learning algorithms covered in this book. In fact, this book only covers a small portion of all of R's machine learning packages. If you are interested in the breadth of R packages, you can view a list at Comprehensive R Archive Network (CRAN), a collection of web and FTP sites located around the world to provide the most up-to-date versions of R software and packages. If you obtained the R software via download, it was most likely from CRAN at http://cran.r-project.org/index.html. If you do not already have R, the CRAN website also provides installation instructions and information on where to find help if you have trouble. The Packages link on the left side of the page will take you to a page where you can browse packages in an alphabetical order or sorted by the publication date. At the time of writing this, a total 6,779 packages were available—a jump of over 60% in the time since the first edition was written, and this trend shows no sign of slowing! The Task Views link on the left side of the CRAN page provides a curated list of packages as per the subject area. The task view for machine learning, which lists the packages covered in this book (and many more), is available at http://cran.r-project.org/web/views/MachineLearning.html. Installing R packages Despite the vast set of available R add-ons, the package format makes installation and use a virtually effortless process. To demonstrate the use of packages, we will install and load the RWeka package, which was developed by Kurt Hornik, Christian Buchta, and Achim Zeileis (see Open-Source Machine Learning: R Meets Weka in Computational Statistics 24: 225-232 for more information). The RWeka package provides a collection of functions that give R access to the machine learning algorithms in the Java-based Weka software package by Ian H. Witten and Eibe Frank. More information on Weka is available at http://www.cs.waikato.ac.nz/~ml/weka/ To use the RWeka package, you will need to have Java installed (many computers come with Java preinstalled). Java is a set of programming tools available for free, which allow for the use of cross-platform applications such as Weka. For more information, and to download Java on your system, you can visit http://java.com. The most direct way to install a package is via the install.packages() function. To install the RWeka package, at the R command prompt, simply type: > install.packages("RWeka") R will then connect to CRAN and download the package in the correct format for your OS. Some packages such as RWeka require additional packages to be installed before they can be used (these are called dependencies). By default, the installer will automatically download and install any dependencies. The first time you install a package, R may ask you to choose a CRAN mirror. If this happens, choose the mirror residing at a location close to you. This will generally provide the fastest download speed. The default installation options are appropriate for most systems. However, in some cases, you may want to install a package to another location. For example, if you do not have root or administrator privileges on your system, you may need to specify an alternative installation path. This can be accomplished using the lib option, as follows: > install.packages("RWeka", lib="/path/to/library") The installation function also provides additional options for installation from a local file, installation from source, or using experimental versions. You can read about these options in the help file, by using the following command: > ?install.packages More generally, the question mark operator can be used to obtain help on any R function. Simply type ? before the name of the function. Loading and unloading R packages In order to conserve memory, R does not load every installed package by default. Instead, packages are loaded by users as they are needed, using the library() function. The name of this function leads some people to incorrectly use the terms library and package interchangeably. However, to be precise, a library refers to the location where packages are installed and never to a package itself. To load the RWeka package we installed previously, you can type the following: > library(RWeka) Aside from RWeka, there are several other R packages. To unload an R package, use the detach() function. For example, to unload the RWeka package shown previously use the following command: > detach("package:RWeka", unload = TRUE) This will free up any resources used by the package. Summary Machine learning originated at the intersection of statistics, database science, and computer science. It is a powerful tool, capable of finding actionable insight in large quantities of data. Still, caution must be used in order to avoid common abuses of machine learning in the real world. Conceptually, learning involves the abstraction of data into a structured representation and the generalization of this structure into action that can be evaluated for utility. In practical terms, a machine learner uses data containing examples and features of the concept to be learned and summarizes this data in the form of a model, which is then used for predictive or descriptive purposes. These purposes can be grouped into tasks, including classification, numeric prediction, pattern detection, and clustering. Among the many options, machine learning algorithms are chosen on the basis of the input data and the learning task. R provides support for machine learning in the form of community-authored packages. These powerful tools are free to download; however, they need to be installed before they can be used. To learn more about R, you can refer the following books published by Packt Publishing (https://www.packtpub.com/): Machine Learning with R - Second Edition (https://www.packtpub.com/big-data-and-business-intelligence/machine-learning-r-second-edition) R for Data Science (https://www.packtpub.com/big-data-and-business-intelligence/r-data-science) R Data Science Essentials (https://www.packtpub.com/big-data-and-business-intelligence/r-data-science-essentials) R Graphs Cookbook Second Edition (https://www.packtpub.com/big-data-and-business-intelligence/r-graph-cookbook-%E2%80%93-second-edition) Resources for Article: Further resources on this subject: Machine Learning[article] Introducing Test-driven Machine Learning[article] Machine Learning with R[article]
Read more
  • 0
  • 0
  • 7887

article-image-translating-file-sdl-trados-studio
Packt
17 Feb 2014
7 min read
Save for later

Translating a file in SDL Trados Studio

Packt
17 Feb 2014
7 min read
(For more resources related to this topic, see here.) Opening an individual document for translation To open a document for translation in SDL Trados Studio, perform the following steps: In any view, choose File | Open | Translate Single Document or press Ctrl + Shift + O. Browse to the file that you want to open for translation, select it, and click Open. Alternatively, you can open a file using drag and drop. You must be in the Editor view to do this. Drag the file from Windows Explorer into the Navigation pane, shown in the following screenshot: In the Open Document window, shown in the following screenshot, select the desired Source Language and Target Language. If you are using the sample file, please choose English (US) as your source language and a language of your choice as your target language. Select one or more TMs by clicking Add and browsing to select an existing TM. You can also choose to create a new TM at this point by clicking Create (choose New File-based Translation Memory, specify a Name and Location for the file, and click Finish). If you are working with our sample file, please create or select a TM of your own at this point. The following screenshot shows the Open Document window after we add the TM: For any other settings, click the Advanced button at the bottom-left corner of the window. Click OK to open the document for translation in the side-by-side editor. Translating in the side-by-side editor The side-by-side editor is made up of five columns, numbered in the following screenshot: At the top left is a tab showing the name of the active document. The numbers in circles represent the following: Column 1: The segment number. Column 2: The source text, divided into segments when you open the file for translation. Column 3: The segment status and translation origin, indicating what work you have done on each segment at any given point in time, and where the match came from. The icons have the following meanings: Column 4: Where you type the translation. Column 5: Information to indicate the context of each segment within the structure of the original document. For example, in the sample file, the H in Segment 1 shows that the text is formatted as a heading in the original MS Word document. To find out what the icons in the segment status column and the information in the document structure column mean, move your mouse pointer over that part of the segment to display a tool tip or click on it for more detailed information. Translating the text To begin translating, click in the first target segment and type the translation. As soon as you start typing, the status symbol changes from (Not Translated) to (Draft), showing that you have edited the segment but not stored it in the TM yet, as shown in the following screenshot. Segment 1 of the sample file is a heading, as indicated by the letter H on the right. Notice that the visual formatting of the text as displayed by SDL Trados Studio is replicated when you type the translation. When you are happy with your translation, press Ctrl + Enter to store the translated segment in the TM and move to the next segment that needs translating (pressing Enter alone has no effect). Alternatively, in the Home tab, click the Confirm button, shown on the left of the following screenshot: This action is generally described as confirming the segment. The status symbol changes from (Draft), to (Translated) to indicate that the segment has been confirmed. Segments that you translate or edit must be confirmed in this way, or they will not be stored in the TM. The default confirm action (Ctrl + Enter) actually moves you to the next unconfirmed segment, skipping any confirmed segments in between. To show more options for confirming segments, as shown in the preceding screenshot, click the drop-down arrow under the Confirm button. To go to the next segment down, whether confirmed or not, choose Confirm and Move to Next Segment (Ctrl + Alt + Enter). If you are translating a file that produces lots of 100% matches that you do not wish to check immediately, choose Confirm and Translate until Next Fuzzy Match (Ctrl + Alt + F). You will then move down the bilingual file, automatically confirming any 100% matches, and only stopping at the next match that is less than 100%. Now translate and confirm Segment 2. This moves you into Segment 3, which is a fuzzy or partial match as indicated by the figure 82% in the following screenshot: Typing accented characters The ability to type accented characters in SDL Trados Studio is dependent on the keyboard settings in MS Windows, as with any other application that you might run on MS Windows. If you are using an English language keyboard and want to type accented characters in the target segment, you can use the Alt codes (such as Alt + 0233 for é). It is also possible to change the keyboard to follow the target language layout, via the Control Panel in MS Windows. The Translation Results window Whenever you move into a new segment (as from Segment 2 to Segment 3), the TM (or TMs if more than one is active) are searched for matches, and the highest match appears in the target segment (this action is called Lookup). If a match is found, the results are displayed in the Translation Results window, and an icon appears in the segment status column in the side-by-side editor to show the match level. By default, if no matches are found, the target segment remains empty, and the Translation Results window displays the text No matches found. The Translation Results window displays the text in the current segment in the white area at the top, and any match from the TM underneath it, as shown in the following screenshot. The blue and red text in the source segment indicates the words that need to be added to and deleted from the new segment compared to the match from the TM (in a similar form to that used in Track Changes in MS Word). In this case, for example, we need to add quite long and delete short in the translation. Edit the target segment to make the translation correct, and then confirm. When you edit and confirm the segment, the fuzzy match icon changes to a transparent background, as shown in the following screenshot. Notice that the fuzzy match value remains even after you confirm the segment. Thus, the percentage values displayed always indicate the value of the match as originally offered by the TM (the translation origin). Inserting matches from the TM The following screenshot shows the sample file before we edit and confirm Segment 4: Each match in the Translation Results window has a number, as shown on the left of the following screenshot. As you will see when you get to Segment 4 (which we will now edit and confirm) the highest match (with the number 1 in the column on the left) is automatically inserted whenever you move into an empty target segment. To insert a different match instead, press Ctrl and the numbers on the main keyboard. For example, to insert match number 2, press Ctrl + 2. To insert the match currently highlighted in blue in the Translation Results window, click the Apply Translation button or choose Home | Apply Translation (Ctrl + T). You can also scroll the list of matches to insert other matches via the Select Previous Match (Alt + Pg Up) and Select Next Match (Alt + Pg Dn) buttons. Notice that the yellow bar at the bottom of the Translation Results window shows the name of the TM providing the match in the segment that is highlighted in blue, as in the preceding screenshot. Summary In this article we learned the basic process of opening a document in SDL Trados Studio and translating it. We opened an individual document for translation and translated it in side-b-side editor. Resources for Article: Using Sprites for Animation [Article] About Test Studio [Article] Connecting to Microsoft SQL Server Compact 3.5 with Visual Studio [Article]
Read more
  • 0
  • 0
  • 7883

article-image-develop-digital-clock
Packt
22 Apr 2015
15 min read
Save for later

Develop a Digital Clock

Packt
22 Apr 2015
15 min read
In this article by Samarth Shah, author of the book Learning Raspberry Pi, we will take your Raspberry Pi to the real world. Make sure you have all the components listed for you to go ahead: Raspberry Pi with Raspbian OS. A keyboard/mouse. A monitor to display the content of Raspberry Pi. If you don't have Raspberry Pi, you can install the VNC server on Raspberry Pi, and on your laptop using the VNC viewer, you will be able to display the content. Hook up wires of different colors (keep around 30 wires of around 10 cm long). To do: Read instructions on how to cut the wires. An HD44780-based LCD. Note; I have used JHD162A. A breadboard. 10K potentiometer (optional). You will be using potentiometer to control the contrast of the LCD, so if you don't have potentiometer, contrast would be fixed and that would be okay for this project. Potentiometer is just a fancy word used for variable resistor. Basically, it is just a three-terminal resistor with sliding or rotating contact, which is used for changing the value of the resistor. (For more resources related to this topic, see here.) Setting up Raspberry Pi Once you have all the components listed in the previous section, before you get started, there are some software installation that needs to be done: Sudo apt-get update For controlling GPIO pins of Raspberry Pi, you will be using Python so for that python-dev, python-setuptools, and rpi.gpio (the Python wrapper of WiringPi) are required. Install them using the following command: Sudo apt-get install python-dev Sudo apt-get install python-setuptools Sudo apt-get install rpi.gpio Now, your Raspberry Pi is all set to control the LCD, but before you go ahead and start connecting LCD pins with Raspberry Pi GPIO pins, you need to understand how LCD works and more specifically how HD44780 based LCD works. Understanding HD44780-based LCD The LCD character displays can be found in espresso machines, laser printers, children's toys, and maybe even the odd toaster. The Hitachi HD44780 controller has become an industry standard for these types of displays. If you look at the back side of the LCD that you have bought, you will find 16 pins: Vcc / HIGH / '1' +5 V GND / LOW / '0' 0 V The following table depicts the HD44780 pin number and functionality: Pin number Functionality 1 Ground 2 VCC 3 Contrast adjustment 4 Register select 5 Read/Write(R/W) 6 Clock(Enable) 7 Bit 0 8 Bit 1 9 Bit 2 10 Bit 3 11 Bit 4 12 Bit 5 13 Bit 6 14 Bit 7 15 Backlight anode (+) 16 Backlight cathode (-) Pin 1 and Pin 2 are the power supply pins. They need to be connected with ground and +5 V power supply respectively. Pin 3 is a contrast setting pin. It should be connected to a potentiometer to control the contrast. However, in a JHD162A LCD, if you directly connect this pin to ground, initially, you will see dark boxes but that will work if you don't have potentiometer. Pin 4, Pin 5, and Pin 6 are the control pins. Pin 7 to Pin 14 are the data pins of LCD. Pin 7 is the least significant bit and pin 14 is the most significant bit of the date inputs. You can use LCD in two modes, that is, 4-bit or 8-bit. In the next section, you will be doing 4-bit operation to control the LCD. If you want to display some number/character on the display, you have to input the appropriate codes for that number/character on these pins. Pin 15 and Pin 16 provide power supply to the backlight of LCD. A backlight is a light within the LCD panel, which makes seeing the characters on the screen easier. When you leave your cell phone or MP3 player untouched for some time, the screen goes dark. This is the backlight turning off. It is possible to use the LCD without the backlight as well. JHD162A has a backlight, so you need to connect the power supply and ground to these pins respectively. Pin 4, Pin 5, and Pin 6 are the most important pins. As mentioned in the table, Pin 4 is the register select pin. This allows you to switch between two operating modes of the LCD, namely the instruction and character modes. Depending on the status of this pin, the data on the 8 data pins (D0-D7) is treated as either an instruction or as character data. To display some characters on LCD, you have to activate the character mode. And to give some instructions such as "clear the display" and "move cursor to home", you have to activate the command mode. To set the LCD in the instruction mode, set Pin 4 to '0' and to put it in character mode, set Pin 4 to '1'. Mostly, you will be using the LCD to display something on the screen; however, sometimes you may require to read what is being written on the LCD. In this case, Pin 5 (read-write) is used. If you set Pin 5 to 0, it will work in the write mode, and if you set Pin 5 to 1, it will work in the read mode. For all the practical purposes, Pin 5 (R/W) has to be permanently set to 0, that is, connect it with GND (Ground). Pin 6 (enable pin) has a very simple function. This is just the clock input for the LCD. The instruction or the character data at the data pins (Pin 7-Pin 14) is processed by the LCD on the falling edge of this pin. The enable pin should be normally held at 1, that is, Vcc by a pull up resistor. When a momentary button switch is pressed, the pin goes low and back to high again when you leave the switch. Your instruction or character will be executed on the falling edge of the pulse, that is, the moment when the switch get closed. So, the flow diagram of a typical write sequence to LCD will be: Connecting LCD pins and Raspberry Pi GPIO pins Having understood the way LCD works, you are ready to connect your LCD with Raspberry Pi. Connect LCD pins with Raspberry Pi pins using following table: LCD pins Functionality Raspberry Pi pins 1 Ground Pin 6 2 Vcc Pin 2 3 Contrast adjustment Pin 6 4 Register select Pin 26 5 Read/Write (R/W) Pin 6 6 Clock (Enable) Pin 24 7 Bit 0 Not used 8 Bit 1 Not used 9 Bit 2 Not used 10 Bit 3 Not used 11 Bit 4 Pin 22 12 Bit 5 Pin 18 13 Bit 6 Pin 16 14 Bit 7 Pin 12 15 Backlight anode (+) Pin 2 16 Backlight cathode (-) Pin 6 Your LCD should have come with 16-pin single row pin header (male/female) soldered with 16 pins of LCD. If you didn't get any pin header with the LCD, you have to buy a 16-pin single row female pin header. If the LCD that you bought doesn't have a soldered 16-pin single row pin header, you have to solder it. Please note that if you have not done soldering before, don't try to solder it by yourself. Ask some of your friends who can help or the easiest option is to take it to the place from where you have bought it; he will solder it for you in merely 5 minutes. The final connections are shown in the following screenshot. Make sure your Raspberry Pi is not running when you are connecting LCD pins with Raspberry Pi pins. Connect LCD pins with Raspberry Pi using hook up wires. Before you start scripting, you need to understand few more things about operating LCD in a 4-bit mode: LCD default is a 8-bit mode, so the first command must be set in a 8-bit mode instructing the LCD to operate in a 4-bit mode In the 4-bit mode, a write sequence is a bit different than what has been depicted earlier for the 8-bit mode. In the following diagram, Step 3 will be different in the 4-bit mode as there are only 4 data pins available for data transfer and you cannot transfer 8 bits at the same time. So in this case, as per the HD44780 datasheet, you have to send the first Upper "nibble" to LCD Pin 11 to Pin 14. Execute those bits by making Enable pin to LOW (as instructions/character will get executed on the falling edge of the pulse) and back to HIGH again. Once you have sent the Upper "nibble", send Lower "nibble" to LCD Pin 11 to Pin 14. To execute the instruction/character sent, set Enable Pin to LOW. So, the typical 4-bit mode write process will look like this: Scripting Once you have connected LCD pins with Raspberry Pi, as per the previously shown diagram, boot up your Raspberry Pi. The following are the steps to develop a digital clock: Create a new python script by right-clicking Create | New File | DigitalClock.py. Here is the code that needs to be copied in the DigitalClock.py file: #!/usr/bin/python import RPi.GPIO as GPIO import time from time import sleep from datetime import datetime from time import strftime class HD44780: def __init__(self, pin_rs=7, pin_e=8,   pins_db=[18,23,24,25]):    self.pin_rs=pin_rs    self.pin_e=pin_e    self.pins_db=pins_db    GPIO.setmode(GPIO.BCM)    GPIO.setup(self.pin_e, GPIO.OUT)    GPIO.setup(self.pin_rs, GPIO.OUT)    for pin in self.pins_db:      GPIO.setup(pin, GPIO.OUT)      self.clear() def clear(self):    # Blank / Reset LCD    self.cmd(0x33)    self.cmd(0x32)    self.cmd(0x28)    self.cmd(0x0C)    self.cmd(0x06)    self.cmd(0x01) def cmd(self, bits, char_mode=False):    # Send command to LCD    sleep(0.001)    bits=bin(bits)[2:].zfill(8)    GPIO.output(self.pin_rs, char_mode)    for pin in self.pins_db:      GPIO.output(pin, False)    for i in range(4):      if bits[i] == "1":        GPIO.output(self.pins_db[i], True)    GPIO.output(self.pin_e, True)    GPIO.output(self.pin_e, False)    for pin in self.pins_db:      GPIO.output(pin, False)    for i in range(4,8):      if bits[i] == "1":        GPIO.output(self.pins_db[i-4], True) GPIO.output(self.pin_e, True)    GPIO.output(self.pin_e, False) def message(self, text):    # Send string to LCD. Newline wraps to second line    for char in text:      if char == 'n':        self.cmd(0xC0) # next line      else:        self.cmd(ord(char),True) if __name__ == '__main__': while True:    lcd = HD44780()    lcd.message(" "+datetime.now().strftime(%H:%M:%S))    time.sleep(1) Run the preceding script by executing the following command: sudo python DigitalClock.py This code accesses the Raspberry Pi GPIO port, which requires root privileges, so sudo is used. Now, your digital clock is ready. The current Raspberry Pi time will get displayed on the LCD screen. Only 4 data pins of LCD are being connected, so this script will work for the LCD 4-bit mode. I have used the JHD162A LCD, which is based on the HD44780 controller. Controlling mechanisms for all HD44780 LCDs are same. So, the preceding class, HD44780, can also be used for controlling another HD44780-based LCD. Once you have understood the preceding LCD 4-bit operation, it will be much easier to understand this code: if __name__ == '__main__': while True:    lcd = HD44780()    lcd.message(" "+datetime.now().strftime(%H:%M:%S))    time.sleep(1) The main HD44780 class has been initialized and the current time will be sent to the LCD every one second by calling the message function of the lcd object. The most important part of this project is the HD44780 class. The HD44780 class has the following four components: __init__: This will initialize the necessary components. clear: This will clear the LCD and instruct it to work in the 4-bit mode. cmd: This is the core of the class. Each and every command/instruction that has to be executed will get executed in this function. message: This will be used to display a message on the screen. The __init__ function The _init_ function initializes the necessary GPIO port for LCD operation: def __init__(self, pin_rs=7, pin_e=8, pins_db=[18,23,24,25]): self.pin_rs=pin_rs self.pin_e=pin_e self.pins_db=pins_db GPIO.setmode(GPIO.BCM) GPIO.setup(self.pin_e, GPIO.OUT) GPIO.setup(self.pin_rs, GPIO.OUT) for pin in self.pins_db:    GPIO.setup(pin, GPIO.OUT)    self.clear() GPIO.setmode(GPIO.BCM): Basically, the GPIO library has two modes in which it can operate. One is BOARD and the other one is BCM. The BOARD mode specifies that you are referring to the numbers printed in the board. The BCM mode means that you are referring to the pins by the "Broadcom SOC channel" number. While creating an instance of the HD44780 class, you can specify which pin to use for a specific purpose. By default, it will take GPIO 7 as RS Pin, GPIO 8 as Enable Pin, and GPIO 18, GPIO 23, GPIO 24, and GPIO 25 as data pins. Once you have defined the mode of the GPIO operation, you have to set all the pins that will be used as output, as you are going to provide output of these pins to LCD pins. Once this is done, clear and initialize the screen. The clear function The clear function clears/resets the LCD:    def clear(self):      # Blank / Reset LCD      self.cmd(0x33)      self.cmd(0x32)      self.cmd(0x28)      self.cmd(0x0C)      self.cmd(0x06)      self.cmd(0x01) You will see some code sequence that is executed in this section. This code sequence is generated based on the HD44780 datasheet instructions. A complete discussion of this code is beyond the scope of this article; however, to give a high-level overview, the functionality of each code is as follows: 0x33: This is the function set to the 8-bit mode 0x32: This is the function set to the 8-bit mode again 0x28: This is the function set to the 4-bit mode, which indicates the LCD has two lines 0x0C: This turns on just the LCD and not the cursor 0x06: This sets the entry mode to the autoincrement cursor and disables the shift mode 0x01: This clears the display The cmd function The cmd function sends the command to the LCD as per the LCD operation, and is shown as follows:    def cmd(self, bits, char_mode=False):      # Send command to LCD      sleep(0.001)      bits=bin(bits)[2:].zfill(8)      GPIO.output(self.pin_rs, char_mode)      for pin in self.pins_db:        GPIO.output(pin, False)      for i in range(4):        if bits[i] == "1":          GPIO.output(self.pins_db[i], True)      GPIO.output(self.pin_e, True)      GPIO.output(self.pin_e, False)      for pin in self.pins_db:        GPIO.output(pin, False)      for i in range(4,8):        if bits[i] == "1":          GPIO.output(self.pins_db[i-4], True)      GPIO.output(self.pin_e, True)      GPIO.output(self.pin_e, False) As the 4-bit mode is used drive the LCD, you will be using the flowchart that has been discussed previously. In the first line, the sleep(0.001) second is used because a few LCD operations will take some time to execute, so you have to wait for a certain time before it gets completed. The bits=bin(bits)[2:].zfill(8) line will convert the integer number to binary string and then pad it with zeros on the left to make it proper 8-bit data that can be sent to the LCD processor. The preceding lines of code does the operation as per the 4-bit write operation flowchart. The message function The message function sends the string to LCD, as shown here:    def message(self, text):      # Send string to LCD. Newline wraps to second line      for char in text:        if char == 'n':          self.cmd(0xC0) # next line        else:          self.cmd(ord(char),True) This function will take the string as input, convert each character into the corresponding ASCII code, and then send it to the cmd function. For a new line, the 0xC0 code is used. Discover more Raspberry Pi projects in Raspberry Pi LED Blueprints - pick up our guide and see how software can bring LEDs to life in amazing different ways!
Read more
  • 0
  • 0
  • 7867
article-image-article-playing-particles
Packt
06 Jun 2013
19 min read
Save for later

Playing with Particles

Packt
06 Jun 2013
19 min read
(For more resources related to this topic, see here.) Introducing particle effects Particle effects are the decorative flourishes used in games to represent dynamic and complex phenomena, such as fire, smoke, and rain. To create a particle effect, it requires three elements: a System, Emitters, and the Particles themselves. Understanding particle systems Particle systems are the universe in which the particles and emitters live. Much like the universe, we cannot define the size but we can define a point of origin which all emitters and particles will be placed relative to. We can also have multiple particle systems in existence at any given time, which can be set to draw the particles at different depths. While we can have as many particle systems as we want, it is best to have as few as possible in order to prevent possible memory leaks. The reason for this is that once a particle system is created, it will remain in existence forever unless it is manually destroyed. Destroying the instance that spawned it or changing rooms will not remove the system, so make sure it is removed when it is no longer needed. By destroying a particle system, it will remove all the emitters and particles in that system along with it. Utilizing particle emitters Particle emitters are defined areas within a system from which particles will spawn. There are two types of emitters to choose from: Burst emitters that spawn particles a single time, and Stream emitters that spew particles continuously over time. We can define the size and shape of the region in space for each emitter, as well as how the particles should be distributed within the region. Image When defining the region in space, there are four Shape options: DIAMOND, ELLIPSE, LINE, and RECTANGLE. An example of each can be seen in the preceding diagram, all using exactly the same dimensions, amount of particles, and distribution. While there is no functional difference between using any one of these shapes, the effect itself can benefit from a properly chosen shape. For example, only a LINE can make an effect appear to be angled 30 degrees. Image The distribution of the particles can also affect how the particles are expelled from the emitter. As can be seen in the preceding diagram, there are three different distributions. LINEAR will spawn particles with an equal random distribution throughout the emitter region. GAUSSIAN will spawn particles more towards the center of the region. INVGAUSSIAN is the inverse of GAUSSIAN, wherein the particles will spawn closer to the edges of the emitter. Applying particles Particles are the graphic resources that are spawned from the emitters. There are two types of particles that can be created: Shapes and Sprites. Shapes are the collection of 64 x 64 pixel sprites that comes built-in with GameMaker: Studio for use as particles. The shapes, as seen in the next diagram, are suitable for the majority of the most common effects, such as fireworks and flames. When wanting to create something more specialized for a game, we can use any Sprite in the Resource tree. Image There are a lot of things we can do with particles by adjusting the many attributes available. We can define ranges for how long it lives, the color it should be, and how it moves. We can even spawn more particles at the point of death for each particle. There are, however, some things that we cannot do. In order to keep the graphics processing costs low, there is no ability to manipulate individual particles within an effect. Also, particles cannot interact with objects in any way, so there is no way to know if a particle has collided with an instance in the world. If we need this kind of control, we need to build objects instead. Designing the look of a particle event is generally a trial and error process that can take a very long time. To speed things up, try using one of the many particle effect generators available on the Internet, such as Particle Designer 2.5 by Alert Games found here: http://alertgames.net/index.php?page=s/pd2. HTML5 limitations Using particle effects can really improve the visual quality of a game, but when developing a game intended to be played in a browser we need to be careful. Before implementing a particle effect, it is important to understand potential problems we may encounter. The biggest issue surrounding particles is that in order for them to be rendered smoothly without any lag, they need to be rendered with the graphics processor instead of the main CPU. Most browsers allow this to happen through a JavaScript API called WebGL. It is not, however, an HTML5 standard and Microsoft has stated that they have no plans for Internet Explorer to support it for the foreseeable future. This means a potentially significant portion of the game's potential audience could suffer poor gameplay if particles are used. Additionally, even with WebGL enabled, the functionality for particles to have additive blending and advanced color blending cannot be used, as none of the browsers currently support this feature. Now that we know this we are ready to make some effects! Adding particle effects to the game We are going to build a few different particle effects to demonstrate the various ways effects can be implemented in a game, and to look into some of the issues that might arise. To keep things straightforward, all of the effects we create will be a part of a single, global particle system. We will use both types of emitters, and utilize both shape and sprite-based particles. We will start with a Dust Cloud that will be seen anytime a Pillar is broken or destroyed. We will then add a system to create a unique shrapnel effect for each Pillar type. Finally, we will create some fire and smoke effects for the TNT explosion to demonstrate moving emitters. Creating a Dust Cloud The first effect we are going to create is a simple Dust Cloud. It will burst outwards upon the destruction of each Pillar and dissolve away over time. As this effect will be used in every level of the game, we will make all of its elements global, so they only need to be declared once. Open the Tower Toppling project we were previously working on if it is not already open. We need to make sure that WebGL is enabled when we build the game. Navigate to Resources | Change Global Game Settings and click on the HTML5 tab. On the left-hand side, click on the tab for Graphics. As seen in the following screenshot, there are three options under WebGL in Options. If WebGL is Disabled, the game will not be able to use the GPU and all browsers will suffer from any potential lag. If WebGL is Required, any browser that does not have this capability will be prevented from running the game. The final option is Auto-Detect which will use WebGL if the browser supports it, but will allow all browsers to play the game no matter what. Select Auto-Detect and then click on OK. Image Now that we have WebGL activated we can build our effects. We will start by defining our particle system as a global variable by creating a new script called scr_Global_Particles. code The first effect we are going to make is the Dust Cloud which will be attached to the Pillars. For this we only need a single emitter which we will move to the appropriate position when it is needed. Create a global variable for the emitter and add it to the particle system with the following code at the end of the script: code For this particle, we are going to use one of the built-in shapes, pt_shape_explosion, which looks like a little thick cloud of dust. Add the following code to the end of the script: code Once again we have made this a global variable, so that we have to create this Dust Cloud particle only once. We have declared only the shape attribute of this particle at this time. We will add more to this later once we can see what the effect looks like in the game. We need to initialize the particle system with the other global variables. Reopen scr_Global_GameStart and call the particles script. code With everything initialized, we can now create a new script, scr_Particles_DustCloud, which we can use to set the region of the emitter and have it activate a burst of particles. code We start by defining a small area for the emitter based on the position of instance that calls this script. The region itself will be circular with a Gaussian distribution so that the particles shoot out from the center. We then activate a single burst of 10 dust particles from the emitter. All we need to do now is execute this script from the destruction of a Pillar. Reopen scr_Pillar_Destroyand insert the following line of code on the line before the instance is destroyed: code We need to add this effect to the breaking of the Pillars as well. Reopen scr_ Pillar_BreakApart and insert the same code in the same spot. Save the game and then play it. When the glass Pillars are destroyed, we should see thick white clouds appearing as shown in the following screenshot: Image The particles are boring and static at this point, because we have not told the particles to do anything other than to look like the shape of a cloud. Let's fix this by adding some attributes to the particle. Reopen scr_Global_Particles and add the following code at the end of the script: code The first attribute we add is how long we want the particle to live for, which is a range between 15 and 30 steps, or at the speed of our rooms, a half to a whole second. Next, we want the particles to explode outwards, so we set the angle and add some velocity. Both functions that we are using have similar parameters. The first value is the particle type for which this is to be applied. The next two parameters are the minimum and maximum values from which a number will be randomly chosen. The fourth parameter sets an incremental value every step. Finally, the last parameter is a wiggle value that will randomly be applied throughout the particle's lifetime. For the Dust Cloud, we are setting the direction to be in any angle and the speed is fairly slow, ranging only a few pixels per step. We also want to change the size of the particles and their transparency, so that the dust appears to dissipate. Save the game and run it again. This time the effect appears much more natural, with the clouds exploding outwards, growing slightly larger, and fading out. It should look something like the next screenshot. The Dust Cloud is now complete. Image Adding in Shrapnel The Dust Cloud effect helps the Pillar destruction appear more believable, but it lacks the bigger chunks of material one would expect to see. We want some Shrapnel of various shapes and sizes to explode outwards for each of the different types of Pillars. We will start with the Glass particles. Create a new Sprite, spr_Particle_Glass, and with Remove Background checked, load Chapter 8/Sprites/Particle_Glass.gif.mhanaje This sprite is not meant to be animated, though it does have several frames within it. Each frame represents a different shape of particle that will be randomly chosen when the particle is spawned. We will want the particles to rotate as they move outwards, so we need to center the origin. Click on OK. Reopen scr_Global_Particles and initialize the Glass particle at the end of the script. code Once we have created the global variable and the particle, we set the particle type to be a Sprite. When assigning Sprites there are a few extra parameters beyond which resources should be used. The third and fourth parameters are for whether it should be animated, and if so, should the animation stretch for the duration of the particle's life. In our case we are not using animation, so it has been set to false. The last parameter is for whether we want it to choose a random subimage of the Sprite, which is what we do want it to do. We also need to add some attributes to this particle for life and movement. Add the following code at the end of the script: code When compared with the Dust Cloud, this particle will have a shorter lifespan but will move at a much higher velocity. This will make this effect more intense while keeping the general area small. We have also added some rotational movement through part_type_orientation. The particles can be set to any angle and will rotate 20 degrees per frame with a variance of up to four degrees. This will give us a nice variety in the spin of each particle. There is one additional parameter for orientation, which is whether the angle should be relative to its movement. We have set it to false as we just want the particles to spin freely. To test this effect out, open up scr_Particles_DustCloud and insert a burst emitter before the Dust Cloud is emitted, so that the Glass particles appear behind the other effect. code Save the game and then play it. When the Pillars break apart, there should be shards of Glass exploding out along with the Dust Cloud. The effect should look something like the following screenshot: Image Next we need to create Shrapnel for the Wood and Steel particles. Create new Sprites for spr_Particle_Wood and spr_Particle_Steel with the supplied images in Chapter 8/Sprites/ in the same manner as we did for Glass. As these particles are global, we cannot just swap the Sprite out dynamically. We need to create new particles for each type. In scr_Global_Particles, add particles for both Wood and Steel with the same attributes as Glass. Currently the effect is set to Always create Glass particles, something we do not want to do. To fix this we are going to add a variable, myParticle, to each of the different Pillars to allow us to spawn the appropriate particle. Open scr_Pillar_Glass_Create and add the following code at the end of the script: code Repeat the last step for Wood and Steel with the appropriate particle assigned. In order to have the proper particle spawn, all we need to do is reopen scr_Particles_DustCloud and change the variable particle_Glass to myParticle as in the following code: code Save the game and play the game until you can destroy all the three types of Pillars to see the effect. It should look something similar to the following screenshot, where each Pillar spawns its own Shrapnel: Image Making the TNT explosion When the TNT explodes, it shoots out some TNT Fragments which are currently bland looking Sprites. We want the Fragments to be on fire as they streak across the scene. We also want a cloud of smoke to rise up from the explosion to indicate that the explosion we see is actually on fire. This is going to cause some complications. In order to make something appear to be on fire, it will need to change color, say from white to yellow to orange. As we have already mentioned, due to the fact that WebGL is not supported by all browsers, we cannot utilize any of the functions that allow us to blend colors together. This means that we need to work around this issue. The solution is to use several particles instead of one. We will start by creating some custom colors so that we can achieve the look of fire and smoke that we want. Open scr_Global_Colors and add the following colors: code We already have a nice yellow color, so we add an orange, a slightly yellow tinted white, and a partially orange black color. In order to achieve the fake blending effect we will need to spawn one particle type, and upon its death, have it spawn the next particle type. For this to work properly, we need to construct the creation of the particles in the opposite order that they will be seen. In this case, we need to start by building the smoke particle. In scr_Global_Particles add a new particle for the smoke with the following attributes: code We start by adding the particle and using the built-in smoke shape. We want the smoke to linger for a while, so we set its life to range between a minimum of a second to almost two full seconds. We then set the direction and speed to be more or less upwards so that the smoke rises. Next, we set the size and have it grow over time. With the alpha values, we don't want the smoke to be completely opaque, so we set it to start at half transparent and fade away over time. Next, we are using part_type_color1 which allows us to tint the particle without affecting the performance very much. Finally, we apply some gravity to the particles so that any angled particles float slowly upwards. The smoke is the final step of our effect and it will be spawned from an orange flame that precedes it. code Once again we set up the particle using the built-in smoke shape, this time with a much shorter lifespan. The general direction is still mainly upwards, though there is more spread than the smoke. These particles are slightly smaller, tinted orange and will be partially transparent for its entire life. We have added a little bit of upward gravity, as this particle is in between fire and smoke. Finally, we are using a function that will spawn a single particle of smoke upon the death of each orange particle. The next particle in the chain for this effect is a yellow particle. This time we are going to use the FLARE shape, which will give a better appearance of fire. It will also be a bit smaller, live slightly longer than the orange particle, and move faster, spreading in all directions. We will not add any transparency to this particle so that it appears to burn bright. code We have only one more particle to create this effect for, which is the hottest and brightest white particle. Its construction is the same as the yellow particle, except it is smaller and faster. code We now have all the particles we need for this particle effect; we just need to add an emitter to spawn them. This time we are going to use a stream emitter, so that the fire continuously flows out of each Fragment. Since the Fragments are moving, we will need to have a unique emitter for each Fragment we create. This means it cannot be a global emitter, but rather a local one. Open scr_TNT_Fragment_Create and add the following code at the end of the script: code We create an emitter with a fairly small area for spawning with balanced distribution. At every step, the emitter will create five new Fire particles as long as the emitter exists. The emitter is now created at the same time as the Fragment, but we need the emitter to move along with it. Open scr_TNT_Fragment_Step and add the following code: code As already mentioned we need to destroy the emitter, otherwise it will never stop streaming particles. For this we will need to open obj_TNT_Fragment and add a destroy event with a new Script, scr_TNT_Fragment_Destroy, which removes the emitter attached. code This function will remove the emitter from the system without removing any of the particles that had been spawned. One last thing we need to do is to uncheck the Visible checkbox, as we don't want to see the Fragment sprite, but just the particles. Save the game and detonate the TNT. Instead of just seeing a few Fragments, there are now streaks of fire jetting out of the explosion that turn into dark clouds of smoke that float up. It should look something like the following screenshot: Image Cleaning up the particles At this point, we have built a good variety of effects using various particles and emitters. The effects have added a lot of polish to the game, but there is a flaw with the particles. If the player decides to restart the room or go to the SHOP immediately after the explosion has occurred, the emitters will not be destroyed. This means that they will continue to spawn particles forever, and we will lose all references to those emitters. The game will end up looking like the following screenshot: Image The first thing we need to do is to destroy the emitters when we leave the room. Luckily, we have already written a script that does exactly this. Open obj_TNT_Fragment and add a Room End event and attach scr_TNT_Fragment_Destroy to it. Even if we destroy the emitters before changing rooms, any particles remaining in the game will still appear in the next room, if only briefly. What we need to do is clear all the particles from the system. While this might sound like it could be a lot of work, it is actually quite simple. As Overlord is in every level, but not in any other room, we can use it to clean up the scene. Open obj_Overlord, add a Room End event and attach a new Script, scr_Overlord_RoomEnd, with the following line of code: part_particles_clear(system); This function will remove any particle that exists within the system, but will not remove the particle type from memory. It is important that we do not destroy the particle type, as we would not be able to use a particle again if its type no longer exists. Save the game, explode some TNT, and restart the room immediately. You should no longer see any particles in the scene. Summary In this article, we were provided with the details to add some spit and polish to the game to really make it shine. We delved into the world of particles and created a variety of effects that add impact to the TNT and Pillar destruction. Resources for Article : Further resources on this subject: HTML5: Generic Containers [Article] HTML5 Presentations - creating our initial presentation [Article] Deploying HTML5 Applications with GNOME [Article]
Read more
  • 0
  • 0
  • 7866

article-image-extracting-data-using-dom-must-know
Packt
12 Sep 2013
5 min read
Save for later

Extracting data using DOM (Must know)

Packt
12 Sep 2013
5 min read
(For more resources related to this topic, see here.) Getting ready This section will parse the content of the page at, http://jsoup.org. The index.html file in the project is provided if you want to have a fi le as input, instead of connecting to the URL. How to do it... The following screenshot shows the page that is going to be parsed: By viewing the source code for this HTML page, we know the site structure. The jsoup library is quite supportive of the DOM navigation method; it provides ways to find elements and extract their contents efficiently. Create the Document class structure by connecting to the URL. Document doc = Jsoup.connect("http://jsoup.org").get(); Navigate to the menu tag whose class is nav-sections. Elements navDivTag = doc.getElementsByClass("nav-sections"); Get the list of all menu tags that are owned by &#lt;a> . Elements list = navDivTag.get(0).getElementsByTag("a"); Extract content from each Element class in the previous menu list. for(Element menu: list) {System.out.print(String.format("[%s]", menu.html()));} The output should look like the following screenshot after running the code: The complete example source code for this section is placed at sourceSection02. The API reference for this section is available at: http://jsoup.org/apidocs/org/jsoup/nodes/Element.html How it works... Let's have a look at the navigation structure: html > body.n1-home > div.wrap > div.header > div.nav-sections > ul >li.n1-news > a The div class="nav-sections" tag is the parent of the navigation section, so by using getElementsByClass("nav-sections"), it will move to this tag. Since there is only one tag with this class value in this example, we only need to extract the first found element; we will get it at index 0 (first item of results). Elements navDivTag = doc.getElementsByClass("nav-sections"); The Elements object in jsoup represents a collection ( Collection<>) or a list (List<>); therefore, you can easily iterate through this object to get each element, which is known as an Element object. When at a parent tag, there are several ways to get to the children. Navigate from subtag <ul>, and deeper to each <li> tag, and then to the <a> tag. Or, you can directly make a query to find all the <a> tags. That's how we retrieved the list that we found, as shown in the following code: Elements list = navDivTag.get(0).getElementsByTag("a"); The final part is to print the extracted HTML content of each <a> tag. Beware of the list value; even if the navigation fails to find any element, it is always not null, and therefore, it is good practice to check the size of the list before doing any other task. Additionally, the Element.html() method is used to return the HTML content of a tag. There's more... jsoup is quite a powerful library for DOM navigation. Besides the following mentioned methods, the other navigation types to find and extract elements are also supported in the Element class. The following are the common methods for DOM navigation: Methods   Descriptions   getElementById(String id)   Finds an element by ID, including its children.   getElementsByTag(String c)   Finds elements, including and recursively under the element that calls this method, with the specified tag name (in this case, c).   getElementsByClass(String className)   Finds elements that have this class, including or under the element that calls this method. Case insensitive.   getElementsByAttribute(String key)   Find elements that have a named attribute set. Case insensitive. This method has several relatives, such as: getElementsByAttribute Starting(String keyPrefix) getElementsByAttributeValue (String key, String value) getElementsByAttributeValue Not(String key, String value) getElementsMatchingText(Pattern pattern)   Finds elements whose text matches the supplied regular expression.   getAllElements()   Finds all elements under the specified element (including self and children of children).   There is a need to mention all methods that are used to extract content from an HTML element. The following table shows the common methods for extracting elements: Methods   Descriptions   id()   This retrieves the ID value of an element.   className()   This retrieves the class name value of an element.   attr(String key)   This gets the value of a specific attribute.   attributes()   This is used to retrieve all the attributes.   html()   This is used to retrieve the inner HTML value of an element.   data()   This is used to retrieve the data content, usually applied for getting content from the <script> and <style> tags.   text()   This is used to retrieve the text content. This method will return the combined text of all inner children and removes all HTML tags, while the html() method returns everything between its open and closed tags.   tag()   This retrieves the tag of the element.   Summary In this article we saw to extract data using DOM from an HTML page. It was seen that jsoup is quite a powerful library for DOM navigation. Resources for Article : Further resources on this subject: HTML5 Presentations - creating our initial presentation [Article] Building HTML5 Pages from Scratch [Article] JBoss Tools Palette [Article]
Read more
  • 0
  • 0
  • 7865

article-image-analyzing-eurostat-data-using-openai-code-interpreter
Alan Bernardo Palacio
21 Aug 2023
17 min read
Save for later

Analyzing Eurostat Data Using OpenAI Code Interpreter

Alan Bernardo Palacio
21 Aug 2023
17 min read
OpenAI's recent release of the ChatGPT Code Interpreter plugin has introduced a groundbreaking addition to their language model, ChatGPT. This plugin combines the power of Large Language Models (LLMs) with traditional programming capabilities, revolutionizing programming workflows and enhancing data analysis processes. By eliminating the need to write code or set up separate environments, the Code Interpreter plugin simplifies the process of leveraging the capabilities of ChatGPT for data analysis. Let's explore how this plugin can be used to analyze Eurostat data and derive valuable insights.Introduction to the ChatGPT Code Interpreter PluginThe ChatGPT Code Interpreter plugin provides a Python interpreter within a secure execution environment. It supports file uploads and downloads, allowing seamless handling of data. The plugin enables persistent code execution within a chat conversation, allowing users to build on previous code executions. Its purpose is to solve mathematical problems, perform data analysis and visualization, and convert files between different formats.Simplifying Programming WorkflowsBefore the introduction of the Code Interpreter plugin, users had to generate code separately and execute it outside of the ChatGPT model. However, with the Code Interpreter, the entire process can be completed within ChatGPT itself. This eliminates the need for external runtime environments and offers an easy-to-use interface for both programmers and non-programmers to access programming capabilities.Analyzing Eurostat Data Using Code InterpreterTo demonstrate the capabilities of the Code Interpreter plugin, let's analyze Eurostat data. Eurostat provides various datasets related to agriculture, trade, energy, and more. We will focus on analyzing livestock production data in this example.Installing the Required LibrariesTo begin, we need to install the eurostat library, which allows us to access Eurostat data in Python. We can use the following command within the Code Interpreter plugin:!pip install eurostatAnd then we need to obtain the table of contents to obtain the necessary data for the analysis.Obtaining Eurostat DataOnce the library is installed, we can proceed to obtain the table of contents (TOC) of the available Eurostat datasets. This will help us identify the datasets relevant to our analysis. Here is the code to retrieve the TOC:import eurostat import pandas as pd toc = eurostat.get_toc() toc_df = pd.DataFrame(toc[1:],columns=toc[0]) toc_df.head().to_string()We obtain a table that contains the next information:title      code     type       last update of data last table structure change data start data end 0                           Balance of payments  MED_ECBP  dataset  2020-05-28T23:00:00+0200    2021-02-08T23:00:00+0100       2005     2018 1       International trade by SITC - 1 000 EUR  MED_ECET  dataset  2020-05-28T23:00:00+0200   2021-02-08T23:00:00+0100       2005     2019 2                     Primary energy production  MED_EG10  dataset  2020-05-28T23:00:00+0200    2021-02-08T23:00:00+0100       2005     2018 3  Secondary energy production (transformation)  MED_EG20  dataset  2016-10-28T23:00:00+0200    2021-02-08T23:00:00+0100       2000     2015 4                            Energy consumption  MED_EG30  dataset  2020-05-28T23:00:00+0200    2021-02-08T23:00:00+0100       2005     2018Let’s save this into a CSV file:toc_df.to_csv('toc_df',index=False)The resulting table will contain information such as the dataset title, code, type, last update of data, and data start/end years.Identifying Relevant DatasetsNext, we can use the uploaded table of contents file to find the dataset codes that are most relevant to our analysis. In this case, we are interested in datasets related to crop production, livestock numbers, fertilizer usage, and economic accounts for agriculture. We can use the following code within the Code Interpreter:I have uploaded a file with Eurostat datasets and I need to find the codes for the datasets that most resemble the next indicators: - Crops production in tonnes by NUTS 2 regions. - Livestock and meat numbers by NUTS 2 regions. - Poultry production or similar by NUTS 2 regions. - Mlik or derivates production or similar by NUTS 2 regions. - Economic Accounts for Agriculture (EAA) by NUTS 2 regions, where we will take the total intermediate consumption as a proxy for input energy. The assistant will now analyze the uploaded and provided this: The uploaded file contains the following columns: - `title`: The title of the dataset. - `code`: The code of the dataset. - `type`: The type of the entry (all seem to be datasets). - `last update of data`: The date of the last update of the dataset. - `last table structure change`: The date of the last structural change of the dataset. - `data start`: The start year of the data. - `data end`: The end year of the data. To find the codes for the datasets that most resemble the indicators you mentioned, I will search the dataset titles for relevant keywords. For each indicator, I'll look for the following keywords: 1. Crop production: "crop", "production", "NUTS 2". 2. Livestock numbers: "livestock", "NUTS 2". 3. Fertilizer usage: "fertilizer", "NUTS 2". 4. Economic Accounts for Agriculture (EAA): "economic accounts", "agriculture", "NUTS 2". Let's start this search. Here are the dataset codes that most resemble your indicators: 1. **Crop production in tonnes by NUTS 2 regions:**    - `APRO_CPNHR`: Crop production in national humidity by NUTS 2 regions (data available from 2000 to 2023)    - `APRO_CPNHR_H`: Crop production by NUTS 2 regions - historical data (data available from 1975 to 1999)    - `APRO_CPSHR`: Crop production in EU standard humidity by NUTS 2 regions (data available from 2000 to 2023) 2. **Livestock numbers by NUTS 2 regions:**    - `TAI06`: Share of main livestock types in total livestock (data available from 2005 to 2016)    - `EF_KVAAREG`: Key farm variables: area, livestock (LSU), labour force (data available from 2005 to 2013)    - `EF_KVECSLEG`: Key variables: area, livestock (LSU), labour force by economic size class (data available from 2005 to 2013)    - `EF_KVFTREG`: Key variables: area, livestock (LSU), labour force by type of farming (data available from 2005 to 2013)    - `EF_LSK_MAIN`: Main livestock indicators by NUTS 2 regions (data available from 2005 to 2020)    - `EF_LS_OVAAREG`: Livestock: number of farms and heads by size of the economic output (data available from 1990 to 2007)    - `EF_LS_OVESU`: Livestock: number of farms and heads by economic size class (data available from 1990 to 2007)    - `EF_LS_OVLSUREG`: Livestock: number of farms and heads by livestock size class (data available from 1990 to 2007)    - `EF_OLSAAREG`: Livestock: number of farms and heads of animals by size of the agricultural area (data available from 2005 to 2013)    - `EF_OLSECSREG`: Livestock: number of farms and heads of animals by economic size class (data available from 2005 to 2013)    - `EF_OLSLSUREG`: Livestock: number of farms and heads of animals by livestock size class (data available from 2005 to 2013)    - `AEI_EF_LS`: Share of main livestock types in total livestock (data available from 2005 to 2016) 3. **Fertilizer usage or similar by NUTS 2 regions:**    - Unfortunately, there were no datasets found that match this indicator. 4. **Economic Accounts for Agriculture (EAA) by NUTS 2 regions:**    - `AGR_R_ACCTS`: Economic accounts for agriculture by NUTS 2 regions (data available from 1980 to 2021)    - `AGR_R_ACCTS_H`: Economic accounts for agriculture by NUTS 2 regions - historical data (data available from 1973 to 2012) Please note that the data ranges for each dataset may not cover the exact period you are interested in. You might need to merge data from multiple datasets or look for additional datasets to fill in any gaps.Based on this, we can retrieve the next datasets:Crop production in tonnes by NUTS 2 regions:APRO_CPNHR: Crop production in national humidity by NUTS 2 regions (data available from 2000 to 2023)APRO_CPSHR: Crop production in EU standard humidity by NUTS 2 regions (data available from 2000 to 2023)Livestock numbers by NUTS 2 regions:EF_LSK_MAIN: Main livestock indicators by NUTS 2 regions (data available from 2005 to 2020)Economic Accounts for Agriculture (EAA) by NUTS 2 regions:AGR_R_ACCTS: Economic accounts for agriculture by NUTS 2 regions (data available from 1980 to 2021)As a next step, we can start downloading one of the datasets using the Eurostat API.Analyzing Livestock Production DataWe will focus on the "Livestock numbers by NUTS 2 regions" dataset (EF_LSK_MAIN) for our analysis. Let's proceed with downloading and analyzing this dataset:# Animal production livestock_prod_code = 'EF_LSK_MAIN' livestock_prod_pars = eurostat.get_pars(livestock_prod_code) print(livestock_prod_pars) par_values = eurostat.get_par_values(livestock_prod_code, 'geo') # filter the regions for germany de_par_values = {    'unit':'LSU',    'so_eur':'TOTAL',    'geo':[p for p in par_values if all([p.startswith('DE'),len(p)==4])]} # Download data for de filtered regions livestock_prod_data = eurostat.get_data_df(livestock_prod_code, filter_pars=de_par_values) print(livestock_prod_data.head().to_string())Which produces the following result:['freq', 'lsu', 'animals', 'farmtype', 'agrarea', 'so_eur', 'unit', 'geo'] freq   lsu animals farmtype  agrarea so_eur unit geo\\TIME_PERIOD  2005  2007  2010  2013  2016  2020 0    A  LSU0   A0010  FT15_SO  HA10-19  TOTAL  LSU            DE11   0.0   0.0   0.0   0.0  None   0.0 1    A  LSU0   A0010  FT15_SO    HA2-4  TOTAL  LSU            DE11   0.0   0.0   NaN   NaN  None   NaN 2    A  LSU0   A0010  FT15_SO  HA20-29  TOTAL  LSU            DE11   0.0   0.0   0.0   0.0  None   0.0 3    A  LSU0   A0010  FT15_SO  HA30-49  TOTAL  LSU            DE11   0.0   0.0   0.0   0.0  None   0.0 4    A  LSU0   A0010  FT15_SO    HA5-9  TOTAL  LSU            DE11   0.0   0.0   0.0   0.0  None   0.0The code above filters the dataset for Germany's NUTS 2 regions, specifically focusing on the unit of measurement (LSU) and the standard output in Euros (TOTAL). The resulting DataFrame provides a glimpse of the livestock production data for different regions and years.Some of the codes are still not meaningful, specifically, the ones related to animals, so let’s obtain some results about it:# Get the values in the dataset livestock_prod_animal_vals = eurostat.get_par_values(livestock_prod_code, 'animals') # Obtain their meaning livestock_prod_animal_codes = [p for p in eurostat.get_dic(livestock_prod_code, 'animals') if p[0] in livestock_prod_animal_vals] livestock_prod_animal_codes = pd.DataFrame(livestock_prod_animal_codes,columns= ['code','meaning']) print(livestock_prod_animal_codes.head().to_string())Which produces the next:code                                meaning 0  A0010                              Livestock 1  A1000  Live horses, asses, mules and hinnies 2  A2000                    Live bovine animals 3  A3100           Live swine, domestic species 4  A4100                             Live sheepNow let’s download both datasets and ask the ChatGPT code interpreter to run some analysis for us. We want to see some basic analysis of the data that use these to run some basic analysis.Visualizing and Analyzing the Livestock Production DataWe can leverage the Code Interpreter plugin to visualize and analyze the downloaded livestock production data. I’ve uploaded the file to the code interpreter and ask it to create an analysis of the data:The result is a very good initial evaluation of the data. The complete response its:The data seems to pertain to livestock production and has the following columns: - `freq`: The frequency of the data - `lsu`: LSU (Livestock Unit) - a reference unit which facilitates the aggregation of livestock from various species and age according to their nutritional or feed requirement. - `animals`: The type of animals - `farmtype`: Type of the farm - `agrarea`: The agricultural area - `so_eur`: Unknown, needs further clarification - `unit`: The unit of measurement - `geo\\TIME_PERIOD`: Geographical region and time period - `2005`, `2007`, `2010`, `2013`, `2016`, `2020`: These seem to be the years for which the data was collected, indicating the measure of the livestock production. To provide an insightful analysis, let's check the basic statistics of the numerical data and count the unique values of the categorical data. The basic statistics of the numerical data (years 2005, 2007, 2010, 2013, 2016, and 2020) are as follows: - For each year, we have a substantial number of entries (count). - The mean (average) varies across the years, with the highest being around 3,250 for the year 2007 and the lowest around 1,759 for the year 2020. - The standard deviation, which measures the dispersion of the data from the mean, is quite large for all years, suggesting a significant variation in the data. - The minimum value is 0 for all years, which indicates that in some cases, there was no livestock production. - The 25th, 50th (median), and 75th percentiles are 0 for many years, indicating that a substantial portion of the data has low or zero values. - The maximum value is quite large for all years, reaching up to around 3,062,740 in 2020, suggesting that some entities had very high livestock production. For the year 2016, there are no data points available. This might be due to missing data or no data collection in that year. Regarding the categorical data: - `freq` has only one unique value, which does not provide any variability. - `lsu`, `animals`, `farmtype`, `agrarea`, and `geo\\TIME_PERIOD` have multiple unique values, indicating different categories within these fields. - `so_eur` and `unit` also have only one unique value each. Next, we can further examine the distribution of the numerical data and the most common categories in the categorical data. We could also investigate the missing values in the dataset.This is a great starting point, now I’ll ask Code Interpreter to run some basic analysis on the data. I’ve uploaded the codes and ask it to join the data and provide some basic visualizations, which results in the next answer:The uploaded file contains codes for different types of livestock and their meanings. Now, let's join this data with the original dataset. After that, we will visualize the livestock production by year and by different categories.Up next it creates a visualization of the data:The image comes with the next description:The bar chart illustrates the total livestock production over the years. Some observations from the plot include: - The total livestock production appears to be highest in the years 2005 and 2007. - There's a noticeable drop in total production in 2010 and 2013. - The year 2016 does not have any data. - There's a slight increase in total production in 2020 compared to 2013. Next, let's analyze the livestock production by different categories, such as the type of animals and the type of farms.Up next it creates a visualization of the joined data by type of animal:It not only provides the visualization but also an analysis of the data:The bar chart shows the top 5 animal types with the highest total livestock production. Some observations from the plot include: - "Livestock" has the highest total production. However, this category may be a general category that includes various types of animals. - "Live bovine animals" and "Live swine, domestic species" have the next highest total production levels. - "Live poultry, domestic species" and "Live sheep" also contribute significantly to the total livestock production. Next, let's analyze the livestock production by different farm types.As we can see, the analysis continues with the next of step of the analysis. First by creating the visualization:And then by providing the resulting analysis:The bar chart shows the top 5 farm types with the highest total livestock production. Some observations from the plot include: - The farm type "FT15_SO" has the highest total livestock production. However, without a proper definition for this farm type, it's hard to interpret what this category entails. - The farm types "FT16_SO", "FT10_SO", "FT14_SO", and "FT08_SO" also have significant levels of livestock production. These visualizations provide a broad overview of the livestock production trends by year, animal type, and farm type. For a more detailed analysis, it would be beneficial to have more context or information about the different categories, especially the farm types.The Code Interpreter plugin is particularly useful for conducting data analysis and visualization tasks. It allows users to directly interact with large datasets within ChatGPT, posing questions and extracting valuable insights without the need for writing code. The plugin's ability to handle substantial amounts of data enables efficient exploration and analysis, enhancing decision-making processes. Furthermore, it provides visual representations of complex datasets, simplifying data comprehension and aiding in the communication of findings.The resulting analyses provide insights into the most common animal types and farm types contributing to livestock production.ConclusionThe Eurostat dataset, integrated with OpenAI's ChatGPT Code Interpreter plugin, streamlines data analysis. This fusion of programming and language models empowers users to extract insights from vast datasets, regardless of coding expertise. With simplified workflows, the plugin caters to programmers and non-programmers alike, enabling data retrieval, filtering, visualization, and basic analysis. This innovation has applications in marketing, finance, HR, and healthcare, democratizing coding and fostering collaboration. The technology's evolution holds the promise of revolutionizing programming and data analysis, empowering users to glean insights from intricate datasets for informed decision-making across industries.Author Bio:Alan Bernardo Palacio is a data scientist and an engineer with vast experience in different engineering fields. His focus has been the development and application of state-of-the-art data products and algorithms in several industries. He has worked for companies such as Ernst and Young, Globant, and now holds a data engineer position at Ebiquity Media helping the company to create a scalable data pipeline. Alan graduated with a Mechanical Engineering degree from the National University of Tucuman in 2015, participated as the founder in startups, and later on earned a Master's degree from the faculty of Mathematics in the Autonomous University of Barcelona in 2017. Originally from Argentina, he now works and resides in the Netherlands.LinkedIn
Read more
  • 0
  • 0
  • 7853
article-image-integrating-d3js-visualization-simple-angularjs-application
Packt
27 Apr 2015
19 min read
Save for later

Integrating a D3.js visualization into a simple AngularJS application

Packt
27 Apr 2015
19 min read
In this article by Christoph Körner, author of the book Data Visualization with D3 and AngularJS, we will apply the acquired knowledge to integrate a D3.js visualization into a simple AngularJS application. First, we will set up an AngularJS template that serves as a boilerplate for the examples and the application. We will see a typical directory structure for an AngularJS project and initialize a controller. Similar to the previous example, the controller will generate random data that we want to display in an autoupdating chart. Next, we will wrap D3.js in a factory and create a directive for the visualization. You will learn how to isolate the components from each other. We will create a simple AngularJS directive and write a custom compile function to create and update the chart. (For more resources related to this topic, see here.) Setting up an AngularJS application To get started with this article, I assume that you feel comfortable with the main concepts of AngularJS: the application structure, controllers, directives, services, dependency injection, and scopes. I will use these concepts without introducing them in great detail, so if you do not know about one of these topics, first try an intermediate AngularJS tutorial. Organizing the directory To begin with, we will create a simple AngularJS boilerplate for the examples and the visualization application. We will use this boilerplate during the development of the sample application. Let's create a project root directory that contains the following files and folders: bower_components/: This directory contains all third-party components src/: This directory contains all source files src/app.js: This file contains source of the application src/app.css: CSS layout of the application test/: This directory contains all test files (test/config/ contains all test configurations, test/spec/ contains all unit tests, and test/e2e/ contains all integration tests) index.html: This is the starting point of the application Installing AngularJS In this article, we use the AngularJS version 1.3.14, but different patch versions (~1.3.0) should also work fine with the examples. Let's first install AngularJS with the Bower package manager. Therefore, we execute the following command in the root directory of the project: bower install angular#1.3.14 Now, AngularJS is downloaded and installed to the bower_components/ directory. If you don't want to use Bower, you can also simply download the source files from the AngularJS website and put them in a libs/ directory. Note that—if you develop large AngularJS applications—you most likely want to create a separate bower.json file and keep track of all your third-party dependencies. Bootstrapping the index file We can move on to the next step and code the index.html file that serves as a starting point for the application and all examples of this section. We need to include the JavaScript application files and the corresponding CSS layouts, the same for the chart component. Then, we need to initialize AngularJS by placing an ng-app attribute to the html tag; this will create the root scope of the application. Here, we will call the AngularJS application myApp, as shown in the following code: <html ng-app="myApp"> <head>    <!-- Include 3rd party libraries -->    <script src="bower_components/d3/d3.js" charset="UTF-   8"></script>    <script src="bower_components/angular/angular.js"     charset="UTF-8"></script>      <!-- Include the application files -->    <script src="src/app.js"></script>    <link href="src/app.css" rel="stylesheet">      <!-- Include the files of the chart component -->    <script src="src/chart.js"></script>    <link href="src/chart.css" rel="stylesheet">   </head> <body>    <!-- AngularJS example go here --> </body> </html> For all the examples in this section, I will use the exact same setup as the preceding code. I will only change the body of the HTML page or the JavaScript or CSS sources of the application. I will indicate to which file the code belongs to with a comment for each code snippet. If you are not using Bower and previously downloaded D3.js and AngularJS in a libs/ directory, refer to this directory when including the JavaScript files. Adding a module and a controller Next, we initialize the AngularJS module in the app.js file and create a main controller for the application. The controller should create random data (that represent some simple logs) in a fixed interval. Let's generate some random number of visitors every second and store all data points on the scope as follows: /* src/app.js */ // Application Module angular.module('myApp', [])   // Main application controller .controller('MainCtrl', ['$scope', '$interval', function ($scope, $interval) {      var time = new Date('2014-01-01 00:00:00 +0100');      // Random data point generator    var randPoint = function() {      var rand = Math.random;      return { time: time.toString(), visitors: rand()*100 };    }      // We store a list of logs    $scope.logs = [ randPoint() ];      $interval(function() {     time.setSeconds(time.getSeconds() + 1);      $scope.logs.push(randPoint());    }, 1000); }]); In the preceding example, we define an array of logs on the scope that we initialize with a random point. Every second, we will push a new random point to the logs. The points contain a number of visitors and a timestamp—starting with the date 2014-01-01 00:00:00 (timezone GMT+01) and counting up a second on each iteration. I want to keep it simple for now; therefore, we will use just a very basic example of random access log entries. Consider to use the cleaner controller as syntax for larger AngularJS applications because it makes the scopes in HTML templates explicit! However, for compatibility reasons, I will use the standard controller and $scope notation. Integrating D3.js into AngularJS We bootstrapped a simple AngularJS application in the previous section. Now, the goal is to integrate a D3.js component seamlessly into an AngularJS application—in an Angular way. This means that we have to design the AngularJS application and the visualization component such that the modules are fully encapsulated and reusable. In order to do so, we will use a separation on different levels: Code of different components goes into different files Code of the visualization library goes into a separate module Inside a module, we divide logics into controllers, services, and directives Using this clear separation allows you to keep files and modules organized and clean. If at anytime we want to replace the D3.js backend with a canvas pixel graphic, we can just implement it without interfering with the main application. This means that we want to use a new module of the visualization component and dependency injection. These modules enable us to have full control of the separate visualization component without touching the main application and they will make the component maintainable, reusable, and testable. Organizing the directory First, we add the new files for the visualization component to the project: src/: This is the default directory to store all the file components for the project src/chart.js: This is the JS source of the chart component src/chart.css: This is the CSS layout for the chart component test/test/config/: This directory contains all test configurations test/spec/test/spec/chart.spec.js: This file contains the unit tests of the chart component test/e2e/chart.e2e.js: This file contains the integration tests of the chart component If you develop large AngularJS applications, this is probably not the folder structure that you are aiming for. Especially in bigger applications, you will most likely want to have components in separate folders and directives and services in separate files. Then, we will encapsulate the visualization from the main application and create the new myChart module for it. This will make it possible to inject the visualization component or parts of it—for example just the chart directive—to the main application. Wrapping D3.js In this module, we will wrap D3.js—which is available via the global d3 variable—in a service; actually, we will use a factory to just return the reference to the d3 variable. This enables us to pass D3.js as a dependency inside the newly created module wherever we need it. The advantage of doing so is that the injectable d3 component—or some parts of it—can be mocked for testing easily. Let's assume we are loading data from a remote resource and do not want to wait for the time to load the resource every time we test the component. Then, the fact that we can mock and override functions without having to modify anything within the component will become very handy. Another great advantage will be defining custom localization configurations directly in the factory. This will guarantee that we have the proper localization wherever we use D3.js in the component. Moreover, in every component, we use the injected d3 variable in a private scope of a function and not in the global scope. This is absolutely necessary for clean and encapsulated components; we should never use any variables from global scope within an AngularJS component. Now, let's create a second module that stores all the visualization-specific code dependent on D3.js. Thus, we want to create an injectable factory for D3.js, as shown in the following code: /* src/chart.js */ // Chart Module   angular.module('myChart', [])   // D3 Factory .factory('d3', function() {   /* We could declare locals or other D3.js      specific configurations here. */   return d3; }); In the preceding example, we returned d3 without modifying it from the global scope. We can also define custom D3.js specific configurations here (such as locals and formatters). We can go one step further and load the complete D3.js code inside this factory so that d3 will not be available in the global scope at all. However, we don't use this approach here to keep things as simple and understandable as possible. We need to make this module or parts of it available to the main application. In AngularJS, we can do this by injecting the myChart module into the myApp application as follows: /* src/app.js */   angular.module('myApp', ['myChart']); Usually, we will just inject the directives and services of the visualization module that we want to use in the application, not the whole module. However, for the start and to access all parts of the visualization, we will leave it like this. We can use the components of the chart module now on the AngularJS application by injecting them into the controllers, services, and directives. The boilerplate—with a simple chart.js and chart.css file—is now ready. We can start to design the chart directive. A chart directive Next, we want to create a reusable and testable chart directive. The first question that comes into one's mind is where to put which functionality? Should we create a svg element as parent for the directive or a div element? Should we draw a data point as a circle in svg and use ng-repeat to replicate these points in the chart? Or should we better create and modify all data points with D3.js? I will answer all these question in the following sections. A directive for SVG As a general rule, we can say that different concepts should be encapsulated so that they can be replaced anytime by a new technology. Hence, we will use AngularJS with an element directive as a parent element for the visualization. We will bind the data and the options of the chart to the private scope of the directive. In the directive itself, we will create the complete chart including the parent svg container, the axis, and all data points using D3.js. Let's first add a simple directive for the chart component: /* src/chart.js */ …   // Scatter Chart Directive .directive('myScatterChart', ["d3", function(d3){      return {      restrict: 'E',      scope: {        },      compile: function( element, attrs, transclude ) {                   // Create a SVG root element        var svg = d3.select(element[0]).append('svg');          // Return the link function        return function(scope, element, attrs) { };      }    }; }]); In the preceding example, we first inject d3 to the directive by passing it as an argument to the caller function. Then, we return a directive as an element with a private scope. Next, we define a custom compile function that returns the link function of the directive. This is important because we need to create the svg container for the visualization during the compilation of the directive. Then, during the link phase of the directive, we need to draw the visualization. Let's try to define some of these directives and look at the generated output. We define three directives in the index.html file, as shown in the following code: <!-- index.html --> <div ng-controller="MainCtrl">   <!-- We can use the visualization directives here --> <!-- The first chart --> <my-scatter-chart class="chart"></my-scatter-chart>   <!-- A second chart --> <my-scatter-chart class="chart"></my-scatter-chart>   <!-- Another chart --> <my-scatter-chart class="chart"></my-scatter-chart>   </div> If we look at the output of the html page in the developer tools, we can see that for each base element of the directive, we created a svg parent element for the visualization: Output of the HTML page In the resulting DOM tree, we can see that three svg elements are appended to the directives. We can now start to draw the chart in these directives. Let's fill these elements with some awesome charts. Implementing a custom compile function First, let's add a data attribute to the isolated scope of the directive. This gives us access to the dataset, which we will later pass to the directive in the HTML template. Next, we extend the compile function of the directive to create a g group container for the data points and the axis. We will also add a watcher that checks for changes of the scope data array. Every time the data changes, we call a draw() function that redraws the chart of the directive. Let's get started: /* src/capp..js */ ... // Scatter Chart Directive .directive('myScatterChart', ["d3", function(d3){        // we will soon implement this function    var draw = function(svg, width, height, data){ … };      return {      restrict: 'E',      scope: {        data: '='      },      compile: function( element, attrs, transclude ) {          // Create a SVG root element        var svg = d3.select(element[0]).append('svg');          svg.append('g').attr('class', 'data');        svg.append('g').attr('class', 'x-axis axis');        svg.append('g').attr('class', 'y-axis axis');          // Define the dimensions for the chart        var width = 600, height = 300;          // Return the link function        return function(scope, element, attrs) {            // Watch the data attribute of the scope          scope.$watch('data', function(newVal, oldVal, scope) {              // Update the chart            draw(svg, width, height, scope.data);          }, true);        };      }    }; }]); Now, we implement the draw() function in the beginning of the directive. Drawing charts So far, the chart directive should look like the following code. We will now implement the draw() function, draw axis, and time series data. We start with setting the height and width for the svg element as follows: /* src/chart.js */ ...   // Scatter Chart Directive .directive('myScatterChart', ["d3", function(d3){      function draw(svg, width, height, data) {      svg        .attr('width', width)        .attr('height', height);      // code continues here }      return {      restrict: 'E',      scope: {        data: '='      },      compile: function( element, attrs, transclude ) { ... } }]); Axis, scale, range, and domain We first need to create the scales for the data and then the axis for the chart. The implementation looks very similar to the scatter chart. We want to update the axis with the minimum and maximum values of the dataset; therefore, we also add this code to the draw() function: /* src/chart.js --> myScatterChart --> draw() */   function draw(svg, width, height, data) { ... // Define a margin var margin = 30;   // Define x-scale var xScale = d3.time.scale()    .domain([      d3.min(data, function(d) { return d.time; }),      d3.max(data, function(d) { return d.time; })    ])    .range([margin, width-margin]);   // Define x-axis var xAxis = d3.svg.axis()    .scale(xScale)    .orient('top')    .tickFormat(d3.time.format('%S'));   // Define y-scale var yScale = d3.time.scale()    .domain([0, d3.max(data, function(d) { return d.visitors; })])    .range([margin, height-margin]);   // Define y-axis var yAxis = d3.svg.axis()    .scale(yScale)    .orient('left')    .tickFormat(d3.format('f'));   // Draw x-axis svg.select('.x-axis')    .attr("transform", "translate(0, " + margin + ")")    .call(xAxis);   // Draw y-axis svg.select('.y-axis')    .attr("transform", "translate(" + margin + ")")    .call(yAxis); } In the preceding code, we create a timescale for the x-axis and a linear scale for the y-axis and adapt the domain of both axes to match the maximum value of the dataset (we can also use the d3.extent() function to return min and max at the same time). Then, we define the pixel range for our chart area. Next, we create two axes objects with the previously defined scales and specify the tick format of the axis. We want to display the number of seconds that have passed on the x-axis and an integer value of the number of visitors on the y-axis. In the end, we draw the axes by calling the axis generator on the axis selection. Joining the data points Now, we will draw the data points and the axis. We finish the draw() function with this code: /* src/chart.js --> myScatterChart --> draw() */ function draw(svg, width, height, data) { ... // Add new the data points svg.select('.data')    .selectAll('circle').data(data)    .enter()    .append('circle');   // Updated all data points svg.select('.data')    .selectAll('circle').data(data)    .attr('r', 2.5)    .attr('cx', function(d) { return xScale(d.time); })    .attr('cy', function(d) { return yScale(d.visitors); }); } In the preceding code, we first create circle elements for the enter join for the data points where no corresponding circle is found in the Selection. Then, we update the attributes of the center point of all circle elements of the chart. Let's look at the generated output of the application: Output of the chart directive We notice that the axes and the whole chart scales as soon as new data points are added to the chart. In fact, this result looks very similar to the previous example with the main difference that we used a directive to draw this chart. This means that the data of the visualization that belongs to the application is stored and updated in the application itself, whereas the directive is completely decoupled from the data. To achieve a nice output like in the previous figure, we need to add some styles to the cart.css file, as shown in the following code: /* src/chart.css */ .axis path, .axis line {    fill: none;    stroke: #999;    shape-rendering: crispEdges; } .tick {    font: 10px sans-serif; } circle {    fill: steelblue; } We need to disable the filling of the axis and enable crisp edges rendering; this will give the whole visualization a much better look. Summary In this article, you learned how to properly integrate a D3.js component into an AngularJS application—the Angular way. All files, modules, and components should be maintainable, testable, and reusable. You learned how to set up an AngularJS application and how to structure the folder structure for the visualization component. We put different responsibilities in different files and modules. Every piece that we can separate from the main application can be reused in another application; the goal is to use as much modularization as possible. As a next step, we created the visualization directive by implementing a custom compile function. This gives us access to the first compilation of the element—where we can append the svg element as a parent for the visualization—and other container elements. Resources for Article: Further resources on this subject: AngularJS Performance [article] An introduction to testing AngularJS directives [article] Our App and Tool Stack [article]
Read more
  • 0
  • 0
  • 7849

article-image-working-simple-associations-using-cakephp
Packt
24 Oct 2009
5 min read
Save for later

Working with Simple Associations using CakePHP

Packt
24 Oct 2009
5 min read
Database relationship is hard to maintain even for a mid-sized PHP/MySQL application, particularly, when multiple levels of relationships are involved because complicated SQL queries are needed. CakePHP offers a simple yet powerful feature called 'object relational mapping' or ORM to handle database relationships with ease.In CakePHP, relations between the database tables are defined through association—a way to represent the database table relationship inside CakePHP. Once the associations are defined in models according to the table relationships, we are ready to use its wonderful functionalities. Using CakePHP's ORM, we can save, retrieve, and delete related data into and from different database tables with simplicity, in a better way—no need to write complex SQL queries with multiple JOINs anymore! In this article by Ahsanul Bari and Anupom Syam, we will have a deep look at various types of associations and their uses. In particular, the purpose of this article is to learn: How to figure out association types from database table relations How to define different types of associations in CakePHP models How to utilize the association for fetching related model data How to relate associated data while saving There are basically 3 types of relationship that can take place between database tables: one-to-one one-to-many many-to-many The first two of them are simple as they don't require any additional table to relate the tables in relationship. In this article, we will first see how to define associations in models for one-to-one and one-to-many relations. Then we will look at how to retrieve and delete related data from, and save data into, database tables using model associations for these simple associations. Defining One-To-Many Relationship in Models To see how to define a one-to-many relationship in models, we will think of a situation where we need to store information about some authors and their books and the relation between authors and books is one-to-many. This means an author can have multiple books but a book belongs to only one author (which is rather absurd, as in real life scenario a book can also have multiple authors). We are now going to define associations in models for this one-to-many relation, so that our models recognize their relations and can deal with them accordingly. Time for Action: Defining One-To-Many Relation Create a new database and put a fresh copy of CakePHP inside the web root. Name the database whatever you like but rename the cake folder to relationship. Configure the database in the new Cake installation. Execute the following SQL statements in the database to create a table named authors, CREATE TABLE `authors` ( `id` int( 11 ) NOT NULL AUTO_INCREMENT PRIMARY KEY , `name` varchar( 127 ) NOT NULL , `email` varchar( 127 ) NOT NULL , `website` varchar( 127 ) NOT NULL ); Create a books table in our database by executing the following SQL commands: CREATE TABLE `books` ( `id` int( 11 ) NOT NULL AUTO_INCREMENT PRIMARY KEY , `isbn` varchar( 13 ) NOT NULL , `title` varchar( 64 ) NOT NULL , `description` text NOT NULL , `author_id` int( 11 ) NOT NULL ) Create the Author model using the following code (/app/models/authors.php): <?php class Author extends AppModel{ var $name = 'Author'; var $hasMany = 'Book';} ?> Use the following code to create the Book model (/app/models/books.php): <?phpclass Book extends AppModel{ var $name = 'Book'; var $belongsTo = 'Author';}?> Create a controller for the Author model with the following code: (/app/controllers/authors_controller.php): <?phpclass AuthorsController extends AppController { var $name = 'Authors'; var $scaffold;}?>   Use the following code to create a controller for the Book model (/app/controllers/books_controller.php): <?php class BooksController extends AppController { var $name = 'Books'; var $scaffold; } ?> Now, go to the following URLs and add some test data: http://localhost/relationship/authors/ and http://localhost/relationship/books/ What Just Happened? We have created two tables: authors and books for storing author and book information. A foreign-key named author_id is added to the books table to establish the one-to-many relation between authors and books. Through this foreign-key, an author is related to multiple books, as well as, a book is related to one single author. By Cake convention, the name of a foreign-key should be underscored, singular name of target model, suffixed with _id. Once the database tables are created and relations are established between them, we can define associations in models. In both of the model classes, Author and Book, we defined associations to represent the one-to-many relationship between the corresponding two tables. CakePHP provides two types of association: hasMany and belongsTo to define one-to-many relations in models. These associations are very appropriately named: As an author 'has many' books, Author model should have hasMany association to represent its relation with the Book model. As a book 'belongs to' one author, Book model should have belongsTo association to denote its relation with the Author model. In the Author model, an association attribute $hasMany is defined with the value Book to inform the model that every author can be related to many books. We also added a $belongsTo attribute in the Book model and set its value to Author to let the Book model know that every book is related to only one author. After defining the associations, two controllers were created for both of these models with scaffolding to see how the associations are working.
Read more
  • 0
  • 0
  • 7846
Modal Close icon
Modal Close icon