Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Save more on your purchases! discount-offer-chevron-icon
Savings automatically calculated. No voucher code required.
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Newsletter Hub
Free Learning
Arrow right icon
timer SALE ENDS IN
0 Days
:
00 Hours
:
00 Minutes
:
00 Seconds

How-To Tutorials

7008 Articles
article-image-web-framework-behavior-tuning
Packt
12 Jan 2017
8 min read
Save for later

Web Framework Behavior Tuning

Packt
12 Jan 2017
8 min read
In this article by Alex Antonov, the author of the book Spring Boot Cookbook – Second Edition, learn to use and configure spring resources and build your own Spring-based application using Spring Boot. In this article, you will learn about the following topics: Configuring route matching patterns Configuring custom static path mappings Adding custom connectors (For more resources related to this topic, see here.) Introduction We will look into enhancing our web application by doing behavior tuning, configuring the custom routing rules and patterns, adding additional static asset paths, and adding and modifying servlet container connectors and other properties, such as enabling SSL. Configuring route matching patterns When we build web applications, it is not always the case that a default, out-of-the-box, mapping configuration is applicable. At times, we want to create our RESTful URLs that contain characters such as . (dot), which Spring treats as a delimiter defining format, like path.xml, or we might not want to recognize a trailing slash, and so on. Conveniently, Spring provides us with a way to get this accomplished with ease. Let's imagine that the ISBN format does allow the use of dots to separate the book number from the revision with a pattern looking like [isbn-number].[revision]. How to do it… We will configure our application to not use the suffix pattern match of .* and not to strip the values after the dot when parsing the parameters. Let's perform the following steps: Let's add the necessary configuration to our WebConfiguration class with the following content: @Override public void configurePathMatch(PathMatchConfigurer configurer) { configurer.setUseSuffixPatternMatch(false). setUseTrailingSlashMatch(true); } Start the application by running ./gradlew clean bootRun. Let's open http://localhost:8080/books/978-1-78528-415-1.1 in the browser to see the following results: If we enter the correct ISBN, we will see a different result, as shown in the following screenshot: How it works… Let's look at what we did in detail. The configurePathMatch(PathMatchConfigurer configurer) method gives us an ability to set our own behavior in how we want Spring to match the request URL path to the controller parameters: configurer.setUseSuffixPatternMatch(false): This method indicates that we don't want to use the .* suffix so as to strip the trailing characters after the last dot. This translates into Spring parsing out 978-1-78528-415-1.1 as an {isbn} parameter for BookController. So, http://localhost:8080/books/978-1-78528-415-1.1 and http://localhost:8080/books/978-1-78528-415-1 will become different URLs. configurer.setUseTrailingSlashMatch(true): This method indicates that we want to use the trailing / in the URL as a match, as if it were not there. This effectively makes http://localhost:8080/books/978-1-78528-415-1 the same as http://localhost:8080/books/978-1-78528-415-1/. If you want to do further configuration on how the path matching takes place, you can provide your own implementation of PathMatcher and UrlPathHelper, but these will be required in the most extreme and custom-tailored situations and are not generally recommended. Configuring custom static path mappings It is possible to control how our web application deals with static assets and the files that exist on the filesystem or are bundled in the deployable archive. Let's say that we want to expose our internal application.properties file via the static web URL of http://localhost:8080/internal/application.properties from our application. To get started with this, proceed with the steps in the next section. How to do it… Let's add a new method, addResourceHandlers, to the WebConfiguration class with the following content: @Override public void addResourceHandlers(ResourceHandlerRegistry registry) { registry.addResourceHandler("/internal/**").addResourceLocations("classpath:/"); } Start the application by running ./gradlew clean bootRun. Let's open http://localhost:8080/internal/application.properties in the browser to see the following results: How it works… The method that we overrode, addResourceHandlers(ResourceHandlerRegistry registry), is another configuration method from WebMvcConfigurer, which gives us an ability to define custom mappings for static resource URLs and connect them with the resources on the filesystem or application classpath. In our case, we defined a mapping of anything that is being accessed via the / internal URL to be looked for in classpath:/ of our application. (For production environment, you probably don't want to expose the entire classpath as a static resource!) So, let's take a detailed look at what we did, as follows: registry.addResourceHandler("/internal/**"): This method adds a resource handler to the registry to handle our static resources, and it returns ResourceHandlerRegistration to us, which can be used to further configure the mapping in a chained fashion. /internal/** is a path pattern that will be used to match against the request URL using PathMatcher. We have seen how PathMatcher can be configured in the previous example but, by default, an AntPathMatcher implementation is used. We can configure more than one URL pattern to be matched to a particular resource location. addResourceLocations("classpath:/"):This method is called on the newly created instance of ResourceHandlerRegistration, and it defines the directories where the resources should be loaded from. These should be valid filesystems or classpath directories, and there can be more than one entered. If multiple locations are provided, they will be checked in the order in which they were entered. setCachePeriod (Integer cachePeriod): Using this method, we can also configure a caching interval for the given resource by adding custom connectors. Another very common scenario in the enterprise application development and deployment is to run the application with two separate HTTP port connectors: one for HTTP and the other for HTTPS. Adding custom connectors Another very common scenario in the enterprise application development and deployment is to run the application with two separate HTTP port connectors: one for HTTP and the other for HTTPS. Getting ready For this recipe, we will undo the changes that we implemented in the previous example. In order to create an HTTPS connector, we will need a few things; but, most importantly, we will need to generate a certificate keystore that is used to encrypt and decrypt the SSL communication with the browser. If you are using Unix or Mac, you can do it by running the following command: $JAVA_HOME/bin/keytool -genkey -alias tomcat -keyalg RSA On Windows, this can be achieved via the following code: "%JAVA_HOME%binkeytool" -genkey -alias tomcat -keyalg RSA During the creation of the keystore, you should enter the information that is appropriate to you, including passwords, name, and so on. For the purpose of this book, we will use the default password: changeit. Once the execution is complete, a newly generated keystore file will appear in your home directory under the name .keystore. You can find more information about preparing the certificate keystore at https://tomcat.apache.org/tomcat-8.0-doc/ssl-howto.html#Prepare_the_Certificate_Keystore. How to do it… With the keystore creation complete, we will need to create a separate properties file in order to store our configuration for the HTTPS connector, such as port and others. After that, we will create a configuration property binding object and use it to configure our new connector. Perform the following steps: First, we will create a new properties file named tomcat.https.properties in the src/main/resources directory from the root of our project with the following content: custom.tomcat.https.port=8443 custom.tomcat.https.secure=true custom.tomcat.https.scheme=https custom.tomcat.https.ssl=true custom.tomcat.https.keystore=${user.home}/.keystore custom.tomcat.https.keystore-password=changeit Next, we will create a nested static class named TomcatSslConnectorProperties in our WebConfiguration, with the following content: @ConfigurationProperties(prefix = "custom.tomcat.https") public static class TomcatSslConnectorProperties { private Integer port; private Boolean ssl= true; private Boolean secure = true; private String scheme = "https"; private File keystore; private String keystorePassword; //Skipping getters and setters to save space, but we do need them public void configureConnector(Connector connector) { if (port != null) connector.setPort(port); if (secure != null) connector.setSecure(secure); if (scheme != null) connector.setScheme(scheme); if (ssl!= null) connector.setProperty("SSLEnabled", ssl.toString()); if (keystore!= null &&keystore.exists()) { connector.setProperty("keystoreFile", keystore.getAbsolutePath()); connector.setProperty("keystorePassword", keystorePassword); } } } Now, we will need to add our newly created tomcat.http.properties file as a Spring Boot property source and enable TomcatSslConnectorProperties to be bound. This can be done by adding the following code right above the class declaration of the WebConfiguration class: @Configuration @PropertySource("classpath:/tomcat.https.properties") @EnableConfigurationProperties(WebConfiguration.TomcatSslConnectorProperties.class) public class WebConfiguration extends WebMvcConfigurerAdapter {...} Finally, we will need to create an EmbeddedServletContainerFactory Spring bean where we will add our HTTPS connector. We will do that by adding the following code to the WebConfiguration class: @Bean public EmbeddedServletContainerFactory servletContainer(TomcatSslConnectorProperties properties) { TomcatEmbeddedServletContainerFactory tomcat = new TomcatEmbeddedServletContainerFactory(); tomcat.addAdditionalTomcatConnectors( createSslConnector(properties)); return tomcat; } private Connector createSslConnector(TomcatSslConnectorProperties properties) { Connector connector = new Connector(); properties.configureConnector(connector); return connector; } Start the application by running ./gradlew clean bootRun. Let's open https://localhost:8443/internal/tomcat.https.properties in the browser to see the following results: Summary In this article, you learned how to fine-tune the behavior of a web application. This article has given a small gist about custom routes, asset paths, and amending routing patterns. You also learned how to add more connectors to the servlet container. Resources for Article: Further resources on this subject: Introduction to Spring Framework [article] Setting up Microsoft Bot Framework Dev Environment [article] Creating our first bot, WebBot [article]
Read more
  • 0
  • 0
  • 18275

article-image-professional-environment-react-native-part-2
Pierre Monge
12 Jan 2017
4 min read
Save for later

A Professional Environment for React Native, Part 2

Pierre Monge
12 Jan 2017
4 min read
In Part 1 of this series, I covered the full environment and everything you need to start creating your own React Native applications. Now here in Part 2, we are going to dig in and go over the tools that you can take advantage of for maintaining those React Native apps. Maintaining the application Maintaining a React Native application, just like any software, is very complex and requires a lot of organization. In addition to having strict code (a good syntax with eslint or a good understanding of the code with flow), you must have intelligent code, and you must organize your files, filenames, and variables. It is necessary to have solutions for the maintenance of the application in the long term as well as have tools that provide feedback. Here are some tools that we use, which should be in place early in the cycle of your React Native development. GitHub GitHub is a fantastic tool, but you need to know how to control it. In my company, we have our own Git flow with a Dev branch, a master branch, release branches, bugs and other useful branches. It's up to you to make your own flow for Git! One of the most important things is the Pull Request, or the PR! And if there are many people on your project, it is important for your group to agree on the organization of the code. BugTracker & Tooling We use many tools in my team, but here is our Must-Have list to maintain the application: circleCI: This is a continuous integration tool that we integrate with GitHub. It allows us to pass recurrent tests with each new commit. BugSnag: This is a bug tracking tool that can be used in a React Native integration, which makes it possible to raise user bugs by the webs without the user noticing it. codePush: This is useful for deploying code on versions already in production. And yes, you can change business code while the application is already in production. I do not pay much attention to it, yet the states of applications (Debug, Beta, and Production) are a big part that has to be treated because it is a workset to have for quality work and a long application life. We also have quality assurance in our company, which allows us to validate a product before it is set up, which provides a regular process of putting a React Native app into production. As you can see, there are many tools that will help you maintain a React Native mobile application. Despite the youthfulness of the product, the community grows quickly and developers are excited about creating apps. There are more and more large companies using React Native, such as AirBnB , Wix, Microsoft, and many others. And with the technology growing and improving, there are more and more new tools and integrations coming to React Native. I hope this series has helped you create and maintain your own React Native applications. Here is a summary of the tools covered: Atom is a text editor that's modern, approachable, yet hackable to the core—a tool that you can customize to do anything, but you also need to use it productively without ever touching a config file. GitHub is a web-based Git repository hosting service. CircleCI is a modern continuous integration and delivery platform that software teams love to use. BugSnag monitors application errors to improve customer experiences and code quality. react-native-code-push is a plugin that provides client-side integration, allowing you to easily add a dynamic update experience to your React Native app. About the author Pierre Monge (liroo.pierre@gmail.com) is a 21-year-old student. He is a developer in C, JavaScript, and all things web development, and he has recently been creating mobile applications. He is currently working as an intern at a company named Azendoo, where he is developing a 100% React Native application.
Read more
  • 0
  • 0
  • 4253

article-image-metric-analytics-metricbeat
Packt
11 Jan 2017
5 min read
Save for later

Metric Analytics with Metricbeat

Packt
11 Jan 2017
5 min read
In this article by Bahaaldine Azarmi, the author of the book Learning Kibana 5.0, we will learn about metric analytics, which is fundamentally different in terms of data structure. (For more resources related to this topic, see here.) Author would like to spend a few lines on the following question: What is a metric? A metric is an event that contains a timestamp and usually one or more numeric values. It is appended to a metric file sequentially, where all lines of metrics are ordered based on the timestamp. As an example, here are a few system metrics: 02:30:00 AM    all    2.58    0.00    0.70    1.12    0.05     95.5502:40:00 AM    all    2.56    0.00    0.69    1.05    0.04     95.6602:50:00 AM    all    2.64    0.00    0.65    1.15    0.05     95.50 Unlike logs, metrics are sent periodically, for example, every 10 minutes (as the preceding example illustrates) whereas logs are usually appended to the log file when something happens. Metrics are often used in the context of software or hardware health monitoring, such as resource utilization monitoring, database execution metrics monitoring, and so on. Since version 5.0, Elastic had, at all layers of the solutions, new features to enhance the user experience of metrics management and analytics. Metricbeat is one of the new features in 5.0. It allows the user to ship metrics data, whether from the machine or from applications, to Elasticsearch, and comes with out-of-the-box dashboards for Kibana. Kibana also integrates Timelion with its core, a plugin which has been made for manipulating numeric data, such as metrics. In this article, we'll start by working with Metricbeat. Metricbeat in Kibana The procedure to import the dashboard has been laid out in the subsequent section. Importing the dashboard Before importing the dashboard, let's have a look at the actual metric data that Metricbeat ships. As I have Chrome opened while typing this article, I'm going to filter the data by process name, here chrome: Discover tab filtered by process name   Here is an example of one of the documents I have: { "_index": "metricbeat-2016.09.06", "_type": "metricsets", "_id": "AVcBFstEVDHwfzZYZHB8", "_score": 4.29527, "_source": { "@timestamp": "2016-09-06T20:00:53.545Z", "beat": { "hostname": "MacBook-Pro-de-Bahaaldine.local", "name": "MacBook-Pro-de-Bahaaldine.local" }, "metricset": { "module": "system", "name": "process", "rtt": 5916 }, "system": { "process": { "cmdline": "/Applications/Google Chrome.app/Contents/Versions/52.0.2743.116/Google Chrome Helper.app/Contents/MacOS/Google Chrome Helper --type=ppapi --channel=55142.2188.1032368744 --ppapi-flash-args --lang=fr", "cpu": { "start_time": "09:52", "total": { "pct": 0.0035 } }, "memory": { "rss": { "bytes": 67813376, "pct": 0.0039 }, "share": 0, "size": 3355303936 }, "name": "Google Chrome H", "pid": 76273, "ppid": 55142, "state": "running", "username": "bahaaldine" } }, "type": "metricsets" }, "fields": { "@timestamp": [ 1473192053545 ] } } Metricbeat document example The preceding document breaks down the utilization of resources for the chrome process. We can see, for example, the usage of CPU and memory, as well as the state of the process as a whole. Now how about visualizing the data in an actual dashboard? To do so, go into the Kibana folder located in the Metricbeat installation directory: MacBook-Pro-de-Bahaaldine:kibana bahaaldine$ pwd /elastic/metricbeat-5.0.0/kibana MacBook-Pro-de-Bahaaldine:kibana bahaaldine$ ls dashboard import_dashboards.ps1 import_dashboards.sh index-pattern search visualization import_dashboards.sh is the file we will use to import the dashboards in Kibana. Execute the file script like the following: ./import_dashboards.sh –h This should print out the help, which, essentially, will give you the list of arguments you can pass to the script. Here, we need to specify a username and a password as we are using the X-Pack security plugin, which secures our cluster: ./import_dashboards.sh –u elastic:changeme You should normally get a bunch of logs stating that dashboards have been imported, as shown in the following example: Import visualization Servers-overview: {"_index":".kibana","_type":"visualization","_id":"Servers-overview","_version":4,"forced_refresh":false,"_shards":{"total":2,"successful":1,"failed":0},"created":false} Now, at this point, you have metric data in Elasticsearch and dashboards created in Kibana, so you can now visualize the data. Visualizing metrics If you go back into the Kibana/dashboard section and try to open the Metricbeat System Statistics dashboard, you should get something similar to the following: Metricbeat Kibana dashboard You should see in your own dashboard the metric based on the processes that are running on your computer. In my case, I have a bunch of them for which I can visualize the CPU and memory utilization, for example: RAM and CPU utilization As an example, what can be important here is to be sure that Metricbeat has a very low footprint on the overall system in terms of CPU or RAM, as shown here: Metricbeat resource utilization As we can see in the preceding diagram, Metricbeat only uses about 0.4% of the CPU and less than 0.1% of the memory on my Macbook Pro. On the other hand, if I want to get the most resource-consuming processes, I can check in the Top processes data table, which gives the following information: Top processes Besides Google Chrome H, which uses a lot of CPU, zoom.us, a conferencing application, seems to bring a lot of stress to my laptop. Rather than using the Kibana standard visualization to manipulate our metrics, we'll use Timelion instead, and focus on this heavy CPU consuming processes use case. Summary In this article, we have seen how we can use Kibana in the context of technical metric analytics. We relied on the data that Metricbeat is able to ship from a machine and visualized the result both in Kibana dashboard and in Kibana Timelion. Resources for Article: Further resources on this subject: An Introduction to Kibana [article] Big Data Analytics [article] Static Data Management [article]
Read more
  • 0
  • 0
  • 3342

article-image-scale-your-django-app
Jean Jung
11 Jan 2017
6 min read
Save for later

Scale your Django App: Gunicorn + Apache + Nginx

Jean Jung
11 Jan 2017
6 min read
One question when starting with Django is "How do I scale my app"? Brandon Rhodes has answered this question in Foundations of Python Network Programming. Rhodes shows us different options, so in this post we will focus on the preferred and main option: Gunicorn + Apache + Nginx. The idea of this architecture is to have Nginx as a proxy to delegate dynamic content to Gunicorn and static content to Apache. As Django, by itself, does not handle static content, and Apache does it very well, we can take advantages from that. Below we will see how to configure everything. Environment Project directory: /var/www/myproject Apache2 Nginx Gunicorn Project settings STATIC_ROOT: /var/www/myproject/static STATIC_URL: /static/ MEDIA_ROOT: /var/www/myproject/media MEDIA_URL: /media/ ALLOWED_HOSTS: myproject.com Gunicorn Gunicorn is a Python and WSGI compatible server, making it our first option when working with Django. It’s possible to install Gunicorn from pip by running: pip install gunicorn To run the Gunicorn server, cd to your project directoryand run: gunicorn myproject.wsgi:application -b localhost:8000 By default, Gunicorn runs just one worker to serve the pages. If you feel you need more workers, you can start them by passing the number of workers to the --workers option. Gunicorn also runs in the foreground, but you need to configure a service on your server. This, however, is not the focus of this post. Visit localhost:8000 on your browser and see that your project is working. You will probably see that your static wasn’t accessible. This is because Django itself cannot serve static files, and Gunicorn is not configured to serve them too. Let’s fix that with Apache in the next section. If your page does not work here, check if you are using a virtualenv and if it is enabled on the Gunicorn running process. Apache Installing Apache takes some time and is not the focus of this post; additionally, a great majority of the readers will already have Apache, so if you don’t know how to install Apache, follow this guide. If you already have configured Apache to serve static content, this one will be very similar to what you have done. If you have never done that, do not be afraid; it will be easy! First of all, change the listening port from Apache. Currently, on Apache2, edit the /etc/apache2/ports.conf and change the line: Listen 80 To: Listen 8001 You can choose other ports too; just be sure to adjust the permissions on the static and media files dir to match the current Apache running user needs. Create a file at /etc/apache2/sites-enabled/myproject.com.conf and add this content: <VirtualHost *:8001> ServerName static.myproject.com ServerAdmin webmaster@localhost CustomLog ${APACHE_LOG_DIR}/static.myproject.com-access.log combined ErrorLog ${APACHE_LOG_DIR}/static.myproject.com-error.log # Possible values include: debug, info, notice, warn, error, crit, # alert, emerg. LogLevel warn DocumentRoot /var/www/myproject Alias /static/ /var/www/myproject/static/ Alias /media/ /var/www/myproject/media/ <Directory /var/www/myproject/static> Require all granted </Directory> <Directory /var/www/myproject/media> Require all granted </Directory> </VirtualHost> Be sure to replace everything needed to fit your project needs. But your project still does not work well because Gunicorn does not know about Apache, and we don’t want it to know anything about that. This is because we will use Nginx, covered in the next session. Nginx Nginx is a very light and powerful web server. It is different from Apache, and it does not spawn a new process for every request, so it works very well as a proxy. As I’ve said, when installing Apache, you would lead to this reference to know how to install Nginx. Proxy configuration is very simple in Nginx; just create a file at /etc/nginx/conf.d/myproject.com.conf and put: upstream dynamic { server 127.0.0.1:8000; } upstream static { server 127.0.0.1:8001; } server { listen 80; server_name myproject.com; # Root request handler to gunicorn upstream location / { proxy_pass http://dynamic; } # Static request handler to apache upstream location /static { proxy_pass http://static/static; } # Media request handler to apache upstream location /media { proxy_pass http://static/media; } proxy_set_header X-Real-IP $remote_addr; proxy_set_header Host $host; proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for; error_page 500 502 503 504 /50x.html; } This way, you have everything working on your machine! If you have more than one machine, you can dedicate the machines to deliver static/dynamic contents. The machine where Nginx runs is the proxy, and it needs to be visible from the Internet. The machines running Apache or Gunicorn can be visible only from your local network. If you follow this architecture, you can just change the Apache and Gunicorn configurations to listen to the default ports, adjust the domain names, and set the Nginx configuration to deliver the connections over the new domains. Where to go now? For more details on deploying Gunicorn with Nginx, see the Gunicorn deployment page. You would like to see the Apache configuration page and the Nginx getting started page to have more information about scalability and security. Summary In this post you saw how to configure Nginx, Apache, and Gunicorn servers to deliver your Django app over a proxy environment, balancing your requests through Apache and Gunicorn. There was a state about how to start more Gunicorn workers and where to find details about scaling each of the servers being used. References [1] - PEP 3333 - WSGI [2] - Gunicorn Project [3] - Apache Project [4] - Nginx Project [5] - Rhodes, B. & Goerzen, J. (2014). Foundations of Python Network Programming. New York, NY: Apress. About the author Jean Jung is a Brazilian developer passionate about technology. Currently a System Analyst at EBANX, an international payment processing cross boarder for Latin America, he's very interested in Python and Artificial Intelligence, specifically Machine Learning, Compilers, and Operational Systems. As a hobby, he's always looking for IoT projects with Arduino.
Read more
  • 0
  • 0
  • 7707

article-image-hello-c-welcome-net-core
Packt
11 Jan 2017
10 min read
Save for later

Hello, C#! Welcome, .NET Core!

Packt
11 Jan 2017
10 min read
In this article by Mark J. Price, author of the book C# 7 and .NET Core: Modern Cross-Platform Development-Second Edition, we will discuss about setting up your development environment; understanding the similarities and differences between .NET Core, .NET Framework, .NET Standard Library, and .NET Native. Most people learn complex topics by imitation and repetition rather than reading a detailed explanation of theory. So, I will not explain every keyword and step. This article covers the following topics: Setting up your development environment Understanding .NET (For more resources related to this topic, see here.) Setting up your development environment Before you start programming, you will need to set up your Interactive Development Environment (IDE) that includes a code editor for C#. The best IDE to choose is Microsoft Visual Studio 2017, but it only runs on the Windows operating system. To develop on alternative operating systems such as macOS and Linux, a good choice of IDE is Microsoft Visual Studio Code. Using alternative C# IDEs There are alternative IDEs for C#, for example, MonoDevelop and JetBrains Project Rider. They each have versions available for Windows, Linux, and macOS, allowing you to write code on one operating system and deploy to the same or a different one. For MonoDevelop IDE, visit http://www.monodevelop.com/ For JetBrains Project Rider, visit https://www.jetbrains.com/rider/ Cloud9 is a web browser-based IDE, so it's even more cross-platform than the others. Here is the link: https://c9.io/web/sign-up/free Linux and Docker are popular server host platforms because they are relatively lightweight and more cost-effectively scalable when compared to operating system platforms that are more for end users, such as Windows and macOS. Using Visual Studio 2017 on Windows 10 You can use Windows 7 or later to run code, but you will have a better experience if you use Windows 10. If you don't have Windows 10, and then you can create a virtual machine (VM) to use for development. You can choose any cloud provider, but Microsoft Azure has preconfigured VMs that include properly licensed Windows 10 and Visual Studio 2017. You only pay for the minutes your VM is running, so it is a way for users of Linux, macOS, and older Windows versions to have all the benefits of using Visual Studio 2017. Since October 2014, Microsoft has made a professional-quality edition of Visual Studio available to everyone for free. It is called the Community Edition. Microsoft has combined all its free developer offerings in a program called Visual Studio Dev Essentials. This includes the Community Edition, the free level of Visual Studio Team Services, Azure credits for test and development, and free training from Pluralsight, Wintellect, and Xamarin. Installing Microsoft Visual Studio 2017 Download and install Microsoft Visual Studio Community 2017 or later: https://www.visualstudio.com/vs/visual-studio-2017/. Choosing workloads On the Workloads tab, choose the following: Universal Windows Platform development .NET desktop development Web development Azure development .NET Core and Docker development On the Individual components tab, choose the following: Git for Windows GitHub extension for Visual Studio Click Install. You can choose to install everything if you want support for languages such as C++, Python, and R. Completing the installation Wait for the software to download and install. When the installation is complete, click Launch. While you wait for Visual Studio 2017 to install, you can jump to the Understanding .NET section in this article. Signing in to Visual Studio The first time that you run Visual Studio 2017, you will be prompted to sign in. If you have a Microsoft account, for example, a Hotmail, MSN, Live, or Outlook e-mail address, you can use that account. If you don't, then register for a new one at the following link: https://signup.live.com/ You will see the Visual Studio user interface with the Start Page open in the central area. Like most Windows desktop applications, Visual Studio has a menu bar, a toolbar for common commands, and a status bar at the bottom. On the right is the Solution Explorer window that will list your open projects: To have quick access to Visual Studio in the future, right-click on its entry in the Windows taskbar and select Pin this program to taskbar. Using older versions of Visual Studio The free Community Edition has been available since Visual Studio 2013 with Update 4. If you want to use a free version of Visual Studio older than 2013, then you can use one of the more limited Express editions. A lot of the code in this book will work with older versions if you bear in mind when the following features were introduced: Year C# Features 2005 2 Generics with <T> 2008 3 Lambda expressions with => and manipulating sequences with LINQ (from, in, where, orderby, ascending, descending, select, group, into) 2010 4 Dynamic typing with dynamic and multithreading with Task 2012 5 Simplifying multithreading with async and await 2015 6 string interpolation with $"" and importing static types with using static 2017 7 Tuples (with deconstruction), patterns, out variables, literal improvements Understanding .NET .NET Framework, .NET Core, .NET Standard Library, and .NET Native are related and overlapping platforms for developers to build applications and services upon. Understanding the .NET Framework platform Microsoft's .NET Framework is a development platform that includes a Common Language Runtime (CLR) that manages the execution of code and a rich library of classes for building applications. Microsoft designed the .NET Framework to have the possibility of being cross-platform, but Microsoft put their implementation effort into making it work best with Windows. Practically speaking, the .NET Framework is Windows-only. Understanding the Mono and Xamarin projects Third parties developed a cross-platform .NET implementation named the Mono project (http://www.mono-project.com/). Mono is cross-platform, but it fell well behind the official implementation of .NET Framework. It has found a niche as the foundation of the Xamarin mobile platform. Microsoft purchased Xamarin and now includes what used to be an expensive product for free with Visual Studio 2017. Microsoft has renamed the Xamarin Studio development tool to Visual Studio for the Mac. Xamarin is targeted at mobile development and building cloud services to support mobile apps. Understanding the .NET Core platform Today, we live in a truly cross-platform world. Modern mobile and cloud development have made Windows a much less important operating system. So, Microsoft has been working on an effort to decouple the .NET Framework from its close ties with Windows. While rewriting .NET to be truly cross-platform, Microsoft has taken the opportunity to refactor .NET to remove major parts that are no longer considered core. This new product is branded as the .NET Core, which includes a cross-platform implementation of the CLR known as CoreCLR and a streamlined library of classes known as CoreFX. Streamlining .NET .NET Core is much smaller than the current version of the .NET Framework because a lot has been removed. For example, Windows Forms and Windows Presentation Foundation (WPF) can be used to build graphical user interface (GUI) applications, but they are tightly bound to Windows, so they have been removed from the .NET Core. The latest technology for building Windows apps is the Universal Windows Platform (UWP). ASP.NET Web Forms and Windows Communication Foundation (WCF) are old web applications and service technologies that fewer developers choose to use today, so they have also been removed from the .NET Core. Instead, developers prefer to use ASP.NET MVC and ASP.NET Web API. These two technologies have been refactored and combined into a new product that runs on the .NET Core named ASP.NET Core. The Entity Framework (EF) 6.x is an object-relational mapping technology for working with data stored in relational databases such as Oracle and Microsoft SQL Server. It has gained baggage over the years, so the cross-platform version has been slimmed down and named Entity Framework Core. Some data types in .NET that are included with both the .NET Framework and the .NET Core have been simplified by removing some members. For example, in the .NET Framework, the File class has both a Close and Dispose method, and either can be used to release the file resources. In .NET Core, there is only the Dispose method. This reduces the memory footprint of the assembly and simplifies the API you have to learn. The .NET Framework 4.6 is about 200 MB. The .NET Core is about 11 MB. Eventually, the .NET Core may grow to a similar larger size. Microsoft's goal is not to make the .NET Core smaller than the .NET Framework. The goal is to componentize .NET Core to support modern technologies and to have fewer dependencies so that deployment requires only those components that your application really needs. Understanding the .NET Standard The situation with .NET today is that there are three forked .NET platforms, all controlled by Microsoft: .NET Framework, Xamarin, and .NET Core. Each have different strengths and weaknesses. This has led to the problem that a developer must learn three platforms, each with annoying quirks and limitations. So, Microsoft is working on defining the .NET Standard 2.0: a set of APIs that all .NET platforms must implement. Today, in 2016, there is the .NET Standard 1.6, but only .NET Core 1.0 supports it; .NET Framework and Xamarin do not! .NET Standard 2.0 will be implemented by .NET Framework, .NET Core, and Xamarin. For .NET Core, this will add many of the missing APIs that developers need to port old code written for .NET Framework to the new cross-platform .NET Core. .NET Standard 2.0 will probably be released towards the end of 2017, so I hope to write a third edition of this book for when that's finally released. The future of .NET The .NET Standard 2.0 is the near future of .NET, and it will make it much easier for developers to share code between any flavor of .NET, but we are not there yet. For cross-platform development, .NET Core is a great start, but it will take another version or two to become as mature as the current version of the .NET Framework. This book will focus on the .NET Core, but will use the .NET Framework when important or useful features have not (yet) been implemented in the .NET Core. Understanding the .NET Native compiler Another .NET initiative is the .NET Native compiler. This compiles C# code to native CPU instructions ahead-of-time (AoT) rather than using the CLR to compile IL just-in-time (JIT) to native code later. The .NET Native compiler improves execution speed and reduces the memory footprint for applications. It supports the following: UWP apps for Windows 10, Windows 10 Mobile, Xbox One, HoloLens, and Internet of Things (IoT) devices such as Raspberry Pi Server-side web development with ASP.NET Core Console applications for use on the command line Comparing .NET technologies The following table summarizes and compares the .NET technologies: Platform Feature set C# compiles to Host OSes .NET Framework Mature and extensive IL executed by a runtime Windows only Xamarin Mature and limited to mobile features iOS, Android, Windows Mobile .NET Core Brand-new and somewhat limited Windows, Linux, macOS, Docker .NET Native Brand-new and very limited Native code Summary In this article, we have learned how to set up the development environment, and discussed in detail about .NET technologies. Resources for Article: Further resources on this subject: Introduction to C# and .NET [article] Reactive Programming with C# [article] Functional Programming in C# [article]
Read more
  • 0
  • 0
  • 11259

article-image-ml-package
Packt
11 Jan 2017
18 min read
Save for later

ML Package

Packt
11 Jan 2017
18 min read
In this article by Denny Lee, the author of the book Learning PySpark, has provided a brief implementation and theory on ML packages. So, let's get to it! In this article, we will reuse a portion of the dataset. The data can be downloaded from http://www.tomdrabas.com/data/LearningPySpark/births_transformed.csv.gz. (For more resources related to this topic, see here.) Overview of the package At the top level, the package exposes three main abstract classes: a Transformer, an Estimator, and a Pipeline. We will shortly explain each with some short examples. Transformer The Transformer class, like the name suggests, transforms your data by (normally) appending a new column to your DataFrame. At the high level, when deriving from the Transformer abstract class, each and every new Transformer needs to implement a .transform(...) method. The method, as a first and normally the only obligatory parameter, requires passing a DataFrame to be transformed. This, of course, varies method-by-method in the ML package: other popular parameters are inputCol and outputCol; these, however, frequently default to some predefined values, such as 'features' for the inputCol parameter. There are many Transformers offered in the spark.ml.feature and we will briefly describe them here: Binarizer: Given a threshold, the method takes a continuous variable and transforms it into a binary one. Bucketizer: Similar to the Binarizer, this method takes a list of thresholds (the splits parameter) and transforms a continuous variable into a multinomial one. ChiSqSelector: For the categorical target variables (think, classification models), the feature allows you to select a predefined number of features (parameterized by the numTopFeatures parameter) that explain the variance in the target the best. The selection is done, as the name of the method suggest using a Chi-Square test. It is one of the two-step methods: first, you need to .fit(...) your data (so the method can calculate the Chi-square tests). Calling the .fit(...) method (you pass your DataFrame as a parameter) returns a ChiSqSelectorModel object that you can then use to transform your DataFrame using the .transform(...) method. More information on Chi-square can be found here: http://ccnmtl.columbia.edu/projects/qmss/the_chisquare_test/about_the_chisquare_test.html. CountVectorizer: Useful for a tokenized text (such as [['Learning', 'PySpark', 'with', 'us'],['us', 'us', 'us']]). It is of two-step methods: first, you need to .fit(...), that is, learn the patterns from your dataset, before you can .transform(...) with the CountVectorizerModel returned by the .fit(...) method. The output from this transformer, for the tokenized text presented previously, would look similar to this: [(4, [0, 1, 2, 3], [1.0, 1.0, 1.0, 1.0]),(4, [3], [3.0])]. DCT: The Discrete Cosine Transform takes a vector of real values and returns a vector of the same length, but with the sum of cosine functions oscillating at different frequencies. Such transformations are useful to extract some underlying frequencies in your data or in data compression. ElementwiseProduct: A method that returns a vector with elements that are products of the vector passed to the method, and a vector passed as the scalingVec parameter. For example, if you had a [10.0, 3.0, 15.0] vector and your scalingVec was [0.99, 3.30, 0.66], then the vector you would get would look as follows: [9.9, 9.9, 9.9]. HashingTF: A hashing trick transformer that takes a list of tokenized text and returns a vector (of predefined length) with counts. From PySpark's documentation: Since a simple modulo is used to transform the hash function to a column index, it is advisable to use a power of two as the numFeatures parameter; otherwise the features will not be mapped evenly to the columns. IDF: The method computes an Inverse Document Frequency for a list of documents. Note that the documents need to already be represented as a vector (for example, using either the HashingTF or CountVectorizer). IndexToString: A complement to the StringIndexer method. It uses the encoding from the StringIndexerModel object to reverse the string index to original values. MaxAbsScaler: Rescales the data to be within the [-1, 1] range (thus, does not shift the center of the data). MinMaxScaler: Similar to the MaxAbsScaler with the difference that it scales the data to be in the [0.0, 1.0] range. NGram: The method that takes a list of tokenized text and returns n-grams: pairs, triples, or n-mores of subsequent words. For example, if you had a ['good', 'morning', 'Robin', 'Williams'] vector you would get the following output: ['good morning', 'morning Robin', 'Robin Williams']. Normalizer: A method that scales the data to be of unit norm using the p-norm value (by default, it is L2). OneHotEncoder: A method that encodes a categorical column to a column of binary vectors. PCA: Performs the data reduction using principal component analysis. PolynomialExpansion: Performs a polynomial expansion of a vector. For example, if you had a vector symbolically written as [x, y, z], the method would produce the following expansion: [x, x*x, y, x*y, y*y, z, x*z, y*z, z*z]. QuantileDiscretizer: Similar to the Bucketizer method, but instead of passing the splits parameter you pass the numBuckets one. The method then decides, by calculating approximate quantiles over your data, what the splits should be. RegexTokenizer: String tokenizer using regular expressions. RFormula: For those of you who are avid R users - you can pass a formula such as vec ~ alpha * 3 + beta (assuming your DataFrame has the alpha and beta columns) and it will produce the vec column given the expression. SQLTransformer: Similar to the previous, but instead of R-like formulas you can use SQL syntax. The FROM statement should be selecting from __THIS__ indicating you are accessing the DataFrame. For example: SELECT alpha * 3 + beta AS vec FROM __THIS__. StandardScaler: Standardizes the column to have 0 mean and standard deviation equal to 1. StopWordsRemover: Removes stop words (such as 'the' or 'a') from a tokenized text. StringIndexer: Given a list of all the words in a column, it will produce a vector of indices. Tokenizer: Default tokenizer that converts the string to lower case and then splits on space(s). VectorAssembler: A highly useful transformer that collates multiple numeric (vectors included) columns into a single column with a vector representation. For example, if you had three columns in your DataFrame: df = spark.createDataFrame( [(12, 10, 3), (1, 4, 2)], ['a', 'b', 'c']) The output of calling: ft.VectorAssembler(inputCols=['a', 'b', 'c'], outputCol='features') .transform(df) .select('features') .collect() It would look as follows: [Row(features=DenseVector([12.0, 10.0, 3.0])), Row(features=DenseVector([1.0, 4.0, 2.0]))] VectorIndexer: A method for indexing categorical column into a vector of indices. It works in a column-by-column fashion, selecting distinct values from the column, sorting and returning an index of the value from the map instead of the original value. VectorSlicer: Works on a feature vector, either dense or sparse: given a list of indices it extracts the values from the feature vector. Word2Vec: The method takes a sentence (string) as an input and transforms it into a map of {string, vector} format, a representation that is useful in natural language processing. Note that there are many methods in the ML package that have an E letter next to it; this means the method is currently in beta (or Experimental) and it sometimes might fail or produce erroneous results. Beware. Estimators Estimators can be thought of as statistical models that need to be estimated to make predictions or classify your observations. If deriving from the abstract Estimator class, the new model has to implement the .fit(...) method that fits the model given the data found in a DataFrame and some default or user-specified parameters. There are a lot of estimators available in PySpark and we will now shortly describe the models available in Spark 2.0. Classification The ML package provides a data scientist with seven classification models to choose from. These range from the simplest ones (such as Logistic Regression) to more sophisticated ones. We will provide short descriptions of each of them in the following section: LogisticRegression: The benchmark model for classification. The logistic regression uses logit function to calculate the probability of an observation belonging to a particular class. At the time of writing, the PySpark ML supports only binary classification problems. DecisionTreeClassifier: A classifier that builds a decision tree to predict a class for an observation. Specifying the maxDepth parameter limits the depth the tree grows, the minInstancePerNode determines the minimum number of observations in the tree node required to further split, the maxBins parameter specifies the maximum number of bins the continuous variables will be split into, and the impurity specifies the metric to measure and calculate the information gain from the split. GBTClassifier: A Gradient Boosted Trees classification model for classification. The model belongs to the family of ensemble models: models that combine multiple weak predictive models to form a strong one. At the moment the GBTClassifier model supports binary labels, and continuous and categorical features. RandomForestClassifier: The models produce multiple decision trees (hence the name - forest) and use the mode output of those decision trees to classify observations. The RandomForestClassifier supports both binary and multinomial labels. NaiveBayes: Based on the Bayes' theorem, this model uses conditional probability theory to classify observations. The NaiveBayes model in PySpark ML supports both binary and multinomial labels. MultilayerPerceptronClassifier: A classifier that mimics the nature of a human brain. Deeply rooted in the Artificial Neural Networks theory, the model is a black-box, that is, it is not easy to interpret the internal parameters of the model. The model consists, at a minimum, of three, fully connected layers (a parameter that needs to be specified when creating the model object) of artificial neurons: the input layer (that needs to be equal to the number of features in your dataset), a number of hidden layers (at least one), and an output layer with the number of neurons equal to the number of categories in your label. All the neurons in the input and hidden layers have a sigmoid activation function, whereas the activation function of the neurons in the output layer is softmax. OneVsRest: A reduction of a multiclass classification to a binary one. For example, in the case of a multinomial label, the model can train multiple binary logistic regression models. For example, if label == 2 the model will build a logistic regression where it will convert the label == 2 to 1 (or else label values would be set to 0) and then train a binary model. All the models are then scored and the model with the highest probability wins. Regression There are seven models available for regression tasks in the PySpark ML package. As with classification, these range from some basic ones (such as obligatory Linear Regression) to more complex ones: AFTSurvivalRegression: Fits an Accelerated Failure Time regression model; It is a parametric model that assumes that a marginal effect of one of the features accelerates or decelerates a life expectancy (or process failure). It is highly applicable for the processes with well-defined stages. DecisionTreeRegressor: Similar to the model for classification with an obvious distinction that the label is continuous instead of binary (or multinomial). GBTRegressor: As with the DecisionTreeRegressor, the difference is the data type of the label. GeneralizedLinearRegression: A family of linear models with differing kernel functions (link functions). In contrast to the linear regression that assumes normality of error terms, the GLM allows the label to have different error term distributions: the GeneralizedLinearRegression model from the PySpark ML package supports gaussian, binomial, gamma, and poisson families of error distributions with a host of different link functions. IsotonicRegression: A type of regression that fits a free-form, non-decreasing line to your data. It is useful to fit the datasets with ordered and increasing observations. LinearRegression: The most simple of regression models, assumes linear relationship between features and a continuous label, and normality of error terms. RandomForestRegressor: Similar to either DecisionTreeRegressor or GBTRegressor, the RandomForestRegressor fits a continuous label instead of a discrete one. Clustering Clustering is a family of unsupervised models that is used to find underlying patterns in your data. The PySpark ML package provides four most popular models at the moment: BisectingKMeans: A combination of k-means clustering method and hierarchical clustering. The algorithm begins with all observations in a single cluster and iteratively splits the data into k clusters. Check out this website for more information on pseudo-algorithm: http://minethedata.blogspot.com/2012/08/bisecting-k-means.html. KMeans: It is the famous k-mean algorithm that separates data into k clusters, iteratively searching for centroids that minimize the sum of square distances between each observation and the centroid of the cluster it belongs to. GaussianMixture: This method uses k Gaussian distributions with unknown parameters to dissect the dataset. Using the Expectation-Maximization algorithm, the parameters for the Gaussians are found by maximizing the log-likelihood function. Beware that for datasets with many features this model might perform poorly due to the curse of dimensionality and numerical issues with Gaussian distributions. LDA: This model is used for topic modeling in natural language processing applications. There is also one recommendation model available in PySpark ML, but I will refrain from describing it here. Pipeline A Pipeline in PySpark ML is a concept of an end-to-end transformation-estimation process (with distinct stages) that ingests some raw data (in a DataFrame form), performs necessary data carpentry (transformations), and finally estimates a statistical model (estimator). A Pipeline can be purely transformative, that is, consisting of Transformers only. A Pipeline can be thought of as a chain of multiple discrete stages. When a .fit(...) method is executed on a Pipeline object, all the stages are executed in the order they were specified in the stages parameter; the stages parameter is a list of Transformer and Estimator objects. The .fit(...) method of the Pipeline object executes the .transform(...) method for the Transformers and the .fit(...) method for the Estimators. Normally, the output of a preceding stage becomes the input for the following stage: when deriving from either the Transformer or Estimator abstract classes, one needs to implement the .getOutputCol() method that returns the value of the outputCol parameter specified when creating an object. Predicting chances of infant survival with ML In this section, we will use the portion of the dataset to present the ideas of PySpark ML. If you have not yet downloaded the data, it can be accessed here: http://www.tomdrabas.com/data/LearningPySpark/births_transformed.csv.gz. In this section, we will, once again, attempt to predict the chances of the survival of an infant. Loading the data First, we load the data with the help of the following code: import pyspark.sql.types as typ labels = [ ('INFANT_ALIVE_AT_REPORT', typ.IntegerType()), ('BIRTH_PLACE', typ.StringType()), ('MOTHER_AGE_YEARS', typ.IntegerType()), ('FATHER_COMBINED_AGE', typ.IntegerType()), ('CIG_BEFORE', typ.IntegerType()), ('CIG_1_TRI', typ.IntegerType()), ('CIG_2_TRI', typ.IntegerType()), ('CIG_3_TRI', typ.IntegerType()), ('MOTHER_HEIGHT_IN', typ.IntegerType()), ('MOTHER_PRE_WEIGHT', typ.IntegerType()), ('MOTHER_DELIVERY_WEIGHT', typ.IntegerType()), ('MOTHER_WEIGHT_GAIN', typ.IntegerType()), ('DIABETES_PRE', typ.IntegerType()), ('DIABETES_GEST', typ.IntegerType()), ('HYP_TENS_PRE', typ.IntegerType()), ('HYP_TENS_GEST', typ.IntegerType()), ('PREV_BIRTH_PRETERM', typ.IntegerType()) ] schema = typ.StructType([ typ.StructField(e[0], e[1], False) for e in labels ]) births = spark.read.csv('births_transformed.csv.gz', header=True, schema=schema) We specify the schema of the DataFrame; our severely limited dataset now only has 17 columns. Creating transformers Before we can use the dataset to estimate a model, we need to do some transformations. Since statistical models can only operate on numeric data, we will have to encode the BIRTH_PLACE variable. Before we do any of this, since we will use a number of different feature transformations. Let's import them all: import pyspark.ml.feature as ft To encode the BIRTH_PLACE column, we will use the OneHotEncoder method. However, the method cannot accept StringType columns - it can only deal with numeric types so first we will cast the column to an IntegerType: births = births .withColumn( 'BIRTH_PLACE_INT', births['BIRTH_PLACE'] .cast(typ.IntegerType())) Having done this, we can now create our first Transformer: encoder = ft.OneHotEncoder( inputCol='BIRTH_PLACE_INT', outputCol='BIRTH_PLACE_VEC') Let's now create a single column with all the features collated together. We will use the VectorAssembler method: featuresCreator = ft.VectorAssembler( inputCols=[ col[0] for col in labels[2:]] + [encoder.getOutputCol()], outputCol='features' ) The inputCols parameter passed to the VectorAssembler object is a list of all the columns to be combined together to form the outputCol - the 'features'. Note that we use the output of the encoder object (by calling the .getOutputCol() method), so we do not have to remember to change this parameter's value should we change the name of the output column in the encoder object at any point. It's now time to create our first estimator. Creating an estimator In this example, we will (once again) use the Logistic Regression model. However, we will showcase some more complex models from the .classification set of PySpark ML models, so we load the whole section: import pyspark.ml.classification as cl Once loaded, let's create the model by using the following code: logistic = cl.LogisticRegression( maxIter=10, regParam=0.01, labelCol='INFANT_ALIVE_AT_REPORT') We would not have to specify the labelCol parameter if our target column had the name 'label'. Also, if the output of our featuresCreator would not be called 'features' we would have to specify the featuresCol by (most conveniently) calling the getOutputCol() method on the featuresCreator object. Creating a pipeline All that is left now is to create a Pipeline and fit the model. First, let's load the Pipeline from the ML package: from pyspark.ml import Pipeline Creating Pipeline is really easy. Here's how our pipeline should look like conceptually: Converting this structure into a Pipeline is a walk in the park: pipeline = Pipeline(stages=[ encoder, featuresCreator, logistic ]) That's it! Our pipeline is now created so we can (finally!) estimate the model. Fitting the model Before you fit the model we need to split our dataset into training and testing datasets. Conveniently, the DataFrame API has the .randomSplit(...) method: births_train, births_test = births .randomSplit([0.7, 0.3], seed=666) The first parameter is a list of dataset proportions that should end up in, respectively, births_train and births_test subsets. The seed parameter provides a seed to the randomizer. You can also split the dataset into more than two subsets as long as the elements of the list sum up to 1, and you unpack the output into as many subsets. For example, we could split the births dataset into three subsets like this: train, test, val = births. randomSplit([0.7, 0.2, 0.1], seed=666) The preceding code would put a random 70% of the births dataset into the train object, 20% would go to the test, and the val DataFrame would hold the remaining 10%. Now it is about time to finally run our pipeline and estimate our model: model = pipeline.fit(births_train) test_model = model.transform(births_test) The .fit(...) method of the pipeline object takes our training dataset as an input. Under the hood, the births_train dataset is passed first to the encoder object. The DataFrame that is created at the encoder stage then gets passed to the featuresCreator that creates the 'features' column. Finally, the output from this stage is passed to the logistic object that estimates the final model. The .fit(...) method returns the PipelineModel object (the model object in the preceding snippet) that can then be used for prediction; we attain this by calling the .transform(...) method and passing the testing dataset created earlier. Here's what the test_model looks like in the following command: test_model.take(1) It generates the following output: As you can see, we get all the columns from the Transfomers and Estimators. The logistic regression model outputs several columns: the rawPrediction is the value of the linear combination of features and the β coefficients, probability is the calculated probability for each of the classes, and finally, the prediction, which is our final class assignment. Evaluating the performance of the model Obviously, we would like to now test how well our model did. PySpark exposes a number of evaluation methods for classification and regression in the .evaluation section of the package: import pyspark.ml.evaluation as ev We will use the BinaryClassficationEvaluator to test how well our model performed: evaluator = ev.BinaryClassificationEvaluator( rawPredictionCol='probability', labelCol='INFANT_ALIVE_AT_REPORT') The rawPredictionCol can either be the rawPrediction column produced by the estimator or the probability. Let's see how well our model performed: print(evaluator.evaluate(test_model, {evaluator.metricName: 'areaUnderROC'})) print(evaluator.evaluate(test_model, {evaluator.metricName: 'areaUnderPR'})) The preceding code produces the following result: The area under the ROC of 74% and area under PR 71% shows a well-defined model, but nothing out of extraordinary; if we had other features, we could drive this up. Saving the model PySpark allows you to save the Pipeline definition for later use. It not only saves the pipeline structure, but also all the definitions of all the Transformers and Estimators: pipelinePath = './infant_oneHotEncoder_Logistic_Pipeline' pipeline.write().overwrite().save(pipelinePath) So, you can load it up later and use it straight away to .fit(...) and predict: loadedPipeline = Pipeline.load(pipelinePath) loadedPipeline .fit(births_train) .transform(births_test) .take(1) The preceding code produces the same result (as expected): Summary Hence we studied ML package. We explained what Transformer and Estimator are, and showed their role in another concept introduced in the ML library: the Pipeline. Subsequently, we also presented how to use some of the methods to fine-tune the hyper parameters of models. Finally, we gave some examples of how to use some of the feature extractors and models from the library. Resources for Article: Further resources on this subject: Package Management [article] Everything in a Package with concrete5 [article] Writing a Package in Python [article]
Read more
  • 0
  • 0
  • 2786
Unlock access to the largest independent learning library in Tech for FREE!
Get unlimited access to 7500+ expert-authored eBooks and video courses covering every tech area you can think of.
Renews at €18.99/month. Cancel anytime
article-image-visualization-dashboard-design
Packt
10 Jan 2017
18 min read
Save for later

Visualization Dashboard Design

Packt
10 Jan 2017
18 min read
In this article by David Baldwin, the author of the book Mastering Tableau, we will cover how you need to create some effective dashboards. (For more resources related to this topic, see here.) Since that fateful week in Manhattan, I've read Edward Tufte, Stephen Few, and other thought leaders in the data visualization space. This knowledge has been very fruitful. For instance, quite recently a colleague told me that one of his clients thought a particular dashboard had too many bar charts and he wanted some variation. I shared the following two quotes: Show data variation, not design variation. –Edward Tufte in The Visual Display of Quantitative Information Variety might be the spice of life, but, if it is introduced on a dashboard for its own sake, the display suffers. –Stephen Few in Information Dashboard Design Those quotes proved helpful for my colleague. Hopefully the following information will prove helpful to you. Additionally I would also like to draw attention to Alberto Cairo—a relatively new voice providing new insight. Each of these authors should be considered a must-read for anyone working in data visualization. Visualization design theory Dashboard design Sheet selection Visualization design theory Any discussion on designing dashboards should begin with information about constructing well-designed content. The quality of the dashboard layout and the utilization of technical tips and tricks do not matter if the content is subpar. In other words we should consider the worksheets displayed on dashboards and ensure that those worksheets are well-designed. Therefore, our discussion will begin with a consideration of visualization design principles. Regarding these principles, it's tempting to declare a set of rules such as: To plot change over time, use a line graph To show breakdowns of the whole, use a treemap To compare discrete elements, use a bar chart To visualize correlation, use a scatter plot But of course even a cursory review of the preceding list brings to mind many variations and alternatives! Thus, we will consider various rules while always keeping in mind that rules (at least rules such as these) are meant to be broken. Formatting rules The following formatting rules encompass fonts, lines, and bands. Fonts are, of course, an obvious formatting consideration. Lines and bands, however, may not be something you typically think of when formatting, especially when considering formatting from the perspective of Microsoft Word. But if we broaden formatting considerations to think of Adobe Illustrator, InDesign, and other graphic design tools, then lines and bands are certainly considered. This illustrates that data visualization is closely related to graphic design and that formatting considers much more than just textual layout. Rule – keep the font choice simple Typically using one or two fonts on a dashboard is advisable. More fonts can create a confusing environment and interfere with readability. Fonts chosen for titles should be thick and solid while the body fonts should be easy to read. As of Tableau 10.0 choosing appropriate fonts is simple because of the new Tableau Font Family. Go to Format | Font to display the Format Font window to see and choose these new fonts: Assuming your dashboard is primarily intended for the screen, sans serif fonts are best. On the rare occasions a dashboard is primarily intended for print, you may consider serif fonts; particularly if the print resolution is high. Rule – Trend line > Fever line > Reference line > Drop line > Zero line > Grid line The preceding pseudo formula is intended to communicate line visibility. For example, trend line visibility should be greater than fever line visibility. Visibility is usually enhanced by increasing line thickness but may be enhanced via color saturation or by choosing a dotted or dashed line over a solid line. The trend line, if present, is usually the most visible line on the graph. Trend lines are displayed via the Analytics pane and can be adjusted via Format à Lines. The fever line (for example, the line used on a time-series chart) should not be so heavy as to obscure twists and turns in the data. Although a fever line may be displayed as dotted or dashed by utilizing the Pages shelf, this is usually not advisable because it may obscure visibility. The thickness of a fever line can be adjusted by clicking on the Size shelf in the Marks View card. Reference lines are usually less prevalent than either fever or trend lines and can be formatted by going to Format | Reference lines. Drop lines are not frequently used. To deploy drop lines, right-click in a blank portion of the view and go to Drop lines | Show drop lines. Next, click on a point in the view to display a drop line. To format droplines, go to Format | Droplines. Drop lines are relevant only if at least one axis is utilized in the visualization. Zero lines (sometimes referred to as base lines) display only if zero or negative values are included in the view or positive numerical values are relatively close to zero. Format zero lines by going to Format | Lines. Grid lines should be the most muted lines on the view and may be dispensed with altogether. Format grid lines by going to Format | Lines. Rule – band in groups of three to five Visualizations comprised of a tall table of text or horizontal bars should segment dimension members in groups of three to five. Exercise – banding Navigate to https://public.tableau.com/profile/david.baldwin#!/ to locate and download the workbook. Navigate to the worksheet titled Banding. Select the Superstore data source and place Product Name on the Rows shelf. Double-click on Discount, Profit, Quantity, and Sales. Navigate to Format | Shading and set Band Size under Row Banding so that three to five lines of text are encompassed by each band. Be sure to set an appropriate color for both Pane and Header: Note that after completing the preceding five steps, Tableau defaulted to banding every other row. This default formatting is fine for a short table but is quite busy for a tall table. The band in groups of three to five rule is influenced by Dona W. Wong, who, in her book The Wall Street Journal Guide to Information Graphics, recommends separating long tables or bar charts with thin rules to separate the bars in groups of three to five to help the readers read across. Color rules It seems slightly ironic to discuss color rules in a black-and-white publication such as Mastering Tableau. Nonetheless, even in a monochromatic setting, a discussion of color is relevant. For example, exclusive use of black text communicates differently than using variations of gray. The following survey of color rules should be helpful to ensure that you use colors effectively in a variety of settings. Rule – keep colors simple and limited Stick to the basic hues and provide only a few (perhaps three to five) hue variations. Alberto Cairo, in his book The Functional Art: An Introduction to Information Graphics and Visualization, provides insights into why this is important. The limited capacity of our visual working memory helps explain why it's not advisable to use more than four or five colors or pictograms to identify different phenomena on maps and charts. Rule – respect the psychological implication of colors In Western society, there is a color vocabulary so pervasive, it's second nature. Exit signs marking stairwell locations are red. Traffic cones are orange. Baby boys are traditionally dressed in blue while baby girls wear pink. Similarly, in Tableau reds and oranges should usually be associated with negative performance while blues and greens should be associated with positive performance. Using colors counterintuitively can cause confusion. Rule – be colorblind-friendly Colorblindness is usually manifested as an inability to distinguish red and green or blue and yellow. Red/green and blue/yellow are on opposite sides of the color wheel. Consequently, the challenges these color combinations present for colorblind individuals can be easily recreated with image editing software such as Photoshop. If you are not colorblind, convert an image with these color combinations to grayscale and observe. The challenge presented to the 8.0% of the males and 0.5% of the females who are color blind becomes immediately obvious! Rule – use pure colors sparingly The resulting colors from the following exercise should be a very vibrant red, green, and blue. Depending on the monitor, you may even find it difficult to stare directly at the colors. These are known as pure colors and should be used sparingly; perhaps only to highlight particularly important items. Exercise – using pure colors Open the workbook and navigate to the worksheet entitled Pure Colors. Select the Superstore data source and place Category on both the Rows shelf and the Color shelf. Set the Fit to Entire View. Click on the Color shelf and choose Edit Colors…. In the Edit Colors dialog box, double-click on the color icons to the left of each dimension member; that is, Furniture, Office Supplies, and Technology: Within the resulting dialog box, set furniture to an HTML value of #0000ff, Office Supplies to #ff0000, and Technology to #00ff00. Rule – color variations over symbol variation Deciphering different symbols takes more mental energy for the end user than distinguishing color. Therefore color variation should be used over symbol variation. This rule can actually be observed in Tableau defaults. Create a scatter plot and place a dimension with many members on the Color shelf and Shape shelf respectively. Note that by default, the view will display 20 unique colors but only 10 unique shapes. Older versions of Tableau (such as Tableau 9.0) display warnings that include text such as “…the recommended maximum for this shelf is 10”: Visualization type rules We won't spend time here to delve into a lengthy list of visualization type rules. However, it does seem appropriate to review at least a couple of rules. In the following exercise, we will consider keeping shapes simple and effectively using pie charts. Rule – keep shapes simple Too many shape details impede comprehension. This is because shape details draw the user's focus away from the data. Consider the following exercise on using two different shopping cart images. Exercise – shapes Open the workbook associated and navigate to the worksheet entitled Simple Shopping Cart. Note that the visualization is a scatterplot showing the top 10 selling Sub-Categories in terms of total sales and profits. On your computer, navigate to the Shapes directory located in the My Tableau Repository. On my computer, the path is C:UsersDavid BaldwinDocumentsMy Tableau RepositoryShapes. Within the Shapes directory, create a folder named My Shapes. Reference the link included in the comment section of the worksheet to download the assets. In the downloaded material, find the images titled Shopping_Cart and Shopping_Cart_3D and copy those images into the My Shapes directory created previously. Within Tableau, access the Simple Shopping Cart worksheet. Click on the Shape shelf and then select More Shapes. Within the Edit Shape dialog box, click on the Reload Shapes button. Select the My Shapes palette and set the shape to the simple shopping cart. After closing the dialog box, click on the Size shelf and adjust as desired. Also adjust other aspects of the visualization as desired. Navigate to the 3D Shopping Cart worksheet and then repeat steps 8 to 11. Instead of using the simple shopping cart, use the 3D shopping cart: Compare the two visualizations. Which version of the shopping cart is more attractive? Likely the cart with the 3D look was your choice. Why not choose the more attractive image? Making visualizations attractive is only of secondary concern. The primary goal is to display the data as clearly and efficiently as possible. A simple shape is grasped more quickly and intuitively than a complex shape. Besides, the cuteness of the 3D image will quickly wear off. Rule – use pie charts sparingly Edward Tufte makes an acrid (and somewhat humorous) comment against the use of pie charts in his book The Visual Display of Quantitative Information. A table is nearly always better than a dumb pie chart; the only worse design than a pie chart is several of them. Given their low density and failure to order numbers along a visual dimension, pie charts should never be used. The present sentiment in data visualization circles is largely sympathetic to Tufte's criticism. There may, however, be some exceptions; that is, some circumstances where a pie chart is optimal. Consider the following visualization: Which of the four visualizations best demonstrates that A accounts for 25% of the whole? Clearly it is the pie chart! Therefore, perhaps it is fairer to refer to pie charts as limited and to use them sparingly as opposed to considering them inherently evil. Compromises In this section, we will transition from more or less strict rules to compromises. Often, building visualizations is a balancing act. It's common to encounter contradictory directions from books, blogs, consultants, and within organizations. One person may insist on utilizing every pixel of space while another urges simplicity and whitespace. One counsels a guided approach while another recommends building wide open dashboards that allow end users to discover their own path. Avant gardes may crave esoteric visualizations while those of a more conservative bent prefer to stay with the conventional. We now explore a few of the more common competing requests and suggests compromises. Make the dashboard simple versus make the dashboard robust Recently a colleague showed me a complex dashboard he had just completed. Although he was pleased that he had managed to get it working well, he felt the need to apologize by saying, “I know it's dense and complex but it's what the client wanted.” Occam's Razor encourages the simplest possible solution for any problem. For my colleague's dashboard, the simplest solution was rather complex. This is OK! Complexity in Tableau dashboarding need not be shunned. But a clear understanding of some basic guidelines can help the author intelligently determine how to compromise between demands for simplicity and demands for robustness. More frequent data updates necessitate simpler design. Some Tableau dashboards may be near-real-time. Third-party technology may be utilized to force a browser displaying a dashboard via Tableau Server to refresh every few minutes to ensure the absolute latest data displays. In such cases, the design should be quite simple. The end user must be able to see at a glance all pertinent data and should not use that dashboard for extensive analysis. Conversely, a dashboard that is refreshed monthly can support high complexity and thus may be used for deep exploration. Greater end user expertise supports greater dashboard complexity. Know thy users. If they want easy, at-a-glance visualizations, keep the dashboards simple. If they like deep dives, design accordingly. Smaller audiences require more precise design. If only a few people monitor a given dashboard, it may require a highly customized approach. In such cases, specifications may be detailed, complex, and difficult to execute and maintain because the small user base has expectations that may not be natively easy to produce in Tableau. Screen resolution and visualization complexity are proportional. Users with low-resolution devices will need to interact fairly simply with a dashboard. Thus the design of such a dashboard will likely be correspondingly uncomplicated. Conversely, high-resolution devices support greater complexity. Greater distance from the screen requires larger dashboard elements. If the dashboard is designed for conference room viewing, the elements on the dashboard may need to be fairly large to meet the viewing needs of those far from the screen. Thus the dashboard will likely be relatively simple. Conversely, a dashboard to be viewed primarily on end users desktops can be more complex. Although these points are all about simple versus complex, do not equate simple with easy. A simple and elegantly designed dashboard can be more difficult to create than a complex dashboard. In the words of Steve Jobs: Simple can be harder than complex: You have to work hard to get your thinking clean to make it simple. But it's worth it in the end because once you get there, you can move mountains. Present dense information versus present sparse information Normally, a line graph should have a maximum of four to five lines. However, there are times when you may wish to display many lines. A compromise can be achieved by presenting many lines and empowering the end user to highlight as desired. The following line graph displays the percentage of Internet usage by country from 2000 to 2012. Those countries with the largest increases have been highlighted. Assuming that Highlight Selected Items has been activated within the Color legend, the end user can select items (countries in this case) from the legend to highlight as desired. Or, even better, a worksheet can be created listing all countries and used in conjunction with a highlight action on a dashboard to focus attention on selected items on the line graph: Tell a story versus allow a story to be discovered Albert Cairo, in his excellent book The Functional Art: An Introduction to Information Graphics and Visualization, includes a section where he interviews prominent data visualization and information graphics professionals. Two of these interviews are remarkable for their opposing views. I… feel that many visualization designers try to transform the user into an editor.  They create these amazing interactive tools with tons of bubbles, lines, bars, filters, and scrubber bars, and expect readers to figure the story out by themselves, and draw conclusions from the data. That's not an approach to information graphics I like. – Jim Grimwade The most fascinating thing about the rise of data visualization is exactly that anyone can explore all those large data sets without anyone telling us what the key insight is. – Moritz Stefaner Fortunately, the compromise position can be found in the Jim Grimwade interview: [The New York Times presents] complex sets of data, and they let you go really deep into the figures and their connections. But beforehand, they give you some context, some pointers as to what you can do with those data. If you don't do this… you will end up with a visualization that may look really beautiful and intricate, but that will leave readers wondering, What has this thing really told me? What is this useful for? – Jim Grimwade Although the case scenarios considered in the preceding quotes are likely quite different from the Tableau work you are involved in, the underlying principles remain the same. You can choose to tell a story or build a platform that allows the discovery of numerous stories. Your choice will differ depending on the given dataset and audience. If you choose to create a platform for story discovery, be sure to take the New York Times approach suggested by Grimwade. Provide hints, pointers, and good documentation to lead your end user to successfully interact with the story you wish to tell or successfully discover their own story. Document, Document, Document! But don't use any space! Immediately above we considered the suggestion Provide hints, pointers, and good documentation… but there's an issue. These things take space. Dashboard space is precious. Often Tableau authors are asked to squeeze more and more stuff on a dashboard and are hence looking for ways to conserve space. Here are some suggestions for maximizing documentation on a dashboard while minimally impacting screen real estate. Craft titles for clear communication Titles are expected. Not just a title for a dashboard and worksheets on the dashboard, but also titles for legends, filters and other objects. These titles can be used for effective and efficient documentation. For instance a filter should not just read Market. Instead it should say something like Select a Market. Notice the imperative statement. The user is being told to do something and this is a helpful hint. Adding a couple of words to a title will usually not impact dashboard space. Use subtitles to relay instructions A subtitle will take some extra space but it does not have to be much. A small, italicized font immediately underneath a title is an obvious place a user will look at for guidance. Consider an example: red represents loss. This short sentence could be used as a subtitle that may eliminate the need for a legend and thus actually save space. Use intuitive icons Consider a use case of navigating from one dashboard to another. Of course you could associate an action with some hyperlinked text stating Click here to navigate to another dashboard. But this seems quite unnecessary when an action can be associated with a small, innocuous arrow, such as is natively used in PowerPoint, to communicate the same thing. Store more extensive documentation in a tooltip associated with a help icon. A small question mark in the top-right corner of an application is common. This clearly communicates where to go if additional help is required. As shown in the following exercise, it's easy to create a similar feature on a Tableau dashboard. Summary Hence from this article we studied to create some effective dashboards that are very beneficial in corporate world as a statistical tool to calculate average growth in terms of revenue. Resources for Article: Further resources on this subject: Say Hi to Tableau [article] Tableau Data Extract Best Practices [article] Getting Started with Tableau Public [article]
Read more
  • 0
  • 0
  • 3264

article-image-elastic-stack-overview
Packt
10 Jan 2017
9 min read
Save for later

Elastic Stack Overview

Packt
10 Jan 2017
9 min read
In this article by Ravi Kumar Gupta and Yuvraj Gupta, from the book, Mastering Elastic Stack, we will have an overview of Elastic Stack, it's very easy to read a log file of a few MBs or hundreds, so is it to keep data of this size in databases or files and still get sense out of it. But then a day comes when this data takes terabytes and petabytes, and even notepad++ would refuse to open a data file of a few hundred MBs. Then we start to find something for huge log management, or something that can index the data properly and make sense out of it. If you Google this, you would stumble upon ELK Stack. Elasticsearch manages your data, Logstash reads the data from different sources, and Kibana makes a fine visualization of it. Recently, ELK Stack has evolved as Elastic Stack. We will get to know about it in this article. The following are the points that will be covered in this article: Introduction to ELK Stack The birth of Elastic Stack Who uses the Stack (For more resources related to this topic, see here.) Introduction to ELK Stack It all began with Shay Banon, who started it as an open source project, Elasticsearch, successor of Compass, which gained popularity to become one of the top open source database engines. Later, based on the distributed model of working, Kibana was introduced to visualize the data present in Elasticsearch. Earlier, to put data into Elasticsearch we had rivers, which provided us with a specific input via which we inserted data into Elasticsearch. However, with growing popularity this setup required a tool via which we can insert data into Elasticsearch and have flexibility to perform various transformations on data to make unstructured data structured to have full control on how to process the data. Based on this premise, Logstash was born, which was then incorporated into the Stack, and together these three tools, Elasticsearch, Logstash, and Kibana were named ELK Stack. The following diagram is a simple data pipeline using ELK Stack: As we can see from the preceding figure, data is read using Logstash and indexed to Elasticsearch. Later we can use Kibana to read the indices from Elasticsearch and visualize it using charts and lists. Let's understand these components separately and the role they play in the making of the Stack. Logstash As we got to know that rivers were used initially to put data into Elasticsearch before ELK Stack. For ELK Stack, Logstash is the entry point for all types of data. Logstash has so many plugins to read data from a number of sources and so many output plugins to submit data to a variety of destinations and one of those is the Elasticsearch plugin, which helps to send data to Elasticsearch. After Logstash became popular, eventually rivers got deprecated as they made the cluster unstable and also performance issues were observed. Logstash does not ship data from one end to another; it helps us with collecting raw data and modifying/filtering it to convert it to something meaningful, formatted, and organized. The updated data is then sent to Elasticsearch. If there is no plugin available to support reading data from a specific source, or writing the data to a location, or modifying it in your way, Logstash is flexible enough to allow you to write your own plugins. Simply put, Logstash is open source, highly flexible, rich with plugins, can read your data from your choice of location, normalizes it as per your defined configurations, and sends it to a particular destination as per the requirements. Elasticsearch All of the data read by Logstash is sent to Elasticsearch for indexing. There is a lot more than just indexing. Elasticsearch is not only used to index data, but it is a full-text search engine, highly scalable, distributed, and offers many more things. Elasticsearch manages and maintains your data in the form of indices, offers you to query, access, and aggregate the data using its APIs. Elasticsearch is based on Lucene, thus providing you all of the features that Lucene does. Kibana Kibana uses Elasticsearch APIs to read/query data from Elasticsearch indices to visualize and analyze in the form of charts, graphs and tables. Kibana is in the form of a web application, providing you a highly configurable user interface that lets you query the data, create a number of charts to visualize, and make actual sense out of the data stored. After a robust ELK Stack, as time passed, a few important and complex demands took place, such as authentication, security, notifications, and so on. This demand led for few other tools such as Watcher (providing alerting and notification based on changes in data), Shield (authentication and authorization for securing clusters), Marvel (monitoring statistics of the cluster), ES-Hadoop, Curator, and Graph as requirement arose. The birth of Elastic Stack All the jobs of reading data were done using Logstash, but that's resource consuming. Since Logstash runs on JVM, it consumes a good amount of memory. The community realized the need of improvement and to make the pipelining process resource friendly and lightweight. Back in 2015, Packetbeat was born, a project which was an effort to make a network packet analyzer that could read from different protocols, parse the data, and ship to Elasticsearch. Being lightweight in nature did the trick and a new concept of Beats was formed. Beats are written in Go programming language. The project evolved a lot, and now ELK stack was no more than just Elasticsearch, Logstash, and Kibana, but Beats also became a significant component. The pipeline now looked as follows: Beat A Beat reads data, parses it, and can ship to either Elasticsearch or Logstash. The difference is that they are lightweight, serve a specific purpose, and are installed as agents. There are few beats available such as Topbeat, Filebeat, Packetbeat, and so on, which are supported and provided by the Elastic.co and a good number of Beats already written by the community. If you have a specific requirement, you can write your own Beat using the libbeat library. In simple words, Beats can be treated as very light weight agents to ship data to either Logstash or Elasticsearch and offer you an infrastructure using the libbeat library to create your own Beats. Together Elasticsearch, Logstash, Kibana, and Beats become Elastic Stack, formally known as ELK Stack. Elastic Stack did not just add Beats to its team, but they will be using the same version always. The starting version of the Elastic Stack will be 5.0.0 and the same version will apply to all the components. This version and release method is not only for Elastic Stack, but for other tools of the Elastic family as well. Due to so many tools, there was a problem of unification, wherein each tool had their own version and every version was not compatible with each other, hence leading to a problem. To solve this, now all of the tools will be built, tested, and released together. All of these components play a significant role in creating a pipeline. While Beats and Logstash are used to collect the data, parse it, and ship it, Elasticsearch creates indices, which is finally used by Kibana to make visualizations. While Elastic Stack helps with a pipeline, other tools add security, notifications, monitoring, and other such capabilities to the setup. Who uses Elastic Stack? In the past few years, implementations of Elastic Stack have been increasing very rapidly. In this section, we will consider a few case studies to understand how Elastic Stack has helped this development. Salesforce Salesforce developed a new plugin named ELF (Event Log Files) to collect Salesforce logged data to enable auditing of user activities. The purpose was to analyze the data to understand user behavior and trends in Salesforce. The plugin is available on GitHub at https://github.com/developerforce/elf_elk_docker. This plugin simplifies the Stack configuration and allows us to download ELF to get indexed and finally sensible data that can be visualized using Kibana. This implementation utilizes Elasticsearch, Logstash, and Kibana. CERN There is not just one use case that Elastic Stack helped CERN (European Organization for Nuclear Research), but five. At CERN, Elastic Stack is used for the following: Messaging Data monitoring Cloud benchmarking Infrastructure monitoring Job monitoring Multiple Kibana dashboards are used by CERN for a number of visualizations. Green Man Gaming This is an online gaming platform where game providers publish their games. The website wanted to make a difference by proving better gameplay. They started using Elastic Stack to do log analysis, search, and analysis of gameplay data. They began with setting up Kibana dashboards to gain insights about the counts of gamers by the country and currency used by gamers. That helped them to understand and streamline the support and help in order to provide an improved response. Apart from these case studies, Elastic Stack is used by a number of other companies to gain insights of the data they own. Sometimes, not all of the components are used, that is, not all of the times a Beat would be used and Logstash would be configured. Sometimes, only an Elasticsearch and Kibana combination is used. If we look at the users within the organization, all of the titles who are expected to do big data analysis, business intelligence, data visualizations, log analysis, and so on, can utilize Elastic Stack for their technical forte. A few of these titles are data scientists, DevOps, and so on. Stack competitors Well, it would be wrong to call for Elastic Stack competitors because Elastic Stack has been emerged as a strong competitor to many other tools in the market in recent years and is growing rapidly. A few of these are: Open source: Graylog: Visit https://www.graylog.org/ for more information InfluxDB: Visit https://influxdata.com/ for more information Others: Logscape: Visit http://logscape.com/ for more information Logscene: Visit http://sematext.com/logsene/ for more information Splunk: Visit http://www.splunk.com/ for more information Sumo Logic: Visit https://www.sumologic.com/ for more information Kibana competitors: Grafana: Visit http://grafana.org/ for more information Graphite: Visit https://graphiteapp.org/ for more information Elasticsearch competitors: Lucene/Solr: Visit http://lucene.apache.org/solr/ or https://lucene.apache.org/ for more information Sphinx: Visit http://sphinxsearch.com/ for more information Most of these compare with respect to log management, while Elastic Stack is much more than that. It offers you the ability to analyze any type of data and not just logs. Resources for Article: Further resources on this subject: AIO setup of OpenStack – preparing the infrastructure code environment [article] Our App and Tool Stack [article] Using the OpenStack Dashboard [article]
Read more
  • 0
  • 0
  • 3325

article-image-installing-and-using-vuejs
Packt
10 Jan 2017
14 min read
Save for later

Installing and Using Vue.js

Packt
10 Jan 2017
14 min read
In this article by Olga Filipova, the author of the book Learning Vue.js 2, explores the key concepts of Vue.js framework to understand all its behind the scenes. Also in this article, we will analyze all possible ways of installing Vue.js. We will also learn the ways of debugging and testing our applications. (For more resources related to this topic, see here.) So, in this article we are going to learn: What is MVVM architecture paradigm and how does it apply to Vue.js How to install, start, run, and debug Vue application MVVM architectural pattern Do you remember how to create the Vue instance? We were instantiating it calling new Vue({…}). You also remember that in the options we were passing the element on the page where this Vue instance should be bound and the data object that contained properties we wanted to bind to our view. The data object is our model and DOM element where Vue instance is bound is view. Classic View-Model representation where Vue instance binds one to another In the meantime, our Vue instance is something that helps to bind our model to the View and vice-versa. Our application thus follows Model-View-ViewModel (MVVM) pattern where the Vue instance is a ViewModel. The simplified diagram of Model View ViewModel pattern Our Model contains data and some business logic, our View is responsible for its representation. ViewModel handles data binding ensuring the data changed in the Model is immediately affecting the View layer and vice-versa. Our Views thus become completely data-driven. ViewModel becomes responsible for the control of data flow, making data binding fully declarative for us. Installing, using, and debugging a Vue.js application In this section, we will analyze all possible ways of installing Vue.js. We will also create a skeleton for our. We will also learn the ways of debugging and testing our applications. Installing Vue.js There are a number of ways to install Vue.js. Starting from classic including the downloaded script into HTML within <script> tags, using tools like bower, npm, or Vue's command-line interface (vue-cli) to bootstrap the whole application. Let's have a look at all these methods and choose our favorite. In all these examples we will just show a header on a page saying Learning Vue.js. Standalone Download the vue.js file. There are two versions, minified and developer version. The development version is here: https://vuejs.org/js/vue.js. The minified version is here: https://vuejs.org/js/vue.min.js. If you are developing, make sure you use the development non-minified version of Vue. You will love nice tips and warnings on the console. Then just include vue.js in the script tags: <script src=“vue.js”></script> Vue is registered in the global variable. You are ready to use it. Our example will then look as simple as the following: <div id="app"> <h1>{{ message }}</h1> </div> <script src="vue.js"></script> <script> var data = { message: "Learning Vue.js" }; new Vue({ el: "#app", data: data }); </script> CDN Vue.js is available in the following CDN's: jsdelivr: https://cdn.jsdelivr.net/vue/1.0.25/vue.min.js cdnjs: https://cdnjs.cloudflare.com/ajax/libs/vue/1.0.25/vue.min.js npmcdn: https://npmcdn.com/vue@1.0.25/dist/vue.min.js Just put the url in source in the script tag and you are ready to use Vue! <script src=“ https://cdnjs.cloudflare.com/ajax/libs/vue/1.0.25/vue.min.js”></script> Beware so, the CDN version might not be synchronized with the latest available version of Vue. Thus, the example will look like exactly the same as in the standalone version, but instead of using downloaded file in the <script> tags, we are using a CDN URL. Bower If you are already managing your application with bower and don't want to use other tools, there's also a bower distribution of Vue. Just call bower install: # latest stable release bower install vue Our example will look exactly like the two previous examples, but it will include the file from the bower folder: <script src=“bower_components/vue/dist/vue.js”></script> CSP-compliant CSP (content security policy) is a security standard that provides a set of rules that must be obeyed by the application in order to prevent security attacks. If you are developing applications for browsers, more likely you know pretty well about this policy! For the environments that require CSP-compliant scripts, there’s a special version of Vue.js here: https://github.com/vuejs/vue/tree/csp/dist Let’s do our example as a Chrome application to see the CSP compliant vue.js in action! Start from creating a folder for our application example. The most important thing in a Chrome application is the manifest.json file which describes your application. Let’s create it. It should look like the following: { "manifest_version": 2, "name": "Learning Vue.js", "version": "1.0", "minimum_chrome_version": "23", "icons": { "16": "icon_16.png", "128": "icon_128.png" }, "app": { "background": { "scripts": ["main.js"] } } } The next step is to create our main.js file which will be the entry point for the Chrome application. The script should listen for the application launching and open a new window with given sizes. Let’s create a window of 500x300 size and open it with index.html: chrome.app.runtime.onLaunched.addListener(function() { // Center the window on the screen. var screenWidth = screen.availWidth; var screenHeight = screen.availHeight; var width = 500; var height = 300; chrome.app.window.create("index.html", { id: "learningVueID", outerBounds: { width: width, height: height, left: Math.round((screenWidth-width)/2), top: Math.round((screenHeight-height)/2) } }); }); At this point the Chrome specific application magic is over and now we shall just create our index.html file that will do the same thing as in the previous examples. It will include the vue.js file and our script where we will initialize our Vue application: <html lang="en"> <head> <meta charset="UTF-8"> <title>Vue.js - CSP-compliant</title> </head> <body> <div id="app"> <h1>{{ message }}</h1> </div> <script src="assets/vue.js"></script> <script src="assets/app.js"></script> </body> </html> Download the CSP-compliant version of vue.js and add it to the assets folder. Now let’s create the app.js file and add the code that we already wrote added several times: var data = { message: "Learning Vue.js" }; new Vue({ el: "#app", data: data }); Add it to the assets folder. Do not forget to create two icons of 16 and 128 pixels and call them icon_16.png and icon_128.png. Your code and structure in the end should look more or less like the following: Structure and code for the sample Chrome application using vue.js And now the most important thing. Let’s check if it works! It is very very simple. Go to chrome://extensions/ url in your Chrome browser. Check Developer mode checkbox. Click on Load unpacked extension... and check the folder that we’ve just created. Your app will appear in the list! Now just open a new tab, click on apps, and check that your app is there. Click on it! Sample Chrome application using vue.js in the list of chrome apps Congratulations! You have just created a Chrome application! NPM NPM installation method is recommended for large-scale applications. Just run npm install vue: # latest stable release npm install vue # latest stable CSP-compliant release npm install vue@csp And then require it: var Vue = require(“vue”); Or, for ES2015 lovers: import Vue from “vue”; Our HTML in our example will look exactly like in the previous examples: <html lang="en"> <head> <meta charset="UTF-8"> <title>Vue.js - NPM Installation</title> </head> <body> <div id="app"> <h1>{{ message }}</h1> </div> <script src="main.js"></script> </body> </html> Now let’s create a script.js file that will look almost exactly the same as in standalone or CDN version with only difference that it will require vue.js: var Vue = require("vue"); var data = { message: "Learning Vue.js" }; new Vue({ el: "#app", data: data }); Let’s install vue and browserify in order to be able to compile our script.js into the main.js file: npm install vue –-save-dev npm install browserify –-save-dev In the package.json file add also a script for build that will execute browserify on script.js transpiling it into main.js. So our package.json file will look like this: { "name": "learningVue", "scripts": { "build": "browserify script.js -o main.js" }, "version": "0.0.1", "devDependencies": { "browserify": "^13.0.1", "vue": "^1.0.25" } } Now run: npm install npm run build And open index.html in the browser. I have a friend that at this point would say something like: really? So many steps, installations, commands, explanations… Just to output some header? I’m out! If you are also thinking this, wait. Yes, this is true, now we’ve done something really simple in a rather complex way, but if you stay with me a bit longer, you will see how complex things become easy to implement if we use the proper tools. Also, do not forget to check your Pomodoro timer, maybe it’s time to take a rest! Vue-cli Vue provides its own command line interface that allows bootstrapping single page applications using whatever workflows you want. It immediately provides hot reloading and structure for test driven environment. After installing vue-cli just run vue init <desired boilerplate> <project-name> and then just install and run! # install vue-cli $ npm install -g vue-cli # create a new project $ vue init webpack learn-vue # install and run $ cd learn-vue $ npm install $ npm run dev Now open your browser on localhost:8080. You just used vue-cli to scaffold your application. Let’s adapt it to our example. Open a source folder. In the src folder you will find an App.vue file. Do you remember we talked about Vue components that are like bricks from which you build your application? Do you remember that we were creating them and registering inside our main script file and I mentioned that we will learn to build components in more elegant way? Congratulations, you are looking at the component built in a fancy way! Find the line that says import Hello from './components/Hello'. This is exactly how the components are being reused inside other components. Have a look at the template at the top of the component file. At some point it contains the tag <hello></hello>. This is exactly where in our HTML file will appear the Hello component. Have a look at this component, it is in the src/components folder. As you can see, it contains a template with {{ msg }} and a script that exports data with defined msg. This is exactly the same what we were doing in our previous examples without using components. Let’s slightly modify the code to make it the same as in the previous examples. In the Hello.vue file change the msg in data object: <script> export default { data () { return { msg: “Learning Vue.js” } } } </script> In the App.vue component remove everything from the template except the hello tag, so the template looks like this: <template> <div id="app"> <hello></hello> </div> </template> Now if you rerun the application you will see our example with beautiful styles we didn’t touch: vue application bootstrapped using vue-cli Besides webpack boilerplate template you can use the following configurations with your vue-cli: webpack-simple: A simple Webpack + vue-loader setup for quick prototyping. browserify: A full-featured Browserify + vueify setup with hot-reload, linting and unit testing. browserify-simple: A simple Browserify + vueify setup for quick prototyping. simple: The simplest possible Vue setup in a single HTML file Dev build My dear reader, I can see your shining eyes and I can read your mind. Now that you know how to install and use Vue.js and how does it work, you definitely want to put your hands deeply into the core code and contribute! I understand you. For this you need to use development version of Vue.js which you have to download from GitHub and compile yourself. Let’s build our example with this development version vue. Create a new folder, for example, dev-build and copy all the files from the npm example to this folder. Do not forget to copy the node_modules folder. You should cd into it and download files from GitHub to it, then run npm install and npm run build. cd <APP-PATH>/node_modules git clone https://github.com/vuejs/vue.git cd vue npm install npm run build Now build our example application: cd <APP-PATH> npm install npm run build Open index.html in the browser, you will see the usual Learning Vue.js header. Let’s now try to change something in vue.js source! Go to the node_modules/vue/src folder. Open config.js file. The second line defines delimeters: let delimiters = ['{{', '}}'] This defines the delimiters used in the html templates. The things inside these delimiters are recognized as a Vue data or as a JavaScript code. Let’s change them! Let’s replace “{{” and “}}” with double percentage signs! Go on and edit the file: let delimiters = ['%%', '%%'] Now rebuild both Vue source and our application and refresh the browser. What do you see? After changing Vue source and replacing delimiters, {{}} delimeters do not work anymore! The message inside {{}} is no longer recognized as data that we passed to Vue. In fact, it is being rendered as part of HTML. Now go to the index.html file and replace our curly brackets delimiters with double percentage: <div id="app"> <h1>%% message %%</h1> </div> Rebuild our application and refresh the browser! What about now? You see how easy it is to change the framework’s code and to try out your changes. I’m sure you have plenty of ideas about how to improve or add some functionality to Vue.js. So change it, rebuild, test, deploy! Happy pull requests! Debug Vue application You can debug your Vue application the same way you debug any other web application. Use your developer tools, breakpoints, debugger statements, and so on. Vue also provides vue devtools so it gets easier to debug Vue applications. You can download and install it from the Chrome web store: https://chrome.google.com/webstore/detail/vuejs-devtools/nhdogjmejiglipccpnnnanhbledajbpd After installing it, open, for example, our shopping list application. Open developer tools. You will see the Vue tab has automatically appeared: Vue devtools In our case we only have one component—Root. As you can imagine, once we start working with components and having lots of them, they will all appear in the left part of the Vue devtools palette. Click on the Root component and inspect it. You’ll see all the data attached to this component. If you try to change something, for example, add a shopping list item, check or uncheck a checkbox, change the title, and so on, all these changes will be immediately propagated to the data in the Vue devtools. You will immediately see the changes on the right side of it. Let’s try, for example, to add shopping list item. Once you start typing, you see on the right how newItem changes accordingly: The changes in the models are immediately propagated to the Vue devtools data When we start adding more components and introduce complexity to our Vue applications, the debugging will certainly become more fun! Summary In this article we have analyzed the behind the scenes of Vue.js. We learned how to install Vue.js. We also learned how to debug Vue application. Resources for Article: Further resources on this subject: API with MongoDB and Node.js [Article] Tips & Tricks for Ext JS 3.x [Article] Working with Forms using REST API [Article]
Read more
  • 0
  • 0
  • 74441

article-image-data-storage-forcecom
Packt
09 Jan 2017
14 min read
Save for later

Data Storage in Force.com

Packt
09 Jan 2017
14 min read
In this article by Andrew Fawcett, author of the book Force.com Enterprise Architecture - Second Edition, we will discuss how it is important to consider your customers' storage needs and use cases around their data creation and consumption patterns early in the application design phase. This ensures that your object schema is the most optimum one with respect to large data volumes, data migration processes (inbound and outbound), and storage cost. In this article, we will extend the Custom Objects in the FormulaForce application as we explore how the platform stores and manages data. We will also explore the difference between your applications operational data and configuration data and the benefits of using Custom Metadata Types for configuration management and deployment. (For more resources related to this topic, see here.) You will obtain a good understanding of the types of storage provided and how the costs associated with each are calculated. It is also important to understand the options that are available when it comes to reusing or attempting to mirror the Standard Objects such as Account, Opportunity, or Product, which extend the discussion further into license cost considerations. You will also become aware of the options for standard and custom indexes over your application data. Finally, we will have some insight into new platform features for consuming external data storage from within the platform. In this article, we will cover the following topics: Mapping out end user storage requirements Understanding the different storage types Reusing existing Standard Objects Importing and exporting application data Options for replicating and archiving data External data sources Mapping out end user storage requirements During the initial requirements and design phase of your application, the best practice is to create user categorizations known as personas. Personas consider the users' typical skills, needs, and objectives. From this information, you should also start to extrapolate their data requirements, such as the data they are responsible for creating (either directly or indirectly, by running processes) and what data they need to consume (reporting). Once you have done this, try to provide an estimate of the number of records that they will create and/or consume per month. Share these personas and their data requirements with your executive sponsors, your market researchers, early adopters, and finally the whole development team so that they can keep them in mind and test against them as the application is developed. For example, in our FormulaForce application, it is likely that managers will create and consume data, whereas race strategists will mostly consume a lot of data. Administrators will also want to manage your applications configuration data. Finally, there will likely be a background process in the application, generating a lot of data, such as the process that records Race Data from the cars and drivers during the qualification stages and the race itself, such as sector (a designated portion of the track) times. You may want to capture your conclusions regarding personas and data requirements in a spreadsheet along with some formulas that help predict data storage requirements. This will help in the future as you discuss your application with Salesforce during the AppExchange Listing process and will be a useful tool during the sales cycle as prospective customers wish to know how to budget their storage costs with your application installed. Understanding the different storage types The storage used by your application records contributes to the most important part of the overall data storage allocation on the platform. There is also another type of storage used by the files uploaded or created on the platform. From the Storage Usage page under the Setup menu, you can see a summary of the records used, including those that reside in the Salesforce Standard Objects. Later in this article, we will create a Custom Metadata Type object to store configuration data. Storage consumed by this type of object is not reflected on the Storage Usage page and is managed and limited in a different way. The preceding page also shows which users are using the most amount of storage. In addition to the individual's User details page, you can also locate the Used Data Space and Used File Space fields; next to these are the links to view the users' data and file storage usage. The limit shown for each is based on a calculation between the minimum allocated data storage depending on the type of organization or the number of users multiplied by a certain number of MBs, which also depends on the organization type; whichever is greater becomes the limit. For full details of this, click on the Help for this Page link shown on the page. Data storage Unlike other database platforms, Salesforce typically uses a fixed 2 KB per record size as part of its storage usage calculations, regardless of the actual number of fields or the size of the data within them on each record. There are some exceptions to this rule, such as Campaigns that take up 8 KB and stored Email Messages use up the size of the contained e-mail, though all Custom Object records take up 2 KB. Note that this record size also applies even if the Custom Object uses large text area fields. File storage Salesforce has a growing number of ways to store file-based data, ranging from the historic Document tab, to the more sophisticated Content tab, to using the Files tab, and not to mention Attachments, which can be applied to your Custom Object records if enabled. Each has its own pros and cons for end users and file size limits that are well defined in the Salesforce documentation. From the perspective of application development, as with data storage, be aware of how much your application is generating on behalf of the user and give them a means to control and delete that information. In some cases, consider if the end user would be happy to have the option to recreate the file on demand (perhaps as a PDF) rather than always having the application to store it. Reusing the existing Standard Objects When designing your object model, a good knowledge of the existing Standard Objects and their features is the key to knowing when and when not to reference them. Keep in mind the following points when considering the use of Standard Objects: From a data storage perspective: Ignoring Standard Objects creates a potential data duplication and integration effort for your end users if they are already using similar Standard Objects as pre-existing Salesforce customers. Remember that adding additional custom fields to the Standard Objects via your package will not increase the data storage consumption for those objects. From a license cost perspective: Conversely, referencing some Standard Objects might cause additional license costs for your users, since not all are available to the users without additional licenses from Salesforce. Make sure that you understand the differences between Salesforce (CRM) and Salesforce Platform licenses with respect to the Standard Objects available. Currently, the Salesforce Platform license provides Accounts and Contacts; however, to use the Opportunity or Product objects, a Salesforce (CRM) license is needed by the user. Refer to the Salesforce documentation for the latest details on these. Use your user personas to define what Standard Objects your users use and reference them via lookups, Apex code, and Visualforce accordingly. You may wish to use extension packages and/or dynamic Apex and SOQL to make these kind of references optional. Since Developer Edition orgs have all these licenses and objects available (although in a limited quantity), make sure that you review your Package dependencies before clicking on the Upload button each time to check for unintentional references. Importing and exporting data Salesforce provides a number of its own tools for importing and exporting data as well as a number of third-party options based on the Salesforce APIs; these are listed on AppExchange. When importing records with other record relationships, it is not possible to predict and include the IDs of related records, such as the Season record ID when importing Race records; in this section, we will present a solution to this. Salesforce provides Data Import Wizard, which is available under the Setup menu. This tool only supports Custom Objects and Custom Settings. Custom Metadata Type records are essentially considered metadata by the platform, and as such, you can use packages, developer tools, and Change Sets to migrate these records between orgs. There is an open source CSV data loader for Custom Metadata Types at https://github.com/haripriyamurthy/CustomMetadataLoader. It is straightforward to import a CSV file with a list of race Season since this is a top-level object and has no other object dependencies. However, to import the Race information (which is a child object related to Season), the Season and Fasted Lap By record IDs are required, which will typically not be present in a Race import CSV file by default. Note that IDs are unique across the platform and cannot be shared between orgs. External ID fields help address this problem by allowing Salesforce to use the existing values of such fields as a secondary means to associate records being imported that need to reference parent or related records. All that is required is that the related record Name or, ideally, a unique external ID be included in the import data file. This CSV file includes three columns: Year, Name, and Fastest Lap By (of the driver who performed the fastest lap of that race, indicated by their Twitter handle). You may remember that a Driver record can also be identified by this since the field has been defined as an External ID field. Both the 2014 Season record and the Lewis Hamilton Driver record should already be present in your packaging org. Now, run Data Import Wizard and complete the settings as shown in the following screenshot: Next, complete the field mappings as shown in the following screenshot: Click on Start Import and then on OK to review the results once the data import has completed. You should find that four new Race records have been created under 2014 Season, with the Fasted Lap By field correctly associated with the Lewis Hamilton Driver record. Note that these tools will also stress your Apex Trigger code for volumes, as they typically have the bulk mode enabled and insert records in chunks of 200 records. Thus, it is recommended that you test your triggers to at least this level of record volumes. Options for replicating and archiving data Enterprise customers often have legacy and/or external systems that are still being used or that they wish to phase out in the future. As such, they may have requirements to replicate aspects of the data stored in the Salesforce platform to another. Likewise, in order to move unwanted data off the platform and manage their data storage costs, there is a need to archive data. The following lists some platform and API facilities that can help you and/or your customers build solutions to replicate or archive data. There are, of course, a number of AppExchange solutions listed that provide applications that use these APIs already: Replication API: This API exists in both the web service SOAP and Apex form. It allows you to develop a scheduled process to query the platform for any new, updated, or deleted records between a given time period for a specific object. The getUpdated and getDeleted API methods return only the IDs of the records, requiring you to use the conventional Salesforce APIs to query the remaining data for the replication. The frequency in which this API is called is important to avoid gaps. Refer to the Salesforce documentation for more details. Outbound Messaging: This feature offers a more real-time alternative to the replication API. An outbound message event can be configured using the standard workflow feature of the platform. This event, once configured against a given object, provides a Web Service Definition Language (WSDL) file that describes a web service endpoint to be called when records are created and updated. It is the responsibility of a web service developer to create the end point based on this definition. Note that there is no provision for deletion with this option. Bulk API: This API provides a means to move up to 5000 chunks of Salesforce data (up to 10 MB or 10,000 records per chunk) per rolling 24-hour period. Salesforce and third-party data loader tools, including the Salesforce Data Loader tool, offer this as an option. It can also be used to delete records without them going into the recycle bin. This API is ideal for building solutions to archive data. Heroku Connect is a seamless data synchronization solution between Salesforce and Heroku Postgres. For further information, refer to https://www.heroku.com/connect. External data sources One of the downsides of moving data off the platform in an archive use case or with not being able to replicate data onto the platform is that the end users have to move between applications and logins to view data; this causes an overhead as the process and data is not connected. The Salesforce Connect (previously known as Lightning Connect) is a chargeable add-on feature of the platform is the ability to surface external data within the Salesforce user interface via the so-called External Objects and External Data Sources configurations under Setup. They offer a similar functionality to Custom Objects, such as List views, Layouts, and Custom Buttons. Currently, Reports and Dashboards are not supported, though it is possible to build custom report solutions via Apex, Visualforce or Lightning Components. External Data Sources can be connected to existing OData-based end points and secured through OAuth or Basic Authentication. Alternatively, Apex provides a Connector API whereby developers can implement adapters to connect to other HTTP-based APIs. Depending on the capabilities of the associated External Data Source, users accessing External Objects using the data source can read and even update records through the standard Salesforce UIs such as Salesforce Mobile and desktop interfaces. Summary This article explored the declarative aspects of developing an application on the platform that applies to how an application is stored and how relational data integrity is enforced through the use of the lookup field deletion constraints and applying unique fields. Upload the latest version of the FormulaForce package and install it into your test org. The summary page during the installation of new and upgraded components should look something like the following screenshot. Note that the permission sets are upgraded during the install. Once you have installed the package in your testing org, visit the Custom Metadata Types page under Setup and click on Manage Records next to the object. You will see that the records are shown as managed and cannot be deleted. Click on one of the records to see that the field values themselves cannot also be edited. This is the effect of the Field Manageability checkbox when defining the fields. The Namespace Prefix shown here will differ from yours. Try changing or adding the Track Lap Time records in your packaging org, for example, update a track time on an existing record. Upload the package again then upgrade your test org. You will see the records are automatically updated. Conversely, any records you created in your test org will be retained between upgrades. In this article, we have now covered some major aspects of the platform with respect to packaging, platform alignment, and how your application data is stored as well as the key aspects of your application's architecture. Resources for Article: Further resources on this subject: Process Builder to the Rescue [article] Custom Coding with Apex [article] Building, Publishing, and Supporting Your Force.com Application [article]
Read more
  • 0
  • 0
  • 11434
article-image-deep-learning-and-regression-analysis
Packt
09 Jan 2017
6 min read
Save for later

Deep learning and regression analysis

Packt
09 Jan 2017
6 min read
In this article by Richard M. Reese and Jennifer L. Reese, authors of the book, Java for Data Science, We will discuss neural networks can be used to perform regression analysis. However, other techniques may offer a more effective solution. With regression analysis, we want to predict a result based on several input variables (For more resources related to this topic, see here.) We can perform regression analysis using an output layer that consists of a single neuron that sums the weighted input plus bias of the previous hidden layer. Thus, the result is a single value representing the regression. Preparing the data We will use a car evaluation database to demonstrate how to predict the acceptability of a car based on a series of attributes. The file containing the data we will be using can be downloaded from: http://archive.ics.uci.edu/ml/machine-learning-databases/car/car.data. It consists of car data such as price, number of passengers, and safety information, and an assessment of its overall quality. It is this latter element that we will try to predict. The comma-delimited values in each attribute are shown next, along with substitutions. The substitutions are needed because the model expects numeric data: Attribute Original value Substituted value Buying price vhigh, high, med, low 3,2,1,0 Maintenance price vhigh, high, med, low 3,2,1,0 Number of doors 2, 3, 4, 5-more 2,3,4,5 Seating 2, 4, more 2,4,5 Cargo space small, med, big 0,1,2 Safety low, med, high 0,1,2 There are 1,728 instances in the file. The cars are marked with four classes: Class Number of instances Percentage of instances Original value Substituted value Unacceptable 1210 70.023% unacc 0 Acceptable 384 22.222% acc 1 Good 69 3.99% good 2 Very good 65 3.76% v-good 3 Setting up the class We start with the definition of a CarRegressionExample class, as shown next: public class CarRegressionExample { public CarRegressionExample() { try { ... } catch (IOException | InterruptedException ex) { // Handle exceptions } } public static void main(String[] args) { new CarRegressionExample(); } } Reading and preparing the data The first task is to read in the data. We will use the CSVRecordReader class to get the data: RecordReader recordReader = new CSVRecordReader(0, ","); recordReader.initialize(new FileSplit(new File("car.txt"))); DataSetIterator iterator = new RecordReaderDataSetIterator(recordReader, 1728, 6, 4); With this dataset, we will split the data into two sets. Sixty five percent of the data is used for training and the rest for testing: DataSet dataset = iterator.next(); dataset.shuffle(); SplitTestAndTrain testAndTrain = dataset.splitTestAndTrain(0.65); DataSet trainingData = testAndTrain.getTrain(); DataSet testData = testAndTrain.getTest(); The data now needs to be normalized: DataNormalization normalizer = new NormalizerStandardize(); normalizer.fit(trainingData); normalizer.transform(trainingData); normalizer.transform(testData); We are now ready to build the model. Building the model A MultiLayerConfiguration instance is created using a series of NeuralNetConfiguration.Builder methods. The following is the dice used. We will discuss the individual methods following the code. Note that this configuration uses two layers. The last layer uses the softmax activation function, which is used for regression analysis: MultiLayerConfiguration conf = new NeuralNetConfiguration.Builder() .iterations(1000) .activation("relu") .weightInit(WeightInit.XAVIER) .learningRate(0.4) .list() .layer(0, new DenseLayer.Builder() .nIn(6).nOut(3) .build()) .layer(1, new OutputLayer .Builder(LossFunctions.LossFunction .NEGATIVELOGLIKELIHOOD) .activation("softmax") .nIn(3).nOut(4).build()) .backprop(true).pretrain(false) .build(); Two layers are created. The first is the input layer. The DenseLayer.Builder class is used to create this layer. The DenseLayer class is a feed-forward and fully connected layer. The created layer uses the six car attributes as input. The output consists of three neurons that are fed into the output layer and is duplicated here for your convenience: .layer(0, new DenseLayer.Builder() .nIn(6).nOut(3) .build()) The second layer is the output layer created with the OutputLayer.Builder class. It uses a loss function as the argument of its constructor. The softmax activation function is used since we are performing regression as shown here: .layer(1, new OutputLayer .Builder(LossFunctions.LossFunction .NEGATIVELOGLIKELIHOOD) .activation("softmax") .nIn(3).nOut(4).build()) Next, a MultiLayerNetwork instance is created using the configuration. The model is initialized, its listeners are set, and then the fit method is invoked to perform the actual training. The ScoreIterationListener instance will display information as the model trains which we will see shortly in the output of this example. Its constructor argument specifies the frequency that information is displayed: MultiLayerNetwork model = new MultiLayerNetwork(conf); model.init(); model.setListeners(new ScoreIterationListener(100)); model.fit(trainingData); We are now ready to evaluate the model. Evaluating the model In the next sequence of code, we evaluate the model against the training dataset. An Evaluation instance is created using an argument specifying that there are four classes. The test data is fed into the model using the output method. The eval method takes the output of the model and compares it against the test data classes to generate statistics. The getLabels method returns the expected values: Evaluation evaluation = new Evaluation(4); INDArray output = model.output(testData.getFeatureMatrix()); evaluation.eval(testData.getLabels(), output); out.println(evaluation.stats()); The output of the training follows, which is produced by the ScoreIterationListener class. However, the values you get may differ due to how the data is selected and analyzed. Notice that the score improves with the iterations but levels out after about 500 iterations: 12:43:35.685 [main] INFO o.d.o.l.ScoreIterationListener - Score at iteration 0 is 1.443480901811554 12:43:36.094 [main] INFO o.d.o.l.ScoreIterationListener - Score at iteration 100 is 0.3259061845624861 12:43:36.390 [main] INFO o.d.o.l.ScoreIterationListener - Score at iteration 200 is 0.2630572026049783 12:43:36.676 [main] INFO o.d.o.l.ScoreIterationListener - Score at iteration 300 is 0.24061281470878784 12:43:36.977 [main] INFO o.d.o.l.ScoreIterationListener - Score at iteration 400 is 0.22955121170274934 12:43:37.292 [main] INFO o.d.o.l.ScoreIterationListener - Score at iteration 500 is 0.22249920540161677 12:43:37.575 [main] INFO o.d.o.l.ScoreIterationListener - Score at iteration 600 is 0.2169898450109222 12:43:37.872 [main] INFO o.d.o.l.ScoreIterationListener - Score at iteration 700 is 0.21271599814600958 12:43:38.161 [main] INFO o.d.o.l.ScoreIterationListener - Score at iteration 800 is 0.2075677126088741 12:43:38.451 [main] INFO o.d.o.l.ScoreIterationListener - Score at iteration 900 is 0.20047317735870715 This is followed by the results of the stats method as shown next. The first part reports on how examples are classified and the second part displays various statistics: Examples labeled as 0 classified by model as 0: 397 times Examples labeled as 0 classified by model as 1: 10 times Examples labeled as 0 classified by model as 2: 1 times Examples labeled as 1 classified by model as 0: 8 times Examples labeled as 1 classified by model as 1: 113 times Examples labeled as 1 classified by model as 2: 1 times Examples labeled as 1 classified by model as 3: 1 times Examples labeled as 2 classified by model as 1: 7 times Examples labeled as 2 classified by model as 2: 21 times Examples labeled as 2 classified by model as 3: 14 times Examples labeled as 3 classified by model as 1: 2 times Examples labeled as 3 classified by model as 3: 30 times ==========================Scores======================================== Accuracy: 0.9273 Precision: 0.854 Recall: 0.8323 F1 Score: 0.843 ======================================================================== The regression model does a reasonable job with this dataset. Summary In this article, we examined deep learning and regression analysis. We showed how to prepare the data and class, build the model, and evaluate the model. We used sample data and displayed output statistics to demonstrate the relative effectiveness of our model. Resources for Article: Further resources on this subject: KnockoutJS Templates [article] The Heart of It All [article] Bringing DevOps to Network Operations [article]
Read more
  • 0
  • 0
  • 5104

article-image-exploring-structure-motion-using-opencv
Packt
09 Jan 2017
20 min read
Save for later

Exploring Structure from Motion Using OpenCV

Packt
09 Jan 2017
20 min read
In this article by Roy Shilkrot, coauthor of the book Mastering OpenCV 3, we will discuss the notion of Structure from Motion (SfM), or better put, extracting geometric structures from images taken with a camera under motion, using OpenCV's API to help us. First, let's constrain the otherwise very broad approach to SfM using a single camera, usually called a monocular approach, and a discrete and sparse set of frames rather than a continuous video stream. These two constrains will greatly simplify the system we will sketch out in the coming pages and help us understand the fundamentals of any SfM method. In this article, we will cover the following: Structure from Motion concepts Estimating the camera motion from a pair of images (For more resources related to this topic, see here.) Throughout the article, we assume the use of a calibrated camera—one that was calibrated beforehand. Calibration is a ubiquitous operation in computer vision, fully supported in OpenCV using command-line tools. We, therefore, assume the existence of the camera's intrinsic parameters embodied in the K matrix and the distortion coefficients vector—the outputs from the calibration process. To make things clear in terms of language, from this point on, we will refer to a camera as a single view of the scene rather than to the optics and hardware taking the image. A camera has a position in space and a direction of view. Between two cameras, there is a translation element (movement through space) and a rotation of the direction of view. We will also unify the terms for the point in the scene, world, real, or 3D to be the same thing, a point that exists in our real world. The same goes for points in the image or 2D, which are points in the image coordinates, of some real 3D point that was projected on the camera sensor at that location and time. Structure from Motion concepts The first discrimination we should make is the difference between stereo (or indeed any multiview), 3D reconstruction using calibrated rigs, and SfM. A rig of two or more cameras assumes we already know what the "motion" between the cameras is, while in SfM, we don't know what this motion is and we wish to find it. Calibrated rigs, from a simplistic point of view, allow a much more accurate reconstruction of 3D geometry because there is no error in estimating the distance and rotation between the cameras—it is already known. The first step in implementing an SfM system is finding the motion between the cameras. OpenCV may help us in a number of ways to obtain this motion, specifically using the findFundamentalMat and findEssentialMat functions. Let's think for one moment of the goal behind choosing an SfM algorithm. In most cases, we wish to obtain the geometry of the scene, for example, where objects are in relation to the camera and what their form is. Having found the motion between the cameras picturing the same scene, from a reasonably similar point of view, we would now like to reconstruct the geometry. In computer vision jargon, this is known as triangulation, and there are plenty of ways to go about it. It may be done by way of ray intersection, where we construct two rays: one from each camera's center of projection and a point on each of the image planes. The intersection of these rays in space will, ideally, intersect at one 3D point in the real world that was imaged in each camera, as shown in the following diagram: In reality, ray intersection is highly unreliable. This is because the rays usually do not intersect, making us fall back to using the middle point on the shortest segment connecting the two rays. OpenCV contains a simple API for a more accurate form of triangulation, the triangulatePoints function, so this part we do not need to code on our own. After you have learned how to recover 3D geometry from two views, we will see how you can incorporate more views of the same scene to get an even richer reconstruction. At that point, most SfM methods try to optimize the bundle of estimated positions of our cameras and 3D points by means of Bundle Adjustment. OpenCV contains means for Bundle Adjustment in its new Image Stitching Toolbox. However, the beauty of working with OpenCV and C++ is the abundance of external tools that can be easily integrated into the pipeline. We will, therefore, see how to integrate an external bundle adjuster, the Ceres non-linear optimization package. Now that we have sketched an outline of our approach to SfM using OpenCV, we will see how each element can be implemented. Estimating the camera motion from a pair of images Before we set out to actually find the motion between two cameras, let's examine the inputs and the tools we have at hand to perform this operation. First, we have two images of the same scene from (hopefully not extremely) different positions in space. This is a powerful asset, and we will make sure that we use it. As for tools, we should take a look at mathematical objects that impose constraints over our images, cameras, and the scene. Two very useful mathematical objects are the fundamental matrix (denoted by F) and the essential matrix (denoted by E). They are mostly similar, except that the essential matrix is assuming usage of calibrated cameras; this is the case for us, so we will choose it. OpenCV allows us to find the fundamental matrix via the findFundamentalMat function and the essential matrix via the findEssentialMatrix function. Finding the essential matrix can be done as follows: Mat E = findEssentialMat(leftPoints, rightPoints, focal, pp); This function makes use of matching points in the "left" image, leftPoints, and "right" image, rightPoints, which we will discuss shortly, as well as two additional pieces of information from the camera's calibration: the focal length, focal, and principal point, pp. The essential matrix, E, is a 3 x 3 matrix, which imposes the following constraint on a point in one image and a point in the other image: x'K­TEKx = 0, where x is a point in the first image one, x' is the corresponding point in the second image, and K is the calibration matrix. This is extremely useful, as we are about to see. Another important fact we use is that the essential matrix is all we need in order to recover the two cameras' positions from our images, although only up to an arbitrary unit of scale. So, if we obtain the essential matrix, we know where each camera is positioned in space and where it is looking. We can easily calculate the matrix if we have enough of those constraint equations, simply because each equation can be used to solve for a small part of the matrix. In fact, OpenCV internally calculates it using just five point-pairs, but through the Random Sample Consensus algorithm (RANSAC), many more pairs can be used and make for a more robust solution. Point matching using rich feature descriptors Now we will make use of our constraint equations to calculate the essential matrix. To get our constraints, remember that for each point in image A, we must find a corresponding point in image B. We can achieve such a matching using OpenCV's extensive 2D feature-matching framework, which has greatly matured in the past few years. Feature extraction and descriptor matching is an essential process in computer vision and is used in many methods to perform all sorts of operations, for example, detecting the position and orientation of an object in the image or searching a big database of images for similar images through a given query. In essence, feature extraction means selecting points in the image that would make for good features and computing a descriptor for them. A descriptor is a vector of numbers that describes the surrounding environment around a feature point in an image. Different methods have different lengths and data types for their descriptor vectors. Descriptor Matching is the process of finding a corresponding feature from one set in another using its descriptor. OpenCV provides very easy and powerful methods to support feature extraction and matching. Let's examine a very simple feature extraction and matching scheme: vector<KeyPoint> keypts1, keypts2; Mat desc1, desc2; // detect keypoints and extract ORB descriptors Ptr<Feature2D> orb = ORB::create(2000); orb->detectAndCompute(img1, noArray(), keypts1, desc1); orb->detectAndCompute(img2, noArray(), keypts2, desc2); // matching descriptors Ptr<DescriptorMatcher> matcher = DescriptorMatcher::create("BruteForce-Hamming"); vector<DMatch> matches; matcher->match(desc1, desc2, matches); You may have already seen similar OpenCV code, but let's review it quickly. Our goal is to obtain three elements: feature points for two images, descriptors for them, and a matching between the two sets of features. OpenCV provides a range of feature detectors, descriptor extractors, and matchers. In this simple example, we use the ORB class to get both the 2D location of Oriented BRIEF (ORB) (where Binary Robust Independent Elementary Features (BRIEF)) feature points and their respective descriptors. We use a brute-force binary matcher to get the matching, which is the most straightforward way to match two feature sets by comparing each feature in the first set to each feature in the second set (hence the phrasing "brute-force"). In the following image, we will see a matching of feature points on two images from the Fountain-P11 sequence found at http://cvlab.epfl.ch/~strecha/multiview/denseMVS.html: Practically, raw matching like we just performed is good only up to a certain level, and many matches are probably erroneous. For that reason, most SfM methods perform some form of filtering on the matches to ensure correctness and reduce errors. One form of filtering, which is built into OpenCV's brute-force matcher, is cross-check filtering. That is, a match is considered true if a feature of the first image matches a feature of the second image, and the reverse check also matches the feature of the second image with the feature of the first image. Another common filtering mechanism, used in the provided code, is to filter based on the fact that the two images are of the same scene and have a certain stereo-view relationship between them. In practice, the filter tries to robustly calculate the fundamental or essential matrix and retain those feature pairs that correspond to this calculation with small errors. An alternative to using rich features, such as ORB, is to use optical flow. The following information box provides a short overview of optical flow. It is possible to use optical flow instead of descriptor matching to find the required point matching between two images, while the rest of the SfM pipeline remains the same. OpenCV recently extended its API for getting the flow field from two images, and now it is faster and more powerful. Optical flow It is the process of matching selected points from one image to another, assuming that both images are part of a sequence and relatively close to one another. Most optical flow methods compare a small region, known as the search window or patch, around each point from image A to the same area in image B. Following a very common rule in computer vision, called the brightness constancy constraint (and other names), the small patches of the image will not change drastically from one image to other, and therefore the magnitude of their subtraction should be close to zero. In addition to matching patches, newer methods of optical flow use a number of additional methods to get better results. One is using image pyramids, which are smaller and smaller resized versions of the image, which allow for working from coarse to-fine—a very well-used trick in computer vision. Another method is to define global constraints on the flow field, assuming that the points close to each other move together in the same direction. Finding camera matrices Now that we have obtained matches between keypoints, we can calculate the essential matrix. However, we must first align our matching points into two arrays, where an index in one array corresponds to the same index in the other. This is required by the findEssentialMat function as we've seen in the Estimating Camera Motion section. We would also need to convert the KeyPoint structure to a Point2f structure. We must pay special attention to the queryIdx and trainIdx member variables of DMatch, the OpenCV struct that holds a match between two keypoints, as they must align with the way we used the DescriptorMatcher::match() function. The following code section shows how to align a matching into two corresponding sets of 2D points, and how these can be used to find the essential matrix: vector<KeyPoint> leftKpts, rightKpts; // ... obtain keypoints using a feature extractor vector<DMatch> matches; // ... obtain matches using a descriptor matcher //align left and right point sets vector<Point2f> leftPts, rightPts; for (size_t i = 0; i < matches.size(); i++) { // queryIdx is the "left" image leftPts.push_back(leftKpts[matches[i].queryIdx].pt); // trainIdx is the "right" image rightPts.push_back(rightKpts[matches[i].trainIdx].pt); } //robustly find the Essential Matrix Mat status; Mat E = findEssentialMat( leftPts, //points from left image rightPts, //points from right image focal, //camera focal length factor pp, //camera principal point cv::RANSAC, //use RANSAC for a robust solution 0.999, //desired solution confidence level 1.0, //point-to-epipolar-line threshold status); //binary vector for inliers We may later use the status binary vector to prune those points that align with the recovered essential matrix. Refer to the following image for an illustration of point matching after pruning. The red arrows mark feature matches that were removed in the process of finding the matrix, and the green arrows are feature matches that were kept: Now we are ready to find the camera matrices; however, the new OpenCV 3 API makes things very easy for us by introducing the recoverPose function. First, we will briefly examine the structure of the camera matrix we will use: This is the model for our camera; it consists of two elements, rotation (denoted as R) and translation (denoted as t). The interesting thing about it is that it holds a very essential equation: x = PX, where x is a 2D point on the image and X is a 3D point in space. There is more to it, but this matrix gives us a very important relationship between the image points and the scene points. So, now that we have a motivation for finding the camera matrices, we will see how it can be done. The following code section shows how to decompose the essential matrix into the rotation and translation elements: Mat E; // ... find the essential matrix Mat R, t; //placeholders for rotation and translation //Find Pright camera matrix from the essential matrix //Cheirality check is performed internally. recoverPose(E, leftPts, rightPts, R, t, focal, pp, mask); Very simple. Without going too deep into mathematical interpretation, this conversion of the essential matrix to rotation and translation is possible because the essential matrix was originally composed by these two elements. Strictly for satisfying our curiosity, we can look at the following equation for the essential matrix, which appears in the literature:. We see that it is composed of (some form of) a translation element, t, and a rotational element, R. Note that a cheirality check is internally performed in the recoverPose function. The cheirality check makes sure that all triangulated 3D points are in front of the reconstructed camera. Camera matrix recovery from the essential matrix has in fact four possible solutions, but the only correct solution is the one that will produce triangulated points in front of the camera, hence the need for a cheirality check. Note that what we just did only gives us one camera matrix, and for triangulation, we require two camera matrices. This operation assumes that one camera matrix is fixed and canonical (no rotation and no translation): The other camera that we recovered from the essential matrix has moved and rotated in relation to the fixed one. This also means that any of the 3D points that we recover from these two camera matrices will have the first camera at the world origin point (0, 0, 0). One more thing we can think of adding to our method is error checking. Many times, the calculation of an essential matrix from point matching is erroneous, and this affects the resulting camera matrices. Continuing to triangulate with faulty camera matrices is pointless. We can install a check to see if the rotation element is a valid rotation matrix. Keeping in mind that rotation matrices must have a determinant of 1 (or -1), we can simply do the following: bool CheckCoherentRotation(const cv::Mat_<double>& R) { if (fabsf(determinant(R)) - 1.0 > 1e-07) { cerr << "rotation matrix is invalid" << endl; return false; } return true; } We can now see how all these elements combine into a function that recovers the P matrices. First, we will introduce some convenience data structures and type short hands: typedef std::vector<cv::KeyPoint> Keypoints; typedef std::vector<cv::Point2f> Points2f; typedef std::vector<cv::Point3f> Points3f; typedef std::vector<cv::DMatch> Matching; struct Features { //2D features Keypoints keyPoints; Points2f points; cv::Mat descriptors; }; struct Intrinsics { //camera intrinsic parameters cv::Mat K; cv::Mat Kinv; cv::Mat distortion; }; Now, we can write the camera matrix finding function: void findCameraMatricesFromMatch( constIntrinsics& intrin, constMatching& matches, constFeatures& featuresLeft, constFeatures& featuresRight, cv::Matx34f& Pleft, cv::Matx34f& Pright) { { //Note: assuming fx = fy const double focal = intrin.K.at<float>(0, 0); const cv::Point2d pp(intrin.K.at<float>(0, 2), intrin.K.at<float>(1, 2)); //align left and right point sets using the matching Features left; Features right; GetAlignedPointsFromMatch( featuresLeft, featuresRight, matches, left, right); //find essential matrix Mat E, mask; E = findEssentialMat( left.points, right.points, focal, pp, RANSAC, 0.999, 1.0, mask); Mat_<double> R, t; //Find Pright camera matrix from the essential matrix recoverPose(E, left.points, right.points, R, t, focal, pp, mask); Pleft = Matx34f::eye(); Pright = Matx34f(R(0,0), R(0,1), R(0,2), t(0), R(1,0), R(1,1), R(1,2), t(1), R(2,0), R(2,1), R(2,2), t(2)); } At this point, we have the two cameras that we need in order to reconstruct the scene. The canonical first camera, in the Pleft variable, and the second camera we calculated, form the essential matrix in the Pright variable. Choosing the image pair to use first Given we have more than just two image views of the scene, we must choose which two views we will start the reconstruction from. In their paper, Snavely et al. suggest that we pick the two views that have the least number of homography inliers. A homography is a relationship between two images or sets of points that lie on a plane; the homography matrix defines the transformation from one plane to another. In case of an image or a set of 2D points, the homography matrix is of size 3 x 3. When Snavely et al. look for the lowest inlier ratio, they essentially suggest to calculate the homography matrix between all pairs of images and pick the pair whose points mostly do not correspond with the homography matrix. This means the geometry of the scene in these two views is not planar or at least not the same plane in both views, which helps when doing 3D reconstruction. For reconstruction, it is best to look at a complex scene with non-planar geometry, with things closer and farther away from the camera. The following code snippet shows how to use OpenCV's findHomography function to count the number of inliers between two views whose features were already extracted and matched: int findHomographyInliers( const Features& left, const Features& right, const Matching& matches) { //Get aligned feature vectors Features alignedLeft; Features alignedRight; GetAlignedPointsFromMatch(left, right, matches, alignedLeft, alignedRight); //Calculate homography with at least 4 points Mat inlierMask; Mat homography; if(matches.size() >= 4) { homography = findHomography(alignedLeft.points, alignedRight.points, cv::RANSAC, RANSAC_THRESHOLD, inlierMask); } if(matches.size() < 4 or homography.empty()) { return 0; } return countNonZero(inlierMask); } The next step is to perform this operation on all pairs of image views in our bundle and sort them based on the ratio of homography inliers to outliers: //sort pairwise matches to find the lowest Homography inliers map<float, ImagePair> pairInliersCt; const size_t numImages = mImages.size(); //scan all possible image pairs (symmetric) for (size_t i = 0; i < numImages - 1; i++) { for (size_t j = i + 1; j < numImages; j++) { if (mFeatureMatchMatrix[i][j].size() < MIN_POINT_CT) { //Not enough points in matching pairInliersCt[1.0] = {i, j}; continue; } //Find number of homography inliers const int numInliers = findHomographyInliers( mImageFeatures[i], mImageFeatures[j], mFeatureMatchMatrix[i][j]); const float inliersRatio = (float)numInliers / (float)(mFeatureMatchMatrix[i][j].size()); pairInliersCt[inliersRatio] = {i, j}; } } Note that the std::map<float, ImagePair> will internally sort the pairs based on the map's key: the inliers ratio. We then simply need to traverse this map from the beginning to find the image pair with least inlier ratio, and if that pair cannot be used, we can easily skip ahead to the next pair. Summary In this article, we saw how OpenCV v3 can help us approach Structure from Motion in a manner that is both simple to code and to understand. OpenCV v3's new API contains a number of useful functions and data structures that make our lives easier and also assist in a cleaner implementation. However, the state-of-the-art SfM methods are far more complex. There are many issues we choose to disregard in favor of simplicity, and plenty more error examinations that are usually in place. Our chosen methods for the different elements of SfM can also be revisited. Some methods even use the N-view triangulation once they understand the relationship between the features in multiple images. If we would like to extend and deepen our familiarity with SfM, we will certainly benefit from looking at other open source SfM libraries. One particularly interesting project is libMV, which implements a vast array of SfM elements that may be interchanged to get the best results. There is a great body of work from University of Washington that provides tools for many flavors of SfM (Bundler and VisualSfM). This work inspired an online product from Microsoft, called PhotoSynth, and 123D Catch from Adobe. There are many more implementations of SfM readily available online, and one must only search to find quite a lot of them. Resources for Article: Further resources on this subject: Basics of Image Histograms in OpenCV [article] OpenCV: Image Processing using Morphological Filters [article] Face Detection and Tracking Using ROS, Open-CV and Dynamixel Servos [article]
Read more
  • 0
  • 1
  • 57530

article-image-writing-reddit-reader-rxphp
Packt
09 Jan 2017
9 min read
Save for later

Writing a Reddit Reader with RxPHP

Packt
09 Jan 2017
9 min read
In this article by Martin Sikora, author of the book, PHP Reactive Programming, we will cover writing a CLI Reddit reader app using RxPHP, and we will see how Disposables are used in the default classes that come with RxPHP, and how these are going to be useful for unsubscribing from Observables in our app. (For more resources related to this topic, see here.) Examining RxPHP's internals As we know, Disposables as a means for releasing resources used by Observers, Observables, Subjects, and so on. In practice, a Disposable is returned, for example, when subscribing to an Observable. Consider the following code from the default RxObservable::subscribe() method: function subscribe(ObserverI $observer, $scheduler = null) { $this->observers[] = $observer; $this->started = true; return new CallbackDisposable(function () use ($observer) { $this->removeObserver($observer); }); } This method first adds the Observer to the array of all subscribed Observers. It then marks this Observable as started and, at the end, it returns a new instance of the CallbackDisposable class, which takes a Closure as an argument and invokes it when it's disposed. This is probably the most common use case for Disposables. This Disposable just removes the Observer from the array of subscribers and therefore, it receives no more events emitted from this Observable. A closer look at subscribing to Observables It should be obvious that Observables need to work in such way that all subscribed Observables iterate. Then, also unsubscribing via a Disposable will need to remove one particular Observer from the array of all subscribed Observables. However, if we have a look at how most of the default Observables work, we find out that they always override the Observable::subscribe() method and usually completely omit the part where it should hold an array of subscribers. Instead, they just emit all available values to the subscribed Observer and finish with the onComplete() signal immediately after that. For example, we can have a look at the actual source code of the subscribe() method of the RxReturnObservable class: function subscribe(ObserverI $obs, SchedulerI $sched = null) { $value = $this->value; $scheduler = $scheduler ?: new ImmediateScheduler(); $disp = new CompositeDisposable(); $disp->add($scheduler->schedule(function() use ($obs, $val) { $obs->onNext($val); })); $disp->add($scheduler->schedule(function() use ($obs) { $obs->onCompleted(); })); return $disp; } The ReturnObservable class takes a single value in its constructor and emits this value to every Observer as they subscribe. The following is a nice example of how the lifecycle of an Observable might look: When an Observer subscribes, it checks whether a Scheduler was also passed as an argument. Usually, it's not, so it creates an instance of ImmediateScheduler. Then, an instance of CompositeDisposable is created, which is going to keep an array of all Disposables used by this method. When calling CompositeDisposable::dispose(), it iterates all disposables it contains and calls their respective dispose() methods. Right after that we start populating our CompositeDisposable with the following: $disposable->add($scheduler->schedule(function() { ... })); This is something we'll see very often. SchedulerInterface::schedule() returns a DisposableInterface, which is responsible for unsubscribing and releasing resources. In this case, when we're using ImmediateScheduler, which has no other logic, it just evaluates the Closure immediately: function () use ($obs, $val) { $observer->onNext($val); } Since ImmediateScheduler::schedule() doesn't need to release any resources (it didn't use any), it just returns an instance of RxDisposableEmptyDisposable that does literally nothing. Then the Disposable is returned, and could be used to unsubscribe from this Observable. However, as we saw in the preceding source code, this Observable doesn't let you unsubscribe, and if we think about it, it doesn't even make sense because ReturnObservable class's value is emitted immediately on subscription. The same applies to other similar Observables, such as IteratorObservable, RangeObservable or ArrayObservable. These just contain recursive calls with Schedulers but the principle is the same. A good question is, why on Earth is this so complicated? All the preceding code does could be stripped into the following three lines (assuming we're not interested in using Schedulers): function subscribe(ObserverI $observer) { $observer->onNext($this->value); $observer->onCompleted(); } Well, for ReturnObservable this might be true, but in real applications, we very rarely use any of these primitive Observables. It's true that we usually don't even need to deal with Schedulers. However, the ability to unsubscribe from Observables or clean up any resources when unsubscribing is very important and we'll use it in a few moments. A closer look at Operator chains Before we start writing our Reddit reader, we should talk briefly about an interesting situation that might occur, so it doesn't catch us unprepared later. We're also going to introduce a new type of Observable, called ConnectableObservable. Consider this simple Operator chain with two subscribers: // rxphp_filters_observables.php use RxObservableRangeObservable; use RxObservableConnectableObservable; $connObs = new ConnectableObservable(new RangeObservable(0, 6)); $filteredObs = $connObs ->map(function($val) { return $val ** 2; }) ->filter(function($val) { return $val % 2;, }); $disposable1 = $filteredObs->subscribeCallback(function($val) { echo "S1: ${val}n"; }); $disposable2 = $filteredObs->subscribeCallback(function($val) { echo "S2: ${val}n"; }); $connObservable->connect(); The ConnectableObservable class is a special type of Observable that behaves similarly to Subject (in fact, internally, it really uses an instance of the Subject class). Any other Observable emits all available values right after you subscribe to it. However, ConnectableObservable takes another Observable as an argument and lets you subscribe Observers to it without emitting anything. When you call ConnectableObservable::connect(), it connects Observers with the source Observables, and all values go one by one to all subscribers. Internally, it contains an instance of the Subject class and when we called subscribe(), it just subscribed this Observable to its internal Subject. Then when we called the connect() method, it subscribed the internal Subject to the source Observable. In the $filteredObs variable we keep a reference to the last Observable returned from filter() call, which is an instance of AnnonymousObservable where, on next few lines, we subscribe both Observers. Now, let's see what this Operator chain prints: $ php rxphp_filters_observables.php S1: 1 S2: 1 S1: 9 S2: 9 S1: 25 S2: 25 As we can see, each value went through both Observers in the order they were emitted. Just out of curiosity, we can also have a look at what would happen if we didn't use ConnectableObservable, and used just the RangeObservable instead: $ php rxphp_filters_observables.php S1: 1 S1: 9 S1: 25 S2: 1 S2: 9 S2: 25 This time, RangeObservable emitted all values to the first Observer and then, again, all values to the second Observer. Right now, we can tell that the Observable had to generate all the values twice, which is inefficient, and with a large dataset, this might cause a performance bottleneck. Let's go back to the first example with ConnectableObservable, and modify the filter() call so it prints all the values that go through: $filteredObservable = $connObservable ->map(function($val) { return $val ** 2; }) ->filter(function($val) { echo "Filter: $valn"; return $val % 2; }); Now we run the code again and see what happens: $ php rxphp_filters_observables.php Filter: 0 Filter: 0 Filter: 1 S1: 1 Filter: 1 S2: 1 Filter: 4 Filter: 4 Filter: 9 S1: 9 Filter: 9 S2: 9 Filter: 16 Filter: 16 Filter: 25 S1: 25 Filter: 25 S2: 25 Well, this is unexpected! Each value is printed twice. This doesn't mean that the Observable had to generate all the values twice, however. It's not obvious at first sight what happened, but the problem is that we subscribed to the Observable at the end of the Operator chain. As stated previously, $filteredObservable is an instance of AnnonymousObservable that holds many nested Closures. By calling its subscribe() method, it runs a Closure that's created by its predecessor, and so on. This leads to the fact that every call to subscribe() has to invoke the entire chain. While this might not be an issue in many use cases, there are situations where we might want to do some special operation inside one of the filters. Also, note that calls to the subscribe() method might be out of our control, performed by another developer who wanted to use an Observable we created for them. It's good to know that such a situation might occur and could lead to unwanted behavior. It's sometimes hard to see what's going on inside Observables. It's very easy to get lost, especially when we have to deal with multiple Closures. Schedulers are prime examples. Feel free to experiment with the examples shown here and use debugger to examine step-by-step what code gets executed and in what order. So, let's figure out how to fix this. We don't want to subscribe at the end of the chain multiple times, so we can create an instance of Subject class, where we'll subscribe both Observers, and the Subject class itself will subscribe to the AnnonymousObservable as discussed a moment ago: // ... use RxSubjectSubject; $subject = new Subject(); $connObs = new ConnectableObservable(new RangeObservable(0, 6)); $filteredObservable = $connObs ->map(function($val) { return $val ** 2; }) ->filter(function($val) { echo "Filter: $valn"; return $val % 2; }) ->subscribe($subject); $disposable1 = $subject->subscribeCallback(function($val) { echo "S1: ${val}n"; }); $disposable2 = $subject->subscribeCallback(function($val) { echo "S2: ${val}n"; }); $connObservable->connect(); Now we can run the script again and see that it does what we wanted it to do: $ php rxphp_filters_observables.php Filter: 0 Filter: 1 S1: 1 S2: 1 Filter: 4 Filter: 9 S1: 9 S2: 9 Filter: 16 Filter: 25 S1: 25 S2: 25 This might look like an edge case, but soon we'll see that this issue, left unhandled, could lead to some very unpredictable behavior. We'll bring out both these issues (proper usage of Disposables and Operator chains) when we start writing our Reddit reader. Summary In this article, we looked in more depth at how to use Disposables and Operators, how these work internally, and what it means for us. We also looked at a couple of new classes from RxPHP, such as ConnectableObservable, and CompositeDisposable. Resources for Article: Further resources on this subject: Working with JSON in PHP jQuery [article] Working with Simple Associations using CakePHP [article] Searching Data using phpMyAdmin and MySQL [article]
Read more
  • 0
  • 0
  • 14589
article-image-professional-environment-react-native-part-1
Pierre Monge
09 Jan 2017
5 min read
Save for later

A Professional Environment for React Native, Part 1

Pierre Monge
09 Jan 2017
5 min read
React Native, a new framework, allows you to build mobile apps using JavaScript. It uses the same design as React.js, letting you compose a rich mobile UI from declarative components. Although many developers are talking about this technology, React Native is not yet approved by most professionals for several reasons: React Native isn’t fully stable yet. At the time of writing this, we are at version 0.40 It can be scary to use a web technology in a mobile application It’s hard to find good React Native developers because knowing the React.js stack is not enough to maintain a mobile React Native app from A to Z! To confront all these prejudices, this series will act as a guide, detailing how we see things in my development team. We will cover the entire React Native environment as well as discuss how to maintain a React Native application. This series may be of interest to companies who want to implement a React Native solution and also of interest to anyone who is looking for the perfect tools to maintain a mobile application in React Native. Let’s start here in part 1 by exploring the React Native environment. The environment The React Native environment is pretty consistent. To manage all of the parts of such an application, you will need a native stack, a JavaScript stack, and specific components from React Native. Let's examine all the aspects of the React Native environment: The Native part consists of two important pieces of software: Android Studio (Android) and Xcode (iOS). Both the pieces of software are provided with their emulators, so there is no need for a physical device! The negative point of Android Studio, however, is that you need to download the SDK, and you will have to find the good versions and download them all. In addition, these two programs take up a lot of room on your hard disk! The JavaScript part naturally consists of Node.js, but we must add Watchman to that to check the changes in a file in real time. The React Native CLI will automate the linking of all software. You only have to run react-native init helloworld to create a project and react-native run-ios --scheme 'Dev' to launch the project on an iOS simulator in debug mode. The supplied react-native controls will load almost everything! You have, no doubt, come to our first conclusion. React Native has a lot of prerequisites, and although it makes sense to have as much dependency as possible, you will have to master them all, which can take some time. And also a lot of space on your hard drive! Try this as your starting point if you want more information on getting started with React Native. Atom, linter, and flow A developer never starts coding without his text editor and his little tricks, just as a woodcutter never goes out into the forest without his ax! More than 80% of people around me use Atom as a text editor. And they are not wrong! React Native is 100% OpenSource, and Atom is also open source. And it is full of plug-ins and themes of all kinds. I personally use a lot of plug-ins, such as color-picker, file-icons, indent-guide-improved, minimap, etc., but there are some plug-ins that should be essential for every JavaScript coder, especially for your React Native application. linter-eslint To work alone or in a group, you must have a common syntax for all your files. To do this, we use linter-eslint with the fbjs configurations. This plugin provides the following: Good indentation Good syntax on how to define variables, objects, classes, etc. Indications on non-existent, unused variables and functions And many other great benefits. Flow One question you may be thinking is what is the biggest problem with using JavaScript? One issue with using JavaScript has always been that it's a language that has no type. In fact, there are types, such as String, Number, Boolean, Function, etc., but that's just not enough. There is no static type. To deal with this, we use Flow, which allows you to perform type check before run-time. This is, of course, useful for predicting bugs! There is even a plug-in version for Atom: linter-flow. Conclusion At this point, you should have everything you need to create your first React Native mobile applications. Here are some great examples of apps that are out there already. Check out part 2 in this series where I cover the tools that can help you maintain your React Native apps. About the author Pierre Monge (liroo.pierre@gmail.com) is a 21 year old student. He is a developer in C, JavaScript, and all things related to web development, and he has recently been creating mobile applications. He is currently working as an intern at a company named Azendoo, where he is developing a 100% React Native application.
Read more
  • 0
  • 0
  • 6723

article-image-creating-hello-world-xamarinforms
Packt
06 Jan 2017
16 min read
Save for later

Creating Hello World in Xamarin.Forms_sample

Packt
06 Jan 2017
16 min read
Since the beginning of Xamarin's life as a company, their motto has always been to present the native APIs on iOS and Android idiomatically to C#. This was a great strategy in the beginning, because applications built with Xamarin.iOS or Xamarin.Android were pretty much indistinguishable from native Objective-C or Java applications. Code sharing was generally limited to non-UI code, which left a potential gap to fill in the Xamarin ecosystem: a cross-platform UI abstraction. Xamarin.Forms is the solution to this problem, a cross-platform UI framework that renders native controls on each platform. Xamarin.Forms is a great framework for those that know C# (and XAML), but also may not want to get into the full details of using the native iOS and Android APIs. In this chapter, we will do the following: Create Hello World in Xamarin.Forms Discuss the Xamarin.Forms architecture Use XAML with Xamarin.Forms Cover data binding and MVVM with Xamarin.Forms Creating Hello World in Xamarin.Forms To understand how a Xamarin.Forms application is put together, let's begin by creating a simple Hello World application. OpenXamarin Studio and perform the following steps: Create a new Multiplatform | App | Forms App project from the new solution dialog. Name your solution something appropriate, such as HelloForms. Make sure Use Portable Class Library is selected. Click Next, then click Create. Notice the three new projects that were successfully created: HelloForms HelloForms.Android HelloForms.iOS In Xamarin.Forms applications, the bulk of your code will be shared, and each platform-specific project is just a small amount of code that starts up the Xamarin.Forms framework. Let's examine theminimum parts of a Xamarin.Forms application: App.xaml and App.xaml.cs in the HelloForms PCL library -- this class is the main starting point of the Xamarin.Forms application. A simple property, MainPage, is set to the first page in the application. In the default project template, HelloFormsPage is created with a single label that will be rendered as a UILabel on iOS and a TextView on Android. MainActivity.cs in the HelloForms.Android Android project -- the main launcher activity of the Android application. The important parts for Xamarin.Forms here is the call to Forms.Init(this, bundle), which initializes the Android-specific portion of the Xamarin.Forms framework. Next is a call to LoadApplication(new App()), which starts our Xamarin.Forms application. AppDelegate.cs in the HelloForms.iOS iOS project -- very similar to Android, except iOS applications start up using a UIApplicationDelegate class. Forms.Init() will initialized the iOS-specific parts of Xamarin.Forms, and just as Android's LoadApplication(new App()), will start the Xamarin.Forms application. Go ahead and run the iOS project; you should see something similar to the following screenshot: If you run theAndroid project, you will get a UI verysimilar to the iOS one shown in the following screenshot, but using native Android controls: Even though it's not covered in this book, Xamarin.Forms also supports Windows Phone, WinRT, and UWP applications. However, a PC running Windows and Visual Studio is required to develop for Windows platforms. If you can get a Xamarin.Forms application working on iOS and Android, then getting a Windows Phone version working should be a piece of cake. Understanding the architecture behind Xamarin.Forms Getting started with Xamarin.Forms is very easy, but it is always good to look behind the scenes to understand how everything is put together. In the earlier chapters of this book, we created a cross-platform application using native iOS and Android APIs directly. Certain applications are much more suited for this development approach, so understanding the difference between a Xamarin.Forms application and a classic Xamarin application is important when choosing what framework is best suited for your app. Xamarin.Forms is an abstraction over the native iOS and Android APIs that you can call directly from C#. So, Xamarin.Forms is using the same APIs you would in a classic Xamarin application, while providing a framework that allows you to define your UIs in a cross-platform way. An abstraction layer such as this is in many ways a very good thing, because it gives you the benefit of sharing the code driving your UI as well as any backend C# code that could also have been shared in a standard Xamarin app. The main disadvantage, however, is a slight hit in performance that might make it more difficult to create a perfect, buttery-smooth experience. Xamarin.Forms gives the option of writing renderers and effects that allow you to override your UI in a platform-specific way. This gives you the ability to drop down to native controls where needed. Have a look at the differences between a Xamarin.Forms application and a traditional Xamarin app in the following diagram: In both applications, the business logic and backend code of the application can be shared, but Xamarin.Forms gives an enormous benefit by allowing your UI code to be shared as well. Additionally, Xamarin.Forms applications have two projecttemplates to choose from, so let's cover each option: Xamarin.Forms Shared: Creates a shared project with all of your Xamarin.Forms code, an iOS project, and an Android project Xamarin.Forms Portable: Creates a Portable Class Library (PCL) containing all shared Xamarin.Forms code, an iOS project, and an Android project Both options will work well for any application, in general. Shared projects are basically a collection of code files that get added automatically by another project referencing it. Using a shared project allows you to use preprocessor statements to implement platform-specific code. PCL projects, on the other hand, create a portable .NET assembly that can be used on iOS, Android, and various other platforms. PCLs can't use preprocessor statements, so you generally set up platform-specific code with interface or abstract/base classes. In most cases, I think a PCL is a better option, since it inherently encourages better programming practices. See Chapter 3, Code Sharing between iOS and Android, for details on the advantages and disadvantages of these two code-sharing techniques. Using XAML in Xamarin.Forms In addition to defining Xamarin.Forms controls from C# code, Xamarin has provided the tooling for developing your UI in Extensible Application Markup Language (XAML). XAML is a declarative language that is basically a set of XML elements that map to a certain control in the Xamarin.Forms framework. Using XAML is comparable to using HTML to define the UI on a webpage, with the exception that XAML in Xamarin.Forms is creating C# objects that represent a native UI. To understand how XAML works in Xamarin.Forms, let's create a new page with different types of Xamarin.Forms controls on it. Return to your HelloForms project from earlier, and open the HelloFormsPage.xaml file. Add the following XAML code between the <ContentPage> tags: <StackLayout Orientation="Vertical" Padding="10,20,10,10"> <Label Text="My Label" XAlign="Center" /> <Button Text="My Button" /> <Entry Text="My Entry" /> <Image Source="https://www.xamarin.com/content/images/ pages/branding/assets/xamagon.png" /> <Switch IsToggled="true" /> <Stepper Value="10" /> </StackLayout> Go ahead and run the application on iOS; your application will look something like the following screenshot: On Android, the application looks identical to iOS, except it is using native Android controls instead of the iOS counterparts: In our XAML, we created a StackLayout control, which is a container for other controls. It can lay out controls either vertically or horizontally one by one, as defined by the Orientation value. We also applied a padding of 10 around the sides and bottom, and 20 from the top to adjust for the iOS status bar. You may be familiar with this syntax for defining rectangles if you are familiar with WPF or Silverlight. Xamarin.Forms uses the same syntax of left, top, right, and bottom values, delimited by commas. We also usedseveral of the built-in Xamarin.Forms controls to see how they work: Label: We used this earlier in the chapter. Used only for displaying text, this maps to a UILabel on iOS and a TextView on Android. Button: A general purpose button that can be tapped by a user. This control maps to a UIButton on iOS and a Button on Android. Entry: This control is a single-line text entry. It maps to a UITextField on iOS and an EditText on Android. Image: This is a simple control for displaying an image on the screen, which maps to a UIImage on iOS and an ImageView on Android. We used the Source property of this control, which loads an image from a web address. Using URLs on this property is nice, but it is best for performance to include the image in your project where possible. Switch: This is an on/off switch or toggle button. It maps to a UISwitch on iOS and a Switch on Android. Stepper: This is a general-purpose input for entering numbers using two plus and minus buttons. On iOS, this maps to a UIStepper, while on Android, Xamarin.Forms implements this functionality with two buttons. These are just some of the controls provided by Xamarin.Forms. There are also more complicated controls, such as the ListView and TableView, which you would expect for delivering mobile UIs. Even though we used XAML in this example, you could also implement this Xamarin.Forms page from C#. Here is an example of what that would look like: public class UIDemoPageFromCode : ContentPage { public UIDemoPageFromCode() { var layout = new StackLayout { Orientation = StackOrientation.Vertical, Padding = new Thickness(10, 20, 10, 10), }; layout.Children.Add(new Label { Text = "My Label", XAlign = TextAlignment.Center, }); layout.Children.Add(new Button { Text = "My Button", }); layout.Children.Add(new Image { Source = "https://www.xamarin.com/content/images/pages/ branding/assets/xamagon.png", }); layout.Children.Add(new Switch { IsToggled = true, }); layout.Children.Add(new Stepper { Value = 10, }); Content = layout; } } So, you can see where using XAML can be a bit more readable, and is generally a bit better at declaring UIs than C#. However, using C# to define your UIs is still a viable, straightforward approach. Using data-binding and MVVM At this point, you should begrasping the basics of Xamarin.Forms, but are wondering how theMVVM design pattern fits into the picture. The MVVM design pattern was originally conceived for use along with XAML and the powerful data binding features XAML provides, so it is only natural that it is a perfect design pattern to be used with Xamarin.Forms. Let's cover the basics of how data-binding and MVVM is set up with Xamarin.Forms: Your Model and ViewModel layers will remain mostly unchanged from the MVVM pattern we covered earlier in the book. Your ViewModels should implement the INotifyPropertyChanged interface, which facilitates data binding. To simplify things in Xamarin.Forms, you can use the BindableObject base class and call OnPropertyChanged when values change on your ViewModels. Any Page or control in Xamarin.Forms has a BindingContext, which is the object that it is data-bound to. In general, you can set a corresponding ViewModel to each view's BindingContext property. In XAML, you can set up a data-binding by using syntax of the form Text="{Binding Name}". This example would bind the Text property of the control to a Name property of the object residing in the BindingContext. In conjunction with data binding, events can be translated to commands using the ICommand interface. So, for example, the click event of a Button can be data-bound to a command exposed by a ViewModel. There is a built-in Command class in Xamarin.Forms to support this. Data binding can also be set up with C# code in Xamarin.Forms using the Binding class. However, it is generally much easier to set up bindings with XAML, since the syntax has been simplified with XAML markup extensions. Now that we have covered the basics, let's go through step-by-step and partially convert our XamSnap sample application from earlier in the book to use Xamarin.Forms. For the most part, we can reuse most of the Model and ViewModel layers, although we will have to make a few minor changes to support data-binding with XAML. Let's begin by creating a new Xamarin.Forms application backed by a PCL, named XamSnap: First, create three folders in the XamSnap project named Views, ViewModels, and Models. Add the appropriate ViewModels and Models classes from the XamSnap application from earlier chapters; these are found in the XamSnap project. Build the project, just to make sure everything is saved. You will get a few compiler errors, which we will resolve shortly. The first class we will need to edit is the BaseViewModel class; open it and make the following changes: public class BaseViewModel : BindableObject { protected readonly IWebService service = DependencyService.Get<IWebService>(); protected readonly ISettings settings = DependencyService.Get<ISettings>(); bool isBusy = false; public bool IsBusy { get { return isBusy; } set { isBusy = value; OnPropertyChanged(); } } } First of all, we removed the calls to the ServiceContainer class, because Xamarin.Forms provides its own IoC container called the DependencyService. It functions very similarly to the container we built in the previous chapters, except it only has one method, Get<T>, and registrations are set up via an assembly attribute that we will set up shortly. Additionally, we removed the IsBusyChanged event in favor of the INotifyPropertyChanged interface that supports data binding. Inheriting from BindableObject gave us the helper method, OnPropertyChanged, which we use to inform bindings in Xamarin.Forms that the value has changed. Notice we didn't pass a string containing the property name to OnPropertyChanged. This method is using a lesser-known feature of .NET 4.0 called CallerMemberName, which will automatically fill in the calling property's name at runtime. Next, let's set up the services we need with the DependencyService. Open App.xaml.cs in the root of the PCL project and add the following two lines above the namespace declaration: [assembly: Dependency(typeof(XamSnap.FakeWebService))] [assembly: Dependency(typeof(XamSnap.FakeSettings))] The DependencyService will automatically pick up these attributes and inspect the types we declared. Any interfaces these types implement will be returned for any future callers of DependencyService.Get<T>. I normally put all Dependency declarations in the App.cs file, just so they are easy to manage and in one place. Next, let's modify LoginViewModel by adding a new property: public Command LoginCommand { get; set; } We'll use this shortly for data-binding the command of a Button. One last change in the view model layer is to set up INotifyPropertyChanged for MessageViewModel: Conversation[] conversations; public Conversation[] Conversations { get { return conversations; } set { conversations = value; OnPropertyChanged(); } } Likewise, you could repeat this pattern for the remaining public properties throughout the view model layer, but this is all we will need for this example. Next, let's create a new Forms ContentPage Xaml file named LoginPage in the Views folder. In the code-behind file, LoginPage.xaml.cs, we'll just need to make a few changes: public partial class LoginPage : ContentPage { readonly LoginViewModel loginViewModel = new LoginViewModel(); public LoginPage() { Title = "XamSnap"; BindingContext = loginViewModel; loginViewModel.LoginCommand = new Command(async () => { try { await loginViewModel.Login(); await Navigation.PushAsync(new ConversationsPage()); } catch (Exception exc) { await DisplayAlert("Oops!", exc.Message, "Ok"); } }); InitializeComponent(); } } We did a few important things here, including setting the BindingContext to our LoginViewModel. We set up the LoginCommand, which basically invokes the Login method and displays a message if something goes wrong. It also navigates to a new page if successful. We also set the Title, which will show up in the top navigation bar of the application. Next, open LoginPage.xaml and we'll add the following XAML code inside ContentPage: <StackLayout Orientation="Vertical" Padding="10,10,10,10"> <Entry Placeholder="Username" Text="{Binding UserName}" /> <Entry Placeholder="Password" Text="{Binding Password}" IsPassword="true" /> <Button Text="Login" Command="{Binding LoginCommand}" /> <ActivityIndicator IsVisible="{Binding IsBusy}" IsRunning="true" /> </StackLayout> This will set up the basics of two text fields, a button, and a spinner, complete with all the bindings to make everything work. Since we set up BindingContext from the LoginPage code-behind file, all the properties are bound to LoginViewModel. Next, create ConversationsPage as a XAML page just like before, and edit the ConversationsPage.xaml.cs code-behind file: public partial class ConversationsPage : ContentPage { readonly MessageViewModel messageViewModel = new MessageViewModel(); public ConversationsPage() { Title = "Conversations"; BindingContext = messageViewModel; InitializeComponent(); } protected async override void OnAppearing() { try { await messageViewModel.GetConversations(); } catch (Exception exc) { await DisplayAlert("Oops!", exc.Message, "Ok"); } } } In this case, we repeated a lot of the same steps. The exception is that we used the OnAppearing method as a way to load the conversations to display on the screen. Now let's add the following XAML code to ConversationsPage.xaml: <ListView ItemsSource="{Binding Conversations}"> <ListView.ItemTemplate> <DataTemplate> <TextCell Text="{Binding UserName}" /> </DataTemplate> </ListView.ItemTemplate> </ListView> In this example, we used ListView to data-bind a list of items and display on the screen. We defined a DataTemplate class, which represents a set of cells for each item in the list that the ItemsSource is data-bound to. In our case, a TextCell displaying the Username is created for each item in the Conversations list. Last but not least, we must return to the App.xaml.cs file and modify the startup page: MainPage = new NavigationPage(new LoginPage()); We used a NavigationPage here so that Xamarin.Forms can push and pop between different pages. This uses a UINavigationController on iOS, so you can see how the native APIs are being used on each platform. At this point, if youcompile and run the application, you will get afunctional iOS and Android application that can log in and view a list of conversations: Summary Xamarin.Forms In this chapter, we covered the basics of Xamarin.Forms and how it can be very useful for building your own cross-platform applications. Xamarin.Forms shines for certain types of apps, but can be limiting if you need to write more complicated UIs or take advantage of native drawing APIs. We discovered how to use XAML for declaring our Xamarin.Forms UIs and understood how Xamarin.Forms controls are rendered on each platform. We also dived into the concepts of data-binding and how to use the MVVM design pattern with Xamarin.Forms. Last but not least, we began porting the XamSnap application from earlier in the book to Xamarin.Forms, and were able to reuse a lot of our existing code. In the next chapter, we will cover the process of submitting applications to the iOS App Store and Google Play. Getting your app into the store can be a time-consuming process, but guidance from the next chapter will give you a head start.
Read more
  • 0
  • 0
  • 23875
Modal Close icon
Modal Close icon