Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Save more on your purchases! discount-offer-chevron-icon
Savings automatically calculated. No voucher code required.
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Newsletter Hub
Free Learning
Arrow right icon
timer SALE ENDS IN
0 Days
:
00 Hours
:
00 Minutes
:
00 Seconds

How-To Tutorials

7008 Articles
article-image-app-metrics-analyze-http-traffic-errors-network-performance-net-core-app
Aaron Lazar
20 Aug 2018
12 min read
Save for later

Use App Metrics to analyze HTTP traffic, errors & network performance of a .NET Core app [Tutorial]

Aaron Lazar
20 Aug 2018
12 min read
App Metrics is an open source tool that can be plugged in with the ASP.NET Core applications. It provides real-time insights about how the application is performing and provides a complete overview of the application's health status. It provides metrics in a JSON format and integrates with the Grafana dashboards for visual reporting. App Metrics is based on .NET Standard and runs cross-platform. It provides various extensions and reporting dashboards that can run on Windows and Linux operating system as well. In this article, we will focus on App Metrics, analyse HTTP traffic, errors, and network performance in .NET Core. This tutorial is an extract from the book C# 7 and .NET Core 2.0 High Performance, authored by Ovais Mehboob Ahmed Khan. Setting up App Metrics with ASP.NET Core We can set up App Metrics in the ASP.NET Core application in three easy steps, which are as follows: Install App Metrics. App Metrics can be installed as NuGet packages. Here are the two packages that can be added through NuGet in your .NET Core project: Install-Package App.Metrics Install-Pacakge App.Metrics.AspnetCore.Mvc Add App Metrics in Program.cs. Add UseMetrics to Program.cs in the BuildWebHost method, as follows: public static IWebHost BuildWebHost(string[] args) => WebHost.CreateDefaultBuilder(args) .UseMetrics() .UseStartup<Startup>() .Build(); Add App Metrics in Startup.cs. Finally, we can add a metrics resource filter in the ConfigureServices method of the Startup class as follows: public void ConfigureServices(IServiceCollection services) { services.AddMvc(options => options.AddMetricsResourceFilter()); } Run your application. Build and run the application. We can test whether App Metrics is running well by using URLs, as shown in the following table. Just append the URL to the application's root URL: URL Description /metrics Shows metrics using the configured metrics formatter /metrics-text Shows metrics using the configured text formatter /env Shows environment information, which includes the operating system, machine name, assembly name, and version Appending /metrics or /metrics-text to the application's root URL gives complete information about application metrics. /metrics returns the JSON response that can be parsed and represented in a view with some custom parsing. Tracking middleware With App Metrics, we can manually define the typical web metrics which are essential to record telemetry information. However, for ASP.NET Core, there is a tracking middleware that can be used and configured in the project, which contains some built-in key metrics which are specific to the web application. Metrics that are recorded by the Tracking middleware are as follows: Apdex: This is used to monitor the user's satisfaction based on the overall performance of the application. Apdex is an open industry standard that measures the user's satisfaction based on the application's response time. We can configure the threshold of time, T, for each request cycle, and the metrics are calculated based on following conditions: User Satisfaction Description Satisfactory If the response time is less than or equal to the threshold time (T) Tolerating If the response time is between the threshold time (T) and 4 times that of the threshold time (T) in seconds Frustrating If the respo nse time is greater than 4 times that of the threshold time (T) Response times: This provides the overall throughput of the request being processed by the application and the duration it takes per route within the application. Active requests: This provides the list of active requests which have been received on the server in a particular amount of time. Errors: This provides the aggregated results of errors in a percentage that includes the overall error request rate, the overall count of each uncaught exception type, the total number of error requests per HTTP status code, and so on. POST and PUT sizes: This provides the request sizes for HTTP POST and PUT requests. Adding tracking middleware We can add tracking middleware as a NuGet package as follows: Install-Package App.Metrics.AspNetCore.Tracking Tracking middleware provides a set of middleware that is added to record telemetry for the specific metric. We can add the following middleware in the Configure method to measure performance metrics: app.UseMetricsApdexTrackingMiddleware(); app.UseMetricsRequestTrackingMiddleware(); app.UseMetricsErrorTrackingMiddleware(); app.UseMetricsActiveRequestMiddleware(); app.UseMetricsPostAndPutSizeTrackingMiddleware(); app.UseMetricsOAuth2TrackingMiddleware(); Alternatively, we can also use meta-pack middleware, which adds all the available tracking middleware so that we have information about all the different metrics which are in the preceding code: app.UseMetricsAllMiddleware(); Next, we will add tracking middleware in our ConfigureServices method as follows: services.AddMetricsTrackingMiddleware(); In the main Program.cs class, we will modify the BuildWebHost method and add the UseMetricsWebTracking method as follows: public static IWebHost BuildWebHost(string[] args) => WebHost.CreateDefaultBuilder(args) .UseMetrics() .UseMetricsWebTracking() .UseStartup<Startup>() .Build(); Setting up configuration Once the middleware is added, we need to set up the default threshold and other configuration values so that reporting can be generated accordingly. The web tracking properties can be configured in the appsettings.json file. Here is the content of the appsettings.json file that contains the MetricWebTrackingOptions JSON key: "MetricsWebTrackingOptions": { "ApdexTrackingEnabled": true, "ApdexTSeconds": 0.1, "IgnoredHttpStatusCodes": [ 404 ], "IgnoredRoutesRegexPatterns": [], "OAuth2TrackingEnabled": true }, ApdexTrackingEnabled is set to true so that the customer satisfaction report will be generated, and ApdexTSeconds is the threshold that decides whether the request response time was satisfactory, tolerating, or frustrating. IgnoredHttpStatusCodes contains the list of status codes that will be ignored if the response returns a 404 status. IgnoredRoutesRegexPatterns are used to ignore specific URIs that match the regular expression, and OAuth2TrackingEnabled can be set to monitor and record the metrics for each client and provide information specific to the request rate, error rate, and POST and PUT sizes for each client. Run the application and do some navigation. Appending /metrics-text in your application URL will display the complete report in textual format. Here is the sample snapshot of what textual metrics looks like: Adding visual reports There are various extensions and reporting plugins available that provide a visual reporting dashboard. Some of them are GrafanaCloud Hosted Metrics, InfluxDB, Prometheus, ElasticSearch, Graphite, HTTP, Console, and Text File. We will configure the InfluxDB extension and see how visual reporting can be achieved. Setting up InfluxDB InfluxDB is the open source time series database developed by Influx Data. It is written in the Go language and is widely used to store time series data for real-time analytics. Grafana is the server that provides reporting dashboards that can be viewed through a browser. InfluxDB can easily be imported as an extension in Grafana to display visual reporting from the InfluxDB database. Setting up the Windows subsystem for Linux In this section, we will set up InfluxDB on the Windows subsystem for the Linux operating system. First of all, we need to enable the Windows subsystem for Linux by executing the following command from the PowerShell as an Administrator: Enable-WindowsOptionalFeature -Online -FeatureName Microsoft-Windows-Subsystem-Linux After running the preceding command, restart your computer. Next, we will install Linux distro from the Microsoft store. In our case, we will install Ubuntu from the Microsoft Store. Go to the Microsoft Store, search for Ubuntu, and install it. Once the installation is done, click on Launch: This will open up the console window, which will ask you to create a user account for Linux OS (Operating System). Specify the username and password that will be used. Run the following command to update Ubuntu to the latest stable version from the bash shell. To run bash, open the command prompt, write bash, and hit Enter: Finally, it will ask you to create an Ubuntu username and password. Specify the username and password and hit enter. Installing InfluxDB Here, we will go through some steps to install the InfluxDB database in Ubuntu: To set up InfluxDB, open a command prompt in Administrator mode and run the bash shell. Execute the following commands to the InfluxDB data store on your local PC: $ curl -sL https://repos.influxdata.com/influxdb.key | sudo apt-key add - $ source /etc/lsb-release $ echo "deb https://repos.influxdata.com/${DISTRIB_ID,,} $ {DISTRIB_CODENAME} stable" | sudo tee /etc/apt/sources.list.d/influxdb.list Install InfluxDB by executing the following command: $ sudo apt-get update && sudo apt-get install influxdb Execute the following command to run InfluxDB: $ sudo influxd Start the InfluxDB shell by running the following command: $ sudo influx It will open up the shell where database-specific commands can be executed. Create a database by executing the following command. Specify a meaningful name for the database. In our case, it is appmetricsdb: > create database appmetricsdb Installing Grafana Grafana is an open source tool used to display dashboards in a web interface. There are various dashboards available that can be imported from the Grafana website to display real-time analytics. Grafana can simply be downloaded as a zip file from http://docs.grafana.org/installation/windows/. Once it is downloaded, we can start the Grafana server by clicking on the grafana-server.exe executable from the bin directory. Grafana provides a website that listens on port 3000. If the Grafana server is running, we can access the site by navigating to http://localhost:3000. Adding the InfluxDB dashboard There is an out-of-the-box InfluxDB dashboard available in Grafana which can be imported from the following link: https://grafana.com/dashboards/2125. Copy the dashboard ID and use this to import it into the Grafana website. We can import the InfluxDB dashboard by going to the Manage option on the Grafana website, as follows: From the Manage option, click on the + Dashboard button and hit the New Dashboard option. Clicking on Import Dashboard will lead to Grafana asking you for the dashboard ID: Paste the dashboard ID (for example, 2125) copied earlier into the box and hit Tab. The system will show the dashboard's details, and clicking on the Import button will import it into the system: Configuring InfluxDB We will now configure the InfluxDB dashboard and add a data source that connects to the database that we just created. To proceed, we will go to the Data Sources section on the Grafana website and click on the Add New Datasource option. Here is the configuration that adds the data source for the InfluxDB database: Modifying the Configure and ConfigureServices methods in Startup Up to now, we have set up Ubuntu and the InfluxDB database on our machine. We also set up the InfluxDB data source and added a dashboard through the Grafana website. Next, we will configure our ASP.NET Core web application to push real-time information to the InfluxDB database. Here is the modified ConfigureServices method that initializes the MetricsBuilder to define the attribute related to the application name, environment, and connection details: public void ConfigureServices(IServiceCollection services) { var metrics = new MetricsBuilder() .Configuration.Configure( options => { options.WithGlobalTags((globalTags, info) => { globalTags.Add("app", info.EntryAssemblyName); globalTags.Add("env", "stage"); }); }) .Report.ToInfluxDb( options => { options.InfluxDb.BaseUri = new Uri("http://127.0.0.1:8086"); options.InfluxDb.Database = "appmetricsdb"; options.HttpPolicy.Timeout = TimeSpan.FromSeconds(10); }) .Build(); services.AddMetrics(metrics); services.AddMetricsReportScheduler(); services.AddMetricsTrackingMiddleware(); services.AddMvc(options => options.AddMetricsResourceFilter()); } In the preceding code, we have set the application name app as the assembly name, and the environment env as the stage. http://127.0.0.1:8086 is the URL of the InfluxDB server that listens for the telemetry being pushed by the application. appmetricsdb is the database that we created in the preceding section. Then, we added the AddMetrics middleware and specified the metrics containing the configuration. AddMetricsTrackingMiddleware is used to track the web telemetry information which is displayed on the dashboard, and AddMetricsReportScheduled is used to push the telemetry information to the database. Here is the Configure method that contains UseMetricsAllMiddleware to use App Metrics. UseMetricsAllMiddleware adds all the middleware available in App Metrics: public void Configure(IApplicationBuilder app, IHostingEnvironment env) { if (env.IsDevelopment()) { app.UseBrowserLink(); app.UseDeveloperExceptionPage(); } else { app.UseExceptionHandler("/Error"); } app.UseStaticFiles(); app.UseMetricsAllMiddleware(); app.UseMvc(); } Rather than calling UseAllMetricsMiddleware, we can also add individual middleware explicitly based on the requirements. Here is the list of middleware that can be added: app.UseMetricsApdexTrackingMiddleware(); app.UseMetricsRequestTrackingMiddleware(); app.UseMetricsErrorTrackingMiddleware(); app.UseMetricsActiveRequestMiddleware(); app.UseMetricsPostAndPutSizeTrackingMiddleware(); app.UseMetricsOAuth2TrackingMiddleware(); Testing the ASP.NET Core App and reporting on the Grafana dashboard To test the ASP.NET Core application and to see visual reporting on the Grafana dashboard, we will go through following steps: Start the Grafana server by going to {installation_directory}\bin\grafana-server.exe. Start bash from the command prompt and run the sudo influx command. Start another bash from the command prompt and run the sudo influx command. Run the ASP.NET Core application. Access http://localhost:3000 and click on the App Metrics dashboard. This will start gathering telemetry information and will display the performance metrics, as shown in the following screenshots: The following graph shows the total throughput in Request Per Minute (RPM), error percentage, and active requests: Here is the Apdex score colorizing the user satisfaction into three different colors, where red is frustrating, orange is tolerating, and green is satisfactory. The following graph shows the blue line being drawn on the green bar, which means that the application performance is satisfactory: The following snapshot shows the throughput graph for all the requests being made, and each request has been colorized with the different colors: red, orange, and green. In this case, there are two HTTP GET requests for the about and contact us pages: Here is the response time graph showing the response time of both requests: If you liked this article and would like to learn more such techniques, go and pick up the full book, C# 7 and .NET Core 2.0 High Performance, authored by Ovais Mehboob Ahmed Khan. Get to know ASP.NET Core Web API [Tutorial] How to call an Azure function from an ASP.NET Core MVC application ASP.NET Core High Performance
Read more
  • 0
  • 0
  • 38171

article-image-best-practices-for-c-code-optimization-tutorial
Aaron Lazar
17 Aug 2018
9 min read
Save for later

Best practices for C# code optimization [Tutorial]

Aaron Lazar
17 Aug 2018
9 min read
There are many factors that negatively impact the performance of a .NET Core application. Sometimes these are minor things that were not considered earlier at the time of writing the code, and are not addressed by the accepted best practices. As a result, to solve these problems, programmers often resort to ad hoc solutions. However, when bad practices are combined together, they produce performance issues. It is always better to know the best practices that help developers write cleaner code and make the application performant. In this article, we will learn the following topics: Boxing and unboxing overhead String concatenation Exceptions handling for versus foreach Delegates This tutorial is an extract from the book, C# 7 and .NET Core 2.0 High Performance, authored by Ovais Mehboob Ahmed Khan. Boxing and unboxing overhead The boxing and unboxing methods are not always good to use and they negatively impact the performance of mission-critical applications. Boxing is a method of converting a value type to an object type, and is done implicitly, whereas unboxing is a method of converting an object type back to a value type and requires explicit casting. Let's go through an example where we have two methods executing a loop of 10 million records, and in each iteration, they are incrementing the counter by 1. The AvoidBoxingUnboxing method is using a primitive integer to initialize and increment it on each iteration, whereas the BoxingUnboxing method is boxing by assigning the numeric value to the object type first and then unboxing it on each iteration to convert it back to the integer type, as shown in the following code: private static void AvoidBoxingUnboxing() { Stopwatch watch = new Stopwatch(); watch.Start(); //Boxing int counter = 0; for (int i = 0; i < 1000000; i++) { //Unboxing counter = i + 1; } watch.Stop(); Console.WriteLine($"Time taken {watch.ElapsedMilliseconds}"); } private static void BoxingUnboxing() { Stopwatch watch = new Stopwatch(); watch.Start(); //Boxing object counter = 0; for (int i = 0; i < 1000000; i++) { //Unboxing counter = (int)i + 1; } watch.Stop(); Console.WriteLine($"Time taken {watch.ElapsedMilliseconds}"); } When we run both methods, we will clearly see the differences in performance. The BoxingUnboxing is executed seven times slower than the AvoidBoxingUnboxing method, as shown in the following screenshot: For mission-critical applications, it's always better to avoid boxing and unboxing. However, in .NET Core, we have many other types that internally use objects and perform boxing and unboxing. Most of the types under System.Collections and System.Collections.Specialized use objects and object arrays for internal storage, and when we store primitive types in these collections, they perform boxing and convert each primitive value to an object type, adding extra overhead and negatively impacting the performance of the application. Other types of System.Data, namely DateSet, DataTable, and DataRow, also use object arrays under the hood. Types under the System.Collections.Generic namespace or typed arrays are the best approaches to use when performance is the primary concern. For example, HashSet<T>, LinkedList<T>, and List<T> are all types of generic collections. For example, here is a program that stores the integer value in ArrayList: private static void AddValuesInArrayList() { Stopwatch watch = new Stopwatch(); watch.Start(); ArrayList arr = new ArrayList(); for (int i = 0; i < 1000000; i++) { arr.Add(i); } watch.Stop(); Console.WriteLine($"Total time taken is {watch.ElapsedMilliseconds}"); } Let's write another program that uses a generic list of the integer type: private static void AddValuesInGenericList() { Stopwatch watch = new Stopwatch(); watch.Start(); List<int> lst = new List<int>(); for (int i = 0; i < 1000000; i++) { lst.Add(i); } watch.Stop(); Console.WriteLine($"Total time taken is {watch.ElapsedMilliseconds}"); } When running both programs, the differences are pretty noticeable. The code with the generic list List<int> is over 10 times faster than the code with ArrayList. The result is as follows: String concatenation In .NET, strings are immutable objects. Two strings refer to the same memory on the heap until the string value is changed. If any of the string is changed, a new string is created on the heap and is allocated a new memory space. Immutable objects are generally thread safe and eliminate the race conditions between multiple threads. Any change in the string value creates and allocates a new object in memory and avoids producing conflicting scenarios with multiple threads. For example, let's initialize the string and assign the Hello World value to the  a string variable: String a = "Hello World"; Now, let's assign the  a string variable to another variable, b: String b = a; Both a and b point to the same value on the heap, as shown in the following diagram: Now, suppose we change the value of b to Hope this helps: b= "Hope this helps"; This will create another object on the heap, where a points to the same and b refers to the new memory space that contains the new text: With each change in the string, the object allocates a new memory space. In some cases, it may be an overkill scenario, where the frequency of string modification is higher and each modification is allocated a separate memory space, creates work for the garbage collector in collecting the unused objects and freeing up space. In such a scenario, it is highly recommended that you use the StringBuilder class. Exception handling Improper handling of exceptions also decreases the performance of an application. The following list contains some of the best practices in dealing with exceptions in .NET Core: Always use a specific exception type or a type that can catch the exception for the code you have written in the method. Using the Exception type for all cases is not a good practice. It is always a good practice to use try, catch, and finally block where the code can throw exceptions. The final block is usually used to clean up the resources, and returns a proper response that the calling code is expecting. In deeply nested code, don't use try catch block and handle it to the calling method or main method. Catching exceptions on multiple stacks slows down performance and is not recommended. Always use exceptions for fatal conditions that terminate the program. Using exceptions for noncritical conditions, such as converting the value to an integer or reading the value from an empty array, is not recommended and should be handled through custom logic. For example, converting a string value to the integer type can be done by using the Int32.Parse method rather than by using the Convert.ToInt32 method and then failing at a point when the string is not represented as a digit. While throwing an exception, add a meaningful message so that the user knows where that exception has actually occurred rather than going through the stack trace. For example, the following code shows a way of throwing an exception and adding a custom message based on the method and class being called: static string GetCountryDetails(Dictionary<string, string> countryDictionary, string key) { try { return countryDictionary[key]; } catch (KeyNotFoundException ex) { KeyNotFoundException argEx = new KeyNotFoundException(" Error occured while executing GetCountryDetails method. Cause: Key not found", ex); throw argEx; } } Throw exceptions rather than returning the custom messages or error codes and handle it in the main calling method. When logging exceptions, always check the inner exception and read the exception message or stack trace. It is helpful, and gives the actual point in the code where the error is thrown. For vs foreach For and foreach are two of the alternative ways of iterating over a list of items. Each of them operates in a different way. The for loop actually loads all the items of the list in memory first and then uses an indexer to iterate over each element, whereas foreach uses an enumerator and iterates until it reaches the end of the list. The following table shows the types of collections that are good to use for for and foreach: Type For/Foreach Typed array Good for both Array list Better with for Generic collections Better with for Delegates Delegates are a type in .NET which hold the reference to the method. The type is equivalent to the function pointer in C or C++. When defining a delegate, we can specify both the parameters that the method can take and its return type. This way, the reference methods will have the same signature. Here is a simple delegate that takes a string and returns an integer: delegate int Log(string n); Now, suppose we have a LogToConsole method that has the same signature as the one shown in the following code. This method takes the string and writes it to the console window: static int LogToConsole(string a) { Console.WriteLine(a); return 1; } We can initialize and use this delegate like this: Log logDelegate = LogToConsole; logDelegate ("This is a simple delegate call"); Suppose we have another method called LogToDatabase that writes the information in the database: static int LogToDatabase(string a) { Console.WriteLine(a); //Log to database return 1; } Here is the initialization of the new logDelegate instance that references the LogToDatabase method: Log logDelegateDatabase = LogToDatabase; logDelegateDatabase ("This is a simple delegate call"); The preceding delegate is the representation of unicast delegates, as each instance refers to a single method. On the other hand, we can also create multicast delegates by assigning  LogToDatabase to the same LogDelegate instance, as follows: Log logDelegate = LogToConsole; logDelegate += LogToDatabase; logDelegate("This is a simple delegate call"); The preceding code seems pretty straightforward and optimized, but under the hood, it has a huge performance overhead. In .NET, delegates are implemented by a MutlicastDelegate class that is optimized to run unicast delegates. It stores the reference of the method to the target property and calls the method directly. For multicast delegates, it uses the invocation list, which is a generic list, and holds the references to each method that is added. With multicast delegates, each target property holds the reference to the generic list that contains the method and executes in sequence. However, this adds an overhead for multicast delegates and takes more time to execute. If you liked this article and would like to learn more such techniques, grab this book, C# 7 and .NET Core 2.0 High Performance, authored by Ovais Mehboob Ahmed Khan. Behavior Scripting in C# and Javascript for game developers Exciting New Features in C# 8.0 Exploring Language Improvements in C# 7.2 and 7.3
Read more
  • 0
  • 1
  • 26997

article-image-rust-for-web-development-tutorial
Aaron Lazar
17 Aug 2018
13 min read
Save for later

Use Rust for web development [Tutorial]

Aaron Lazar
17 Aug 2018
13 min read
You might think that Rust is only meant to be used for complex system development, or that it should be used where security is the number one concern. Thinking of using it for web development might sound to you like huge overkill. We already have proven web-oriented languages that have worked until now, such as PHP or JavaScript, right? This is far from true. Many projects use the web as their platform and for them, it's sometimes more important to be able to receive a lot of traffic without investing in expensive servers rather than using legacy technologies, especially in new products. This is where Rust comes in handy. Thanks to its speed and some really well thought out web-oriented frameworks, Rust performs even better than the legacy web programming languages. In this tutorial, we'll see how Rust can be used for Web Development. This article is an extract from Rust High Performance, authored by Iban Eguia Moraza. Rust is even trying to replace some of the JavaScript on the client side of applications, since Rust can compile to WebAssembly, making it extremely powerful for heavy client-side web workloads. Creating extremely efficient web templates We have seen that Rust is a really efficient language and metaprogramming allows for the creation of even more efficient code. Rust has great templating language support, such as Handlebars and Tera. Rust's Handlebars implementation is much faster than the JavaScript implementation, while Tera is a template engine created for Rust based on Jinja2. In both cases, you define a template file and then you use Rust to parse it. Even though this will be reasonable for most web development, in some cases, it might be slower than pure Rust alternatives. This is where the Maud crate comes in. We will see how it works and how it achieves orders of magnitude faster performance than its counterparts. To use Maud, you will need nightly Rust, since it uses procedural macros. As we saw in previous chapters, if you are using rustup you can simply run rustup override set nightly. Then, you will need to add Maud to your Cargo.toml file in the [dependencies] section: [dependencies] maud = "0.17.2 Maud brings an html!{} procedural macro that enables you to write HTML in Rust. You will, therefore, need to import the necessary crate and macro in your main.rs or lib.rs file, as you will see in the following code. Remember to also add the procedural macro feature at the beginning of the crate: #![feature(proc_macro)] extern crate maud; use maud::html; You will now be able to use the html!{} macro in your main() function. This macro will return a Markup object, which you can then convert to a String or return to Rocket or Iron for your website implementation (you will need to use the relevant Maud features in that case). Let's see what a short template implementation looks like: fn main() { use maud::PreEscaped; let user_name = "FooBar"; let markup = html! { (PreEscaped("<!DOCTYPE html>")) html { head { title { "Test website" } meta charset="UTF-8"; } body { header { nav { ul { li { "Home" } li { "Contact Us" } } } } main { h1 { "Welcome to our test template!" } p { "Hello, " (user_name) "!" } } footer { p { "Copyright © 2017 - someone" } } } } }; println!("{}", markup.into_string()); } It seems like a complex template, but it contains just the basic information a new website should have. We first add a doctype, making sure it will not escape the content (that is what the PreEscaped is for) and then we start the HTML document with two parts: the head and the body. In the head, we add the required title and the charset meta element to tell the browser that we will be using UTF-8. Then, the body contains the three usual sections, even though this can, of course, be modified. One header, one main section, and one footer. I added some example information in each of the sections and showed you how to add a dynamic variable in the main section inside a paragraph. The interesting syntax here is that you can create elements with attributes, such as the meta element, even without content, by finishing it early with a semicolon. You can use any HTML tag and add variables. The generated code will be escaped, except if you ask for non-escaped data, and it will be minified so that it occupies the least space when being transmitted. Inside the parentheses, you can call any function or variable that returns a type that implements the Display trait and you can even add any Rust code if you add braces around it, with the last statement returning a Display element. This works on attributes too. This gets processed at compile time, so that at runtime it will only need to perform the minimum possible amount of work, making it extremely efficient. And not only that; the template will be typesafe thanks to Rust's compile-time guarantees, so you won't forget to close a tag or an attribute. There is a complete guide to the templating engine that can be found at https://maud.lambda.xyz/. Connecting with a database If we want to use SQL/relational databases in Rust, there is no other crate to think about than Diesel. If you need access to NoSQL databases such as Redis or MongoDB, you will also find proper crates, but since the most used databases are relational databases, we will check Diesel here. Diesel makes working with MySQL/MariaDB, PostgreSQL, and SQLite very easy by providing a great ORM and typesafe query builder. It prevents all potential SQL injections at compile time, but is still extremely fast. In fact, it's usually faster than using prepared statements, due to the way it manages connections to databases. Without entering into technical details, we will check how this stable framework works. The development of Diesel has been impressive and it's already working in stable Rust. It even has a stable 1.x version, so let's check how we can map a simple table. Diesel comes with a command-line interface program, which makes it much easier to use. To install it, run cargo install diesel_cli. Note that, by default, this will try to install it for PostgreSQL, MariaDB/MySQL, and SQLite. For this short tutorial, you need to have SQLite 3 development files installed, but if you want to avoid installing all MariaDB/MySQL or PostgreSQL files, you should run the following command: cargo install --no-default-features --features sqlite diesel_cli Then, since we will be using SQLite for our short test, add a file named .env to the current directory, with the following content: DATABASE_URL=test.sqlite We can now run diesel setup and diesel migration generate initial_schema. This will create the test.sqlite SQLite database and a migrations folder, with the first empty initial schema migration. Let's add this to the initial schema up.sql file: CREATE TABLE 'users' ( 'username' TEXT NOT NULL PRIMARY KEY, 'password' TEXT NOT NULL, 'email' TEXT UNIQUE ); In its counterpart down.sql file, we will need to drop the created table: DROP TABLE `users`; Then, we can execute diesel migration run and check that everything went smoothly. We can execute diesel migration redo to check that the rollback and recreation worked properly. We can now start using the ORM. We will need to add diesel, diesel_infer_schema, and dotenv to our Cargo.toml. The dotenv crate will read the .env file to generate the environment variables. If you want to avoid using all the MariaDB/MySQL or PostgreSQL features, you will need to configure diesel for it: [dependencies] dotenv = "0.10.1" [dependencies.diesel] version = "1.1.1" default-features = false features = ["sqlite"] [dependencies.diesel_infer_schema] version = "1.1.0" default-features = false features = ["sqlite"] Let's now create a structure that we will be able to use to retrieve data from the database. We will also need some boilerplate code to make everything work: #[macro_use] extern crate diesel; #[macro_use] extern crate diesel_infer_schema; extern crate dotenv; use diesel::prelude::*; use diesel::sqlite::SqliteConnection; use dotenv::dotenv; use std::env; #[derive(Debug, Queryable)] struct User { username: String, password: String, email: Option<String>, } fn establish_connection() -> SqliteConnection { dotenv().ok(); let database_url = env::var("DATABASE_URL") .expect("DATABASE_URL must be set"); SqliteConnection::establish(&database_url) .expect(&format!("error connecting to {}", database_url)) } mod schema { infer_schema!("dotenv:DATABASE_URL"); } Here, the establish_connection() function will call dotenv() so that the variables in the .env file get to the environment, and then it uses that DATABASE_URL variable to establish the connection with the SQLite database and returns the handle. The schema module will contain the schema of the database. The infer_schema!() macro will get the DATABASE_URL variable and connect to the database at compile time to generate the schema. Make sure you run all the migrations before compiling. We can now develop a small main() function with the basics to list all of the users from the database: fn main() { use schema::users::dsl::*; let connection = establish_connection(); let all_users = users .load::<User>(&connection) .expect("error loading users"); println!("{:?}", all_users); } This will just load all of the users from the database into a list. Notice the use statement at the beginning of the function. This retrieves the required information from the schema for the users table so that we can then call users.load(). As you can see in the guides at diesel.rs, you can also generate Insertable objects, which might not have some of the fields with default values, and you can perform complex queries by filtering the results in the same way you would write a SELECT statement. Creating a complete web server There are multiple web frameworks for Rust. Some of them work in stable Rust, such as Iron and Nickel Frameworks, and some don't, such as Rocket. We will talk about the latter since, even if it forces you to use the latest nightly branch, it's so much more powerful than the rest that it really makes no sense to use any of the others if you have the option to use Rust nightly. Using Diesel with Rocket, apart from the funny wordplay joke, works seamlessly. You will probably be using the two of them together, but in this section, we will learn how to create a small Rocket server without any further complexity. There are some boilerplate code implementations that add a database, cache, OAuth, templating, response compression, JavaScript minification, and SASS minification to the website, such as my Rust web template in GitHub if you need to start developing a real-life Rust web application. Rocket trades that nightly instability, which will break your code more often than not, for simplicity and performance. Developing a Rocket application is really easy and the performance of the results is astonishing. It's even faster than using some other, seemingly simpler frameworks, and of course, it's much faster than most of the frameworks in other languages. So, how does it feel to develop a Rocket application? We start by adding the latest rocket and rocket_codegen crates to our Cargo.toml file and adding a nightly override to our current directory by running rustup override set nightly. The rocket crate contains all the code to run the server, while the rocket_codegen crate is actually a compiler plugin that modifies the language to adapt it for web development. We can now write the default Hello, world! Rocket example: #![feature(plugin)] #![plugin(rocket_codegen)] extern crate rocket; #[get("/")] fn index() -> &'static str { "Hello, world!" } fn main() { rocket::ignite().mount("/", routes![index]).launch(); } In this example, we can see how we ask Rust to let us use plugins to then import the rocket_codegen plugin. This will enable us to use attributes such as #[get] or #[post] with request information that will generate boilerplate code when compiled, leaving our code fairly simple for our development. Also, note that this code has been checked with Rocket 0.3 and it might fail in a future version, since the library is not stable yet. In this case, you can see that the index() function will respond to any GET request with a base URL. This can be modified to accept only certain URLs or to get the path of something from the URL. You can also have overlapping routes with different priorities so that if one is not taken for a request guard, the next will be tried. And, talking about request guards, you can create objects that can be generated when processing a request that will only let the request process a given function if they are properly built. This means that you can, for example, create a User object that will get generated by checking the cookies in the request and comparing them in a Redis database, only allowing the execution of the function for logged-in users. This easily prevents many logic flaws. The main() function ignites the Rocket and mounts the index route at /. This means that you can have multiple routes with the same path mounted at different route paths and they do not need to know about the whole path in the URL. In the end, it will launch the Rocket server and if you run it with cargo run, it will show the following: If you go to the URL, you will see the Hello, World! message. Rocket is highly configurable. It has a rocket_contrib crate which offers templates and further features, and you can create responders to add GZip compression to responses. You can also create your own error responders when an error occurs. You can also configure the behavior of Rocket by using the Rocket.toml file and environment variables. As you can see in this last output, it is running in development mode, which adds some debugging information. You can configure different behaviors for staging and production modes and make them perform faster. Also, make sure that you compile the code in --release mode in production. If you want to develop a web application in Rocket, make sure you check https://rocket.rs/ for further information. Future releases also look promising. Rocket will implement native CSRF and XSS prevention, which, in theory, should prevent all XSS and CSRF attacks at compile time. It will also make further customizations to the engine possible. If you found this article useful and would like to learn more such tips, head over to pick up the book, Rust High Performance, authored by Iban Eguia Moraza. Mozilla is building a bridge between Rust and JavaScript Perform Advanced Programming with Rust Say hello to Sequoia: a new Rust based OpenPGP library to secure your apps
Read more
  • 0
  • 0
  • 20788

article-image-task-parallel-library-multi-threading-net-core
Aaron Lazar
16 Aug 2018
11 min read
Save for later

Task parallel library for easy multi-threading in .NET Core [Tutorial]

Aaron Lazar
16 Aug 2018
11 min read
Compared to the classic threading model in .NET, Task Parallel Library minimizes the complexity of using threads and provides an abstraction through a set of APIs that help developers focus more on the application program instead of focusing on how the threads will be provisioned. In this article, we'll learn how TPL benefits of using traditional threading techniques for concurrency and high performance. There are several benefits of using TPL over threads: It autoscales the concurrency to a multicore level It autoscales LINQ queries to a multicore level It handles the partitioning of the work and uses ThreadPool where required It is easy to use and reduces the complexity of working with threads directly This tutorial is an extract from the book, C# 7 and .NET Core 2.0 High Performance, authored by Ovais Mehboob Ahmed Khan. Creating a task using TPL TPL APIs are available in the System.Threading and System.Threading.Tasks namespaces. They work around the task, which is a program or a block of code that runs asynchronously. An asynchronous task can be run by calling either the Task.Run or TaskFactory.StartNew methods. When we create a task, we provide a named delegate, anonymous method, or a lambda expression that the task executes. Here is a code snippet that uses a lambda expression to execute the ExecuteLongRunningTasksmethod using Task.Run: class Program { static void Main(string[] args) { Task t = Task.Run(()=>ExecuteLongRunningTask(5000)); t.Wait(); } public static void ExecuteLongRunningTask(int millis) { Thread.Sleep(millis); Console.WriteLine("Hello World"); } } In the preceding code snippet, we have executed the ExecuteLongRunningTask method asynchronously using the Task.Run method. The Task.Run method returns the Task object that can be used to further wait for the asynchronous piece of code to be executed completely before the program ends. To wait for the task, we have used the Wait method. Alternatively, we can also use the Task.Factory.StartNew method, which is more advanced and provides more options. While calling the Task.Factory.StartNew method, we can specify CancellationToken, TaskCreationOptions, and TaskScheduler to set the state, specify other options, and schedule tasks. TPL uses multiple cores of the CPU out of the box. When the task is executed using the TPL API, it automatically splits the task into one or more threads and utilizes multiple processors, if they are available. The decision as to how many threads will be created is calculated at runtime by CLR. Whereas a thread only has an affinity to a single processor, running any task on multiple processors needs a proper manual implementation. Task-based asynchronous pattern (TAP) When developing any software, it is always good to implement the best practices while designing its architecture. The task-based asynchronous pattern is one of the recommended patterns that can be used when working with TPL. There are, however, a few things to bear in mind while implementing TAP. Naming convention The method executing asynchronously should have the naming suffix Async. For example, if the method name starts with ExecuteLongRunningOperation, it should have the suffix Async, with the resulting name of ExecuteLongRunningOperationAsync. Return type The method signature should return either a System.Threading.Tasks.Task or System.Threading.Tasks.Task<TResult>. The task's return type is equivalent to the method that returns void, whereas TResult is the data type. Parameters The out and ref parameters are not allowed as parameters in the method signature. If multiple values need to be returned, tuples or a custom data structure can be used. The method should always return Task or Task<TResult>, as discussed previously. Here are a few signatures for both synchronous and asynchronous methods: Synchronous methodAsynchronous methodVoid Execute();Task ExecuteAsync();List<string> GetCountries();Task<List<string>> GetCountriesAsync();Tuple<int, string> GetState(int stateID);Task<Tuple<int, string>> GetStateAsync(int stateID);Person GetPerson(int personID);Task<Person> GetPersonAsync(int personID); Exceptions The asynchronous method should always throw exceptions that are assigned to the returning task. However, the usage errors, such as passing null parameters to the asynchronous method, should be properly handled. Let's suppose we want to generate several documents dynamically based on a predefined templates list, where each template populates the placeholders with dynamic values and writes it on the filesystem. We assume that this operation will take a sufficient amount of time to generate a document for each template. Here is a code snippet showing how the exceptions can be handled: static void Main(string[] args) { List<Template> templates = GetTemplates(); IEnumerable<Task> asyncDocs = from template in templates select GenerateDocumentAsync(template); try { Task.WaitAll(asyncDocs.ToArray()); }catch(Exception ex) { Console.WriteLine(ex); } Console.Read(); } private static async Task<int> GenerateDocumentAsync(Template template) { //To automate long running operation Thread.Sleep(3000); //Throwing exception intentionally throw new Exception(); } In the preceding code, we have a GenerateDocumentAsync method that performs a long running operation, such as reading the template from the database, populating placeholders, and writing a document to the filesystem. To automate this process, we used Thread.Sleep to sleep the thread for three seconds and then throw an exception that will be propagated to the calling method. The Main method loops the templates list and calls the GenerateDocumentAsync method for each template. Each GenerateDocumentAsync method returns a task. When calling an asynchronous method, the exception is actually hidden until the Wait, WaitAll, WhenAll, and other methods are called. In the preceding example, the exception will be thrown once the Task.WaitAll method is called, and will log the exception on the console. Task status The task object provides a TaskStatus that is used to know whether the task is executing the method running, has completed the method, has encountered a fault, or whether some other occurrence has taken place. The task initialized using Task.Run initially has the status of Created, but when the Start method is called, its status is changed to Running. When applying the TAP pattern, all the methods return the Task object, and whether they are using the Task.Run inside, the method body should be activated. That means that the status should be anything other than Created. The TAP pattern ensures the consumer that the task is activated and the starting task is not required. Task cancellation Cancellation is an optional thing for TAP-based asynchronous methods. If the method accepts the CancellationToken as the parameter, it can be used by the caller party to cancel a task. However, for a TAP, the cancellation should be properly handled. Here is a basic example showing how cancellation can be implemented: static void Main(string[] args) { CancellationTokenSource tokenSource = new CancellationTokenSource(); CancellationToken token = tokenSource.Token; Task.Factory.StartNew(() => SaveFileAsync(path, bytes, token)); } static Task<int> SaveFileAsync(string path, byte[] fileBytes, CancellationToken cancellationToken) { if (cancellationToken.IsCancellationRequested) { Console.WriteLine("Cancellation is requested..."); cancellationToken.ThrowIfCancellationRequested } //Do some file save operation File.WriteAllBytes(path, fileBytes); return Task.FromResult<int>(0); } In the preceding code, we have a SaveFileAsync method that takes the byte array and the CancellationToken as parameters. In the Main method, we initialize the CancellationTokenSource that can be used to cancel the asynchronous operation later in the program. To test the cancellation scenario, we will just call the Cancel method of the tokenSource after the Task.Factory.StartNew method and the operation will be canceled. Moreover, when the task is canceled, its status is set to Cancelled and the IsCompleted property is set to true. Task progress reporting With TPL, we can use the IProgress<T> interface to get real-time progress notifications from the asynchronous operations. This can be used in scenarios where we need to update the user interface or the console app of asynchronous operations. When defining the TAP-based asynchronous methods, defining IProgress<T> in a parameter is optional. We can have overloaded methods that can help consumers to use in the case of specific needs. However, they should only be used if the asynchronous method supports them.  Here is the modified version of SaveFileAsync that updates the user about the real progress: static void Main(string[] args) { var progressHandler = new Progress<string>(value => { Console.WriteLine(value); }); var progress = progressHandler as IProgress<string>; CancellationTokenSource tokenSource = new CancellationTokenSource(); CancellationToken token = tokenSource.Token; Task.Factory.StartNew(() => SaveFileAsync(path, bytes, token, progress)); Console.Read(); } static Task<int> SaveFileAsync(string path, byte[] fileBytes, CancellationToken cancellationToken, IProgress<string> progress) { if (cancellationToken.IsCancellationRequested) { progress.Report("Cancellation is called"); Console.WriteLine("Cancellation is requested..."); } progress.Report("Saving File"); File.WriteAllBytes(path, fileBytes); progress.Report("File Saved"); return Task.FromResult<int>(0); } Implementing TAP using compilers Any method that is attributed with the async keyword (for C#) or Async for (Visual Basic) is called an asynchronous method. The async keyword can be applied to a method, anonymous method, or a Lambda expression, and the language compiler can execute that task asynchronously. Here is a simple implementation of the TAP method using the compiler approach: static void Main(string[] args) { var t = ExecuteLongRunningOperationAsync(100000); Console.WriteLine("Called ExecuteLongRunningOperationAsync method, now waiting for it to complete"); t.Wait(); Console.Read(); } public static async Task<int> ExecuteLongRunningOperationAsync(int millis) { Task t = Task.Factory.StartNew(() => RunLoopAsync(millis)); await t; Console.WriteLine("Executed RunLoopAsync method"); return 0; } public static void RunLoopAsync(int millis) { Console.WriteLine("Inside RunLoopAsync method"); for(int i=0;i< millis; i++) { Debug.WriteLine($"Counter = {i}"); } Console.WriteLine("Exiting RunLoopAsync method"); } In the preceding code, we have the ExecuteLongRunningOperationAsync method, which is implemented as per the compiler approach. It calls the RunLoopAsync that executes a loop for a certain number of milliseconds that is passed in the parameter. The async keyword on the ExecuteLongRunningOperationAsync method actually tells the compiler that this method has to be executed asynchronously, and, once the await statement is reached, the method returns to the Main method that writes the line on a console and waits for the task to be completed. Once the RunLoopAsync is executed, the control comes back to await and starts executing the next statements in the ExecuteLongRunningOperationAsync method. Implementing TAP with greater control over Task As we know, that the TPL is centered on the Task and Task<TResult> objects. We can execute an asynchronous task by calling the Task.Run method and execute a delegate method or a block of code asynchronously and use Wait or other methods on that task. However, this approach is not always adequate, and there are scenarios where we may have different approaches to executing asynchronous operations, and we may use an Event-based Asynchronous Pattern (EAP) or an Asynchronous Programming Model (APM). To implement TAP principles here, and to get the same control over asynchronous operations executing with different models, we can use the TaskCompletionSource<TResult> object. The TaskCompletionSource<TResult> object is used to create a task that executes an asynchronous operation. When the asynchronous operation completes, we can use the TaskCompletionSource<TResult> object to set the result, exception, or state of the task. Here is a basic example that executes the ExecuteTask method that returns Task, where the ExecuteTask method uses the TaskCompletionSource<TResult> object to wrap the response as a Task and executes the ExecuteLongRunningTask through the Task.StartNew method: static void Main(string[] args) { var t = ExecuteTask(); t.Wait(); Console.Read(); } public static Task<int> ExecuteTask() { var tcs = new TaskCompletionSource<int>(); Task<int> t1 = tcs.Task; Task.Factory.StartNew(() => { try { ExecuteLongRunningTask(10000); tcs.SetResult(1); }catch(Exception ex) { tcs.SetException(ex); } }); return tcs.Task; } public static void ExecuteLongRunningTask(int millis) { Thread.Sleep(millis); Console.WriteLine("Executed"); } So now, we've been able to use TPL and TAP over traditional threads, thus improving performance. If you liked this article and would like to learn more such techniques, pick up this book, C# 7 and .NET Core 2.0 High Performance, authored by Ovais Mehboob Ahmed Khan. Get to know ASP.NET Core Web API [Tutorial] .NET Core completes move to the new compiler – RyuJIT Applying Single Responsibility principle from SOLID in .NET Core
Read more
  • 0
  • 0
  • 33211

article-image-f-for-net-core-application-development-tutorial
Aaron Lazar
16 Aug 2018
17 min read
Save for later

Getting started with F# for .Net Core application development [Tutorial]

Aaron Lazar
16 Aug 2018
17 min read
F# is Microsoft's purely functional programming language, that can be used along with the .NET Core framework. In this article, we will get introduced to F# to leverage .NET Core for our application development. This article is extracted from the book, .NET Core 2.0 By Example, written by Rishabh Verma and Neha Shrivastava. Basics of classes Classes are types of object which can contain functions, properties, and events. An F# class must have a parameter and a function attached like a member. Both properties and functions can use the member keyword. The following is the class definition syntax: type [access-modifier] type-name [type-params] [access-modifier] (parameter-list) [ as identifier ] = [ class ] [ inherit base-type-name(base-constructor-args) ] [ let-bindings ] [ do-bindings ] member-list [ end ] // Mutually recursive class definitions: type [access-modifier] type-name1 ... and [access-modifier] type-name2 ... Let’s discuss the preceding syntax for class declaration: type: In the F# language, class definition starts with a type keyword. access-modifier: The F# language supports three access modifiers—public, private, and internal. By default, it considers the public modifier if no other access modifier is provided. The Protected keyword is not used in the F# language, and the reason is that it will become object oriented rather than functional programming. For example, F# usually calls a member using a lambda expression and if we make a member type protected and call an object of a different instance, it will not work. type-name: It is any of the previously mentioned valid identifiers; the default access modifier is public. type-params: It defines optional generic type parameters. parameter-list: It defines constructor parameters; the default access modifier for the primary constructor is public. identifier: It is used with the optional as keyword, the as keyword gives a name to an instance variable which can be used in the type definition to refer to the instance of the type. Inherit: This keyword allows us to specify the base class for a class. let-bindings: This is used to declare fields or function values in the context of a class. do-bindings: This is useful for the execution of code to create an object member-list: The member-list comprises extra constructors, instance and static method declarations, abstract bindings, interface declarations, and event and property declarations. Here is an example of a class: type StudentName(firstName,lastName) = member this.FirstName = firstName member this.LastName = lastName In the previous example, we have not defined the parameter type. By default, the program considers it as a string value but we can explicitly define a data type, as follows: type StudentName(firstName:string,lastName:string) = member this.FirstName = firstName member this.LastName = lastName Constructor of a class In F#, the constructor works in a different way to any other .NET language. The constructor creates an instance of a class. A parameter list defines the arguments of the primary constructor and class. The constructor contains let and do bindings, which we will discuss next. We can add multiple constructors, apart from the primary constructor, using the new keyword and it must invoke the primary constructor, which is defined with the class declaration. The syntax of defining a new constructor is as shown: new (argument-list) = constructor-body Here is an example to explain the concept. In the following code, the StudentDetail class has two constructors: a primary constructor that takes two arguments and another constructor that takes no arguments: type StudentDetail(x: int, y: int) = do printfn "%d %d" x y new() = StudentDetail(0, 0) A let and do binding A let and do binding creates the primary constructor of a class and runs when an instance of a class is created. A function is compiled into a member if it has a let binding. If the let binding is a value which is not used in any function or member, then it is compiled into a local variable of a constructor; otherwise, it is compiled into a field of the class. The do expression executes the initialized code. As any extra constructors always call the primary constructor, let and do bindings always execute, irrespective of which constructor is called. Fields that are created by let bindings can be accessed through the methods and properties of the class, though they cannot be accessed from static methods, even if the static methods take an instance variable as a parameter: type Student(name) as self = let data = name do self.PrintMessage() member this.PrintMessage() = printf " Student name is %s" data Generic type parameters F# also supports a generic parameter type. We can specify multiple generic type parameters separated by a comma. The syntax of a generic parameter declaration is as follows: type MyGenericClassExample<'a> (x: 'a) = do printfn "%A" x The type of the parameter infers where it is used. In the following code, we call the MyGenericClassExample method and pass a sequence of tuples, so here the parameter type became a sequence of tuples: let g1 = MyGenericClassExample( seq { for i in 1 .. 10 -> (i, i*i) } ) Properties Values related to an object are represented by properties. In object-oriented programming, properties represent data associated with an instance of an object. The following snippet shows two types of property syntax: // Property that has both get and set defined. [ attributes ] [ static ] member [accessibility-modifier] [self- identifier.]PropertyName with [accessibility-modifier] get() = get-function-body and [accessibility-modifier] set parameter = set-function-body // Alternative syntax for a property that has get and set. [ attributes-for-get ] [ static ] member [accessibility-modifier-for-get] [self-identifier.]PropertyName = get-function-body [ attributes-for-set ] [ static ] member [accessibility-modifier-for-set] [self- identifier.]PropertyName with set parameter = set-function-body There are two kinds of property declaration: Explicitly specify the value: We should use the explicit way to implement the property if it has non-trivial implementation. We should use a member keyword for the explicit property declaration. Automatically generate the value: We should use this when the property is just a simple wrapper for a value. There are many ways of implementing an explicit property syntax based on need: Read-only: Only the get() method Write-only: Only the set() method Read/write: Both get() and set() methods An example is shown as follows: // A read-only property. member this.MyReadOnlyProperty = myInternalValue // A write-only property. member this.MyWriteOnlyProperty with set (value) = myInternalValue <- value // A read-write property. member this.MyReadWriteProperty with get () = myInternalValue and set (value) = myInternalValue <- value Backing stores are private values that contain data for properties. The keyword, member val instructs the compiler to create backing stores automatically and then gives an expression to initialize the property. The F# language supports immutable types, but if we want to make a property mutable, we should use get and set. As shown in the following example, the MyClassExample class has two properties: propExample1 is read-only and is initialized to the argument provided to the primary constructor, and propExample2 is a settable property initialized with a string value ".Net Core 2.0": type MyClassExample(propExample1 : int) = member val propExample1 = property1 member val propExample2 = ".Net Core 2.0" with get, set Automatically implemented properties don't work efficiently with some libraries, for example, Entity Framework. In these cases, we should use explicit properties. Static and instance properties There can be further categorization of properties as static or instance properties. Static, as the name suggests, can be invoked without any instance. The self-identifier is neglected by the static property while it is necessary for the instance property. The following is an example of the static property: static member MyStaticProperty with get() = myStaticValue and set(value) = myStaticValue <- value Abstract properties Abstract properties have no implementation and are fully abstract. They can be virtual. It should not be private and if one accessor is abstract all others must be abstract. The following is an example of the abstract property and how to use it: // Abstract property in abstract class. // The property is an int type that has a get and // set method [<AbstractClass>] type AbstractBase() = abstract Property1 : int with get, set // Implementation of the abstract property type Derived1() = inherit AbstractBase() let mutable value = 10 override this.Property1 with get() = value and set(v : int) = value <- v // A type with a "virtual" property. type Base1() = let mutable value = 10 abstract Property1 : int with get, set default this.Property1 with get() = value and set(v : int) = value <- v // A derived type that overrides the virtual property type Derived2() = inherit Base1() let mutable value2 = 11 override this.Property1 with get() = value2 and set(v) = value2 <- v Inheritance and casts In F#, the inherit keyword is used while declaring a class. The following is the syntax: type MyDerived(...) = inherit MyBase(...) In a derived class, we can access all methods and members of the base class, but it should not be a private member. To refer to base class instances in the F# language, the base keyword is used. Virtual methods and overrides  In F#, the abstract keyword is used to declare a virtual member. So, here we can write a complete definition of the member as we use abstract for virtual. F# is not similar to other .NET languages. Let's have a look at the following example: type MyClassExampleBase() = let mutable x = 0 abstract member virtualMethodExample : int -> int default u. virtualMethodExample (a : int) = x <- x + a; x type MyClassExampleDerived() = inherit MyClassExampleBase () override u. virtualMethodExample (a: int) = a + 1 In the previous example, we declared a virtual method, virtualMethodExample, in a base class, MyClassExampleBase, and overrode it in a derived class, MyClassExampleDerived. Constructors and inheritance An inherited class constructor must be called in a derived class. If a base class constructor contains some arguments, then it takes parameters of the derived class as input. In the following example, we will see how derived class arguments are passed in the base class constructor with inheritance: type MyClassBase2(x: int) = let mutable z = x * x do for i in 1..z do printf "%d " i type MyClassDerived2(y: int) = inherit MyClassBase2(y * 2) do for i in 1..y do printf "%d " i If a class has multiple constructors, such as new(str) or new(), and this class is inherited in a derived class, we can use a base class constructor to assign values. For example, DerivedClass, which inherits BaseClass, has new(str1,str2), and in place of the first string, we pass inherit BaseClass(str1). Similarly for blank, we wrote inherit BaseClass(). Let's explore the following example for more detail: type BaseClass = val string1 : string new (str) = { string1 = str } new () = { string1 = "" } type DerivedClass = inherit BaseClass val string2 : string new (str1, str2) = { inherit BaseClass(str1); string2 = str2 } new (str2) = { inherit BaseClass(); string2 = str2 } let obj1 = DerivedClass("A", "B") let obj2 = DerivedClass("A") Functions and lambda expressions A lambda expression is one kind of anonymous function, which means it doesn't have a name attached to it. But if we want to create a function which can be called, we can use the fun keyword with a lambda expression. We can pass the input parameter in the lambda function, which is created using the fun keyword. This function is quite similar to a normal F# function. Let's see a normal F# function and a lambda function: // Normal F# function let addNumbers a b = a+b // Evaluating values let sumResult = addNumbers 5 6 // Lambda function and evaluating values let sumResult = (fun (a:int) (b:int) -> a+b) 5 6 // Both the function will return value sumResult = 11 Handling data – tuples, lists, record types, and data manipulation F# supports many data types, for example: Primitive types: bool, int, float, string values. Aggregate type: class, struct, union, record, and enum Array: int[], int[ , ], and float[ , , ] Tuple: type1 * type2 * like (a,1,2,true) type is—char * int * int * bool Generic: list<’x>, dictionary < ’key, ’value> In an F# function, we can pass one tuple instead of multiple parameters of different types. Declaration of a tuple is very simple and we can assign values of a tuple to different variables, for example: let tuple1 = 1,2,3 // assigning values to variables , v1=1, v2= 2, v3=3 let v1,v2,v3 = tuple1 // if we want to assign only two values out of three, use “_” to skip the value. Assigned values: v1=1, //v3=3 let v1,_,v3 = tuple In the preceding examples, we saw that tuple supports pattern matching. These are option types and an option type in F# supports the idea that the value may or not be present at runtime. List List is a generic type implementation. An F# list is similar to a linked list implementation in any other functional language. It has a special opening and closing bracket construct, a short form of the standard empty list ([ ]) syntax: let empty = [] // This is an empty list of untyped type or we can say //generic type. Here type is: 'a list let intList = [10;20;30;40] // this is an integer type list The cons operator is used to prepend an item to a list using a double colon cons(prepend,::). To append another list to one list, we use the append operator—@: // prepend item x into a list let addItem xs x = x :: xs let newIntList = addItem intList 50 // add item 50 in above list //“intlist”, final result would be- [50;10;20;30;40] // using @ to append two list printfn "%A" (["hi"; "team"] @ ["how";"are";"you"]) // result – ["hi"; "team"; "how";"are";"you"] Lists are decomposable using pattern matching into a head and a tail part, where the head is the first item in the list and the tail part is the remaining list, for example: printfn "%A" newIntList.Head printfn "%A" newIntList.Tail printfn "%A" newIntList.Tail.Tail.Head let rec listLength (l: 'a list) = if l.IsEmpty then 0 else 1 + (listLength l.Tail) printfn "%d" (listLength newIntList) Record type The class, struct, union, record, and enum types come under aggregate types. The record type is one of them, it can have n number of members of any individual type. Record type members are by default immutable but we can make them mutable. In general, a record type uses the members as an immutable data type. There is no way to execute logic during instantiation as a record type don't have constructors. A record type also supports match expression, depending on the values inside those records, and they can also again decompose those values for individual handling, for example: type Box = {width: float ; height:int } let giftbox = {width = 6.2 ; height = 3 } In the previous example, we declared a Box with float a value width and an integer height. When we declare giftbox, the compiler automatically detects its type as Box by matching the value types. We can also specify type like this: let giftbox = {Box.width = 6.2 ; Box.height = 3 } or let giftbox : Box = {width = 6.2 ; height = 3 } This kind of type declaration is used when we have the same type of fields or field type declared in more than one type. This declaration is called a record expression. Object-oriented programming in F# F# also supports implementation inheritance, the creation of object, and interface instances. In F#, constructed types are fully compatible .NET classes which support one or more constructors. We can implement a do block with code logic, which can run at the time of class instance creation. The constructed type supports inheritance for class hierarchy creation. We use the inherit keyword to inherit a class. If the member doesn't have implementation, we can use the abstract keyword for declaration. We need to use the abstractClass attribute on the class to inform the compiler that it is abstract. If the abstractClass attribute is not used and type has all abstract members, the F# compiler automatically creates an interface type. Interface is automatically inferred by the compiler as shown in the following screenshot: The override keyword is used to override the base class implementation; to use the base class implementation of the same member, we use the base keyword. In F#, interfaces can be inherited from another interface. In a class, if we use the construct interface, we have to implement all the members in the interface in that class, as well. In general, it is not possible to use interface members from outside the class instance, unless we upcast the instance type to the required interface type. To create an instance of a class or interface, the object expression syntax is used. We need to override virtual members if we are creating a class instance and need member implementation for interface instantiation: type IExampleInterface = abstract member IntValue: int with get abstract member HelloString: unit -> string type PrintValues() = interface IExampleInterface with member x.IntValue = 15 member x.HelloString() = sprintf "Hello friends %d" (x :> IExampleInterface).IntValue let example = let varValue = PrintValues() :> IExampleInterface { new IExampleInterface with member x.IntValue = varValue.IntValue member x.HelloString() = sprintf "<b>%s</b>" (varValue.HelloString()) } printfn "%A" (example.HelloString()) Exception handling The exception keyword is used to create a custom exception in F#; these exceptions adhere to Microsoft best practices, such as constructors supplied, serialization support, and so on. The keyword raise is used to throw an exception. Apart from this, F# has some helper functions, such as failwith, which throws a failure exception at F# runtime, and invalidop, invalidarg, which throw the .NET Framework standard type invalid operation and invalid argument exception, respectively. try/with is used to catch an exception; if an exception occurred on an expression or while evaluating a value, then the try/with expression could be used on the right side of the value evaluation and to assign the value back to some other value. try/with also supports pattern matching to check an individual exception type and extract an item from it. try/finally expression handling depends on the actual code block. Let's take an example of declaring and using a custom exception: exception MyCustomExceptionExample of int * string raise (MyCustomExceptionExample(10, "Error!")) In the previous example, we created a custom exception called MyCustomExceptionExample, using the exception keyword, passing value fields which we want to pass. Then we used the raise keyword to raise exception passing values, which we want to display while running the application or throwing the exception. However, as shown here, while running this code, we don't get our custom message in the error value and the standard exception message is displayed: We can see in the previous screenshot that the exception message doesn't contain the message that we passed. In order to display our custom error message, we need to override the standard message property on the exception type. We will use pattern matching assignment to get two values and up-cast the actual type, due to the internal representation of the exception object. If we run this program again, we will get the custom message in the exception: exception MyCustomExceptionExample of int * string with override x.Message = let (MyCustomExceptionExample(i, s)) = upcast x sprintf "Int: %d Str: %s" i s raise (MyCustomExceptionExample(20, "MyCustomErrorMessage!")) Now, we will get the following error message: In the previous screenshot, we can see our custom message with integer and string values included in the output. We can also use the helper function, failwith, to raise a failure exception, as it includes our message as an error message, as follows: failwith "An error has occurred" The preceding error message can be seen in the following screenshot: Here is a detailed exception screenshot: An example of the invalidarg helper function follows. In this factorial function, we are checking that the value of x is greater than zero. For cases where x is less than 0, we call invalidarg, pass x as the parameter name that is invalid, and then some error message saying the value should be greater than 0. The invalidarg helper function throws an invalid argument exception from the standard system namespace in .NET: let rec factorial x = if x < 0 then invalidArg "x" "Value should be greater than zero" match x with | 0 -> 1 | _ -> x * (factorial (x - 1)) By now, you should be pretty familiar with the F# programming language, to use in your application development, alongside C#. If you found this tutorial helpful and you're interested in learning more, head over to this book .NET Core 2.0 By Example, by Rishabh Verma and Neha Shrivastava. .NET Core completes move to the new compiler – RyuJIT Applying Single Responsibility principle from SOLID in .NET Core Unit Testing in .NET Core with Visual Studio 2017 for better code quality
Read more
  • 0
  • 0
  • 17764

article-image-multithreading-in-rust-using-crates-tutorial
Aaron Lazar
15 Aug 2018
17 min read
Save for later

Multithreading in Rust using Crates [Tutorial]

Aaron Lazar
15 Aug 2018
17 min read
The crates.io ecosystem in Rust can make use of approaches to improve our development speed as well as the performance of our code. In this tutorial, we'll learn how to use the crates ecosystem to manipulate threads in Rust. This article is an extract from Rust High Performance, authored by Iban Eguia Moraza. Using non-blocking data structures One of the issues we saw earlier was that if we wanted to share something more complex than an integer or a Boolean between threads and if we wanted to mutate it, we needed to use a Mutex. This is not entirely true, since one crate, Crossbeam, allows us to use great data structures that do not require locking a Mutex. They are therefore much faster and more efficient. Often, when we want to share information between threads, it's usually a list of tasks that we want to work on cooperatively. Other times, we want to create information in multiple threads and add it to a list of information. It's therefore not so usual for multiple threads to be working with exactly the same variables since as we have seen, that requires synchronization and it will be slow. This is where Crossbeam shows all its potential. Crossbeam gives us some multithreaded queues and stacks, where we can insert data and consume data from different threads. We can, in fact, have some threads doing an initial processing of the data and others performing a second phase of the processing. Let's see how we can use these features. First, add crossbeam to the dependencies of the crate in the Cargo.toml file. Then, we start with a simple example: extern crate crossbeam; use std::thread; use std::sync::Arc; use crossbeam::sync::MsQueue; fn main() { let queue = Arc::new(MsQueue::new()); let handles: Vec<_> = (1..6) .map(|_| { let t_queue = queue.clone(); thread::spawn(move || { for _ in 0..1_000_000 { t_queue.push(10); } }) }) .collect(); for handle in handles { handle.join().unwrap(); } let final_queue = Arc::try_unwrap(queue).unwrap(); let mut sum = 0; while let Some(i) = final_queue.try_pop() { sum += i; } println!("Final sum: {}", sum); } Let's first understand what this example does. It will iterate 1,000,000 times in 5 different threads, and each time it will push a 10 to a queue. Queues are FIFO lists, first input, first output. This means that the first number entered will be the first one to pop() and the last one will be the last to do so. In this case, all of them are a 10, so it doesn't matter. Once the threads finish populating the queue, we iterate over it and we add all the numbers. A simple computation should make you able to guess that if everything goes perfectly, the final number should be 50,000,000. If you run it, that will be the result, and that's not all. If you run it by executing cargo run --release, it will run blazingly fast. On my computer, it took about one second to complete. If you want, try to implement this code with the standard library Mutex and vector, and you will see that the performance difference is amazing. As you can see, we still needed to use an Arc to control the multiple references to the queue. This is needed because the queue itself cannot be duplicated and shared, it has no reference count. Crossbeam not only gives us FIFO queues. We also have LIFO stacks. LIFO comes from last input, first output, and it means that the last element you inserted in the stack will be the first one to pop(). Let's see the difference with a couple of threads: extern crate crossbeam; use std::thread; use std::sync::Arc; use std::time::Duration; use crossbeam::sync::{MsQueue, TreiberStack}; fn main() { let queue = Arc::new(MsQueue::new()); let stack = Arc::new(TreiberStack::new()); let in_queue = queue.clone(); let in_stack = stack.clone(); let in_handle = thread::spawn(move || { for i in 0..5 { in_queue.push(i); in_stack.push(i); println!("Pushed :D"); thread::sleep(Duration::from_millis(50)); } }); let mut final_queue = Vec::new(); let mut final_stack = Vec::new(); let mut last_q_failed = 0; let mut last_s_failed = 0; loop { // Get the queue match queue.try_pop() { Some(i) => { final_queue.push(i); last_q_failed = 0; println!("Something in the queue! :)"); } None => { println!("Nothing in the queue :("); last_q_failed += 1; } } // Get the stack match stack.try_pop() { Some(i) => { final_stack.push(i); last_s_failed = 0; println!("Something in the stack! :)"); } None => { println!("Nothing in the stack :("); last_s_failed += 1; } } // Check if we finished if last_q_failed > 1 && last_s_failed > 1 { break; } else if last_q_failed > 0 || last_s_failed > 0 { thread::sleep(Duration::from_millis(100)); } } in_handle.join().unwrap(); println!("Queue: {:?}", final_queue); println!("Stack: {:?}", final_stack); } As you can see in the code, we have two shared variables: a queue and a stack. The secondary thread will push new values to each of them, in the same order, from 0 to 4. Then, the main thread will try to get them back. It will loop indefinitely and use the try_pop() method. The pop() method can be used, but it will block the thread if the queue or the stack is empty. This will happen in any case once all values get popped since no new values are being added, so the try_pop() method will help not to block the main thread and end gracefully. The way it checks whether all the values were popped is by counting how many times it failed to pop a new value. Every time it fails, it will wait for 100 milliseconds, while the push thread only waits for 50 milliseconds between pushes. This means that if it tries to pop new values two times and there are no new values, the pusher thread has already finished. It will add values as they are popped to two vectors and then print the result. In the meantime, it will print messages about pushing and popping new values. You will understand this better by seeing the output: Note that the output can be different in your case, since threads don't need to be executed in any particular order. In this example output, as you can see, it first tries to get something from the queue and the stack but there is nothing there, so it sleeps. The second thread then starts pushing things, two numbers actually. After this, the queue and the stack will be [0, 1]. Then, it pops the first item from each of them. From the queue, it will pop the 0 and from the stack it will pop the 1 (the last one), leaving the queue as [1] and the stack as [0]. It will go back to sleep and the secondary thread will insert a 2 in each variable, leaving the queue as [1, 2] and the stack as [0, 2]. Then, the main thread will pop two elements from each of them. From the queue, it will pop the 1 and the 2, while from the stack it will pop the 2 and then the 0, leaving both empty. The main thread then goes to sleep, and for the next two tries, the secondary thread will push one element and the main thread will pop it, twice. It might seem a little bit complex, but the idea is that these queues and stacks can be used efficiently between threads without requiring a Mutex, and they accept any Send type. This means that they are great for complex computations, and even for multi-staged complex computations. The Crossbeam crate also has some helpers to deal with epochs and even some variants of the mentioned types. For multithreading, Crossbeam also adds a great utility: scoped threads. Scoped threads In all our examples, we have used standard library threads. As we have discussed, these threads have their own stack, so if we want to use variables that we created in the main thread we will need to send them to the thread. This means that we will need to use things such as Arc to share non-mutable data. Not only that, having their own stack means that they will also consume more memory and eventually make the system slower if they use too much. Crossbeam gives us some special threads that allow sharing stacks between them. They are called scoped threads. Using them is pretty simple and the crate documentation explains them perfectly; you will just need to create a Scope by calling crossbeam::scope(). You will need to pass a closure that receives the Scope. You can then call spawn() in that scope the same way you would do it in std::thread, but with one difference, you can share immutable variables among threads if they were created inside the scope or moved to it. This means that for the queues or stacks we just talked about, or for atomic data, you can simply call their methods without requiring an Arc! This will improve the performance even further. Let's see how it works with a simple example: extern crate crossbeam; fn main() { let all_nums: Vec<_> = (0..1_000_u64).into_iter().collect(); let mut results = Vec::new(); crossbeam::scope(|scope| { for num in &all_nums { results.push(scope.spawn(move || num * num + num * 5 + 250)); } }); let final_result: u64 = results.into_iter().map(|res| res.join()).sum(); println!("Final result: {}", final_result); } Let's see what this code does. It will first just create a vector with all the numbers from 0 to 1000. Then, for each of them, in a crossbeam scope, it will run one scoped thread per number and perform a supposedly complex computation. This is just an example, since it will just return a result of a simple second-order function. Interestingly enough, though, the scope.spawn() method allows returning a result of any type, which is great in our case. The code will add each result to a vector. This won't directly add the resulting number, since it will be executed in parallel. It will add a result guard, which we will be able to check outside the scope. Then, after all the threads run and return the results, the scope will end. We can now check all the results, which are guaranteed to be ready for us. For each of them, we just need to call join() and we will get the result. Then, we sum it up to check that they are actual results from the computation. This join() method can also be called inside the scope and get the results, but it will mean that if you do it inside the for loop, for example, you will block the loop until the result is generated, which is not efficient. The best thing is to at least run all the computations first and then start checking the results. If you want to perform more computations after them, you might find it useful to run the new computation in another loop or iterator inside the crossbeam scope. But, how does crossbeam allow you to use the variables outside the scope freely? Won't there be data races? Here is where the magic happens. The scope will join all the inner threads before exiting, which means that no further code will be executed in the main thread until all the scoped threads finish. This means that we can use the variables of the main thread, also called parent stack, due to the main thread being the parent of the scope in this case without any issue. We can actually check what is happening by using the println!() macro. If we remember from previous examples, printing to the console after spawning some threads would usually run even before the spawned threads, due to the time it takes to set them up. In this case, since we have crossbeam preventing it, we won't see it. Let's check the example: extern crate crossbeam; fn main() { let all_nums: Vec<_> = (0..10).into_iter().collect(); crossbeam::scope(|scope| { for num in all_nums { scope.spawn(move || { println!("Next number is {}", num); }); } }); println!("Main thread continues :)"); } If you run this code, you will see something similar to the following output: As you can see, scoped threads will run without any particular order. In this case, it will first run the 1, then the 0, then the 2, and so on. Your output will probably be different. The interesting thing, though, is that the main thread won't continue executing until all the threads have finished. Therefore, reading and modifying variables in the main thread is perfectly safe. There are two main performance advantages with this approach; Arc will require a call to malloc() to allocate memory in the heap, which will take time if it's a big structure and the memory is a bit full. Interestingly enough, that data is already in our stack, so if possible, we should try to avoid duplicating it in the heap. Moreover, the Arc will have a reference counter, as we saw. And it will even be an atomic reference counter, which means that every time we clone the reference, we will need to atomically increment the count. This takes time, even more than incrementing simple integers. Most of the time, we might be waiting for some expensive computations to run, and it would be great if they just gave all the results when finished. We can still add some more chained computations, using scoped threads, that will only be executed after the first ones finish, so we should use scoped threads more often than normal threads, if possible. Using thread pool So far, we have seen multiple ways of creating new threads and sharing information between them. Nevertheless, the ideal number of threads we should spawn to do all the work should be around the number of virtual processors in the system. This means we should not spawn one thread for each chunk of work. Nevertheless, controlling what work each thread does can be complex, since you have to make sure that all threads have work to do at any given point in time. Here is where thread pooling comes in handy. The Threadpool crate will enable you to iterate over all your work and for each of your small chunks, you can call something similar to a thread::spawn(). The interesting thing is that each task will be assigned to an idle thread, and no new thread will be created for each task. The number of threads is configurable and you can get the number of CPUs with other crates. Not only that, if one of the threads panics, it will automatically add a new one to the pool. To see an example, first, let's add threadpool and num_cpus as dependencies in our Cargo.toml file.  Then, let's see an example code: extern crate num_cpus; extern crate threadpool; use std::sync::atomic::{AtomicUsize, Ordering}; use std::sync::Arc; use threadpool::ThreadPool; fn main() { let pool = ThreadPool::with_name("my worker".to_owned(), num_cpus::get()); println!("Pool threads: {}", pool.max_count()); let result = Arc::new(AtomicUsize::new(0)); for i in 0..1_0000_000 { let t_result = result.clone(); pool.execute(move || { t_result.fetch_add(i, Ordering::Relaxed); }); } pool.join(); let final_res = Arc::try_unwrap(result).unwrap().into_inner(); println!("Final result: {}", final_res); } This code will create a thread pool of threads with the number of logical CPUs in your computer. Then, it will add a number from 0 to 1,000,000 to an atomic usize, just to test parallel processing. Each addition will be performed by one thread. Doing this with one thread per operation (1,000,000 threads) would be really inefficient. In this case, though, it will use the appropriate number of threads, and the execution will be really fast. There is another crate that gives thread pools an even more interesting parallel processing feature: Rayon. Using parallel iterators If you can see the big picture in these code examples, you'll have realized that most of the parallel work has a long loop, giving work to different threads. It happened with simple threads and it happens even more with scoped threads and thread pools. It's usually the case in real life, too. You might have a bunch of data to process, and you can probably separate that processing into chunks, iterate over them, and hand them over to various threads to do the work for you. The main issue with that approach is that if you need to use multiple stages to process a given piece of data, you might end up with lots of boilerplate code that can make it difficult to maintain. Not only that, you might find yourself not using parallel processing sometimes due to the hassle of having to write all that code. Luckily, Rayon has multiple data parallelism primitives around iterators that you can use to parallelize any iterative computation. You can almost forget about the Iterator trait and use Rayon's ParallelIterator alternative, which is as easy to use as the standard library trait! Rayon uses a parallel iteration technique called work stealing. For each iteration of the parallel iterator, the new value or values get added to a queue of pending work. Then, when a thread finishes its work, it checks whether there is any pending work to do and if there is, it starts processing it. This, in most languages, is a clear source of data races, but thanks to Rust, this is no longer an issue, and your algorithms can run extremely fast and in parallel. Let's look at how to use it for an example similar to those we have seen in this chapter. First, add rayon to your Cargo.toml file and then let's start with the code: extern crate rayon; use rayon::prelude::*; fn main() { let result = (0..1_000_000_u64) .into_par_iter() .map(|e| e * 2) .sum::<u64>(); println!("Result: {}", result); } As you can see, this works just as you would write it in a sequential iterator, yet, it's running in parallel. Of course, running this example sequentially will be faster than running it in parallel thanks to compiler optimizations, but when you need to process data from files, for example, or perform very complex mathematical computations, parallelizing the input can give great performance gains. Rayon implements these parallel iteration traits to all standard library iterators and ranges. Not only that, it can also work with standard library collections, such as HashMap and Vec. In most cases, if you are using the iter() or into_iter() methods from the standard library in your code, you can simply use par_iter() or into_par_iter() in those calls and your code should now be parallel and work perfectly. But, beware, sometimes parallelizing something doesn't automatically improve its performance. Take into account that if you need to update some shared information between the threads, they will need to synchronize somehow, and you will lose performance. Therefore, multithreading is only great if workloads are completely independent and you can execute one without any dependency on the rest. If you found this article useful and would like to learn more such tips, head over to pick up this book, Rust High Performance, authored by Iban Eguia Moraza. Rust 1.28 is here with global allocators, nonZero types and more Java Multithreading: How to synchronize threads to implement critical sections and avoid race conditions Multithreading with Qt
Read more
  • 0
  • 0
  • 33775
Unlock access to the largest independent learning library in Tech for FREE!
Get unlimited access to 7500+ expert-authored eBooks and video courses covering every tech area you can think of.
Renews at €18.99/month. Cancel anytime
article-image-understanding-functional-reactive-programming-in-scala
Fatema Patrawala
15 Aug 2018
6 min read
Save for later

Understanding functional reactive programming in Scala [Tutorial]

Fatema Patrawala
15 Aug 2018
6 min read
Like OOP (Object-Oriented Programming), Functional Programming is a kind of programming paradigm. It is a programming style in which we write programs in terms of pure functions and immutable data. It treats its programs as function evaluation. As we use pure functions and immutable data to write our applications, we will get lots of benefits for free. For instance, with immutable data, we do not need to worry about shared-mutable states, side effects, and thread-safety. It follows a Declarative programming style, which means programming is done in terms of expressions, not statements. For instance, in OOP or imperative programming paradigms, we use statements to write programs where FP uses everything as expressions. In this scala functional programming tutorial we will understand the principles and benefits of FP and why Functional reactive programming is a best fit for Reactive programming in Scala. This Scala tutorial is an extract taken from the book Scala Reactive Programming written by Rambabu Posa.  Principles of functional programming FP has the following principles: Pure functions Immutable data No side effects Referential transparency (RT) Functions are first-class citizens Functions that include anonymous functions, higher order functions, combinators, partial functions, partially-applied functions, function currying, closures Tail recursion Functions composability A pure function is a function that always returns the same results for the same inputs irrespective of how many times and where you run this function. We will get lots of benefits with immutable data. For instance, no shared data, no side effects, thread safety for free, and so on. Like an object is a first-class citizen in OOP, in FP, a function is a first-class citizen. This means that we can use a function as any of these: An object A value A data A data type An operation In simple words, in FP, we treat both functions and data as the same. We can compose functions that are in sequential order so that we can solve even complex problems easily. Higher-Order Functions (HOF) are functions that take one or more functions as their parameters or return a function as their result or do both. For instance, map(), flatMap(), filter(), and so on are some of the important and frequently used higher-order functions. Consider the following example: map(x => x*x) Here, the map() function is an example of Higher-Order Function because it takes an anonymous function as its parameter. This anonymous function x => x *x is of type Int => Int, which takes an Int as input and returns Int as its result. An anonymous function is a function without any name. Benefits of functional programming FP provides us with many benefits: Thread-safe code Easy-to-write concurrency and parallel code We can write simple, readable, and elegant code Type safety Composability Supports Declarative programming As we use pure functions and immutability in FP, we will get thread-safety for free. One of the greatest benefits of FP is function composability. We can compose multiple functions one by one and execute them either sequentially or parentally. It gives us a great approach to solve complex problems easily. Functional Reactive programming The combination of FP and RP is known as function Reactive programming or, for short, FRP. It is a multiparadigm and combines the benefits and best features of two of the most popular programming paradigms, which are, FP and RP. FRP is a new programming paradigm or a new style of programming that uses the RP paradigm to support asynchronous non-blocking data streaming with backpressure and also uses the FP paradigm to utilize its features (such as pure functions, immutability, no side effects, RT, and more) and its HOF or combinators (such as map, flatMap, filter, reduce, fold, and zip). In simple words, FRP is a new programming paradigm to support RP using FP features and its building blocks. FRP = FP + RP, as shown here: Today, we have many FRP solutions, frameworks, tools, or technologies. Here's a list of a few FRP technologies: Scala, Play Framework, and Akka Toolkit RxJS Reactive-banana Reactive Sodium Haskell This book is dedicated toward discussing Lightbend's FRP technology stack—Lagom Framework, Scala, Play Framework, and Akka Toolkit (Akka Streams). FRP technologies are mainly useful in developing interactive programs, such as rich GUI (graphical user interfaces), animations, multiplayer games, computer music, or robot controllers. Types of Reactive Programming Even though most of the projects or companies use FP Paradigm to develop their Reactive systems or solutions, there are a couple of ways to use RP. They are known as types of RP: FRP (Functional Reactive Programming) OORP (Object-Oriented Reactive Programming) However, FP is the best programming paradigm to conflate with RP. We will get all the benefits of FP for free. Why FP is the best fit for RP When we conflate RP with FP, we will get the following benefits: Composability—we can compose multiple data streams using functional operations so that we can solve even complex problems easily Thread safety Readability Simple, concise, clear, and easy-to-understand code Easy-to-write asynchronous, concurrent, and parallel code Supports very flexible and easy-to-use operations Supports Declarative programming Easy to write, more Scalable, highly available, and robust code In FP, we concentrate on what to do to fulfill a job, whereas in other programming paradigms, such as OOP or imperative programming (IP), we concentrate on how to do. Declarative programming gives us the following benefits: No side effects Enforces to use immutability Easy to write concise and understandable code The main property of RP is real-time data streaming, and the main property of FP is composability. If we combine these two paradigms, we will get more benefits and can develop better solutions easily. In RP, everything is a stream, while everything is a function in FP. We can use these functions to perform operations on data streams. We learnt the principles and benefits of Scala functional programming. To build fault-tolerant, robust, and distributed applications in Scala, grab the book Scala Reactive Programming today. Introduction to the Functional Programming Manipulating functions in functional programming Why functional programming in Python matters: Interview with best selling author, Steven Lott
Read more
  • 0
  • 0
  • 19708

article-image-mongodb-sharding-clusters-choosing-right-shard-key
Fatema Patrawala
14 Aug 2018
9 min read
Save for later

MongoDB Sharding: Sharding clusters and choosing the right shard key [Tutorial]

Fatema Patrawala
14 Aug 2018
9 min read
Sharding was one of the features that MongoDB offered from an early stage, since version 1.6 was released in August 2010. Sharding is the ability to horizontally scale out our database by partitioning our datasets across different servers—the shards. Foursquare and Bitly are two of the most famous early customers for MongoDB that were also using sharding from its inception all the way to the general availability release. In this article we will learn how to design a sharding cluster and how to make the single most important decision around it of choosing the unique shard key. This article is a MongoDB shard tutorial taken from the book Mastering MongoDB 3.x by Alex Giamas. Sharding setup in MongoDB Sharding is performed at the collection level. We can have collections that we don't want or need to shard for several reasons. We can leave these collections unsharded. These collections will be stored in the primary shard. The primary shard is different for each database in MongoDB. The primary shard is automatically selected by MongoDB when we create a new database in a sharded environment. MongoDB will pick the shard that has the least data stored at the moment of creation. If we want to change the primary shard at any other point, we can issue the following command: > db.runCommand( { movePrimary : "mongo_books", to : "UK_based" } ) We thus move the database named mongo_books to the shard named UK_based. Choosing the shard key Choosing our shard key is the most important decision we need to make. The reason is that once we shard our data and deploy our cluster, it becomes very difficult to change the shard key. First, we will go through the process of changing the shard key. Changing the shard key There is no command or simple procedure to change the shard key in MongoDB. The only way to change the shard key involves backing up and restoring all of our data, something that may range from being extremely difficult to impossible in high-load production environments. The steps if we want to change our shard key are as follows: Export all data from MongoDB. Drop the original sharded collection. Configure sharding with the new key. Presplit the new shard key range. Restore our data back into MongoDB. From these steps, step 4 is the one that needs some more explanation. MongoDB uses chunks to split data in a sharded collection. If we bootstrap a MongoDB sharded cluster from scratch, chunks will be calculated automatically by MongoDB. MongoDB will then distribute the chunks across different shards to ensure that there are an equal number of chunks in each shard. The only case in which we cannot really do this is when we want to load data into a newly sharded collection. The reasons are threefold: MongoDB creates splits only after an insert operation. Chunk migration will copy all of the data in that chunk from one shard to another. The floor(n/2) chunk migrations can happen at any given time, where n is the number of shards we have. Even with three shards, this is only a floor(1.5)=1 chunk migration at a time. These three limitations combined mean that letting MongoDB to figure it out on its own will definitely take much longer and may result in an eventual failure. This is why we want to presplit our data and give MongoDB some guidance on where our chunks should go. Considering our example of the mongo_books database and the books collection, this would be: > db.runCommand( { split : "mongo_books.books", middle : { id : 50 } } ) The middle command parameter will split our key space in documents that have id<=50 and documents that have id>50. There is no need for a document to exist in our collection with id=50 as this will only serve as the guidance value for our partitions. In this example, we chose 50 assuming that our keys follow a uniform distribution (that is, the same count of keys for each value) in the range of values from 0 to 100. We should aim to create at least 20-30 chunks to grant MongoDB flexibility in potential migrations. We can also use bounds and find instead of middle if we want to manually define the partition key, but both parameters need data to exist in our collection before applying them. Choosing the correct shard key After the previous section, it's now self-evident that we need to take into great consideration the choice of our shard key as it is something that we have to stick with. A great shard key has three characteristics: High cardinality Low frequency Non-monotonically changing in value We will go over the definitions of these three properties first to understand what they mean. High cardinality means that the shard key must have as many distinct values as possible. A Boolean can take only values of true/false, and so it is a bad shard key choice. A 64-bit long value field that can take any value from −(2^63) to 2^63 − 1 and is a good example in terms of cardinality. Low frequency directly relates to the argument about high cardinality. A low-frequency shard key will have a distribution of values as close to a perfectly random / uniform distribution. Using the example of our 64-bit long value, it is of little use to us if we have a field that can take values ranging from −(2^63) to 2^63 − 1 only to end up observing the values of 0 and 1 all the time. In fact, it is as bad as using a Boolean field, which can also take only two values after all. If we have a shard key with high frequency values, we will end up with chunks that are indivisible. These chunks cannot be further divided and will grow in size, negatively affecting the performance of the shard that contains them. Non-monotonically changing values mean that our shard key should not be, for example, an integer that always increases with every new insert. If we choose a monotonically increasing value as our shard key, this will result in all writes ending up in the last of all of our shards, limiting our write performance. If we want to use a monotonically changing value as the shard key, we should consider using hash-based sharding. In the next section, we will describe different sharding strategies and their advantages and disadvantages. Range-based sharding The default and the most widely used sharding strategy is range-based sharding. This strategy will split our collection's data into chunks, grouping documents with nearby values in the same shard. For our example database and collection, mongo_books and books respectively, we have: > sh.shardCollection("mongo_books.books", { id: 1 } ) This creates a range-based shard key on id with ascending direction. The direction of our shard key will determine which documents will end up in the first shard and which ones in the subsequent ones. This is a good strategy if we plan to have range-based queries as these will be directed to the shard that holds the result set instead of having to query all shards. Hash-based sharding If we don't have a shard key (or can't create one) that achieves the three goals mentioned previously, we can use the alternative strategy of using hash-based sharding. In this case, we are trading data distribution with query isolation. Hash-based sharding will take the values of our shard key and hash them in a way that guarantees close to uniform distribution. This way we can be sure that our data will evenly distribute across shards. The downside is that only exact match queries will get routed to the exact shard that holds the value. Any range query will have to go out and fetch data from all shards. For our example database and collection (mongo_books and books respectively), we have: > sh.shardCollection("mongo_books.books", { id: "hashed" } ) Similar to the preceding example, we are now using the id field as our hashed shard key. Suppose we use fields with float values for hash-based sharding. Then we will end up with collisions if the precision of our floats is more that 2^53. These fields should be avoided where possible. Coming up with our own key Range-based sharding does not need to be confined to a single key. In fact, in most cases, we would like to combine multiple keys to achieve high cardinality and low frequency. A common pattern is to combine a low-cardinality first part (but still having as distinct values more than two times the number of shards that we have) with a high-cardinality key as its second field. This achieves both read and write distribution from the first part of the sharding key and then cardinality and read locality from the second part. On the other hand, if we don't have range queries, we can get away by using hash-based sharding on a primary key as this will exactly target the shard and document that we are going after. To make things more complicated, these considerations may change depending on our workload. A workload that consists almost exclusively (say 99.5%) of reads won't care about write distribution. We can use the built-in _id field as our shard key and this will only add 0.5% load in the last shard. Our reads will still be distributed across shards. Unfortunately, in most cases, this is not simple. Location-based data Due to government regulations and the desire to have our data as close to our users as possible, there is often a constraint and need to limit data in a specific data center. By placing different shards at different data centers, we can satisfy this requirement. To summarize we learned about MongoDB sharding and got to know techniques to choose the correct shard key. Get the expert guide Mastering MongoDB 3.x  today to build fault-tolerant MongoDB application. MongoDB 4.0 now generally available with support for multi-platform, mobile, ACID transactions and more MongoDB going relational with 4.0 release Indexing, Replicating, and Sharding in MongoDB [Tutorial]
Read more
  • 0
  • 0
  • 21654

article-image-cloud-native-architectures-microservices-containers-serverless-part-2
Guest Contributor
14 Aug 2018
8 min read
Save for later

Modern Cloud Native architectures: Microservices, Containers, and Serverless - Part 2

Guest Contributor
14 Aug 2018
8 min read
This whitepaper is written by Mina Andrawos, an experienced engineer who has developed deep experience in the Go language, and modern software architectures. He regularly writes articles and tutorials about the Go language, and also shares open source projects. Mina Andrawos has authored the book Cloud Native programming with Golang, which provides practical techniques, code examples, and architectural patterns required to build cloud native microservices in the Go language.He is also the author of the Mastering Go Programming, and the Modern Golang Programming video courses. We published Part 1 of this paper yesterday and here we come up with Part 2 which involves Containers and Serverless applications. Let us get started: Containers The technology of software containers is the next key technology that needs to be discussed to practically explain cloud native applications. A container is simply the idea of encapsulating some software inside an isolated user space or “container.” For example, a MySQL database can be isolated inside a container where the environmental variables, and the configurations that it needs will live. Software outside the container will not see the environmental variables or configuration contained inside the container by default. Multiple containers can exist on the same local virtual machine, cloud virtual machine, or hardware server. Containers provide the ability to run numerous isolated software services, with all their configurations, software dependencies, runtimes, tools, and accompanying files, on the same machine. In a cloud environment, this ability translates into saved costs and efforts, as the need for provisioning and buying server nodes for each microservices will diminish, since different microservices can be deployed on the same host without disrupting each other. Containers  combined with microservices architectures are powerful tools to build modern, portable, scalable, and cost efficient software. In a production environment, more than a single server node combined with numerous containers would be needed to achieve scalability and redundancy. Containers also add more benefits to cloud native applications beyond microservices isolation. With a container, you can move your microservices, with all the configuration, dependencies, and environmental variables that it needs, to fresh server nodes without the need to reconfigure the environment, achieving powerful portability. Due to the power and popularity of the software containers technology, some new operating systems like CoreOS, or Photon OS, are built from the ground up to function as hosts for containers. One of the most popular software container projects in the software industry is Docker. Major organizations such as Cisco, Google, and IBM utilize Docker containers in their infrastructure as well as in their products. Another notable project in the software containers world is Kubernetes. Kubernetes is a tool that allows the automation of deployment, management, and scaling of containers. It was built by Google to facilitate the management of their containers, which are counted by billions per week. Kubernetes provides some powerful features such as load balancing between containers, restart for failed containers, and orchestration of storage utilized by the containers. The project is part of the cloud native foundation along with Prometheus. Container complexities In case of containers, sometimes the task of managing them can get rather complex for the same reasons as managing expanding numbers of microservices. As containers or microservices grow in size, there needs to be a mechanism to identify where each container or microservices is deployed, what their purpose is, and what they need in resources to keep running. Serverless applications Serverless architecture is a new software architectural paradigm that was popularized with the AWS Lambda service. In order to fully understand serverless applications, we must first cover an important concept known as ‘Function As A service’, or FaaS for short. Function as a service or FaaS is the idea that a cloud provider such as Amazon or even a local piece of software such as Fission.io or funktion would provide a service, where a user can request a function to run remotely in order to perform a very specific task, and then after the function concludes, the function results return back to the user. No services or stateful data are maintained and the function code is provided by the user to the service that runs the function. The idea behind properly designed cloud native production applications that utilize the serverless architecture is that instead of building multiple microservices expected to run continuously in order to carry out individual tasks, build an application that has fewer microservices combined with FaaS, where FaaS covers tasks that don’t need services to run continuously. FaaS is a smaller construct than a microservice. For example, in case of the event booking application we covered earlier, there were multiple microservices covering different tasks. If we use a serverless applications model, some of those microservices would be replaced with a number of functions that serve their purpose. Here is a diagram that showcases the application utilizing a serverless architecture: In this diagram, the event handler microservices as well as the booking handler microservices were replaced with a number of functions that produce the same functionality. This eliminates the need to run and maintain the two existing microservices. Serverless architectures have the advantage that no virtual machines and/or containers need to be provisioned to build the part of the application that utilizes FaaS. The computing instances that run the functions cease to exist from the user point of view once their functions conclude. Furthermore, the number of microservices and/or containers that need to be monitored and maintained by the user decreases, saving cost, time, and effort. Serverless architectures provide yet another powerful software building tool in the hands of software engineers and architects to design flexible and scalable software. Known FaaS are AWS Lambda by Amazon, Azure Functions by Microsoft, Cloud Functions by Google, and many more. Another definition for serverless applications is the applications that utilize the BaaS or backend as a service paradigm. BaaS is the idea that developers only write the client code of their application, which then relies on several software pre-built services hosted in the cloud, accessible via APIs. BaaS is popular in mobile app programming, where developers would rely on a number of backend services to drive the majority of the functionality of the application. Examples of BaaS services are: Firebase, and Parse. Disadvantages of serverless applications Similarly to microservices and cloud native applications, the serverless architecture is not suitable for all scenarios. The functions provided by FaaS don’t keep state by themselves which means special considerations need to be observed when writing the function code. This is unlike a full microservice, where the developer has full control over the state. One approach to keep state in case of FaaS, in spite of this limitation, is to propagate the state to a database or a memory cache like Redis. The startup times for the functions are not always fast since there is time allocated to sending the request to the FaaS service provider then the time needed to start a computing instance that runs the function in some cases. These delays have to be accounted for when designing serverless applications. FaaS do not run continuously like microservices, which makes them unsuitable for any task that requires continuous running of the software. Serverless applications have the same limitation as other cloud native applications where portability of the application from one cloud provider to another or from the cloud to a local environment becomes challenging because of vendor lock-in Conclusion Cloud computing architectures have opened avenues for developing efficient, scalable, and reliable software. This paper covered some significant concepts in the world of cloud computing such as microservices, cloud native applications, containers, and serverless applications. Microservices are the building blocks for most scalable cloud native applications; they decouple the application tasks into various efficient services. Containers are how microservices could be isolated and deployed safely to production environments without polluting them.  Serverless applications decouple application tasks into smaller constructs mostly called functions that can be consumed via APIs. Cloud native applications make use of all those architectural patterns to build scalable, reliable, and always available software. You read Part 2 of of Modern cloud native architectures, a white paper by Mina Andrawos. Also read Part 1 which includes Microservices and Cloud native applications with their advantages and disadvantages. If you are interested to learn more, check out Mina’s Cloud Native programming with Golang to explore practical techniques for building cloud-native apps that are scalable, reliable, and always available. About Author: Mina Andrawos Mina Andrawos is an experienced engineer who has developed deep experience in Go from using it personally and professionally. He regularly authors articles and tutorials about the language, and also shares Go's open source projects. He has written numerous Go applications with varying degrees of complexity. Other than Go, he has skills in Java, C#, Python, and C++. He has worked with various databases and software architectures. He is also skilled with the agile methodology for software development. Besides software development, he has working experience of scrum mastering, sales engineering, and software product management. Build Java EE containers using Docker [Tutorial] Are containers the end of virtual machines? Why containers are driving DevOps
Read more
  • 0
  • 0
  • 15950

article-image-application-data-entity-framework-net-core
Aaron Lazar
14 Aug 2018
14 min read
Save for later

Access application data with Entity Framework in .NET Core [Tutorial]

Aaron Lazar
14 Aug 2018
14 min read
In this tutorial, we will get started with using the Entity Framework and create a simple console application to perform CRUD operations. The intent is to get started with EF Core and understand how to use it. Before we dive into coding, let us see the two development approaches that EF Core supports: Code-first Database-first These two paradigms have been supported for a very long time and therefore we will just look at them at a very high level. EF Core mainly targets the code-first approach and has limited support for the database-first approach, as there is no support for the visual designer or wizard for the database model out of the box. However, there are third-party tools and extensions that support this. The list of third-party tools and extensions can be seen at https://docs.microsoft.com/en-us/ef/core/extensions/. This tutorial has been extracted from the book .NET Core 2.0 By Example, by Rishabh Verma and Neha Shrivastava. In the code-first approach, we first write the code; that is, we first create the domain model classes and then, using these classes, EF Core APIs create the database and tables, using migration based on the convention and configuration provided. We will look at conventions and configurations a little later in this section. The following diagram illustrates the code-first approach: In the database-first approach, as the name suggests, we have an existing database or we create a database first and then use EF Core APIs to create the domain and context classes. As mentioned, currently EF Core has limited support for it due to a lack of tooling. So, our preference will be for the code-first approach throughout our examples. The reader can discover the third-party tools mentioned previously to learn more about the EF Core database-first approach as well. The following image illustrates the database-first approach: Building Entity Framework Core Console App Now that we understand the approaches and know that we will be using the code-first approach, let's dive into coding our getting started with EF Core console app. Before we do so, we need to have SQL Express installed in our development machine. If SQL Express is not installed, download the SQL Express 2017 edition from https://www.microsoft.com/en-IN/sql-server/sql-server-downloads and run the setup wizard. We will do the Basic installation of SQL Express 2017 for our learning purposes, as shown in the following screenshot: Our objective is to learn how to use EF Core and so we will not do anything fancy in our console app. We will just do simple Create Read Update Delete (CRUD) operations of a simple class called Person, as defined here: public class Person { public int Id { get; set; } public string Name { get; set; } public bool Gender { get; set; } public DateTime DateOfBirth { get; set; } public int Age { get { var age = DateTime.Now.Year - this.DateOfBirth.Year; if (DateTime.Now.DayOfYear < this.DateOfBirth.DayOfYear) { age = age - 1; } return age; } } } As we can see in the preceding code, the class has simple properties. To perform the CRUD operations on this class, let's create a console app by performing the following steps: Create a new .NET Core console project named GettingStartedWithEFCore, as shown in the following screenshot: Create a new folder named Models in the project node and add the Person class to this newly created folder. This will be our model entity class, which we will use for CRUD operations. Next, we need to install the EF Core package. Before we do that, it's important to know that EF Core provides support for a variety of databases. A few of the important ones are: SQL Server SQLite InMemory (for testing) The complete and comprehensive list can be seen at https://docs.microsoft.com/en-us/ef/core/providers/. We will be working with SQL Server on Windows for our learning purposes, so let's install the SQL Server package for Entity Framework Core. To do so, let's install the Microsoft.EntityFrameworkCore.SqlServer package from the NuGet Package Manager in Visual Studio 2017. Right-click on the project. Select Manage Nuget Packages and then search for Microsoft.EntityFrameworkCore.SqlServer. Select the matching result and click Install: Next, we will create a class called Context, as shown here: public class Context : DbContext { public DbSet<Person&gt; Persons { get; set; } protected override void OnConfiguring(DbContextOptionsBuilder optionsBuilder) { //// Get the connection string from configuration optionsBuilder.UseSqlServer(@"Server=.\SQLEXPRESS ;Database=PersonDatabase;Trusted_Connection=True;"); } protected override void OnModelCreating(ModelBuilder modelBuilder) { modelBuilder.Entity<Person> ().Property(nameof(Person.Name)).IsRequired(); } } The class looks quite simple, but it has the following subtle and important things to make note of: The Context class derives from DbContext, which resides in the Microsoft.EntityFrameworkCore namespace. DbContext is an integral part of EF Core and if you have worked with EF, you will already be aware of it. An instance of DbContext represents a session with the database and can be used to query and save instances of your entities. DbContext is a combination of the Unit Of Work and Repository Patterns. Typically, you create a class that derives from DbContext and contains Microsoft.EntityFrameworkCore.DbSet properties for each entity in the model. If properties have a public setter, they are automatically initialized when the instance of the derived context is created. It contains a property named Persons (plural of the model class Person) of type DbSet<Person&gt;. This will map to the Persons table in the underlying database. The class overrides the OnConfiguring method of DbContext and specifies the connection string to be used with the SQL Server database. The connection string should be read from the configuration file, appSettings.json, but for the sake of brevity and simplicity, it's hardcoded in the preceding code. The OnConfiguring method allows us to select and configure the data source to be used with a context using DbContextOptionsBuilder. Let's look at the connection string. Server= specifies the server. It can be .\SQLEXPRESS, .\SQLSERVER, .\LOCALDB, or any other instance name based on the installation you have done. Database= specifies the database name that will be created. Trusted_Connection=True specifies that we are using integrated security or Windows authentication. An enthusiastic reader should read the official Microsoft Entity framework documentation on configuring the context at https://docs.microsoft.com/en-us/ef/core/miscellaneous/configuring-dbcontext.  The OnModelCreating method allows us to configure the model using the ModelBuilder Fluent API. This is the most powerful method of configuration and allows configuration to be specified without modifying the entity classes. The Fluent API configuration has the highest precedence and will override conventions and data annotations. The preceding code has same effect as the following data annotation has on the Name property in the Person class: [Required] public string Name { get; set; } The preceding point highlights the flexibility and configuration that EF Core brings to the table. EF Core uses a combination of conventions, attributes, and Fluent API statements to build a database model at runtime. All we have to do is to perform actions on the model classes using a combination of these and they will automatically be translated to appropriate changes in the database. Before we conclude this point, let's have a quick look at each of the different ways to configure a database model: EF Core conventions: The conventions in EF Core are comprehensive. They are the default rules by which EF Core builds a database model based on classes. A few of the simpler yet important default conventions are listed here: EF Core creates database tables for all DbSet<TEntity&gt; properties in a Context class with the same name as that of the property. In the preceding example, the table name would be Persons based on this convention. EF Core creates tables for entities that are not included as DbSet properties but are reachable through reference properties in the other DbSet entities. If the Person class had a complex/navigation property, EF Core would have created a table for it as well. EF Core creates columns for all the scalar read-write properties of a class with the same name as the property by default. It uses the reference and collection properties for building relationships among corresponding tables in the database. In the preceding example, the scalar properties of Person correspond to a column in the Persons table. EF Core assumes a property named ID or one that is suffixed with ID as a primary key. If the property is an integer type or Guid type, then EF Core also assumes it to be IDENTITY and automatically assigns a value when inserting the data. This is precisely what we will make use of in our example while inserting or creating a new Person. EF Core maps the data type of a database column based on the data type of the property defined in the C# class. A few of the mappings between the C# data type to the SQL Server column data type are listed in the following table: C# data type SQL server data type int int string nvarchar(Max) decimal decimal(18,2) float real byte[] varbinary(Max) datetime datetime bool bit byte tinyint short smallint long bigint double float There are many other conventions, and we can define custom conventions as well. For more details, please read the official Microsoft documentation at https://docs.microsoft.com/en-us/ef/core/modeling/. Attributes: Conventions are often not enough to map the class to database objects. In such scenarios, we can use attributes called data annotation attributes to get the desired results. The [Required] attribute that we have just seen is an example of a data annotation attribute. Fluent API: This is the most powerful way of configuring the model and can be used in addition to or in place of attributes. The code written in the OnModelConfiguring method is an example of a Fluent API statement. If we check now, there is no PersonDatabase database. So, we need to create the database from the model by adding a migration. EF Core includes different migration commands to create or update the database based on the model. To do so in Visual Studio 2017, go to Tools | Nuget Package Manager | Package Manager Console, as shown in the following screenshot: This will open the Package Manager Console window. Select the Default Project as GettingStartedWithEFCore and type the following command: add-migration CreatePersonDatabase If you are not using Visual Studio 2017 and you are dependent on .NET Core CLI tooling, you can use the following command: dotnet ef migrations add CreatePersonDatabase We have not installed the Microsoft.EntityFrameworkCore.Design package, so it will give an error: Your startup project 'GettingStartedWithEFCore' doesn't reference Microsoft.EntityFrameworkCore.Design. This package is required for the Entity Framework Core Tools to work. Ensure your startup project is correct, install the package, and try again. So let's first go to the NuGet Package Manager and install this package. After successful installation of this package, if we run the preceding command again, we should be able to run the migrations successfully. It will also tell us the command to undo the migration by displaying the message To undo this action, use Remove-Migration. We should see the new files added in the Solution Explorer in the Migrations folder, as shown in the following screenshot: 8. Although we have migrations applied, we have still not created a database. To create the database, we need to run the following commands. In Visual Studio 2017: update-database –verbose In .NET Core CLI: dotnet ef database update If all goes well, we should have the database created with the Persons table (property of type DbSet<Person&gt;) in the database. Let's validate the table and database by using SQL Server Management Studio (SSMS). If SSMS is not installed in your machine, you can also use Visual Studio 2017 to view the database and table. Let's check the created database. In Visual Studio 2017, click on the View menu and select Server Explorer, as shown in the following screenshot: In Server Explorer, right-click on Data Connections and then select Add Connection. The Add Connection dialog will show up. Enter .\SQLEXPRESS in the Server name (since we installed SQL EXPRESS 2017) and select PersonDatabase as the database, as shown in the following screenshot: On clicking OK, we will see the database named PersonDatabase and if we expand the tables, we can see the Persons table as well as the _EFMigrationsHistory table. Notice that the properties in the Person class that had setters are the only properties that get transformed into table columns in the Persons table. Notice that the Age property is read-only in the class we created and therefore we do not see an age column in the database table, as shown in the following screenshot: This is the first migration to create a database. Whenever we add or update the model classes or configurations, we need to sync the database with the model using the add-migration and update-database commands. With this, we have our model class ready and the corresponding database created. The following image summarizes how the properties have been mapped from the C# class to the database table columns: Now, we will use the Context class to perform CRUD operations.  Let's go back to our Main.cs and write the following code. The code is well commented, so please go through the comments to understand the flow: class Program { static void Main(string[] args) { Console.WriteLine("Getting started with EF Core"); Console.WriteLine("We will do CRUD operations on Person class."); //// Lets create an instance of Person class. Person person = new Person() { Name = "Rishabh Verma", Gender = true, //// For demo true= Male, false = Female. Prefer enum in real cases. DateOfBirth = new DateTime(2000, 10, 23) }; using (var context = new Context()) { //// Context has strongly typed property named Persons which referes to Persons table. //// It has methods Add, Find, Update, Remove to perform CRUD among many others. //// Use AddRange to add multiple persons in once. //// Complete set of APIs can be seen by using F12 on the Persons property below in Visual Studio IDE. var personData = context.Persons.Add(person); //// Though we have done Add, nothing has actually happened in database. All changes are in context only. //// We need to call save changes, to persist these changes in the database. context.SaveChanges(); //// Notice above that Id is Primary Key (PK) and hence has not been specified in the person object passed to context. //// So, to know the created Id, we can use the below Id int createdId = personData.Entity.Id; //// If all goes well, person data should be persisted in the database. //// Use proper exception handling to discover unhandled exception if any. Not showing here for simplicity and brevity. createdId variable would now hold the id of created person. //// READ BEGINS Person readData = context.Persons.Where(j => j.Id == createdId).FirstOrDefault(); //// We have the data of person where Id == createdId, i.e. details of Rishabh Verma. //// Lets update the person data all together just for demonstarting update functionality. //// UPDATE BEGINS person.Name = "Neha Shrivastava"; person.Gender = false; person.DateOfBirth = new DateTime(2000, 6, 15); person.Id = createdId; //// For update cases, we need this to be specified. //// Update the person in context. context.Persons.Update(person); //// Save the updates. context.SaveChanges(); //// DELETE the person object. context.Remove(readData); context.SaveChanges(); } Console.WriteLine("All done. Please press Enter key to exit..."); Console.ReadLine(); } } With this, we have completed our sample app to get started with EF Core. I hope this simple example will set you up to start using EF Core with confidence and encourage you to start exploring it further. The detailed features of EF Core can be learned from the official Microsoft documentation available at https://docs.microsoft.com/en-us/ef/core/. If you're interested in learning more, head over to this book, .NET Core 2.0 By Example, by Rishabh Verma and Neha Shrivastava. How to build a chatbot with Microsoft Bot framework Working with Entity Client and Entity SQL Get to know ASP.NET Core Web API [Tutorial]
Read more
  • 0
  • 0
  • 24168
article-image-polymorphism-type-pattern-matching-python
Aaron Lazar
13 Aug 2018
11 min read
Save for later

Polymorphism and type-pattern matching in Python [Tutorial]

Aaron Lazar
13 Aug 2018
11 min read
Some functional programming languages offer clever approaches to the problem of working with statically typed function definitions. The problem is that many functions we'd like to write are entirely generic with respect to data type. For example, most of our statistical functions are identical for int or float numbers, as long as the division returns a value that is a subclass of numbers.Real (for example, Decimal, Fraction, or float). In many functional languages, sophisticated type or type-pattern matching rules are used by the compiler to make a single generic definition work for multiple data types. Python doesn't have this problem and doesn't need the pattern matching. In this article, we'll understand how to achieve Polymorphism and type-pattern matching in Python. This Python tutorial is an extract taken from the 2nd edition of the bestseller, Functional Python Programming, authored by Steven Lott. Instead of the (possibly) complex features of statically typed functional languages, Python changes the approach dramatically. Python uses dynamic selection of the final implementation of an operator based on the data types being used. In Python, we always write generic definitions. The code isn't bound to any specific data type. The Python runtime will locate the appropriate operations based on the types of the actual objects in use. The 3.3.7 Coercion rules section of the language reference manual and the numbers module in the library provide details on how this mapping from operation to special method name works. This means that the compiler doesn't certify that our functions are expecting and producing the proper data types. We generally rely on unit testing and the mypy tool for this kind of type checking. In rare cases, we might need to have different behavior based on the types of data elements. We have two ways to tackle this: We can use the isinstance() function to distinguish the different cases We can create our own subclass of numbers.Number or NamedTuple and implement proper polymorphic special method names. In some cases, we'll actually need to do both so that we can include appropriate data type conversions for each operation. Additionally, we'll also need to use the cast() function to make the types explicit to the mypy tool. The ranking example in the previous section is tightly bound to the idea of applying rank-ordering to simple pairs. While this is the way the Spearman correlation is defined, a multivariate dataset have a need to do rank-order correlation among all the variables. The first thing we'll need to do is generalize our idea of rank-order information. The following is a NamedTuple value that handles a tuple of ranks and a raw data object: from typing import NamedTuple, Tuple, Any class Rank_Data(NamedTuple): rank_seq: Tuple[float] raw: Any A typical use of this kind of class definition is shown in this example: >>> data = {'key1': 1, 'key2': 2} >>> r = Rank_Data((2, 7), data) >>> r.rank_seq[0] 2 >>> r.raw {'key1': 1, 'key2': 2} The row of raw data in this example is a dictionary. There are two rankings for this particular item in the overall list. An application can get the sequence of rankings as well as the original raw data item. We'll add some syntactic sugar to our ranking function. In many previous examples, we've required either an iterable or a concrete collection. The for statement is graceful about working with either one. However, we don't always use the for statement, and for some functions, we've had to explicitly use iter() to make an iterable out of a collection. We can handle this situation with a simple isinstance() check, as shown in the following code snippet: def some_function(seq_or_iter: Union[Sequence, Iterator]): if isinstance(seq_or_iter, Sequence): yield from some_function(iter(seq_or_iter), key) return # Do the real work of the function using the Iterator This example includes a type check to handle the small difference between a Sequence object and an Iterator. Specifically, the function uses iter() to create an Iterator from a Sequence, and calls itself recursively with the derived value. For rank-ordering, the Union[Sequence, Iterator] will be supported. Because the source data must be sorted for ranking, it's easier to use list() to transform a given iterator into a concrete sequence. The essential isinstance() check will be used, but instead of creating an iterator from a sequence (as shown previously), the following examples will create a sequence object from an iterator. In the context of our rank-ordering function, we can make the function somewhat more generic. The following two expressions define the inputs: Source = Union[Rank_Data, Any] Union[Sequence[Source], Iterator[Source]] There are four combinations defined by these two types: Sequence[Rank_Data] Sequence[Any] Iterator[Rank_Data] Iterator[Any] Handling four combination data types Here's the rank_data() function with three cases for handling the four combinations of data types: from typing import ( Callable, Sequence, Iterator, Union, Iterable, TypeVar, cast, Union ) K_ = TypeVar("K_") # Some comparable key type used for ranking. Source = Union[Rank_Data, Any] def rank_data( seq_or_iter: Union[Sequence[Source], Iterator[Source]], key: Callable[[Rank_Data], K_] = lambda obj: cast(K_, obj) ) -> Iterable[Rank_Data]: if isinstance(seq_or_iter, Iterator): # Iterator? Materialize a sequence object yield from rank_data(list(seq_or_iter), key) return data: Sequence[Rank_Data] if isinstance(seq_or_iter[0], Rank_Data): # Collection of Rank_Data is what we prefer. data = seq_or_iter else: # Convert to Rank_Data and process. empty_ranks: Tuple[float] = cast(Tuple[float], ()) data = list( Rank_Data(empty_ranks, raw_data) for raw_data in cast(Sequence[Source], seq_or_iter) ) for r, rd in rerank(data, key): new_ranks = cast( Tuple[float], rd.rank_seq + cast(Tuple[float], (r,))) yield Rank_Data(new_ranks, rd.raw) We've decomposed the ranking into three cases to cover the four different types of data. The following are the cases defined by the union of unions: Given an Iterator (an object without a usable __getitem__() method), we'll materialize a list object to work with. This will work for Rank_Data as well as any other raw data type. This case covers objects which are Iterator[Rank_Data] as well as Iterator[Any]. Given a Sequence[Any], we'll wrap the unknown objects into Rank_Data tuples with an empty collection of rankings to create a Sequence[Rank_Data]. Finally, given a Sequence[Rank_Data], add yet another ranking to the tuple of ranks inside the each Rank_Data container. The first case calls rank_data() recursively. The other two cases both rely on a rerank() function that builds a new Rank_Data tuple with additional ranking values. This contains several rankings for a complex record of raw data values. Note that a relatively complex cast() expression is required to disambiguate the use of generic tuples for the rankings. The mypy tool offers a reveal_type() function that can be incorporated to debug the inferred types. The rerank() function follows a slightly different design to the example of the rank() function shown previously. It yields two-tuples with the rank and the original data object: def rerank( rank_data_iter: Iterable[Rank_Data], key: Callable[[Rank_Data], K_] ) -> Iterator[Tuple[float, Rank_Data]]: sorted_iter = iter( sorted( rank_data_iter, key=lambda obj: key(obj.raw) ) ) # Apply ranker to head, *tail = sorted(rank_data_iter) head = next(sorted_iter) yield from ranker(sorted_iter, 0, [head], key) The idea behind rerank() is to sort a collection of Rank_Data objects. The first item, head, is used to provide a seed value to the ranker() function. The ranker() function can examine the remaining items in the iterable to see if they match this initial value, this allows computing a proper rank for a batch of matching items. The ranker() function accepts a sorted iterable of data, a base rank number, and an initial collection of items of the minimum rank. The result is an iterable sequence of two-tuples with a rank number and an associated Rank_Data object: def ranker( sorted_iter: Iterator[Rank_Data], base: float, same_rank_seq: List[Rank_Data], key: Callable[[Rank_Data], K_] ) -> Iterator[Tuple[float, Rank_Data]]: try: value = next(sorted_iter) except StopIteration: dups = len(same_rank_seq) yield from yield_sequence( (base+1+base+dups)/2, iter(same_rank_seq)) return if key(value.raw) == key(same_rank_seq[0].raw): yield from ranker( sorted_iter, base, same_rank_seq+[value], key) else: dups = len(same_rank_seq) yield from yield_sequence( (base+1+base+dups)/2, iter(same_rank_seq)) yield from ranker( sorted_iter, base+dups, [value], key) This starts by attempting to extract the next item from the sorted_iter collection of sorted Rank_Data items. If this fails with a StopIteration exception, there is no next item, the source was exhausted. The final output is the final batch of equal-valued items in the same_rank_seq sequence. If the sequence has a next item, the key() function extracts the key value. If this new value matches the keys in the same_rank_seq collection, it is accumulated into the current batch of same-valued keys.  The final result is based on the rest of the items in sorted_iter, the current value for the rank, a larger batch of same_rank items that now includes the head value, and the original key() function. If the next item's key doesn't match the current batch of equal-valued items, the final result has two parts. The first part is the batch of equal-valued items accumulated in same_rank_seq.  This is followed by the reranking of the remainder of the sorted items. The base value for these is incremented by the number of equal-valued items, a fresh batch of equal-rank items is initialized with the distinct key, and the original key() extraction function is provided. The output from ranker() depends on the yield_sequence() function, which looks as follows: def yield_sequence( rank: float, same_rank_iter: Iterator[Rank_Data] ) -> Iterator[Tuple[float, Rank_Data]]: head = next(same_rank_iter) yield rank, head yield from yield_sequence(rank, same_rank_iter) We've written this in a way that emphasizes the recursive definition. For any practical work, this should be optimized into a single for statement. When doing Tail-Call Optimization to transform a recursion into a loop define unit test cases first. Be sure the recursion passes the unit test cases before optimizing. The following are some examples of using this function to rank (and rerank) data. We'll start with a simple collection of scalar values: >>> scalars= [0.8, 1.2, 1.2, 2.3, 18] >>> list(rank_data(scalars)) [Rank_Data(rank_seq=(1.0,), raw=0.8), Rank_Data(rank_seq=(2.5,), raw=1.2), Rank_Data(rank_seq=(2.5,), raw=1.2), Rank_Data(rank_seq=(4.0,), raw=2.3), Rank_Data(rank_seq=(5.0,), raw=18)] Each value becomes the raw attribute of a Rank_Data object. When we work with a slightly more complex object, we can also have multiple rankings. The following is a sequence of two tuples: >>> pairs = ((2, 0.8), (3, 1.2), (5, 1.2), (7, 2.3), (11, 18)) >>> rank_x = list(rank_data(pairs, key=lambda x:x[0])) >>> rank_x [Rank_Data(rank_seq=(1.0,), raw=(2, 0.8)), Rank_Data(rank_seq=(2.0,), raw=(3, 1.2)), Rank_Data(rank_seq=(3.0,), raw=(5, 1.2)), Rank_Data(rank_seq=(4.0,), raw=(7, 2.3)), Rank_Data(rank_seq=(5.0,), raw=(11, 18))] >>> rank_xy = list(rank_data(rank_x, key=lambda x:x[1] )) >>> rank_xy [Rank_Data(rank_seq=(1.0, 1.0), raw=(2, 0.8)), Rank_Data(rank_seq=(2.0, 2.5), raw=(3, 1.2)), Rank_Data(rank_seq=(3.0, 2.5), raw=(5, 1.2)), Rank_Data(rank_seq=(4.0, 4.0), raw=(7, 2.3)), Rank_Data(rank_seq=(5.0, 5.0), raw=(11, 18))] Here, we defined a collection of pairs. Then, we ranked the two tuples, assigning the sequence of Rank_Data objects to the rank_x variable. We then ranked this collection of Rank_Data objects, creating a second rank value and assigning the result to the rank_xy variable. The resulting sequence can be used for a slightly modified rank_corr() function to compute the rank correlations of any of the available values in the rank_seq attribute of the Rank_Data objects. We'll leave this modification as an exercise for you. If you found this tutorial useful and would like to learn more such techniques, head over to get Steven Lott's bestseller, Functional Python Programming. Why functional programming in Python matters: Interview with best selling author, Steven Lott Top 7 Python programming books you need to read Members Inheritance and Polymorphism
Read more
  • 0
  • 0
  • 15376

article-image-cloud-native-architectures-microservices-containers-serverless-part-1
Guest Contributor
13 Aug 2018
9 min read
Save for later

Modern Cloud Native architectures: Microservices, Containers, and Serverless - Part 1

Guest Contributor
13 Aug 2018
9 min read
This whitepaper is written by Mina Andrawos, an experienced engineer who has developed deep experience in the Go language, and modern software architectures. He regularly writes articles and tutorials about the Go language, and also shares open source projects. Mina Andrawos has authored the book Cloud Native programming with Golang, which provides practical techniques, code examples, and architectural patterns required to build cloud native microservices in the Go language.He is also the author of the Mastering Go Programming, and the Modern Golang Programming video courses. This paper sheds some light and provides practical exposure on some key topics in the modern software industry, namely cloud native applications.This includes microservices, containers , and serverless applications. The paper will cover the practical advantages, and disadvantages of the technologies covered. Microservices The microservices architecture has gained reputation as a powerful approach to architect modern software applications. So what are microservices? Microservices can be described as simply the idea of separating the functionality required from a software application into multiple independent small software services or “microservices.” Each microservice is responsible for an individual focused task. In order for microservices to collaborate together to form a large scalable application, they communicate and exchange data. Microservices were born out of the need to tame the complexity, and inflexibility of “monolithic” applications. A monolithic application is a type of application, where all required functionality is coded together into the same service. For example, here is a diagram representing a monolithic events (like concerts, shows..etc) booking application that takes care of the booking payment processing and event reservation: The application can be used by a customer to book a concert or a show. A user interface will be needed. Furthermore, we will also need a search functionality to look for events, a bookings handler to process the user booking then save it, and an events handler to help find the event, ensure it has seats available, then link it to the booking. In a production level application, more tasks will be needed like payment processing for example, but for now let’s focus on the four tasks outlined in the above figure. This monolithic application will work well with small to medium load. It will run on a single server, connect to a single database and will be written probably in the same programming language. Now, what will happen if the business grows exponentially and hundreds of thousands or millions of users need to be handled and processed? Initially, the short term solution would be to ensure that the server where the application runs, has powerful hardware specifications to withstand higher loads, and if not then add more memory, storage, and processing power to the server. This is called vertical scaling, which is the act of increasing the power of the hardware  like RAM and hard drive capacity to run heavy applications.However, this is typically not  sustainable in the long run as the load on the application continues to grow. Another challenge with monolithic applications is the inflexibility caused by being limited to only one or two programming languages. This inflexibility can affect the overall quality, and efficiency of the application. For example, node.js is a popular JavaScript framework for building web applications, whereas R is popular for data science applications. A monolithic application will make it difficult to utilize both technologies, whereas in a microservices application, we can simply build a data science service written in R and a web service written in Node.js. The microservices version of the events application will take the below form: This application will be capable of scaling among multiple servers, a practice known as horizontal scaling. Each service can be deployed on a different server with dedicated resources or in separate containers (more on that later). The different services can be written in different programming languages enabling greater flexibility, and different dedicated teams can focus on different services achieving more overall quality for the application. Another notable advantage of using microservices is the ease of continuous delivery, which is the ability to deploy software often, and at any time. The reason why microservices make continuous delivery easier is because a new feature deployed to one microservices is less likely to affect other microservices compared to monolithic applications. Issues with Microservices One notable drawback of relying heavily on microservices is the fact that they can become too complicated to manage in the long run as they grow in numbers and scope. There are approaches to mitigate this by utilizing monitoring tools such as Prometheus to detect problems, container technologies such as Docker to avoid pollutions of the host environments and avoiding over designing the services. However, these approaches take effort and time. Cloud native applications Microservices architectures are a natural fit for cloud native applications. A cloud native application is simply defined as an application built from the ground up for cloud computing architectures. This simply means that our application is cloud native, if we design it as if it is expected to be deployed on a distributed, and scalable infrastructure. For example, building an application with a redundant microservices architecture -we’ll see an example shortly- makes the application cloud native, since this architecture allows our application to be deployed in a distributed manner that allows it to be scalable and almost always available. A cloud native application does not need to always be deployed to a public cloud like AWS, we can deploy it to our own distributed cloud-like infrastructure instead if we have one. In fact, what makes an application fully cloud native is beyond just using microservices. Your application  should employ continuous delivery, which is your ability to continuously deliver updates to your production applications without  disruptions. Your application should also make use of services like message queues and technologies like containers, and serverless (containers and serverless are important topics for modern software architectures, so we’ll be discussing them in the next few sections). Cloud native applications assume access to numerous server nodes, having access to pre-deployed software services like message queues or load balancers, ease of integration with continuous delivery services, among other things. If you deploy your cloud native application to a commercial cloud like AWS or Azure, your application gets the option to utilize cloud only software services. For example, DynamoDB is a powerful database engine that can only be used on Amazon Web Services for production applications. Another example is the DocumentDB database in Azure. There are also cloud only message queues such as Amazon Simple Queue Service (SQS), which can be used to allow communication between microservices in the Amazon Web Services cloud. As mentioned earlier, cloud native microservices should be designed to allow redundancy between services. If we take the events booking application as an example, the application will look like this: Multiple server nodes would be allocated per microservice, allowing a redundant microservices architecture to be deployed. If the primary node or service fails for any reason, the secondary can take over ensuring lasting reliability and availability for cloud native applications. This availability is vital for fault intolerant applications such as e-commerce platforms, where downtime translates into large amounts of lost revenue. Cloud native applications provide great value for developers, enterprises, and startups. A notable tool worth mentioning in the world of microservices and cloud computing is Prometheus. Prometheus is an open source system monitoring and alerting tool that can be used to monitor complex microservices architectures and alert when an action needs to be taken. Prometheus was originally created by SoundCloud to monitor their systems, but then grew to become an independent project. The project is now a part of the cloud native computing foundation, which is a foundation tasked with building a sustainable ecosystem for cloud native applications. Cloud native limitations For cloud native applications, you will face some challenges if the need arises to migrate some or all of the applications. That is due to multiple reasons, depending on where your application is deployed. For example,if your cloud native application is deployed on a public cloud like AWS, cloud native APIs are not cross cloud platform. So, a DynamoDB database API utilized in an application will only work on AWS but not on Azure, since DynamoDB belongs exclusively to AWS. The API will also never work in a local environment because DynamoDB can only be utilized in AWS in production. Another reason is because there are some assumptions made when some cloud native applications are built, like the fact that there will be virtually unlimited number of server nodes to utilize when needed and that a new server node can be made available very quickly. These assumptions are sometimes hard to guarantee in a local data center environment, where real servers, networking hardware, and wiring need to be purchased. This brings us to the end of Part 1 of this whitepaper. Check out Part 2 tomorrow to learn about Containers and Serverless applications along with their practical advantages and limitations. About Author: Mina Andrawos Mina Andrawos is an experienced engineer who has developed deep experience in Go from using it personally and professionally. He regularly authors articles and tutorials about the language, and also shares Go's open source projects. He has written numerous Go applications with varying degrees of complexity. Other than Go, he has skills in Java, C#, Python, and C++. He has worked with various databases and software architectures. He is also skilled with the agile methodology for software development. Besides software development, he has working experience of scrum mastering, sales engineering, and software product management. Building microservices from a monolith Java EE app [Tutorial] 6 Ways to blow up your Microservices! Have Microservices killed the monolithic architecture? Maybe not!
Read more
  • 0
  • 0
  • 21139

article-image-tic-tac-toe-game-in-asp-net-core-tutorial
Aaron Lazar
13 Aug 2018
28 min read
Save for later

Building a Tic-tac-toe game in ASP.Net Core 2.0 [Tutorial]

Aaron Lazar
13 Aug 2018
28 min read
Learning is more fun if we do it while making games. With this thought, let's continue our quest to learn .NET Core 2.0 by writing a Tic-tac-toe game in .NET Core 2.0. We will develop the game in the ASP.NET Core 2.0 web app, using SignalR Core. We will follow a step-by-step approach and use Visual Studio 2017 as the primary IDE, but will list the steps needed while using the Visual Studio Code editor as well. Let's do the project setup first and then we will dive into the coding. This tutorial has been extracted from the book .NET Core 2.0 By Example, by Rishabh Verma and Neha Shrivastava. Installing SignalR Core NuGet package Create a new ASP.NET Core 2.0 MVC app named TicTacToeGame. With this, we will have a basic working ASP.NET Core 2.0 MVC app in place. However, to leverage SignalR Core in our app, we need to install SignalR Core NuGet and the client packages. To install the SignalR Core NuGet package, we can perform one of the following two approaches in the Visual Studio IDE: In the context menu of the TicTacToeGame project, click on Manage NuGet Packages. It will open the NuGet Package Manager for the project. In the Browse section, search for the Microsoft.AspNetCore.SignalR package and click Install. This will install SignalR Core in the app. Please note that currently the package is in the preview stage and hence the pre-release checkbox has to be ticked: Edit the TicTacToeGame.csproj file, add the following code snippet in the ItemGroup code containing package references, and click Save. As soon as the file is saved, the tooling will take care of restoring the packages and in a while, the SignalR package will be installed. This approach can be used with Visual Studio Code as well. Although Visual Studio Code detects the unresolved dependencies and may prompt you to restore the package, it is recommended that immediately after editing and saving the file, you run the dotnet restore command in the terminal window at the location of the project: <ItemGroup> <PackageReference Include="Microsoft.AspNetCore.All" Version="2.0.0" /> <PackageReference Include="Microsoft.AspNetCore.SignalR" Version="1.0.0-alpha1-final" /> </ItemGroup> Now we have server-side packages installed. We still need to install the client-side package of SignalR, which is available through npm. To do so, we need to first ascertain whether we have npm installed on the machine or not. If not, we need to install it. npm is distributed with Node.js, so we need to download and install Node.js from https://nodejs.org/en/. The installation is quite straightforward. Once this installation is done, open a Command Prompt at the project location and run the following command: npm install @aspnet/signalr-client This will install the SignalR client package. Just go to the package location (npm creates a node_modules folder in the project directory). The relative path from the project directory would be \node_modules\@aspnet\signalr-client\dist\browser. From this location, copy the signalr-client-1.0.0-alpha1-final.js file into the wwwroot\js folder. In the current version, the name is signalr-client-1.0.0-alpha1-final.js. With this, we are done with the project setup and we are ready to use SignalR goodness as well. So let's dive into the coding. Coding the game In this section, we will implement our gaming solution. The end output will be the working two-player Tic-Tac-Toe game. We will do the coding in steps for ease of understanding:  In the Startup class, we modify the ConfigureServices method to add SignalR to the container, by writing the following code: //// Adds SignalR to the services container. services.AddSignalR(); In the Configure method of the same class, we configure the pipeline to use SignalR and intercept and wire up the request containing gameHub to our SignalR hub that we will be creating with the following code: //// Use - SignalR & let it know to intercept and map any request having gameHub. app.UseSignalR(routes => { routes.MapHub<GameHub>("gameHub"); }); The following is the code for both methods, for the sake of clarity and completion. Other methods and properties are removed for brevity: // This method gets called by the run-time. Use this method to add services to the container. public void ConfigureServices(IServiceCollection services) { services.AddMvc(); //// Adds SignalR to the services container. services.AddSignalR(); } // This method gets called by the runtime. Use this method to configure the HTTP request pipeline. public void Configure(IApplicationBuilder app, IHostingEnvironment env) { if (env.IsDevelopment()) { app.UseDeveloperExceptionPage(); app.UseBrowserLink(); } else { app.UseExceptionHandler("/Home/Error"); } app.UseStaticFiles(); app.UseMvc(routes => { routes.MapRoute( name: "default", template: "{controller=Home}/{action=Index}/{id?}"); }); //// Use - SignalR & let it know to intercept and map any request having gameHub. app.UseSignalR(routes => { routes.MapHub<GameHub>("gameHub"); }); The previous two steps set up SignalR for us. Now, let's start with the coding of the player registration form. We want the player to be registered with a name and display the picture. Later, the server will also need to know whether the player is playing, waiting for a move, searching for an opponent, and so on. Let's create the Player model in the Models folder in the app. The code comments are self-explanatory: /// <summary> /// The player class. Each player of Tic-Tac-Toe game would be an instance of this class. /// </summary> internal class Player { /// <summary> /// Gets or sets the name of the player. This would be set at the time user registers. /// </summary> public string Name { get; set; } /// <summary> /// Gets or sets the opponent player. The player against whom the player would be playing. /// This is determined/ set when the players click Find Opponent Button in the UI. /// </summary> public Player Opponent { get; set; } /// <summary> /// Gets or sets a value indicating whether the player is playing. /// This is set when the player starts a game. /// </summary> public bool IsPlaying { get; set; } /// <summary> /// Gets or sets a value indicating whether the player is waiting for opponent to make a move. /// </summary> public bool WaitingForMove { get; set; } /// <summary> /// Gets or sets a value indicating whether the player is searching for opponent. /// </summary> public bool IsSearchingOpponent { get; set; } /// <summary> /// Gets or sets the time when the player registered. /// </summary> public DateTime RegisterTime { get; set; } /// <summary> /// Gets or sets the image of the player. /// This would be set at the time of registration, if the user selects the image. /// </summary> public string Image { get; set; } /// <summary> /// Gets or sets the connection id of the player connection with the gameHub. /// </summary> public string ConnectionId { get; set; } } Now, we need to have a UI in place so that the player can fill in the form and register. We also need to show the image preview to the player when he/she browses the image. To do so, we will use the Index.cshtml view of the HomeController class that comes with the default MVC template. We will refer to the following two .js files in the _Layout.cshtml partial view so that they are available to all the views. Alternatively, you could add these in the Index.cshtml view as well, but its highly recommended that common scripts should be added in _Layout.cshtml. The version of the script file may be different in your case. These are the currently available latest versions. Although jQuery is not required to be the library of choice for us, we will use jQuery to keep the code clean, simple, and compact. With these references, we have jQuery and SignalR available to us on the client side: <script src="~/lib/jquery/dist/jquery.js"></script> <!-- jQuery--> <script src="~/js/signalr-client-1.0.0-alpha1-final.js"></script> <!-- SignalR--> After adding these references, create the simple HTML UI for the image preview and registration, as follows: <div id="divPreviewImage"> <!-- To display the browsed image--> <fieldset> <div class="form-group"> <div class="col-lg-2"> <image src="" id="previewImage" style="height:100px;width:100px;border:solid 2px dotted; float:left" /> </div> <div class="col-lg-10" id="divOpponentPlayer"> <!-- To display image of opponent player--> <image src="" id="opponentImage" style="height:100px;width:100px;border:solid 2px dotted; float:right;" /> </div> </div> </fieldset> </div> <div id="divRegister"> <!-- Our Registration form--> <fieldset> <legend>Register</legend> <div class="form-group"> <label for="name" class="col-lg-2 control- label">Name</label> <div class="col-lg-10"> <input type="text" class="form-control" id="name" placeholder="Name"> </div> </div> <div class="form-group"> <label for="image" class="col-lg-2 control- label">Avatar</label> <div class="col-lg-10"> <input type="file" class="form-control" id="image" /> </div> </div> <div class="form-group"> <div class="col-lg-10 col-lg-offset-2"> <button type="button" class="btn btn-primary" id="btnRegister">Register</button> </div> </div> </fieldset> </div> When the player registers by clicking the Register button, the player's details need to be sent to the server. To do this, we will write the JavaScript to send details to our gameHub: let hubUrl = '/gameHub'; let httpConnection = new signalR.HttpConnection(hubUrl); let hubConnection = new signalR.HubConnection(httpConnection); var playerName = ""; var playerImage = ""; var hash = "#"; hubConnection.start(); $("#btnRegister").click(function () { //// Fires on button click playerName = $('#name').val(); //// Sets the player name with the input name. playerImage = $('#previewImage').attr('src'); //// Sets the player image variable with specified image var data = playerName.concat(hash, playerImage); //// The registration data to be sent to server. hubConnection.invoke('RegisterPlayer', data); //// Invoke the "RegisterPlayer" method on gameHub. }); $("#image").change(function () { //// Fires when image is changed. readURL(this); //// HTML 5 way to read the image as data url. }); function readURL(input) { if (input.files && input.files[0]) { //// Go in only if image is specified. var reader = new FileReader(); reader.onload = imageIsLoaded; reader.readAsDataURL(input.files[0]); } } function imageIsLoaded(e) { if (e.target.result) { $('#previewImage').attr('src', e.target.result); //// Sets the image source for preview. $("#divPreviewImage").show(); } }; The player now has a UI to input the name and image, see the preview image, and click Register. On clicking the Register button, we are sending the concatenated name and image to the gameHub on the server through hubConnection.invoke('RegisterPlayer', data);  So, it's quite simple for the client to make a call to the server. Initialize the hubConnection by specifying hub name as we did in the first three lines of the preceding code snippet. Start the connection by hubConnection.start();, and then invoke the server hub method by calling the invoke method, specifying the hub method name and the parameter it expects. We have not yet created the hub, so let's create the GameHub class on the server: /// <summary> /// The Game Hub class derived from Hub /// </summary> public class GameHub : Hub { /// <summary> /// To keep the list of all the connected players registered with the game hub. We could have /// used normal list but used concurrent bag as its thread safe. /// </summary> private static readonly ConcurrentBag<Player> players = new ConcurrentBag<Player>(); /// <summary> /// Registers the player with name and image. /// </summary> /// <param name="nameAndImageData">The name and image data sent by the player.</param> public void RegisterPlayer(string nameAndImageData) { var splitData = nameAndImageData?.Split(new char[] { '#' }, StringSplitOptions.None); string name = splitData[0]; string image = splitData[1]; var player = players?.FirstOrDefault(x => x.ConnectionId == Context.ConnectionId); if (player == null) { player = new Player { ConnectionId = Context.ConnectionId, Name = name, IsPlaying = false, IsSearchingOpponent = false, RegisterTime = DateTime.UtcNow, Image = image }; if (!players.Any(j => j.Name == name)) { players.Add(player); } } this.OnRegisterationComplete(Context.ConnectionId); } /// <summary> /// Fires on completion of registration. /// </summary> /// <param name="connectionId">The connectionId of the player which registered</param> public void OnRegisterationComplete(string connectionId) { //// Notify this connection id that the registration is complete. this.Clients.Client(connectionId). InvokeAsync(Constants.RegistrationComplete); } } The code comments make it self-explanatory. The class should derive from the SignalR Hub class for it to be recognized as Hub. There are two methods of interest which can be overridden. Notice that both the methods follow the async pattern and hence return Task: Task OnConnectedAsync(): This method fires when a client/player connects to the hub. Task OnDisconnectedAsync(Exception exception): This method fires when a client/player disconnects or looses the connection. We will override this method to handle the scenario where the player disconnects. There are also a few properties that the hub class exposes: Context: This property is of type HubCallerContext and gives us access to the following properties: Connection: Gives access to the current connection User: Gives access to the ClaimsPrincipal of the user who is currently connected ConnectionId: Gives the current connection ID string Clients: This property is of type IHubClients and gives us the way to communicate to all the clients via the client proxy Groups: This property is of type IGroupManager and provides a way to add and remove connections to the group asynchronously To keep the things simple, we are not using a database to keep track of our registered players. Rather we will use an in-memory collection to keep the registered players. We could have used a normal list of players, such as List<Player>, but then we would need all the thread safety and use one of the thread safety primitives, such as lock, monitor, and so on, so we are going with ConcurrentBag<Player>, which is thread safe and reasonable for our game development. That explains the declaration of the players collection in the class. We will need to do some housekeeping to add players to this collection when they resister and remove them when they disconnect. We saw in previous step that the client invoked the RegisterPlayer method of the hub on the server, passing in the name and image data. So we defined a public method in our hub, named RegisterPlayer, accepting the name and image data string concatenated through #. This is just one of the simple ways of accepting the client data for demonstration purposes, we can also use strongly typed parameters. In this method, we split the string on # and extract the name as the first part and the image as the second part. We then check if the player with the current connection ID already exists in our players collection. If it doesn't, we create a Player object with default values and add them to our players collection. We are distinguishing the player based on the name for demonstration purposes, but we can add an Id property in the Player class and make different players have the same name also. After the registration is complete, the server needs to update the player, that the registration is complete and the player can then look for the opponent. To do so, we make a call to the OnRegistrationComplete method which invokes a method called  registrationComplete on the client with the current connection ID. Let's understand the code to invoke the method on the client: this.Clients.Client(connectionId).InvokeAsync(Constants.RegistrationComplete); On the Clients property, we can choose a client having a specific connection ID (in this case, the current connection ID from the Context) and then call InvokeAsync to invoke a method on the client specifying the method name and parameters as required. In the preceding case method, the name is registrationComplete with no parameters. Now we know how to invoke a server method from the client and also how to invoke the client method from the server. We also know how to select a specific client and invoke a method there. We can invoke the client method from the server, for all the clients, a group of clients, or a specific client, so rest of the coding stuff would be just a repetition of these two concepts. Next, we need to implement the registrationComplete method on the client. On registration completion, the registration form should be hidden and the player should be able to find an opponent to play against. To do so, we would write JavaScript code to hide the registration form and show the UI for finding the opponent. On clicking the Find Opponent button, we need the server to pair us against an opponent, so we need to invoke a hub method on server to find opponent. The server can respond us with two outcomes: It finds an opponent player to play against. In this case, the game can start so we need to simulate the coin toss, determine the player who can make the first move, and start the game. This would be a game board in the client-user interface. It doesn't find an opponent and asks the player to wait for another player to register and search for an opponent. This would be a no opponent found screen in the client. In both the cases, the server would do some processing and invoke a method on the client. Since we need a lot of different user interfaces for different scenarios, let's code the HTML markup inside div to make it easier to show and hide sections based on the server response. We will add the following code snippet in the body. The comments specify the purpose of each of the div elements and markup inside them: <div id="divFindOpponentPlayer"> <!-- Section to display Find Opponent --> <fieldset> <legend>Find a player to play against!</legend> <div class="form-group"> <input type="button" class="btn btn-primary" id="btnFindOpponentPlayer" value="Find Opponent Player" /> </div> </fieldset> </div> <div id="divFindingOpponentPlayer"> <!-- Section to display opponent not found, wait --> <fieldset> <legend>Its lonely here!</legend> <div class="form-group"> Looking for an opponent player. Waiting for someone to join! </div> </fieldset> </div> <div id="divGameInformation" class="form-group"> <!-- Section to display game information--> <div class="form-group" id="divGameInfo"></div> <div class="form-group" id="divInfo"></div> </div> <div id="divGame" style="clear:both"> <!-- Section where the game board would be displayed --> <fieldset> <legend>Game On</legend> <div id="divGameBoard" style="width:380px"></div> </fieldset> </div> The following client-side code would take care of Steps 7 and 8. Though the comments are self-explanatory, we will quickly see what all stuff is that is going on here. We handle the registartionComplete method and display the Find Opponent Player section. This section has a button to find an opponent player called btnFindOpponentPlayer. We define the event handler of the button to invoke the FindOpponent method on the hub. We will see the hub method implementation later, but we know that the hub method would either find an opponent or would not find an opponent, so we have defined the methods opponentFound and opponentNotFound, respectively, to handle these scenarios. In the opponentNotFound method, we just display a section in which we say, we do not have an opponent player. In the opponentFound method, we display the game section, game information section, opponent display picture section, and draw the Tic-Tac-Toe game board as a 3×3 grid using CSS styling. All the other sections are hidden: $("#btnFindOpponentPlayer").click(function () { hubConnection.invoke('FindOpponent'); }); hubConnection.on('registrationComplete', data => { //// Fires on registration complete. Invoked by server hub $("#divRegister").hide(); // hide the registration div $("#divFindOpponentPlayer").show(); // display find opponent player div. }); hubConnection.on('opponentNotFound', data => { //// Fires when no opponent is found. $('#divFindOpponentPlayer').hide(); //// hide the find opponent player section. $('#divFindingOpponentPlayer').show(); //// display the finding opponent player div. }); hubConnection.on('opponentFound', (data, image) => { //// Fires when opponent player is found. $('#divFindOpponentPlayer').hide(); $('#divFindingOpponentPlayer').hide(); $('#divGame').show(); //// Show game board section. $('#divGameInformation').show(); //// Show game information $('#divOpponentPlayer').show(); //// Show opponent player image. opponentImage = image; //// sets the opponent player image for display $('#opponentImage').attr('src', opponentImage); //// Binds the opponent player image $('#divGameInfo').html("<br/><span><strong> Hey " + playerName + "! You are playing against <i>" + data + "</i> </strong></span>"); //// displays the information of opponent that the player is playing against. //// Draw the tic-tac-toe game board, A 3x3 grid :) by proper styling. for (var i = 0; i < 9; i++) { $("#divGameBoard").append("<span class='marker' id=" + i + " style='display:block;border:2px solid black;height:100px;width:100px;float:left;margin:10px;'>" + i + "</span>"); } }); First we need to have a Game object to track a game, players involved, moves left, and check if there is a winner. We will have a Game class defined as per the following code. The comments detail the purpose of the methods and the properties defined: internal class Game { /// <summary> /// Gets or sets the value indicating whether the game is over. /// </summary> public bool IsOver { get; private set; } /// <summary> /// Gets or sets the value indicating whether the game is draw. /// </summary> public bool IsDraw { get; private set; } /// <summary> /// Gets or sets Player 1 of the game /// </summary> public Player Player1 { get; set; } /// <summary> /// Gets or sets Player 2 of the game /// </summary> public Player Player2 { get; set; } /// <summary> /// For internal housekeeping, To keep track of value in each of the box in the grid. /// </summary> private readonly int[] field = new int[9]; /// <summary> /// The number of moves left. We start the game with 9 moves remaining in a 3x3 grid. /// </summary> private int movesLeft = 9; /// <summary> /// Initializes a new instance of the <see cref="Game"/> class. /// </summary> public Game() { //// Initialize the game for (var i = 0; i < field.Length; i++) { field[i] = -1; } } /// <summary> /// Place the player number at a given position for a player /// </summary> /// <param name="player">The player number would be 0 or 1</param> /// <param name="position">The position where player number would be placed, should be between 0 and ///8, both inclusive</param> /// <returns>Boolean true if game is over and we have a winner.</returns> public bool Play(int player, int position) { if (this.IsOver) { return false; } //// Place the player number at the given position this.PlacePlayerNumber(player, position); //// Check if we have a winner. If this returns true, //// game would be over and would have a winner, else game would continue. return this.CheckWinner(); } } Now we have the entire game mystery solved with the Game class. We know when the game is over, we have the method to place the player marker, and check the winner. The following server side-code on the GameHub will handle Steps 7 and 8: /// <summary> /// The list of games going on. /// </summary> private static readonly ConcurrentBag<Game> games = new ConcurrentBag<Game>(); /// <summary> /// To simulate the coin toss. Like heads and tails, 0 belongs to one player and 1 to opponent. /// </summary> private static readonly Random toss = new Random(); /// <summary> /// Finds the opponent for the player and sets the Seraching for Opponent property of player to true. /// We will use the connection id from context to identify the current player. /// Once we have 2 players looking to play, we can pair them and simulate coin toss to start the game. /// </summary> public void FindOpponent() { //// First fetch the player from our players collection having current connection id var player = players.FirstOrDefault(x => x.ConnectionId == Context.ConnectionId); if (player == null) { //// Since player would be registered before making this call, //// we should not reach here. If we are here, something somewhere in the flow above is broken. return; } //// Set that player is seraching for opponent. player.IsSearchingOpponent = true; //// We will follow a queue, so find a player who registered earlier as opponent. //// This would only be the case if more than 2 players are looking for opponent. var opponent = players.Where(x => x.ConnectionId != Context.ConnectionId && x.IsSearchingOpponent && !x.IsPlaying).OrderBy(x =>x.RegisterTime).FirstOrDefault(); if (opponent == null) { //// Could not find any opponent, invoke opponentNotFound method in the client. Clients.Client(Context.ConnectionId) .InvokeAsync(Constants.OpponentNotFound); return; } //// Set both players as playing. player.IsPlaying = true; player.IsSearchingOpponent = false; //// Make him unsearchable for opponent search opponent.IsPlaying = true; opponent.IsSearchingOpponent = false; //// Set each other as opponents. player.Opponent = opponent; opponent.Opponent = player; //// Notify both players that they can play by invoking opponentFound method for both the players. //// Also pass the opponent name and opoonet image, so that they can visualize it. //// Here we are directly using connection id, but group is a good candidate and use here. Clients.Client(Context.ConnectionId) .InvokeAsync(Constants.OpponentFound, opponent.Name, opponent.Image); Clients.Client(opponent.ConnectionId) .InvokeAsync(Constants.OpponentFound, player.Name, player.Image); //// Create a new game with these 2 player and add it to games collection. games.Add(new Game { Player1 = player, Player2 = opponent }); } Here, we have created a games collection to keep track of ongoing games and a Random field named toss to simulate the coin toss. How FindOpponent works is documented in the comments and is intuitive to understand. Once the game starts, each player has to make a move and then wait for the opponent to make a move, until the game ends. The move is made by clicking on the available grid cells. Here, we need to ensure that cell position that is already marked by one of the players is not changed or marked. So, as soon as a valid cell is marked, we set its CSS class to notAvailable so we know that the cell is taken. While clicking on a cell, we will check whether the cell has notAvailablestyle. If yes, it cannot be marked. If not, the cell can be marked and we then send the marked position to the server hub. We also see the waitingForMove, moveMade, gameOver, and opponentDisconnected events invoked by the server based on the game state. The code is commented and is pretty straightforward. The moveMade method in the following code makes use of the MoveInformation class, which we will define at the server for sharing move information with both players: //// Triggers on clicking the grid cell. $(document).on('click', '.marker', function () { if ($(this).hasClass("notAvailable")) { //// Cell is already taken. return; } hubConnection.invoke('MakeAMove', $(this)[0].id); //// Cell is valid, send details to hub. }); //// Fires when player has to make a move. hubConnection.on('waitingForMove', data => { $('#divInfo').html("<br/><span><strong> Your turn <i>" + playerName + "</i>! Make a winning move! </strong></span>"); }); //// Fires when move is made by either player. hubConnection.on('moveMade', data => { if (data.Image == playerImage) { //// Move made by player. $("#" + data.ImagePosition).addClass("notAvailable"); $("#" + data.ImagePosition).css('background-image', 'url(' + data.Image + ')'); $('#divInfo').html("<br/><strong>Waiting for <i>" + data.OpponentName + "</i> to make a move. </strong>"); } else { $("#" + data.ImagePosition).addClass("notAvailable"); $("#" + data.ImagePosition).css('background-image', 'url(' + data.Image + ')'); $('#divInfo').html("<br/><strong>Waiting for <i>" + data.OpponentName + "</i> to make a move. </strong>"); } }); //// Fires when the game ends. hubConnection.on('gameOver', data => { $('#divGame').hide(); $('#divInfo').html("<br/><span><strong>Hey " + playerName + "! " + data + " </strong></span>"); $('#divGameBoard').html(" "); $('#divGameInfo').html(" "); $('#divOpponentPlayer').hide(); }); //// Fires when the opponent disconnects. hubConnection.on('opponentDisconnected', data => { $("#divRegister").hide(); $('#divGame').hide(); $('#divGameInfo').html(" "); $('#divInfo').html("<br/><span><strong>Hey " + playerName + "! Your opponent disconnected or left the battle! You are the winner ! Hip Hip Hurray!!!</strong></span>"); }); After every move, both players need to be updated by the server about the move made, so that both players' game boards are in sync. So, on the server side we will need an additional model called MoveInformation, which will contain information on the latest move made by the player and the server will send this model to both the clients to keep them in sync: /// <summary> /// While playing the game, players would make moves. This class contains the information of those moves. /// </summary> internal class MoveInformation { /// <summary> /// Gets or sets the opponent name. /// </summary> public string OpponentName { get; set; } /// <summary> /// Gets or sets the player who made the move. /// </summary> public string MoveMadeBy { get; set; } /// <summary> /// Gets or sets the image position. The position in the game board (0-8) where the player placed his /// image. /// </summary> public int ImagePosition { get; set; } /// <summary> /// Gets or sets the image. The image of the player that he placed in the board (0-8) /// </summary> public string Image { get; set; } } Finally, we will wire up the remaining methods in the GameHub class to complete the game coding. The MakeAMove method is called every time a player makes a move. Also, we have overidden the OnDisconnectedAsync method to inform a player when their opponent disconnects. In this method, we also keep our players and games list current. The comments in the code explain the workings of the methods: /// <summary> /// Invoked by the player to make a move on the board. /// </summary> /// <param name="position">The position to place the player</param> public void MakeAMove(int position) { //// Lets find a game from our list of games where one of the player has the same connection Id as the current connection has. var game = games?.FirstOrDefault(x => x.Player1.ConnectionId == Context.ConnectionId || x.Player2.ConnectionId == Context.ConnectionId); if (game == null || game.IsOver) { //// No such game exist! return; } //// Designate 0 for player 1 int symbol = 0; if (game.Player2.ConnectionId == Context.ConnectionId) { //// Designate 1 for player 2. symbol = 1; } var player = symbol == 0 ? game.Player1 : game.Player2; if (player.WaitingForMove) { return; } //// Update both the players that move is made. Clients.Client(game.Player1.ConnectionId) .InvokeAsync(Constants.MoveMade, new MoveInformation { OpponentName = player.Name, ImagePosition = position, Image = player.Image }); Clients.Client(game.Player2.ConnectionId) .InvokeAsync(Constants.MoveMade, new MoveInformation { OpponentName = player.Name, ImagePosition = position, Image = player.Image }); //// Place the symbol and look for a winner after every move. if (game.Play(symbol, position)) { Remove<Game>(games, game); Clients.Client(game.Player1.ConnectionId) .InvokeAsync(Constants.GameOver, $"The winner is {player.Name}"); Clients.Client(game.Player2.ConnectionId) .InvokeAsync(Constants.GameOver, $"The winner is {player.Name}"); player.IsPlaying = false; player.Opponent.IsPlaying = false; this.Clients.Client(player.ConnectionId) .InvokeAsync(Constants.RegistrationComplete); this.Clients.Client(player.Opponent.ConnectionId) .InvokeAsync(Constants.RegistrationComplete); } //// If no one won and its a tame draw, update the players that the game is over and let them look for new game to play. if (game.IsOver && game.IsDraw) { Remove<Game>(games, game); Clients.Client(game.Player1.ConnectionId) .InvokeAsync(Constants.GameOver, "Its a tame draw!!!"); Clients.Client(game.Player2.ConnectionId) .InvokeAsync(Constants.GameOver, "Its a tame draw!!!"); player.IsPlaying = false; player.Opponent.IsPlaying = false; this.Clients.Client(player.ConnectionId) .InvokeAsync(Constants.RegistrationComplete); this.Clients.Client(player.Opponent.ConnectionId) .InvokeAsync(Constants.RegistrationComplete); } if (!game.IsOver) { player.WaitingForMove = !player.WaitingForMove; player.Opponent.WaitingForMove = !player.Opponent.WaitingForMove; Clients.Client(player.Opponent.ConnectionId) .InvokeAsync(Constants.WaitingForOpponent, player.Opponent.Name); Clients.Client(player.ConnectionId) .InvokeAsync(Constants.WaitingForOpponent, player.Opponent.Name); } } With this, we are done with the coding of the game and are ready to run the game app. So there you have it! You've just built your first game in .NET Core! The detailed source code can be downloaded from Github. If you're interested in learning more, head on over to get the book, .NET Core 2.0 By Example, by Rishabh Verma and Neha Shrivastava. Applying Single Responsibility principle from SOLID in .NET Core Unit Testing in .NET Core with Visual Studio 2017 for better code quality Get to know ASP.NET Core Web API [Tutorial]
Read more
  • 0
  • 1
  • 29195
article-image-web-services-functional-python-programming-tutorial
Aaron Lazar
12 Aug 2018
18 min read
Save for later

Writing web services with functional Python programming [Tutorial]

Aaron Lazar
12 Aug 2018
18 min read
In this article we'll understand how functional programming can be applied to web services in Python. This article is an extract from the 2nd edition of the bestseller, Functional Python Programming, written by Steven Lott. We'll look at a RESTful web service, which can slice and dice a source of data and provide downloads as JSON, XML, or CSV files. We'll provide an overall WSGI-compatible wrapper. The functions that do the real work of the application won't be narrowly constrained to fit the WSGI standard. We'll use a simple dataset with four subcollections: the Anscombe Quartet. It's a small set of data but it can be used to show the principles of a RESTful web service. We'll split our application into two tiers: a web tier, which will be a simple WSGI application, and data service tier, which will be more typical functional programming. We'll look at the web tier first so that we can focus on a functional approach to provide meaningful results. We need to provide two pieces of information to the web service: The quartet that we want: this is a slice and dice operation. The idea is to slice up the information by filtering and extracting meaningful subsets. The output format we want. The data selection is commonly done through the request path. We can request /anscombe/I/ or /anscombe/II/ to pick specific datasets from the quartet. The idea is that a URL defines a resource, and there's no good reason for the URL to ever change. In this case, the dataset selectors aren't dependent on dates or some organizational approval status, or other external factors. The URL is timeless and absolute. The output format is not a first-class part of the URL. It's just a serialization format, not the data itself. In some cases, the format is requested through the HTTP Accept header. This is hard to use from a browser, but easy to use from an application using a RESTful API. When extracting data from the browser, a query string is commonly used to specify the output format. We'll use the ?form=json method at the end of the path to specify the JSON output format. A URL we can use will look like this: http://localhost:8080/anscombe/III/?form=csv This would request a CSV download of the third dataset. Creating the Web Server Gateway Interface First, we'll use a simple URL pattern-matching expression to define the one and only routing in our application. In a larger or more complex application, we might have more than one such pattern: import re path_pat= re.compile(r"^/anscombe/(?P<dataset>.*?)/?$") This pattern allows us to define an overall script in the WSGI sense at the top level of the path. In this case, the script is anscombe. We'll take the next level of the path as a dataset to select from the Anscombe Quartet. The dataset value should be one of I, II, III, or IV. We used a named parameter for the selection criteria. In many cases, RESTful APIs are described using a syntax, as follows: /anscombe/{dataset}/ We translated this idealized pattern into a proper, regular expression, and preserved the name of the dataset selector in the path. Here are some example URL paths that demonstrate how this pattern works: >>> m1 = path_pat.match( "/anscombe/I" ) >>> m1.groupdict() {'dataset': 'I'} >>> m2 = path_pat.match( "/anscombe/II/" ) >>> m2.groupdict() {'dataset': 'II'} >>> m3 = path_pat.match( "/anscombe/" ) >>> m3.groupdict() {'dataset': ''} Each of these examples shows the details parsed from the URL path. When a specific series is named, this is located in the path. When no series is named, then an empty string is found by the pattern. Here's the overall WSGI application: import traceback import urllib.parse def anscombe_app( environ: Dict, start_response: SR_Func ) -> Iterable[bytes]: log = environ['wsgi.errors'] try: match = path_pat.match(environ['PATH_INFO']) set_id = match.group('dataset').upper() query = urllib.parse.parse_qs(environ['QUERY_STRING']) print(environ['PATH_INFO'], environ['QUERY_STRING'], match.groupdict(), file=log) dataset = anscombe_filter(set_id, raw_data()) content_bytes, mime = serialize( query['form'][0], set_id, dataset) headers = [ ('Content-Type', mime), ('Content-Length', str(len(content_bytes))), ] start_response("200 OK", headers) return [content_bytes] except Exception as e: # pylint: disable=broad-except traceback.print_exc(file=log) tb = traceback.format_exc() content = error_page.substitute( title="Error", message=repr(e), traceback=tb) content_bytes = content.encode("utf-8") headers = [ ('Content-Type', "text/html"), ('Content-Length', str(len(content_bytes))), ] start_response("404 NOT FOUND", headers) return [content_bytes] This application will extract two pieces of information from the request: the PATH_INFO and the QUERY_STRING keys in the environment dictionary. The PATH_INFO request will define which set to extract. The QUERY_STRING request will specify an output format. It's important to note that query strings can be quite complex. Rather than assume it is simply a string like ?form=json, we've used the urllib.parse module to properly locate all of the name-value pairs in the query string. The value with the 'form' key in the dictionary extracted from the query string can be found in query['form'][0]. This should be one of the defined formats. If it isn't, an exception will be raised, and an error page displayed. After locating the path and query string, the application processing is highlighted in bold. These two statements rely on three functions to gather, filter, and serialize the results: The raw_data() function reads the raw data from a file. The result is a dictionary with lists of Pair objects. The anscombe_filter() function accepts a selection string and the dictionary of raw data and returns a single list of Pair objects. The list of pairs is then serialized into bytes by the serialize() function. The serializer is expected to produce byte's, which can then be packaged with an appropriate header, and returned. We elected to produce an HTTP Content-Length header as part of the result. This header isn't required, but it's polite for large downloads. Because we decided to emit this header, we are forced to create a bytes object with the serialization of the data so we can count the bytes. If we elected to omit the Content-Length header, we could change the structure of this application dramatically. Each serializer could be changed to a generator function, which would yield bytes as they are produced. For large datasets, this can be a helpful optimization. For the user watching a download, however, it might not be so pleasant because the browser can't display how much of the download is complete. A common optimization is to break the transaction into two parts. The first part computes the result and places a file into a Downloads directory. The response is a 302 FOUND with a Location header that identifies the file to download. Generally, most clients will then request the file based on this initial response. The file can be downloaded by Apache httpd or Nginx without involving the Python application. For this example, all errors are treated as a 404 NOT FOUND error. This could be misleading, since a number of individual things might go wrong. More sophisticated error handling could give more try:/except: blocks to provide more informative feedback. For debugging purposes, we've provided a Python stack trace in the resulting web page. Outside the context of debugging, this is a very bad idea. Feedback from an API should be just enough to fix the request, and nothing more. A stack trace provides too much information to potentially malicious users. Getting raw data Here's what we're using for this application: from Chapter_3.ch03_ex5 import ( series, head_map_filter, row_iter) from typing import ( NamedTuple, Callable, List, Tuple, Iterable, Dict, Any) RawPairIter = Iterable[Tuple[float, float]] class Pair(NamedTuple): x: float y: float pairs: Callable[[RawPairIter], List[Pair]] \ = lambda source: list(Pair(*row) for row in source) def raw_data() -> Dict[str, List[Pair]]: with open("Anscombe.txt") as source: data = tuple(head_map_filter(row_iter(source))) mapping = { id_str: pairs(series(id_num, data)) for id_num, id_str in enumerate( ['I', 'II', 'III', 'IV']) } return mapping The raw_data() function opens the local data file, and applies the row_iter() function to return each line of the file parsed into a row of separate items. We applied the head_map_filter() function to remove the heading from the file. The result created a tuple-of-list structure, which is assigned the variable data. This handles parsing the input into a structure that's useful. The resulting structure is an instance of the Pair subclass of the NamedTuple class, with two fields that have float as their type hints. We used a dictionary comprehension to build the mapping from id_str to pairs assembled from the results of the series() function. The series() function extracts (x, y) pairs from the input document. In the document, each series is in two adjacent columns. The series named I is in columns zero and one; the series() function extracts the relevant column pairs. The pairs() function is created as a lambda object because it's a small generator function with a single parameter. This function builds the desired NamedTuple objects from the sequence of anonymous tuples created by the series() function. Since the output from the raw_data() function is a mapping, we can do something like the following example to pick a specific series by name: >>> raw_data()['I'] [Pair(x=10.0, y=8.04), Pair(x=8.0, y=6.95), ... Given a key, for example, 'I', the series is a list of Pair objects that have the x, y values for each item in the series. Applying a filter In this application, we're using a simple filter. The entire filter process is embodied in the following function: def anscombe_filter( set_id: str, raw_data_map: Dict[str, List[Pair]] ) -> List[Pair]: return raw_data_map[set_id] We made this trivial expression into a function for three reasons: The functional notation is slightly more consistent and a bit more flexible than the subscript expression We can easily expand the filtering to do more We can include separate unit tests in the docstring for this function While a simple lambda would work, it wouldn't be quite as convenient to test. For error handling, we've done exactly nothing. We've focused on what's sometimes called the happy path: an ideal sequence of events. Any problems that arise in this function will raise an exception. The WSGI wrapper function should catch all exceptions and return an appropriate status message and error response content. For example, it's possible that the set_id method will be wrong in some way. Rather than obsess over all the ways it could be wrong, we'll simply allow Python to throw an exception. Indeed, this function follows the Python advice that, it's better to seek forgiveness than to ask permission. This advice is materialized in code by avoiding permission-seeking: there are no preparatory if statements that seek to qualify the arguments as valid. There is only forgiveness handling: an exception will be raised and handled in the WSGI wrapper. This essential advice applies to the preceding raw data and the serialization that we will see now. Serializing the results Serialization is the conversion of Python data into a stream of bytes, suitable for transmission. Each format is best described by a simple function that serializes just that one format. A top-level generic serializer can then pick from a list of specific serializers. The picking of serializers leads to the following collection of functions: Serializer = Callable[[str, List[Pair]], bytes] SERIALIZERS: Dict[str, Tuple[str, Serializer]]= { 'xml': ('application/xml', serialize_xml), 'html': ('text/html', serialize_html), 'json': ('application/json', serialize_json), 'csv': ('text/csv', serialize_csv), } def serialize( format: str, title: str, data: List[Pair] ) -> Tuple[bytes, str]: mime, function = SERIALIZERS.get( format.lower(), ('text/html', serialize_html)) return function(title, data), mime The overall serialize() function locates a specific serializer in the SERIALIZERS dictionary, which maps a format name to a two-tuple. The tuple has a MIME type that must be used in the response to characterize the results. The tuple also has a function based on the Serializer type hint. This function will transform a name and a list of Pair objects into bytes that will be downloaded. The serialize() function doesn't do any data transformation. It merely maps a name to a function that does the hard work of transformation. Returning a function permits the overall application to manage the details of memory or file-system serialization. Serializing to the file system, while slow, permits larger files to be handled. We'll look at the individual serializers below. The serializers fall into two groups: those that produce strings and those that produce bytes. A serializer that produces a string will need to have the string encoded as bytes for download. A serializer that produces bytes doesn't need any further work. For the serializers, which produce strings, we can use function composition with a standardized convert-to-bytes function. Here's a decorator that can standardize the conversion to bytes: from typing import Callable, TypeVar, Any, cast from functools import wraps def to_bytes( function: Callable[..., str] ) -> Callable[..., bytes]: @wraps(function) def decorated(*args, **kw): text = function(*args, **kw) return text.encode("utf-8") return cast(Callable[..., bytes], decorated) We've created a small decorator named @to_bytes. This will evaluate the given function and then encode the results using UTF-8 to get bytes. Note that the decorator changes the decorated function from having a return type of str to a return type of bytes. We haven't formally declared parameters for the decorated function, and used ... instead of the details. We'll show how this is used with JSON, CSV, and HTML serializers. The XML serializer produces bytes directly and doesn't need to be composed with this additional function. We could also do the functional composition in the initialization of the serializers mapping. Instead of decorating the function definition, we could decorate the reference to the function object. Here's an alternative definition for the serializer mapping: SERIALIZERS = { 'xml': ('application/xml', serialize_xml), 'html': ('text/html', to_bytes(serialize_html)), 'json': ('application/json', to_bytes(serialize_json)), 'csv': ('text/csv', to_bytes(serialize_csv)), } This replaces decoration at the site of the function definition with decoration when building this mapping data structure. It seems potentially confusing to defer the decoration. Serializing data into JSON or CSV formats The JSON and CSV serializers are similar because both rely on Python's libraries to serialize. The libraries are inherently imperative, so the function bodies are strict sequences of statements. Here's the JSON serializer: import json @to_bytes def serialize_json(series: str, data: List[Pair]) -> str: """ >>> data = [Pair(2,3), Pair(5,7)] >>> serialize_json( "test", data ) b'[{"x": 2, "y": 3}, {"x": 5, "y": 7}]' """ obj = [dict(x=r.x, y=r.y) for r in data] text = json.dumps(obj, sort_keys=True) return text We created a list-of-dict structure and used the json.dumps() function to create a string representation. The JSON module requires a materialized list object; we can't provide a lazy generator function. The sort_keys=True argument value is helpful for unit testing. However, it's not required for the application and represents a bit of overhead. Here's the CSV serializer: import csv import io @to_bytes def serialize_csv(series: str, data: List[Pair]) -> str: """ >>> data = [Pair(2,3), Pair(5,7)] >>> serialize_csv("test", data) b'x,y\\r\\n2,3\\r\\n5,7\\r\\n' """ buffer = io.StringIO() wtr = csv.DictWriter(buffer, Pair._fields) wtr.writeheader() wtr.writerows(r._asdict() for r in data) return buffer.getvalue() The CSV module's readers and writers are a mixture of imperative and functional elements. We must create the writer, and properly create headings in a strict sequence. We've used the _fields attribute of the Pair namedtuple to determine the column headings for the writer. The writerows() method of the writer will accept a lazy generator function. In this case, we used the _asdict() method of each Pair object to return a dictionary suitable for use with the CSV writer. Serializing data into XML We'll look at one approach to XML serialization using the built-in libraries. This will build a document from individual tags. A common alternative approach is to use Python introspection to examine and map Python objects and class names to XML tags and attributes. Here's our XML serialization: import xml.etree.ElementTree as XML def serialize_xml(series: str, data: List[Pair]) -> bytes: """ >>> data = [Pair(2,3), Pair(5,7)] >>> serialize_xml( "test", data ) b'<series name="test"><row><x>2</x><y>3</y></row><row><x>5</x><y>7</y></row></series>' """ doc = XML.Element("series", name=series) for row in data: row_xml = XML.SubElement(doc, "row") x = XML.SubElement(row_xml, "x") x.text = str(row.x) y = XML.SubElement(row_xml, "y") y.text = str(row.y) return cast(bytes, XML.tostring(doc, encoding='utf-8')) We created a top-level element, <series>, and placed <row> sub-elements underneath that top element. Within each <row> sub-element, we've created <x> and <y> tags, and assigned text content to each tag. The interface for building an XML document using the ElementTree library tends to be heavily imperative. This makes it a poor fit for an otherwise functional design. In addition to the imperative style, note that we haven't created a DTD or XSD. We have not properly assigned a namespace to our tags. We also omitted the <?xml version="1.0"?> processing instruction that is generally the first item in an XML document. The XML.tostring() function has a type hint that states it returns str. This is generally true, but when we provide the encoding parameter, the result type changes to bytes. There's no easy way to formalize the idea of variant return types based on parameter values, so we use an explicit cast() to inform mypy of the actual type. A more sophisticated serialization library could be helpful here. There are many to choose from. Visit https://wiki.python.org/moin/PythonXml for a list of alternatives. Serializing data into HTML In our final example of serialization, we'll look at the complexity of creating an HTML document. The complexity arises because in HTML, we're expected to provide an entire web page with a great deal of context information. Here's one way to tackle this HTML problem: import string data_page = string.Template("""\ <html> <head><title>Series ${title}</title></head> <body> <h1>Series ${title}</h1> <table> <thead><tr><td>x</td><td>y</td></tr></thead> <tbody> ${rows} </tbody> </table> </body> </html> """) @to_bytes def serialize_html(series: str, data: List[Pair]) -> str: """ >>> data = [Pair(2,3), Pair(5,7)] >>> serialize_html("test", data) #doctest: +ELLIPSIS b'<html>...<tr><td>2</td><td>3</td></tr>\\n<tr><td>5</td><td>7</td></tr>... """ text = data_page.substitute( title=series, rows="\n".join( "<tr><td>{0.x}</td><td>{0.y}</td></tr>".format(row) for row in data) ) return text Our serialization function has two parts. The first part is a string.Template() function that contains the essential HTML page. It has two placeholders where data can be inserted into the template. The ${title} method shows where title information can be inserted, and the ${rows} method shows where the data rows can be inserted. The function creates individual data rows using a simple format string. These are joined into a longer string, which is then substituted into the template. While workable for simple cases like the preceding example, this isn't ideal for more complex result sets. There are a number of more sophisticated template tools to create HTML pages. A number of these include the ability to embed the looping in the template, separate from the function that initializes serialization. If you found this tutorial useful and would like to learn more such techniques, head over to get Steven Lott's bestseller, Functional Python Programming. What is the difference between functional and object-oriented programming? Should you move to Python 3? 7 Python experts’ opinions Is Python edging R out in the data science wars?
Read more
  • 0
  • 0
  • 21281

article-image-implement-rnn-tensorflow-spam-prediction-tutorial
Packt Editorial Staff
11 Aug 2018
11 min read
Save for later

Implementing RNN in TensorFlow for spam prediction [Tutorial]

Packt Editorial Staff
11 Aug 2018
11 min read
Artificial neural networks (ANN) are an abstract representation of the human nervous system, which contains a collection of neurons that communicate with each other through connections called axons. A recurrent neural network (RNN) is a class of ANN where connections between units form a directed cycle. RNNs make use of information from the past. That way, they can make predictions in data with high temporal dependencies. This creates an internal state of the network, which allows it to exhibit dynamic temporal behavior. In this article we will look at: Implementation of basic RNNs in TensorFlow. An example of how to implement an RNN in TensorFlow for spam predictions. Train a model that will learn to distinguish between spam and non-spam emails using the text of the email. This article is an extract taken from the book Deep Learning with TensorFlow – Second Edition, written by Giancarlo Zaccone, Md. Rezaul Karim. Implementing basic RNNs in TensorFlow TensorFlow has tf.contrib.rnn.BasicRNNCell and tf.nn.rnn_cell. BasicRNNCell, which provide the basic building blocks of RNNs. However, first let's implement a very simple RNN model, without using either of these. The idea is to have a better understanding of what goes on under the hood. We will create an RNN composed of a layer of five recurrent neurons using the ReLU activation function. We will assume that the RNN runs over only two-time steps, taking input vectors of size 3 at each time step. The following code builds this RNN, unrolled through two-time steps: n_inputs = 3 n_neurons = 5 X1 = tf.placeholder(tf.float32, [None, n_inputs]) X2 = tf.placeholder(tf.float32, [None, n_inputs]) Wx = tf.get_variable("Wx", shape=[n_inputs,n_neurons], dtype=tf. float32, initializer=None, regularizer=None, trainable=True, collections=None) Wy = tf.get_variable("Wy", shape=[n_neurons,n_neurons], dtype=tf. float32, initializer=None, regularizer=None, trainable=True, collections=None) b = tf.get_variable("b", shape=[1,n_neurons], dtype=tf.float32, initializer=None, regularizer=None, trainable=True, collections=None) Y1 = tf.nn.relu(tf.matmul(X1, Wx) + b) Y2 = tf.nn.relu(tf.matmul(Y1, Wy) + tf.matmul(X2, Wx) + b) Then we initialize the global variables as follows: init_op = tf.global_variables_initializer() This network looks much like a two-layer feedforward neural network, but both layers share the same weights and bias vectors. Additionally, we feed inputs at each layer and receive outputs from each layer. X1_batch = np.array([[0, 2, 3], [2, 8, 9], [5, 3, 8], [3, 2, 9]]) # t = 0 X2_batch = np.array([[5, 6, 8], [1, 0, 0], [8, 2, 0], [2, 3, 6]]) # t = 1 These mini-batches contain four instances, each with an input sequence composed of exactly two inputs. At the end, Y1_val and Y2_val contain the outputs of the network at both time steps for all neurons and all instances in the mini-batch. Then we create a TensorFlow session and execute the computational graph as follows: with tf.Session() as sess:        init_op.run()        Y1_val, Y2_val = sess.run([Y1, Y2], feed_dict={X1:        X1_batch, X2: X2_batch}) Finally, we print the result: print(Y1_val) # output at t = 0 print(Y2_val) # output at t = 1 The following is the output: >>> [[ 0. 0. 0. 2.56200171 1.20286 ] [ 0. 0. 0. 12.39334488 2.7824254 ] [ 0. 0. 0. 13.58520699 5.16213894] [ 0. 0. 0. 9.95982838 6.20652485]] [[ 0. 0. 0. 14.86255169 6.98305273] [ 0. 0. 26.35326385 0.66462421 18.31009483] [ 5.12617588 4.76199865 20.55905533 11.71787453 18.92538261] [ 0. 0. 19.75175095 3.38827515 15.98449326]] The network we created is simple, but if you run it over 100 time steps, for example, the graph is going to be very big. Implementing an RNN for spam prediction In this section, we will see how to implement an RNN in TensorFlow to predict spam/ham from texts. Data description and preprocessing The popular spam dataset from the UCI ML repository will be used, which can be downloaded from http://archive.ics.uci.edu/ml/machine-learning-databases/00228/smsspamcollection.zip. The dataset contains texts from several emails, some of which were marked as spam. Here we will train a model that will learn to distinguish between spam and non-spam emails using only the text of the email. Let's get started by importing the required libraries and model: import os import re import io import requests import numpy as np import matplotlib.pyplot as plt import tensorflow as tf from zipfile import ZipFile from tensorflow.python.framework import ops import warnings Additionally, we can stop printing the warning produced by TensorFlow if you want: warnings.filterwarnings("ignore") os.environ['TF_CPP_MIN_LOG_LEVEL'] = '3' ops.reset_default_graph() Now, let's create the TensorFlow session for the graph: sess = tf.Session() The next task is setting the RNN parameters: epochs = 300 batch_size = 250 max_sequence_length = 25 rnn_size = 10 embedding_size = 50 min_word_frequency = 10 learning_rate = 0.0001 dropout_keep_prob = tf.placeholder(tf.float32) Let's manually download the dataset and store it in a text_data.txt file in the temp directory. First, we set the path: data_dir = 'temp' data_file = 'text_data.txt' if not os.path.exists(data_dir):    os.makedirs(data_dir) Now, we directly download the dataset in zipped format: if not os.path.isfile(os.path.join(data_dir, data_file)):    zip_url = 'http://archive.ics.uci.edu/ml/machine-learning- databases/00228/smsspamcollection.zip'    r = requests.get(zip_url)    z = ZipFile(io.BytesIO(r.content))    file = z.read('SMSSpamCollection') We still need to format the data: text_data = file.decode()    text_data = text_data.encode('ascii',errors='ignore')    text_data = text_data.decode().split('\n') Now, store in it the directory mentioned earlier in a text file: with open(os.path.join(data_dir, data_file), 'w') as file_conn:        for text in text_data:            file_conn.write("{}\n".format(text)) else:    text_data = []    with open(os.path.join(data_dir, data_file), 'r') as file_conn:        for row in file_conn:            text_data.append(row)    text_data = text_data[:-1] Let's split the words that have a word length of at least 2: text_data = [x.split('\t') for x in text_data if len(x)>=1] [text_data_target, text_data_train] = [list(x) for x in zip(*text_data)] Now we create a text cleaning function: def clean_text(text_string):    text_string = re.sub(r'([^\s\w]|_|[0-9])+', '', text_string)    text_string = " ".join(text_string.split())    text_string = text_string.lower()    return(text_string) We call the preceding method to clean the text: text_data_train = [clean_text(x) for x in text_data_train] Now we need to do one of the most important tasks, which is creating word embedding –changing text into numeric vectors: vocab_processor = tf.contrib.learn.preprocessing.VocabularyProcessor(max_sequence_length, min_frequency=min_word_frequency) text_processed = np.array(list(vocab_processor.fit_transform(text_data_train))) Now let's shuffle to make the dataset balance: text_processed = np.array(text_processed) text_data_target = np.array([1 if x=='ham' else 0 for x in text_data_target]) shuffled_ix = np.random.permutation(np.arange(len(text_data_target))) x_shuffled = text_processed[shuffled_ix] y_shuffled = text_data_target[shuffled_ix] Now that we have shuffled the data, we can split the data into a training and testing set: ix_cutoff = int(len(y_shuffled)*0.75) x_train, x_test = x_shuffled[:ix_cutoff], x_shuffled[ix_cutoff:] y_train, y_test = y_shuffled[:ix_cutoff], y_shuffled[ix_cutoff:] vocab_size = len(vocab_processor.vocabulary_) print("Vocabulary size: {:d}".format(vocab_size)) print("Training set size: {:d}".format(len(y_train))) print("Test set size: {:d}".format(len(y_test))) Following is the output of the preceding code: >>> Vocabulary size: 933 Training set size: 4180 Test set size: 1394 Before we start training, let's create placeholders for our TensorFlow graph: x_data = tf.placeholder(tf.int32, [None, max_sequence_length]) y_output = tf.placeholder(tf.int32, [None]) Let's create the embedding: embedding_mat = tf.get_variable("embedding_mat", shape=[vocab_size, embedding_size], dtype=tf.float32, initializer=None, regularizer=None, trainable=True, collections=None) embedding_output = tf.nn.embedding_lookup(embedding_mat, x_data) Now it's time to construct our RNN. The following code defines the RNN cell: cell = tf.nn.rnn_cell.BasicRNNCell(num_units = rnn_size) output, state = tf.nn.dynamic_rnn(cell, embedding_output, dtype=tf.float32) output = tf.nn.dropout(output, dropout_keep_prob) Now let's define the way to get the output from our RNN sequence: output = tf.transpose(output, [1, 0, 2]) last = tf.gather(output, int(output.get_shape()[0]) - 1) Next, we define the weights and the biases for the RNN: weight = bias = tf.get_variable("weight", shape=[rnn_size, 2], dtype=tf.float32, initializer=None, regularizer=None, trainable=True, collections=None) bias = tf.get_variable("bias", shape=[2], dtype=tf.float32, initializer=None, regularizer=None, trainable=True, collections=None) The logits output is then defined. It uses both the weight and the bias from the preceding code: logits_out = tf.nn.softmax(tf.matmul(last, weight) + bias) Now we define the losses for each prediction so that later on, they can contribute to the loss function: losses = tf.nn.sparse_softmax_cross_entropy_with_logits_v2(logits=logits_ou t, labels=y_output) We then define the loss function: loss = tf.reduce_mean(losses) We now define the accuracy of each prediction: accuracy = tf.reduce_mean(tf.cast(tf.equal(tf.argmax(logits_out, 1), tf.cast(y_output, tf.int64)), tf.float32)) We then create the training_op with RMSPropOptimizer: optimizer = tf.train.RMSPropOptimizer(learning_rate) train_step = optimizer.minimize(loss) Now let's initialize all the variables using the global_variables_initializer() method: init_op = tf.global_variables_initializer() sess.run(init_op) Additionally, we can create some empty lists to keep track of the training loss, testing loss, training accuracy, and the testing accuracy in each epoch: train_loss = [] test_loss = [] train_accuracy = [] test_accuracy = [] Now we are ready to perform the training, so let's get started. The workflow of the training goes as follows: Shuffle the training data Select the training set and calculate generations Run training step for each batch Run loss and accuracy of training Run the evaluation steps. The following codes include all of the aforementioned steps: shuffled_ix = np.random.permutation(np.arange(len(x_train)))    x_train = x_train[shuffled_ix]    y_train = y_train[shuffled_ix]    num_batches = int(len(x_train)/batch_size) + 1    for i in range(num_batches):        min_ix = i * batch_size        max_ix = np.min([len(x_train), ((i+1) * batch_size)])        x_train_batch = x_train[min_ix:max_ix]        y_train_batch = y_train[min_ix:max_ix]        train_dict = {x_data: x_train_batch, y_output: \ y_train_batch, dropout_keep_prob:0.5}        sess.run(train_step, feed_dict=train_dict)        temp_train_loss, temp_train_acc = sess.run([loss,\                         accuracy], feed_dict=train_dict)    train_loss.append(temp_train_loss)    train_accuracy.append(temp_train_acc)    test_dict = {x_data: x_test, y_output: y_test, \ dropout_keep_prob:1.0}    temp_test_loss, temp_test_acc = sess.run([loss, accuracy], \                    feed_dict=test_dict)    test_loss.append(temp_test_loss)    test_accuracy.append(temp_test_acc)    print('Epoch: {}, Test Loss: {:.2}, Test Acc: {:.2}'.format(epoch+1, temp_test_loss, temp_test_acc)) print('\nOverall accuracy on test set (%): {}'.format(np.mean(temp_test_acc)*100.0)) Following is the output of the preceding code: >>> Epoch: 1, Test Loss: 0.68, Test Acc: 0.82 Epoch: 2, Test Loss: 0.68, Test Acc: 0.82 Epoch: 3, Test Loss: 0.67, Test Acc: 0.82 … Epoch: 997, Test Loss: 0.36, Test Acc: 0.96 Epoch: 998, Test Loss: 0.36, Test Acc: 0.96 Epoch: 999, Test Loss: 0.35, Test Acc: 0.96 Epoch: 1000, Test Loss: 0.35, Test Acc: 0.96 Overall accuracy on test set (%): 96.19799256324768 Well done! The accuracy of the RNN is above 96%, which is outstanding. Now let's observe how the loss propagates across each iteration and over time: epoch_seq = np.arange(1, epochs+1) plt.plot(epoch_seq, train_loss, 'k--', label='Train Set') plt.plot(epoch_seq, test_loss, 'r-', label='Test Set') plt.title('RNN training/test loss') plt.xlabel('Epochs') plt.ylabel('Loss') plt.legend(loc='upper left') plt.show() Figure 1: a) RNN training and test loss per epoch b) test accuracy per epoch We also plot the accuracy over time: plt.plot(epoch_seq, train_accuracy, 'k--', label='Train Set') plt.plot(epoch_seq, test_accuracy, 'r-', label='Test Set') plt.title('Test accuracy') plt.xlabel('Epochs') plt.ylabel('Accuracy') plt.legend(loc='upper left') plt.show() We discussed the implementation of RNNs in TensorFlow. We saw how to make predictions with data that has a high temporal dependency and how to develop real-life predictive models that make the predictive analytics easier using RNNs. If you want to delve into neural networks and implement deep learning algorithms check out this book, Deep learning with TensorFlow - Second Edition. Top 5 Deep Learning Architectures Understanding Sentiment Analysis and other key NLP concepts Facelifting NLP with Deep Learning
Read more
  • 0
  • 0
  • 6196
Modal Close icon
Modal Close icon