How-To Tutorials

article-image-create-enterprise-grade-angular-forms-typescript-tutorial

04 Jul 2018

11 min read

Create enterprise-grade Angular forms in TypeScript [Tutorial]

04 Jul 2018

0
0
30546

How-To Tutorials

article-image-why-do-react-developers-love-redux-for-state-management

Sugandha Lahoti

03 Jul 2018

3 min read

Why do React developers love Redux for state management?

Sugandha Lahoti

03 Jul 2018

3 min read

Redux is an implementation of FLUX, which is a pattern for managing application state in React. Redux brings a clean and testable design to the table using a purely functional approach. Redux completes the missing piece of the React framework and is used at the core of React for most complex React projects. This video tutorial talks about why Redux is needed and touches upon the Redux Flow. Why Redux? If you have written a large-scale application before, you will know that managing application state can become a pain as the app grows. Application state includes server responses, cached data, and data that has not been persisted to the server yet. Furthermore, the User Interface (UI) state constantly increases in complexity. Let’s take the example of an e-commerce website. Any website contains a lot of components, for instance, the product view, the menu section, the filter panel. Whenever we have such a complex app, whether it be a mobile or a web app, it becomes difficult to communicate between components and to know each other’s updated state. For instance, when you interact with the price filter slider, the product view changes. This can obviously work if we have a parent component calling the child component and share properties. However, this works only for simple apps. For complex apps, it becomes difficult to manage the state and update history between multiple components. Redux comes to the rescue here. In order to understand the functioning of Redux, we will go through a flow chart. Redux Flow Action Whenever a state change occurs in the components, it triggers an action creator. An action creator is a function called action. Actions are plain javascript objects of information that send data from your application to your store. They are the only source of information for the store. Reducers After action, returns this object, it is handled by Reducers. Reducers specify how the application’s state changes in response to actions sent to the store, depending on the action type. Store The store is the object that brings them together. It holds the application state, allows access to state, and allows state to be updated. Provider The provider distributes the data retrieved from a store to all the other components by encapsulating a main base component. This all seems highly theoretical, and may seem a bit difficult to gulp down first. But once you practically apply it, you will get used to complex terminologies and how Redux flows. Don’t forget to watch the video tutorial from Learning React Native Development by Mifta Sintaha to know more about Redux. For a comprehensive guide to building React Native mobile apps, buy the full video course from the Packt store. Introduction to Redux Creating Reusable Generic Modals in React and Redux Minko Gechev: “Developers should learn all major front-end frameworks to go to the next level”

0
0
42808

How-To Tutorials

article-image-openstack-networking-and-security-with-ansible-2

Vijin Boricha

03 Jul 2018

9 min read

Automating OpenStack Networking and Security with Ansible 2 [Tutorial]

Vijin Boricha

03 Jul 2018

9 min read

OpenStack is a software that can help us build a system similar to popular cloud providers, such as AWS or GCP. OpenStack provides an API and a dashboard to manage the resources that it controls. Basic operations, such as creating and managing virtual machines, block storage, object storage, identity management, and so on, are supported out of the box. This is an excerpt from Ansible 2 Cloud Automation Cookbook written by Aditya Patawari, Vikas Aggarwal. No matter the Cloud Platform you are using this book will help you orchestrate your cloud infrastructure. In the case of OpenStack, we control the underlying hardware and network, which comes with its own pros and cons. In this article we will leverage Ansible 2 to automate not so common networking tasks in OpenStack. We can use custom network solutions. We can use economical equipment or high-end devices, depending upon the actual need. This can help us get the features that we want and may end up saving money. Caution: Although OpenStack can be hosted on premises, several cloud providers provide OpenStack as a service. Sometimes these cloud providers may choose to turn off certain features or provide add-on features. Sometimes, even while configuring OpenStack on a self-hosted environment, we may choose to toggle certain features or configure a few things differently. Therefore, inconsistencies may occur. All the code examples in this article are tested on a self-hosted OpenStack released in August 2017, named Pike. The underlying operating system was CentOS 7.4. Managing security groups Security groups are the firewalls that can be used to allow or disallow the flow of traffic. They can be applied to virtual machines. Security groups and virtual machines have a many-to-many relationship. A single security group can be applied to multiple virtual machines and a single virtual machine can have multiple security groups. How to do it… Let's create a security group as follows: - name: create a security group for web servers os_security_group: name: web-sg state: present description: security group for web servers The name parameter has to be unique. The description parameter is optional, but we recommend using it to state the purpose of the security group. The preceding task will create a security group for us, but there are no rules attached to it. A firewall without any rules is of little use. So let's go ahead and add a rule to allow access to port 80 as follows: - name: allow port 80 for http os_security_group_rule: security_group: web-sg protocol: tcp port_range_min: 80 port_range_max: 80 remote_ip_prefix: 0.0.0.0/0 We also need SSH access to this server, so we should allow port 22 as well: - name: allow port 80 for SSH os_security_group_rule: security_group: web-sg protocol: tcp port_range_min: 22 port_range_max: 22 remote_ip_prefix: 0.0.0.0/0 How it works… For this module, we need to specify the name of the security group. The rule that we are creating will be associated with this group. We have to supply the protocol and the port range information. If we just want to whitelist only one port, then that would be the upper and lower bound of the range. Lastly, we have to specify the allowed addresses in the form of CIDR. The address 0.0.0.0/0 signifies that port 80 is open for everyone. This task will add an ingress type rule and allow traffic on port 80 to reach the instance. Similarly, we have to add a rule to allow traffic on port 22 as well. Managing network resources A network is a basic building block of the infrastructure. Most of the cloud providers will supply a sample or default network. While setting up a self-hosted OpenStack instance, a single network is typically created automatically. However, if the network is not created, or if we want to create another network for the purpose of isolation or compliance, we can do so using the os_network module. How to do it… Let's go ahead and create an isolated network and name it private, as follows: - name: creating a private network os_network: state: present name: private In the preceding example, we created a logical network with no subnets. A network with no subnets is of little use, so the next step would be to create a subnet: - name: creating a private subnet os_subnet: state: present network_name: private name: app cidr: 192.168.0.0/24 dns_nameservers: - 8.8.4.4 - 8.8.8.8 host_routes: - destination: 0.0.0.0/0 nexthop: 104.131.86.234 - destination: 192.168.0.0/24 nexthop: 192.168.0.1 How it works… The preceding task will create a subnet named app in the network called private. We have also supplied a CIDR for the subnet, 192.168.0.0/24. We are using Google DNS for nameservers as an example here, but this information should be obtained from the IT department of the organization. Similarly, we have set up the example host routes, but this information should be obtained from the IT department as well. After successful execution of this recipe, our network is ready to use. User management OpenStack provides an elaborate user management mechanism. If we are coming from a typical third-party cloud provider, such as AWS or GCP, then it can look overwhelming. The following list explains the building blocks of user management: Domain: This is a collection of projects and users that define an administrative entity. Typically, they can represent a company or a customer account. For a self-hosted setup, this could be done on the basis of departments or environments. A user with administrative privileges on the domain can further create projects, groups, and users. Group: A group is a collection of users owned by a domain. We can add and remove privileges from a group and our changes will be applied to all the users within the group. Project: A project creates a virtual isolation for resources and objects. This can be done to separate departments and environments as well. Role: This is a collection of privileges that can be applied to groups or users. User: A user can be a person or a virtual entity, such as a program, that accesses OpenStack services. For a complete documentation of the user management components, go through the OpenStack Identity document at https://docs.openstack.org/keystone/pike/admin/identity-concepts.html. How to do it… Let's go ahead and start creating some of these basic building blocks of user management. We should note that, most likely, a default version of these building blocks will already be present in most of the setups: We'll start with a domain called demodomain, as follows: - name: creating a demo domain os_keystone_domain: name: demodomain description: Demo Domain state: present register: demo_domain After we get the domain, let's create a role, as follows: - name: creating a demo role os_keystone_role: state: present name: demorole Projects can be created, as follows: - name: creating a demo project os_project: state: present name: demoproject description: Demo Project domain_id: "{{ demo_domain.id }}" enabled: True Once we have a role and a project, we can create a group, as follows: - name: creating a demo group os_group: state: present name: demogroup description: "Demo Group" domain_id: "{{ demo_domain.id }}" Let's create our first user: - name: creating a demo user os_user: name: demouser password: secret-pass update_password: on_create email: demo@example.com domain: "{{ demo_domain.id }}" state: present Now we have a user and a group. Let's add the user to the group that we created before: - name: adding user to the group os_user_group: user: demouser group: demogroup We can also associate a user or a group with a role: - name: adding role to the group os_user_role: group: demo2 role: demorole domain: "{{ demo_domain.id }}" How it works… In step 1, the os_keystone_domain would take a name as a mandatory parameter. We also supplied a description for our convenience. We are going to use the details of this domain, so we saved it in a variable called demo_domain. In step 2, the os_keystone_role would just take a name and create a role. Note that a role is not tied up with a domain. In step 3, the os_project module would require a name. We have added the description for our convenience. The projects are tied to a domain, so we have used the demo_domain variable that we registered in a previous task. In step 4, the groups are tied to domains as well. So, along with the name, we would specify the description and domain ID like we did before. At this point, the group is empty, and there are no users associated with this group. In step 5, we supply name along with a password for the user. The update_password parameter is set to on_create, which means that the password won't be modified for an existing user. This is great for the sake of idempotency. We also specify the email ID. This would be required for recovering the password and several other use cases. Lastly, we add the domain ID to create the user in the right domain. In step 6, the os_user_group module will help us associate the demouser with the demogroup. In step 7, the os_user_role will take a parameter for user or group and associate it with a role. A lot of these divisions might not be required for every organization. We recommend going through the documentation and understanding the use case of each of them. Another point to note is that we might not even see the user management bits on a day-to-day basis. Depending on the setup and our responsibilities, we might only interact with modules that involve managing resources, such as virtual machines and storage. We learned how to successfully to solve complex OpenStack networking tasks with Ansible 2. To learn more about managing other public cloud platforms like AWS and Azure refer to our book Ansible 2 Cloud Automation Cookbook. Getting Started with Ansible 2 System Architecture and Design of Ansible An In-depth Look at Ansible Plugins

0
0
18977

article-image-interactive-dashboard-with-vrealize-operations-manager

Vijin Boricha

03 Jul 2018

14 min read

Interactive dashboard with vRealize Operations Manager [Tutorial]

Vijin Boricha

03 Jul 2018

14 min read

0
0
24087

article-image-configuring-and-deploying-hbase-tutorial

Natasha Mathur

02 Jul 2018

21 min read

Configuring and deploying HBase [Tutorial]

Natasha Mathur

02 Jul 2018

21 min read

0
0
8894

article-image-administration-rights-for-power-bi-users

Pravin Dhandre

02 Jul 2018

8 min read

Administration rights for Power BI users

Pravin Dhandre

02 Jul 2018

8 min read

In this tutorial, you will understand and learn administration rights/rules for Power BI users. This includes setting and monitoring rules like; who in the organization can utilize which feature, how Power BI Premium capacity is allocated and by whom, and other settings such as embed codes and custom visuals. This article is an excerpt from a book written by Brett Powell titled Mastering Microsoft Power BI. The admin portal is accessible to Office 365 Global Administrators and users mapped to the Power BI service administrator role. To open the admin portal, log in to the Power BI service and select the Admin portal item from the Settings (Gear icon) menu in the top right, as shown in the following screenshot: All Power BI users, including Power BI free users, are able to access the Admin portal. However, users who are not admins can only view the Capacity settings page. The Power BI service administrators and Office 365 global administrators have view and edit access to the following seven pages: Administrators of Power BI most commonly utilize the Tenant settings and Capacity settings as described in the Tenant Settings and Power BI Premium Capacities sections later in this tutorial. However, the admin portal can also be used to manage any approved custom visuals for the organization. Usage metrics The Usage metrics page of the Admin portal provides admins with a Power BI dashboard of several top metrics, such as the most consumed dashboards and the most consumed dashboards by workspace. However, the dashboard cannot be modified and the tiles of the dashboard are not linked to any underlying reports or separate dashboards to support further analysis. Given these limitations, alternative monitoring solutions are recommended, such as the Office 365 audit logs and usage metric datasets specific to Power BI apps. Details of both monitoring options are included in the app usage metrics and Power BI audit log activities sections later in this chapter. Users and Audit logs The Users and Audit logs pages only provide links to the Office 365 admin center. In the admin center, Power BI users can be added, removed and managed. If audit logging is enabled for the organization via the Create audit logs for internal activity and auditing and compliance tenant setting, this audit log data can be retrieved from the Office 365 Security & Compliance Center or via PowerShell. This setting is noted in the following section regarding the Tenant settings tab of the Power BI admin portal. An Office 365 license is not required to utilize the Office 365 admin center for Power BI license assignments or to retrieve Power BI audit log activity. Tenant settings The Tenant settings page of the Admin portal allows administrators to enable or disable various features of the Power BI web service. Likewise, the administrator could allow only a certain security group to embed Power BI content in SaaS applications such as SharePoint Online. The following diagram identifies the 18 tenant settings currently available in the admin portal and the scope available to administrators for configuring each setting: From a data security perspective, the first seven settings within the Export and Sharing and Content packs and apps groups are most important. For example, many organizations choose to disable the Publish to web feature for the entire organization. Additionally, only certain security groups may be allowed to export data or to print hard copies of reports and dashboards. As shown in the Scope column of the previous table and the following example, granular security group configurations are available to minimize risk and manage the overall deployment. Currently, only one tenant setting is available for custom visuals and this setting (Custom visuals settings) can be enabled or disabled for the entire organization only. For organizations that wish to restrict or prohibit custom visuals for security reasons, this setting can be used to eliminate the ability to add, view, share, or interact with custom visuals. More granular controls to this setting are expected later in 2018, such as the ability to define users or security groups of users who are allowed to use custom visuals. In the following screenshot from the Tenant settings page of the Admin portal, only the users within the BI Admin security group who are not also members of the BI Team security group are allowed to publish apps to the entire organization: For example, a report author who also helps administer the On-premises data gateway via the BI Admin security group would be denied the ability to publish apps to the organization given membership in the BI Team security group. Many of the tenant setting configurations will be more simple than this example, particularly for smaller organizations or at the beginning of Power BI deployments. However, as adoption grows and the team responsible for Power BI changes, it's important that the security groups created to help administer these settings are kept up to date. Embed Codes Embed Codes are created and stored in the Power BI service when the Publish to web feature is utilized. As described in the Publish to web section of the previous chapter, this feature allows a Power BI report to be embedded in any website or shared via URL on the public internet. Users with edit rights to the workspace of the published to web content are able to manage the embed codes themselves from within the workspace. However, the admin portal provides visibility and access to embed codes across all workspaces, as shown in the following screenshot: Via the Actions commands on the far right of the Embed Codes page, a Power BI Admin can view the report in a browser (diagonal arrow) or remove the embed code. The Embed Codes page can be helpful to periodically monitor the usage of the Publish to web feature and for scenarios in which data was included in a publish to web report that shouldn't have been, and thus needs to be removed. As shown in the Power BI Tenant settings table referenced in the previous section, this feature can be enabled or disabled for the entire organization or for specific users within security groups. Organizational Custom visuals The Custom Visuals page allows admins to upload and manage custom visuals (.pbiviz files) that have been approved for use within the organization. For example, an organization may have proprietary custom visuals developed internally, which it wishes to expose to business users. Alternatively, the organization may wish to define a set of approved custom visuals, such as only the custom visuals that have been certified by Microsoft. In the following screenshot, the Chiclet Slicer custom visual is added as an organizational custom visual from the Organizational visuals page of the Power BI admin portal: The Organizational visuals page provides a link (Add a custom visual) to launch the form and identifies all uploaded visuals, as well as their last update. Once a visual has been uploaded, it can be deleted but not updated or modified. Therefore, when a new version of an organizational visual becomes available, this visual can be added to the list of organizational visuals with a descriptive title (Chiclet Slicer v2.0). Deleting an organizational custom visual will cause any reports that use this visual to stop rendering. The following screenshot reflects the uploaded Chiclet Slicer custom visual on the Organization visuals page: Once the custom visual has been uploaded as an organizational custom visual, it will be accessible to users in Power BI Desktop. In the following screenshot from Power BI Desktop, the user has opened the MARKETPLACE of custom visuals and selected MY ORGANIZATION: In this screenshot, rather than searching through the MARKETPLACE, the user can go directly to visuals defined by the organization. The marketplace of custom visuals can be launched via either the Visualizations pane or the From Marketplace icon on the Home tab of the ribbon. Organizational custom visuals are not supported for reports or dashboards shared with external users. Additionally, organizational custom visuals used in reports that utilize the publish to web feature will not render outside the Power BI tenant. Moreover, Organizational custom visuals are currently a preview feature. Therefore, users must enable the My organization custom visuals feature via the Preview features tab of the Options window in Power BI Desktop. With this, we got you acquainted with features and processes applicable in administering Power BI for an organization. This includes the configuration of tenant settings in the Power BI admin portal, analyzing the usage of Power BI assets, and monitoring overall user activity via the Office 365 audit logs. If you found this tutorial useful, do check out the book Mastering Microsoft Power BI to develop visually rich, immersive, and interactive Power BI reports and dashboards. Unlocking the secrets of Microsoft Power BI A tale of two tools: Tableau and Power BI Building a Microsoft Power BI Data Model

0
0
7578

article-image-how-to-migrate-power-bi-datasets-to-microsoft-analysis-services-models

Pravin Dhandre

29 Jun 2018

5 min read

How to migrate Power BI datasets to Microsoft Analysis Services models [Tutorial]

Pravin Dhandre

29 Jun 2018

5 min read

0
1
28761

How-To Tutorials

article-image-indexing-replicating-and-sharding-in-mongodb-tutorial

Amey Varangaonkar

29 Jun 2018

11 min read

Indexing, Replicating, and Sharding in MongoDB [Tutorial]

Amey Varangaonkar

29 Jun 2018

11 min read

MongoDB is an open source, document-oriented, and cross-platform database. It is primarily written in C++. It is also the leading NoSQL database and tied with the SQL database in the fifth position after PostgreSQL. It provides high performance, high availability, and easy scalability. MongoDB uses JSON-like documents with schema. MongoDB, developed by MongoDB Inc., is free to use. It is published under a combination of the GNU Affero General Public License and the Apache License. In this article, we look at the indexing, replication and sharding features offered by MongoDB. The following excerpt is taken from the book 'Seven NoSQL Databases in a Week' written by Aaron Ploetz et al. Introduction to MongoDB indexing Indexes allow efficient execution of MongoDB queries. If we don't have indexes, MongoDB has to scan all the documents in the collection to select those documents that match the criteria. If proper indexing is used, MongoDB can limit the scanning of documents and select documents efficiently. Indexes are a special data structure that store some field values of documents in an easy-to-traverse way. Indexes store the values of specific fields or sets of fields, ordered by the values of fields. The ordering of field values allows us to apply effective algorithms of traversing, such as the mid-search algorithm, and also supports range-based operations effectively. In addition, MongoDB can return sorted results easily. Indexes in MongoDB are the same as indexes in other database systems. MongoDB defines indexes at the collection level and supports indexes on fields and sub-fields of documents. The default _id index MongoDB creates the default _id index when creating a document. The _id index prevents users from inserting two documents with the same _id value. You cannot drop an index on an _id field. The following syntax is used to create an index in MongoDB: >db.collection.createIndex(<key and index type specification>, <options>); The preceding method creates an index only if an index with the same specification does not exist. MongoDB indexes use the B-tree data structure. The following are the different types of indexes: Single field: In addition to the _id field index, MongoDB allows the creation of an index on any single field in ascending or descending order. For a single field index, the order of the index does not matter as MongoDB can traverse indexes in any order. The following is an example of creating an index on the single field where we are creating an index on the firstName field of the user_profiles collection: The query gives acknowledgment after creating the index: This will create an ascending index on the firstName field. To create a descending index, we have to provide -1 instead of 1. Compound index: MongoDB also supports user-defined indexes on multiple fields. The order of fields defined while creating an index has a significant effect. For example, a compound index defined as {firstName:1, age:-1} will sort data by firstName first and then each firstName with age. Multikey index: MongoDB uses multi-key indexes to index the content in the array. If you index the field that contains the array values, MongoDB creates an index for each field in the object of an array. These indexes allow queries to select the document by matching the element or set of elements of the array. MongoDB automatically decides whether to create multi-key indexes or not. Text indexes: MongoDB provides text indexes that support the searching of string contents in the MongoDB collection. To create text indexes, we have to use the db.collection.createIndex() method, but we need to pass a text string literal in the query: You can also create text indexes on multiple fields, for example: Once the index is created, we get an acknowledgment: Compound indexes can be used with text indexes to define an ascending or descending order of the index. Hashed index: To support hash-based sharding, MongoDB supports hashed indexes. In this approach, indexes store the hash value and query, and the select operation checks the hashed indexes. Hashed indexes can support only equality-based operations. They are limited in their performance of range-based operations. Indexes have the following properties: Unique indexes: Indexes should maintain uniqueness. This makes MongoDB drop the duplicate value from indexes. Partial Indexes: Partial indexes apply the index on documents of a collection that match a specified condition. By applying an index on the subset of documents in the collection, partial indexes have a lower storage requirement as well as a reduced performance cost. Sparse index: In the sparse index, MongoDB includes only those documents in the index in which the index field is present, other documents are discarded. We can combine unique indexes with a sparse index to reject documents that have duplicate values but ignore documents that have an indexed key. TTL index: TTL indexes are a special type of indexes where MongoDB will automatically remove the document from the collection after a certain amount of time. Such indexes are ideal to remove machine-generated data, logs, and session information that we need for a finite duration. The following TTL index will automatically delete data from the log table after 3000 seconds: Once the index is created, we get an acknowledgment message: The limitations of indexes: A single collection can have up to 64 indexes only. The qualified index name is <database-name>.<collection-name>.$<index-name> and cannot have more than 128 characters. By default, the index name is a combination of index type and field name. You can specify an index name while using the createIndex() method to ensure that the fully-qualified name does not exceed the limit. There can be no more than 31 fields in the compound index. The query cannot use both text and geospatial indexes. You cannot combine the $text operator, which requires text indexes, with some other query operator required for special indexes. For example, you cannot combine the $text operator with the $near operator. Fields with 2d sphere indexes can only hold geometry data. 2d sphere indexes are specially provided for geometric data operations. For example, to perform operations on co-ordinate, we have to provide data as points on a planer co-ordinate system, [x, y]. For non-geometries, the data query operation will fail. The limitation on data: The maximum number of documents in a capped collection must be less than 2^32. We should define it by the max parameter while creating it. If you do not specify, the capped collection can have any number of documents, which will slow down the queries. The MMAPv1 storage engine will allow 16,000 data files per database, which means it provides the maximum size of 32 TB. We can set the storage.mmapv1.smallfile parameter to reduce the size of the database to 8 TB only. Replica sets can have up to 50 members. Shard keys cannot exceed 512 bytes. Replication in MongoDB A replica set is a group of MongoDB instances that store the same set of data. Replicas are basically used in production to ensure a high availability of data. Redundancy and data availability: because of replication, we have redundant data across the MongoDB instances. We are using replication to provide a high availability of data to the application. If one instance of MongoDB is unavailable, we can serve data from another instance. Replication also increases the read capacity of applications as reading operations can be sent to different servers and retrieve data faster. By maintaining data on different servers, we can increase the locality of data and increase the availability of data for distributed applications. We can use the replica copy for backup, reporting, as well as disaster recovery. Working with replica sets A replica set is a group of MongoDB instances that have the same dataset. A replica set has one arbiter node and multiple data-bearing nodes. In data-bearing nodes, one node is considered the primary node while the other nodes are considered the secondary nodes. All write operations happen at the primary node. Once a write occurs at the primary node, the data is replicated across the secondary nodes internally to make copies of the data available to all nodes and to avoid data inconsistency. If a primary node is not available for the operation, secondary nodes use election algorithms to select one of their nodes as a primary node. A special node, called an arbiter node, is added in the replica set. This arbiter node does not store any data. The arbiter is used to maintain a quorum in the replica set by responding to a heartbeat and election request sent by the secondary nodes in replica sets. As an arbiter does not store data, it is a cost-effective resource used in the election process. If votes in the election process are even, the arbiter adds a voice to choose a primary node. The arbiter node is always the arbiter, it will not change its behavior, unlike a primary or secondary node. The primary node can step down and work as secondary node, while secondary nodes can be elected to perform as primary nodes. Secondary nodes apply read/write operations from a primary node to secondary nodes asynchronously. Automatic failover in replication Primary nodes always communicate with other members every 10 seconds. If it fails to communicate with the others in 10 seconds, other eligible secondary nodes hold an election to choose a primary-acting node among them. The first secondary node that holds the election and receives the majority of votes is elected as a primary node. If there is an arbiter node, its vote is taken into consideration while choosing primary nodes. Read operations Basically, the read operation happens at the primary node only, but we can specify the read operation to be carried out from secondary nodes also. A read from a secondary node does not affect data at the primary node. Reading from secondary nodes can also give inconsistent data. Sharding in MongoDB Sharding is a methodology to distribute data across multiple machines. Sharding is basically used for deployment with a large dataset and high throughput operations. The single database cannot handle a database with large datasets as it requires larger storage, and bulk query operations can use most of the CPU cycles, which slows down processing. For such scenarios, we need more powerful systems. One approach is to add more capacity to a single server, such as adding more memory and processing units or adding more RAM on the single server, this is also called vertical scaling. Another approach is to divide a large dataset across multiple systems and serve a data application to query data from multiple servers. This approach is called horizontal scaling. MongoDB handles horizontal scaling through sharding. Sharded clusters MongoDB's sharding consists of the following components: Shard: Each shard stores a subset of sharded data. Also, each shard can be deployed as a replica set. Mongos: Mongos provide an interface between a client application and sharded cluster to route the query. Config server: The configuration server stores the metadata and configuration settings for the cluster. The MongoDB data is sharded at the collection level and distributed across sharded clusters. Shard keys: To distribute documents in collections, MongoDB partitions the collection using the shard key. MongoDB shards data into chunks. These chunks are distributed across shards in sharded clusters. Advantages of sharding Here are some of the advantages of sharding: When we use sharding, the load of the read/write operations gets distributed across sharded clusters. As sharding is used to distribute data across a shard cluster, we can increase the storage capacity by adding shards horizontally. MongoDB allows continuing the read/write operation even if one of the shards is unavailable. In the production environment, shards should deploy with a replication mechanism to maintain high availability and add fault tolerance in a system. Indexing, sharding and replication are three of the most important tasks to perform on any database, as they ensure optimal querying and database performance. In this article, we saw how MongoDB facilitates these tasks and makes them as easy as possible for the administrators to take care of. If you found the excerpt to be useful, make sure you check out our book Seven NoSQL Databases in a Week to learn more about the different database administration techniques in MongoDB, as well as the other popularly used NoSQL databases such as Redis, HBase, Neo4j, and more. Read more Top 5 programming languages for crunching Big Data effectively Top 5 NoSQL Databases Is Apache Spark today’s Hadoop?

0
0
23476

How-To Tutorials

article-image-build-actuator-application-with-raspberry-pi-3

Gebin George

28 Jun 2018

10 min read

Build an Actuator app for controlling Illumination with Raspberry Pi 3

Gebin George

28 Jun 2018

10 min read

In this article, we will look at how to build an actuator application for controlling illuminations. This article is an excerpt from the book, Mastering Internet of Things, written by Peter Waher. This book will help you design and implement IoT solutions with single board computers. Preparing our project Let's create a new Universal Windows Platform application project. This time, we'll call it Actuator. We can also use the Raspberry Pi 3, even though we will only use the relay in this project. To make the persistence of application states even easier, we'll also include the latest version of the NuGet package Waher.Runtime.Settings in the project. It uses the underlying object database defined by Waher.Persistence to persist application settings. Defining control parameters Actuators come in all sorts, types, and sizes, from the very complex to the very simple. While it would be possible to create a proprietary format that configures the actuator in a bulk operation, such a method is doomed to fail if you aim for any type of interoperable communication. Since the internet is based on interoperability as a core principle, we should consider this from the start, during the design phase. Interoperability means devices can learn to operate together, even if they are from different manufacturers. To achieve this, devices must be able to describe what they can do, in a way that each participant understands. To be able to do this, we need a way to break down (divide and conquer) a complex actuator into parts that are easily described and understood. One way is to see an actuator as a collection of control parameters. Each control parameter is a named parameter with a simple and recognizable data type. (In the same way, we can see a sensor as a collection of sensor data fields.): For our example, we will only need one control parameter: A Boolean control parameter controlling the state of our relay. We'll just call it Output, for simplicity. Understanding relays Relays, simply put, are electric switches that we can control using a small output signal. They're perfect for small controllers, like Raspberry Pi, to switch other circuits with higher voltages on and off. The simplest example is to use a relay to switch a lamp on and off. We can't light the lamp using the voltage available to us in Raspberry Pi, but we can use a relay as a switch to control the lamp. The principal part of a normal relay is a coil. When electricity runs through it, it magnetizes an iron core, which in turn moves a lever from the Normally Closed (NC) connector to the Normally Open (NO) connector. When electricity is cut, a spring returns the lever from the NO connector to the NC connector. This movement of the lever from one connector to the other causes a characteristic clicking sound. This tells you that the relay works. The lever in turn is connected to the Common Ground (COM) connector. The following figure illustrates how a simple relay is constructed. We control the flow of the current through the coil (L1) using our output SIGNAL (D1 in our case). Internally, in the relay, a resistor (R1) is placed before the base pin of the transistor (T1), to adapt the signal voltage to an appropriate level. When we connect or cut the current through the coil, it will induce a reverse current. This may be harmful for the transistor when the current is being cut. For that reason, a fly-back diode (D1) is added, allowing excess current to be fed back, avoiding harm to the transistor: Connecting our lamp Now that we know how a relay works, it's relatively easy to connect our lamp to it. Since we want the lamp to be illuminated when we turn the relay on (set D1to HIGH), we will use the NO and COM connectors, and let the NC connector be. If the lamp has a normal two-wire AC cable, we can insert the relay into the AC circuit by simply cutting one of the wires, inserting one end into the NO connector and the other into the COM connector, as is illustrated in the following figure: Be sure to follow appropriate safety regulations when working with electricity. Connecting an LED An alternative to working with the alternating current (AC) is to use a low-power direct current (DC) source and an LED to simulate a lamp. You can connect the COM connector to a resistor and an LED, and then to ground (GND) on one end, and the NO directly to the 5V or 3.3V source on the Raspberry Pi on the other end. The size of the resistor is determined by how much current the LED needs to light up, and the voltage source you choose. If the LED needs 20 mA, and you connect it to a 5V source, Ohms Law tells us we need an R = U/I = 5V/0.02A = 250 Ω resistor. The following figure illustrates this: Controlling the output The relay is connected to our digital output pin 9 on the Arduino board. As such, controlling it is a simple call to the digitalWrite() method on our arduino object. Since we will need to perform this control action from various locations in code in later chapters, we'll create a method for it: internal async Task SetOutput(bool On, string Actor) { if (this.arduino != null) { this.arduino.digitalWrite(9, On ? PinState.HIGH : PinState.LOW); The first parameter simply states the new value of the control parameter. We'll add a second parameter that describes who is making the requested change. This will come in handy later, when we allow online users to change control parameters. Persisting control parameter states If the device reboots for some reason, for instance after a power outage, it's normally desirable that it returns to the state it was in before it shut down. For this, we need to persist the output value. We can use the object database defined in Waher.Persistence and Waher.Persistence.Files for this. But for simple control states, we don't need to create our own data-bearing classes. That has already been done by Waher.Runtime.Settings. To use it, we first include the NuGet, as described earlier. We must also include its assembly when we initialize the runtime inventory, which is used by the object database: Types.Initialize( typeof(FilesProvider).GetTypeInfo().Assembly, typeof(App).GetTypeInfo().Assembly, typeof(RuntimeSettings).GetTypeInfo().Assembly); Depending on the build version selected when creating your UWP application, different versions of .NET Standard will be supported. Build 10586 for instance, only supports .NET Standard up to v1.4. Build 16299, however, supports .NET Standard up to v2.0. The Waher.Runtime.Inventory.Loader library, available as a NuGet package, provides the capability to loop through existing assemblies in a simple manner, but it requires support for .NET Standard 1.5. You can call its TypesLoader.Initialize() method to initialize Waher.Runtime.Inventory with all assemblies available in the runtime. It also dynamically loads all permitted assemblies available in the application folder that have not been loaded. Saving the current control state is then simply a matter of calling the Set() or SetAsync() methods on the static RuntimeSettings class, defined in the Waher.Runtime.Settings namespace: await RuntimeSettings.SetAsync("Actuator.Output", On); During the initialization of the device, we then call the Get() or GetAsync() methods to get the last value, if it exists. If it does not exist, a default value we define is returned: bool LastOn = await RuntimeSettings.GetAsync("Actuator.Output", false); this.arduino.digitalWrite(1, LastOn ? PinState.HIGH : PinState.LOW); Logging important control events In distributed IoT control applications, it's vitally important to make sure unauthorized access to the system is avoided. While we will dive deeper into this subject in later chapters, one important tool we can start using it to log everything of a security interest in the event log. We can decide what to do with the event log later, whether we want to analyze or store it locally or distribute it in the network for analysis somewhere else. But unless we start logging events of security interest directly when we develop, we risk forgetting logging certain events later. So, let's log an event every time the output is set: Log.Informational("Setting Control Parameter.", string.Empty, Actor ?? "Windows user", new KeyValuePair<string, object>("Output", On)); If the Actor parameter is null, we assume the control parameter has been set from the Windows GUI. We use this fact, to update the window if the change has been requested from somewhere else: if (Actor != null) await MainPage.Instance.OutputSet(On); Using Raspberry Pi GPIO pins directly The Raspberry Pi can also perform input and output without an Arduino board. But the General-Purpose Input/Output (GPIO) pins available only supports digital input and output. Since the relay module is controlled through a digital output, we can connect it directly to the Raspberry Pi, if we want. That way, we don't need the Arduino board. (We wouldn't be able to test-run the application on the local machine either, though.) Checking whether GPIO is available GPIO pins are accessed through the GpioController class defined in the Windows.Devices.Gpio namespace. First, we must check that GPIO is available on the machine. We do this by getting the default controller, and checking whether it's available: gpio = GpioController.GetDefault(); if (gpio != null) { ... } else Log.Error("Unable to get access to GPIO pin " + gpioOutputPin.ToString()); Initializing the GPIO output pin Once we have access to the controller, we can try to open exclusive access to the GPIO pin we've connected the relay to: if (gpio.TryOpenPin(gpioOutputPin, GpioSharingMode.Exclusive, out this.gpioPin, out GpioOpenStatus Status) && Status == GpioOpenStatus.PinOpened) { ... } else Log.Error("Unable to get access to GPIO pin " + gpioOutputPin.ToString()); Through the GpioPin object gpioPin, we can now control the pin. The first step is to set the operating mode for the pin. This is done by calling the SetDriveMode() method. There are many different modes a pin can be set to, not all necessarily supported by the underlying firmware and hardware. To check that a mode is supported, call the IsDriveModeSupported() method first: if (this.gpioPin.IsDriveModeSupported(GpioPinDriveMode.Output)) { This.gpioPin.SetDriveMode(GpioPinDriveMode.Output); ... } else Log.Error("Output mode not supported for GPIO pin " + gpioOutputPin.ToString()); There are various output modes available: Output, OutputOpenDrain, OutputOpenDrainPullUp, OutputOpenSource, and OutputOpenSourcePullDown. The code documentation for each flag describes the particulars of each option. Setting the GPIO pin output To set the actual output value, we call the Write() method on the pin object: bool LastOn = await RuntimeSettings.GetAsync("Actuator.Output", false); this.gpioPin.Write(LastOn ? GpioPinValue.High : GpioPinValue.Low); We need to make a similar change in the SetOutput() method. The Actuator project in the MIOT repository uses the Arduino use case by default. The GPIO code is also available through conditional compiling. It is activated by uncommenting the GPIO switch definition on the first row of the App.xaml.cs file. You can also perform Digital Input using principles similar to the preceding ones, with some differences. First, you select an input drive mode: Input, InputPullUp or InputPullDown. You then use the Read() method to read the current state of the pin. You can also use the ValueChanged event to get a notification whenever the input pin changes value. We saw how to create a simple actuator app for the Raspberry Pi using C#. If you found our post useful, do check out this title Mastering Internet of Things, to build complex projects using motions detectors, controllers, sensors, and Raspberry Pi 3. Should you go with Arduino Uno or Raspberry Pi 3 for your next IoT project? Build your first Raspberry Pi project Meet the Coolest Raspberry Pi Family Member: Raspberry Pi Zero W Wireless

0
0
31890

How-To Tutorials

article-image-build-and-train-rnn-chatbot-using-tensorflow

Sunith Shetty

28 Jun 2018

21 min read

Build and train an RNN chatbot using TensorFlow [Tutorial]

Sunith Shetty

28 Jun 2018

21 min read

Chatbots are increasingly used as a way to provide assistance to users. Many companies, including banks, mobile/landline companies and large e-sellers now use chatbots for customer assistance and for helping users in pre and post sales queries. They are a great tool for companies which don't need to provide additional customer service capacity for trivial questions: they really look like a win-win situation! In today’s tutorial, we will understand how to train an automatic chatbot that will be able to answer simple and generic questions, and how to create an endpoint over HTTP for providing the answers via an API. This article is an excerpt from a book written by Luca Massaron, Alberto Boschetti, Alexey Grigorev, Abhishek Thakur, and Rajalingappaa Shanmugamani titled TensorFlow Deep Learning Projects. There are mainly two types of chatbot: the first is a simple one, which tries to understand the topic, always providing the same answer for all questions about the same topic. For example, on a train website, the questions Where can I find the timetable of the City_A to City_B service? and What's the next train departing from City_A? will likely get the same answer, that could read Hi! The timetable on our network is available on this page: <link>. This types of chatbots use classification algorithms to understand the topic (in the example, both questions are about the timetable topic). Given the topic, they always provide the same answer. Usually, they have a list of N topics and N answers; also, if the probability of the classified topic is low (the question is too vague, or it's on a topic not included in the list), they usually ask the user to be more specific and repeat the question, eventually pointing out other ways to do the question (send an email or call the customer service number, for example). The second type of chatbots is more advanced, smarter, but also more complex. For those, the answers are built using an RNN, in the same way, that machine translation is performed. Those chatbots are able to provide more personalized answers, and they may provide a more specific reply. In fact, they don't just guess the topic, but with an RNN engine, they're able to understand more about the user's questions and provide the best possible answer: in fact, it's very unlikely you'll get the same answers with two different questions using these types of chatbots. The input corpus Unfortunately, we haven't found any consumer-oriented dataset that is open source and freely available on the Internet. Therefore, we will train the chatbot with a more generic dataset, not really focused on customer service. Specifically, we will use the Cornell Movie Dialogs Corpus, from the Cornell University. The corpus contains the collection of conversations extracted from raw movie scripts, therefore the chatbot will be able to answer more to fictional questions than real ones. The Cornell corpus contains more than 200,000 conversational exchanges between 10+ thousands of movie characters, extracted from 617 movies. The dataset is available here: https://www.cs.cornell.edu/~cristian/Cornell_Movie-Dialogs_Corpus.html. We would like to thank the authors for having released the corpus: that makes experimentation, reproducibility and knowledge sharing easier. The dataset comes as a .zip archive file. After decompressing it, you'll find several files in it: README.txt contains the description of the dataset, the format of the corpora files, the details on the collection procedure and the author's contact. Chameleons.pdf is the original paper for which the corpus has been released. Although the goal of the paper is strictly not around chatbots, it studies the language used in dialogues, and it's a good source of information to understanding more movie_conversations.txt contains all the dialogues structure. For each conversation, it includes the ID of the two characters involved in the discussion, the ID of the movie and the list of sentences IDs (or utterances, to be more precise) in chronological order. For example, the first line of the file is: u0 +++$+++ u2 +++$+++ m0 +++$+++ ['L194', 'L195', 'L196', 'L197'] That means that user u0 had a conversation with user u2 in the movie m0 and the conversation had 4 utterances: 'L194', 'L195', 'L196' and 'L197' movie_lines.txt contains the actual text of each utterance ID and the person who produced it. For example, the utterance L195 is listed here as: L195 +++$+++ u2 +++$+++ m0 +++$+++ CAMERON +++$+++ Well, I thought we'd start with pronunciation, if that's okay with you. So, the text of the utterance L195 is Well, I thought we'd start with pronunciation, if that's okay with you. And it was pronounced by the character u2 whose name is CAMERON in the movie m0. movie_titles_metadata.txt contains information about the movies, including the title, year, IMDB rating, the number of votes in IMDB and the genres. For example, the movie m0 here is described as: m0 +++$+++ 10 things i hate about you +++$+++ 1999 +++$+++ 6.90 +++$+++ 62847 +++$+++ ['comedy', 'romance'] So, the title of the movie whose ID is m0 is 10 things i hate about you, it's from 1999, it's a comedy with romance and it received almost 63 thousand votes on IMDB with an average score of 6.9 (over 10.0) movie_characters_metadata.txt contains information about the movie characters, including the name the title of the movie where he/she appears, the gender (if known) and the position in the credits (if known). For example, the character “u2” appears in this file with this description: u2 +++$+++ CAMERON +++$+++ m0 +++$+++ 10 things i hate about you +++$+++ m +++$+++ 3 The character u2 is named CAMERON, it appears in the movie m0 whose title is 10 things i hate about you, his gender is male and he's the third person appearing in the credits. raw_script_urls.txt contains the source URL where the dialogues of each movie can be retrieved. For example, for the movie m0 that's it: m0 +++$+++ 10 things i hate about you +++$+++ http://www.dailyscript.com/scripts/10Things.html As you will have noticed, most files use the token +++$+++ to separate the fields. Beyond that, the format looks pretty straightforward to parse. Please take particular care while parsing the files: their format is not UTF-8 but ISO-8859-1. Creating the training dataset Let's now create the training set for the chatbot. We'd need all the conversations between the characters in the correct order: fortunately, the corpora contains more than what we actually need. For creating the dataset, we will start by downloading the zip archive, if it's not already on disk. We'll then decompress the archive in a temporary folder (if you're using Windows, that should be C:Temp), and we will read just the movie_lines.txt and the movie_conversations.txt files, the ones we really need to create a dataset of consecutive utterances. Let's now go step by step, creating multiple functions, one for each step, in the file corpora_downloader.py. The first function we need is to retrieve the file from the Internet, if not available on disk. def download_and_decompress(url, storage_path, storage_dir): import os.path directory = storage_path + "/" + storage_dir zip_file = directory + ".zip" a_file = directory + "/cornell movie-dialogs corpus/README.txt" if not os.path.isfile(a_file): import urllib.request import zipfile urllib.request.urlretrieve(url, zip_file) with zipfile.ZipFile(zip_file, "r") as zfh: zfh.extractall(directory) return This function does exactly that: it checks whether the “README.txt” file is available locally; if not, it downloads the file (thanks for the urlretrieve function in the urllib.request module) and it decompresses the zip (using the zipfile module). The next step is to read the conversation file and extract the list of utterance IDS. As a reminder, its format is: u0 +++$+++ u2 +++$+++ m0 +++$+++ ['L194', 'L195', 'L196', 'L197'], therefore what we're looking for is the fourth element of the list after we split it on the token +++$+++ . Also, we'd need to clean up the square brackets and the apostrophes to have a clean list of IDs. For doing that, we shall import the re module, and the function will look like this. import re def read_conversations(storage_path, storage_dir): filename = storage_path + "/" + storage_dir + "/cornell movie-dialogs corpus/movie_conversations.txt" with open(filename, "r", encoding="ISO-8859-1") as fh: conversations_chunks = [line.split(" +++$+++ ") for line in fh] return [re.sub('[[]']', '', el[3].strip()).split(", ") for el in conversations_chunks] As previously said, remember to read the file with the right encoding, otherwise, you'll get an error. The output of this function is a list of lists, each of them containing the sequence of utterance IDS in a conversation between characters. Next step is to read and parse the movie_lines.txt file, to extract the actual utterances texts. As a reminder, the file looks like this line: L195 +++$+++ u2 +++$+++ m0 +++$+++ CAMERON +++$+++ Well, I thought we'd start with pronunciation, if that's okay with you. Here, what we're looking for are the first and the last chunks. def read_lines(storage_path, storage_dir): filename = storage_path + "/" + storage_dir + "/cornell movie-dialogs corpus/movie_lines.txt" with open(filename, "r", encoding="ISO-8859-1") as fh: lines_chunks = [line.split(" +++$+++ ") for line in fh] return {line[0]: line[-1].strip() for line in lines_chunks} The very last bit is about tokenization and alignment. We'd like to have a set whose observations have two sequential utterances. In this way, we will train the chatbot, given the first utterance, to provide the next one. Hopefully, this will lead to a smart chatbot, able to reply to multiple questions. Here's the function: def get_tokenized_sequencial_sentences(list_of_lines, line_text): for line in list_of_lines: for i in range(len(line) - 1): yield (line_text[line[i]].split(" "), line_text[line[i+1]].split(" ")) Its output is a generator containing a tuple of the two utterances (the one on the right follows temporally the one on the left). Also, utterances are tokenized on the space character. Finally, we can wrap up everything into a function, which downloads the file and unzip it (if not cached), parse the conversations and the lines, and format the dataset as a generator. As a default, we will store the files in the /tmp directory: def retrieve_cornell_corpora(storage_path="/tmp", storage_dir="cornell_movie_dialogs_corpus"): download_and_decompress("http://www.cs.cornell.edu/~cristian/data/cornell_movie_dialogs_corpus.zip", storage_path, storage_dir) conversations = read_conversations(storage_path, storage_dir) lines = read_lines(storage_path, storage_dir) return tuple(zip(*list(get_tokenized_sequencial_sentences(conversations, lines)))) At this point, our training set looks very similar to the training set used in the translation project. We can, therefore, use some pieces of code we've developed in the machine learning translation article. For example, the corpora_tools.py file can be used here without any change (also, it requires the data_utils.py). Given that file, we can dig more into the corpora, with a script to check the chatbot input. To inspect the corpora, we can use the corpora_tools.py, and the file we've previously created. Let's retrieve the Cornell Movie Dialog Corpus, format the corpora and print an example and its length: from corpora_tools import * from corpora_downloader import retrieve_cornell_corpora sen_l1, sen_l2 = retrieve_cornell_corpora() print("# Two consecutive sentences in a conversation") print("Q:", sen_l1[0]) print("A:", sen_l2[0]) print("# Corpora length (i.e. number of sentences)") print(len(sen_l1)) assert len(sen_l1) == len(sen_l2) This code prints an example of two tokenized consecutive utterances, and the number of examples in the dataset, that is more than 220,000: # Two consecutive sentences in a conversation Q: ['Can', 'we', 'make', 'this', 'quick?', '', 'Roxanne', 'Korrine', 'and', 'Andrew', 'Barrett', 'are', 'having', 'an', 'incredibly', 'horrendous', 'public', 'break-', 'up', 'on', 'the', 'quad.', '', 'Again.'] A: ['Well,', 'I', 'thought', "we'd", 'start', 'with', 'pronunciation,', 'if', "that's", 'okay', 'with', 'you.'] # Corpora length (i.e. number of sentences) 221616 Let's now clean the punctuation in the sentences, lowercase them and limits their size to 20 words maximum (that is examples where at least one of the sentences is longer than 20 words are discarded). This is needed to standardize the tokens: clean_sen_l1 = [clean_sentence(s) for s in sen_l1] clean_sen_l2 = [clean_sentence(s) for s in sen_l2] filt_clean_sen_l1, filt_clean_sen_l2 = filter_sentence_length(clean_sen_l1, clean_sen_l2) print("# Filtered Corpora length (i.e. number of sentences)") print(len(filt_clean_sen_l1)) assert len(filt_clean_sen_l1) == len(filt_clean_sen_l2) This leads us to almost 140,000 examples: # Filtered Corpora length (i.e. number of sentences) 140261 Then, let's create the dictionaries for the two sets of sentences. Practically, they should look the same (since the same sentence appears once on the left side, and once in the right side) except there might be some changes introduced by the first and last sentences of a conversation (they appear only once). To make the best out of our corpora, let's build two dictionaries of words and then encode all the words in the corpora with their dictionary indexes: dict_l1 = create_indexed_dictionary(filt_clean_sen_l1, dict_size=15000, storage_path="/tmp/l1_dict.p") dict_l2 = create_indexed_dictionary(filt_clean_sen_l2, dict_size=15000, storage_path="/tmp/l2_dict.p") idx_sentences_l1 = sentences_to_indexes(filt_clean_sen_l1, dict_l1) idx_sentences_l2 = sentences_to_indexes(filt_clean_sen_l2, dict_l2) print("# Same sentences as before, with their dictionary ID") print("Q:", list(zip(filt_clean_sen_l1[0], idx_sentences_l1[0]))) print("A:", list(zip(filt_clean_sen_l2[0], idx_sentences_l2[0]))) That prints the following output. We also notice that a dictionary of 15 thousand entries doesn't contain all the words and more than 16 thousand (less popular) of them don't fit into it: [sentences_to_indexes] Did not find 16823 words [sentences_to_indexes] Did not find 16649 words # Same sentences as before, with their dictionary ID Q: [('well', 68), (',', 8), ('i', 9), ('thought', 141), ('we', 23), ("'", 5), ('d', 83), ('start', 370), ('with', 46), ('pronunciation', 3), (',', 8), ('if', 78), ('that', 18), ("'", 5), ('s', 12), ('okay', 92), ('with', 46), ('you', 7), ('.', 4)] A: [('not', 31), ('the', 10), ('hacking', 7309), ('and', 23), ('gagging', 8761), ('and', 23), ('spitting', 6354), ('part', 437), ('.', 4), ('please', 145), ('.', 4)] As the final step, let's add paddings and markings to the sentences: data_set = prepare_sentences(idx_sentences_l1, idx_sentences_l2, max_length_l1, max_length_l2) print("# Prepared minibatch with paddings and extra stuff") print("Q:", data_set[0][0]) print("A:", data_set[0][1]) print("# The sentence pass from X to Y tokens") print("Q:", len(idx_sentences_l1[0]), "->", len(data_set[0][0])) print("A:", len(idx_sentences_l2[0]), "->", len(data_set[0][1])) And that, as expected, prints: # Prepared minibatch with paddings and extra stuff Q: [0, 68, 8, 9, 141, 23, 5, 83, 370, 46, 3, 8, 78, 18, 5, 12, 92, 46, 7, 4] A: [1, 31, 10, 7309, 23, 8761, 23, 6354, 437, 4, 145, 4, 2, 0, 0, 0, 0, 0, 0, 0, 0, 0] # The sentence pass from X to Y tokens Q: 19 -> 20 A: 11 -> 22 Training the chatbot After we're done with the corpora, it's now time to work on the model. This project requires again a sequence to sequence model, therefore we can use an RNN. Even more, we can reuse part of the code from the previous project: we'd just need to change how the dataset is built, and the parameters of the model. We can then copy the training script, and modify the build_dataset function, to use the Cornell dataset. Mind that the dataset used in this article is bigger than the one used in the machine learning translation article, therefore you may need to limit the corpora to a few dozen thousand lines. On a 4 years old laptop with 8GB RAM, we had to select only the first 30 thousand lines, otherwise, the program ran out of memory and kept swapping. As a side effect of having fewer examples, even the dictionaries are smaller, resulting in less than 10 thousands words each. def build_dataset(use_stored_dictionary=False): sen_l1, sen_l2 = retrieve_cornell_corpora() clean_sen_l1 = [clean_sentence(s) for s in sen_l1][:30000] ### OTHERWISE IT DOES NOT RUN ON MY LAPTOP clean_sen_l2 = [clean_sentence(s) for s in sen_l2][:30000] ### OTHERWISE IT DOES NOT RUN ON MY LAPTOP filt_clean_sen_l1, filt_clean_sen_l2 = filter_sentence_length(clean_sen_l1, clean_sen_l2, max_len=10) if not use_stored_dictionary: dict_l1 = create_indexed_dictionary(filt_clean_sen_l1, dict_size=10000, storage_path=path_l1_dict) dict_l2 = create_indexed_dictionary(filt_clean_sen_l2, dict_size=10000, storage_path=path_l2_dict) else: dict_l1 = pickle.load(open(path_l1_dict, "rb")) dict_l2 = pickle.load(open(path_l2_dict, "rb")) dict_l1_length = len(dict_l1) dict_l2_length = len(dict_l2) idx_sentences_l1 = sentences_to_indexes(filt_clean_sen_l1, dict_l1) idx_sentences_l2 = sentences_to_indexes(filt_clean_sen_l2, dict_l2) max_length_l1 = extract_max_length(idx_sentences_l1) max_length_l2 = extract_max_length(idx_sentences_l2) data_set = prepare_sentences(idx_sentences_l1, idx_sentences_l2, max_length_l1, max_length_l2) return (filt_clean_sen_l1, filt_clean_sen_l2), data_set, (max_length_l1, max_length_l2), (dict_l1_length, dict_l2_length) By inserting this function into the train_translator.py file and rename the file as train_chatbot.py, we can run the training of the chatbot. After a few iterations, you can stop the program and you'll see something similar to this output: [sentences_to_indexes] Did not find 0 words [sentences_to_indexes] Did not find 0 words global step 100 learning rate 1.0 step-time 7.708967611789704 perplexity 444.90090078460474 eval: perplexity 57.442316329639176 global step 200 learning rate 0.990234375 step-time 7.700247814655302 perplexity 48.8545568311572 eval: perplexity 42.190180314697045 global step 300 learning rate 0.98046875 step-time 7.69800933599472 perplexity 41.620538109894945 eval: perplexity 31.291903031786116 ... ... ... global step 2400 learning rate 0.79833984375 step-time 7.686293318271639 perplexity 3.7086356605442767 eval: perplexity 2.8348589631663046 global step 2500 learning rate 0.79052734375 step-time 7.689657487869262 perplexity 3.211876894960698 eval: perplexity 2.973809378544393 global step 2600 learning rate 0.78271484375 step-time 7.690396382808681 perplexity 2.878854805600354 eval: perplexity 2.563583924617356 Again, if you change the settings, you may end up with a different perplexity. To obtain these results, we set the RNN size to 256 and 2 layers, the batch size of 128 samples, and the learning rate to 1.0. At this point, the chatbot is ready to be tested. Although you can test the chatbot with the same code as in the test_translator.py, here we would like to do a more elaborate solution, which allows exposing the chatbot as a service with APIs. Chatbox API First of all, we need a web framework to expose the API. In this project, we've chosen Bottle, a lightweight simple framework very easy to use. To install the package, run pip install bottle from the command line. To gather further information and dig into the code, take a look at the project webpage, https://bottlepy.org. Let's now create a function to parse an arbitrary sentence provided by the user as an argument. All the following code should live in the test_chatbot_aas.py file. Let's start with some imports and the function to clean, tokenize and prepare the sentence using the dictionary: import pickle import sys import numpy as np import tensorflow as tf import data_utils from corpora_tools import clean_sentence, sentences_to_indexes, prepare_sentences from train_chatbot import get_seq2seq_model, path_l1_dict, path_l2_dict model_dir = "/home/abc/chat/chatbot_model" def prepare_sentence(sentence, dict_l1, max_length): sents = [sentence.split(" ")] clean_sen_l1 = [clean_sentence(s) for s in sents] idx_sentences_l1 = sentences_to_indexes(clean_sen_l1, dict_l1) data_set = prepare_sentences(idx_sentences_l1, [[]], max_length, max_length) sentences = (clean_sen_l1, [[]]) return sentences, data_set The function prepare_sentence does the following: Tokenizes the input sentence Cleans it (lowercase and punctuation cleanup) Converts tokens to dictionary IDs Add markers and paddings to reach the default length Next, we will need a function to convert the predicted sequence of numbers to an actual sentence composed of words. This is done by the function decode, which runs the prediction given the input sentence and with softmax predicts the most likely output. Finally, it returns the sentence without paddings and markers: def decode(data_set): with tf.Session() as sess: model = get_seq2seq_model(sess, True, dict_lengths, max_sentence_lengths, model_dir) model.batch_size = 1 bucket = 0 encoder_inputs, decoder_inputs, target_weights = model.get_batch( {bucket: [(data_set[0][0], [])]}, bucket) _, _, output_logits = model.step(sess, encoder_inputs, decoder_inputs, target_weights, bucket, True) outputs = [int(np.argmax(logit, axis=1)) for logit in output_logits] if data_utils.EOS_ID in outputs: outputs = outputs[1:outputs.index(data_utils.EOS_ID)] tf.reset_default_graph() return " ".join([tf.compat.as_str(inv_dict_l2[output]) for output in outputs]) Finally, the main function, that is, the function to run in the script: if __name__ == "__main__": dict_l1 = pickle.load(open(path_l1_dict, "rb")) dict_l1_length = len(dict_l1) dict_l2 = pickle.load(open(path_l2_dict, "rb")) dict_l2_length = len(dict_l2) inv_dict_l2 = {v: k for k, v in dict_l2.items()} max_lengths = 10 dict_lengths = (dict_l1_length, dict_l2_length) max_sentence_lengths = (max_lengths, max_lengths) from bottle import route, run, request @route('/api') def api(): in_sentence = request.query.sentence _, data_set = prepare_sentence(in_sentence, dict_l1, max_lengths) resp = [{"in": in_sentence, "out": decode(data_set)}] return dict(data=resp) run(host='127.0.0.1', port=8080, reloader=True, debug=True) Initially, it loads the dictionary and prepares the inverse dictionary. Then, it uses the Bottle API to create an HTTP GET endpoint (under the /api URL). The route decorator sets and enriches the function to run when the endpoint is contacted via HTTP GET. In this case, the api() function is run, which first reads the sentence passed as HTTP parameter, then calls the prepare_sentence function, described above, and finally runs the decoding step. What's returned is a dictionary containing both the input sentence provided by the user and the reply of the chatbot. Finally, the webserver is turned on, on the localhost at port 8080. Isn't very easy to have a chatbot as a service with Bottle? It's now time to run it and check the outputs. To run it, run from the command line: $> python3 –u test_chatbot_aas.py Then, let's start querying the chatbot with some generic questions, to do so we can use CURL, a simple command line; also all the browsers are ok, just remember that the URL should be encoded, for example, the space character should be replaced with its encoding, that is, %20. Curl makes things easier, having a simple way to encode the URL request. Here are a couple of examples: $> curl -X GET -G http://127.0.0.1:8080/api --data-urlencode "sentence=how are you?" {"data": [{"out": "i ' m here with you .", "in": "where are you?"}]} $> curl -X GET -G http://127.0.0.1:8080/api --data-urlencode "sentence=are you here?" {"data": [{"out": "yes .", "in": "are you here?"}]} $> curl -X GET -G http://127.0.0.1:8080/api --data-urlencode "sentence=are you a chatbot?" {"data": [{"out": "you ' for the stuff to be right .", "in": "are you a chatbot?"}]} $> curl -X GET -G http://127.0.0.1:8080/api --data-urlencode "sentence=what is your name ?" {"data": [{"out": "we don ' t know .", "in": "what is your name ?"}]} $> curl -X GET -G http://127.0.0.1:8080/api --data-urlencode "sentence=how are you?" {"data": [{"out": "that ' s okay .", "in": "how are you?"}]} If the system doesn't work with your browser, try encoding the URL, for example: $> curl -X GET http://127.0.0.1:8080/api?sentence=how%20are%20you? {"data": [{"out": "that ' s okay .", "in": "how are you?"}]}. Replies are quite funny; always remember that we trained the chatbox on movies, therefore the type of replies follow that style. To turn off the webserver, use Ctrl + C. To summarize, we've learned to implement a chatbot, which is able to respond to questions through an HTTP endpoint and a GET API. To know more how to design deep learning systems for a variety of real-world scenarios using TensorFlow, do checkout this book TensorFlow Deep Learning Projects. Facebook’s Wit.ai: Why we need yet another chatbot development framework? How to build a chatbot with Microsoft Bot framework Top 4 chatbot development frameworks for developers

0
1
44455

article-image-how-to-implement-immutability-functions-in-kotlin

Aaron Lazar

27 Jun 2018

8 min read

How to implement immutability functions in Kotlin [Tutorial]

Aaron Lazar

27 Jun 2018

8 min read

Unlike Clojure, Haskell, F#, and the likes, Kotlin is not a pure functional programming language, where immutability is forced; rather, we may refer to Kotlin as a perfect blend of functional programming and OOP languages. It contains the major benefits of both worlds. So, instead of forcing immutability like pure functional programming languages, Kotlin encourages immutability, giving it automatic preference wherever possible. In this article, we'll understand the various methods of implementing immutability in Kotlin. This article has been taken from the book, Functional Kotlin, by Mario Arias and Rivu Chakraborty. In other words, Kotlin has immutable variables (val), but no language mechanisms that would guarantee true deep immutability of the state. If a val variable references a mutable object, its contents can still be modified. We will have a more elaborate discussion and a deeper dive on this topic, but first let us have a look at how we can get referential immutability in Kotlin and the differences between var, val, and const val. By true deep immutability of the state, we mean a property will always return the same value whenever it is called and that the property never changes its value; we can easily avoid this if we have a val property that has a custom getter. You can find more details at the following link: https://artemzin.com/blog/kotlin-val-does-not-mean-immutable-it-just-means-readonly-yeah/ The difference between var and val So, in order to encourage immutability but still let the developers have the choice, Kotlin introduced two types of variables. The first one is var, which is just a simple variable, just like in any imperative language. On the other hand, val brings us a bit closer to immutability; again, it doesn't guarantee immutability. So, what exactly does the val variable provide us? It enforces read-only, you cannot write into a val variable after initialization. So, if you use a val variable without a custom getter, you can achieve referential immutability. Let's have a look; the following program will not compile: fun main(args: Array<String>) { val x:String = "Kotlin" x+="Immutable"//(1) } As I mentioned earlier, the preceding program will not compile; it will give an error on comment (1). As we've declared variable x as val, x will be read-only and once we initialize x; we cannot modify it afterward. So, now you're probably asking why we cannot guarantee immutability with val ? Let's inspect this with the following example: object MutableVal { var count = 0 val myString:String = "Mutable" get() {//(1) return "$field ${++count}"//(2) } } fun main(args: Array<String>) { println("Calling 1st time ${MutableVal.myString}") println("Calling 2nd time ${MutableVal.myString}") println("Calling 3rd time ${MutableVal.myString}")//(3) } In this program, we declared myString as a val property, but implemented a custom get function, where we tweaked the value of myString before returning it. Have a look at the output first, then we will further look into the program: As you can see, the myString property, despite being val, returned different values every time we accessed it. So, now, let us look into the code to understand such behavior. On comment (1), we declared a custom getter for the val property myString. On comment (2), we pre-incremented the value of count and added it after the value of the field value, myString, and returned the same from the getter. So, whenever we requested the myString property, count got incremented and, on the next request, we got a different value. As a result, we broke the immutable behavior of a val property. Compile time constants So, how can we overcome this? How can we enforce immutability? The const val properties are here to help us. Just modify val myString with const val myString and you cannot implement the custom getter. While val properties are read-only variables, const val, on the other hand, are compile time constants. You cannot assign the outcome (result) of a function to const val. Let's discuss some of the differences between val and const val: The val properties are read-only variables, while const val are compile time constants The val properties can have custom getters, but const val cannot We can have val properties anywhere in our Kotlin code, inside functions, as a class member, anywhere, but const val has to be a top-level member of a class/object You cannot write delegates for the const val properties We can have the val property of any type, be it our custom class or any primitive data type, but only primitive data types and String are allowed with a const val property We cannot have nullable data types with the const val properties; as a result, we cannot have null values for the const val properties either As a result, the const val properties guarantee immutability of value but have lesser flexibility and you are bound to use only primitive data types with const val, which cannot always serve our purposes. Now, that I've used the word referential immutability quite a few times, let us now inspect what it means and how many types of immutability there are. Types of immutability There are basically the following two types of immutability: Referential immutability Immutable values Immutable reference (referential immutability) Referential immutability enforces that, once a reference is assigned, it can't be assigned to something else. Think of having it as a val property of a custom class, or even MutableList or MutableMap; after you initialize the property, you cannot reference something else from that property, except the underlying value from the object. For example, take the following program: class MutableObj { var value = "" override fun toString(): String { return "MutableObj(value='$value')" } } fun main(args: Array<String>) { val mutableObj:MutableObj = MutableObj()//(1) println("MutableObj $mutableObj") mutableObj.value = "Changed"//(2) println("MutableObj $mutableObj") val list = mutableListOf("a","b","c","d","e")//(3) println(list) list.add("f")//(4) println(list) } Have a look at the output before we proceed with explaining the program: So, in this program we've two val properties—list and mutableObj. We initialized mutableObj with the default constructor of MutableObj, since it's a val property it'll always refer to that specific object; but, if you concentrate on comment (2), we changed the value property of mutableObj, as the value property of the MutableObj class is mutable (var). It's the same with the list property, we can add items to the list after initialization, changing its underlying value. Both list and mutableObj are perfect examples of immutable reference; once initialized, the properties can't be assigned to something else, but their underlying values can be changed (you can refer the output). The reason behind that is the data type we used to assign to those properties. Both the MutableObj class and the MutableList<String> data structures are mutable themselves, so we cannot restrict value changes for their instances. Immutable values The immutable values, on the other hand, enforce no change on values as well; it is really complex to maintain. In Kotlin, the const val properties enforce immutability of value, but they lack flexibility (we already discussed them) and you're bound to use only primitive types, which can be troublesome in real-life scenarios. Immutable collections Kotlin gives preference to immutability wherever possible, but leaves the choice to the developer whether or when to use it. This power of choice makes the language even more powerful. Unlike most languages, where they have either only mutable (like Java, C#, and so on) or only immutable collections (like F#, Haskell, Clojure, and so on), Kotlin has both and distinguishes between them, leaving the developer with the freedom to choose whether to use an immutable or mutable one. Kotlin has two interfaces for collection objects—Collection<out E> and MutableCollection<out E>; all the collection classes (for example, List, Set, or Map) implement either of them. As the name suggests, the two interfaces are designed to serve immutable and mutable collections respectively. Let us have an example: fun main(args: Array<String>) { val immutableList = listOf(1,2,3,4,5,6,7)//(1) println("Immutable List $immutableList") val mutableList:MutableList<Int> = immutableList.toMutableList()//(2) println("Mutable List $mutableList") mutableList.add(8)//(3) println("Mutable List after add $mutableList") println("Mutable List after add $immutableList") } The output is as follows: So, in this program, we created an immutable list with the help of the listOf method of Kotlin, on comment (1). The listOf method creates an immutable list with the elements (varargs) passed to it. This method also has a generic type parameter, which can be skipped if the elements array is not empty. The listOf method also has a mutable version—mutableListOf() which is identical except that it returns MutableList instead. We can convert an immutable list to a mutable one with the help of the toMutableList() extension function, we did the same in comment (2), to add an element to it on comment (3). However, if you check the output, the original Immutable List remains the same without any changes, the item is, however, added to the newly created MutableList instead. So now you know how to implement immutability in Kotlin. If you found this tutorial helpful, and would like to learn more, head on over to purchase the full book, Functional Kotlin, by Mario Arias and Rivu Chakraborty. Extension functions in Kotlin: everything you need to know Building RESTful web services with Kotlin Building chat application with Kotlin using Node.js, the powerful Server-side JavaScript platform

0
0
28976

article-image-build-google-cloud-iot-application

Gebin George

27 Jun 2018

19 min read

Build an IoT application with Google Cloud [Tutorial]

Gebin George

27 Jun 2018

19 min read

0
17
57004

How-To Tutorials

article-image-data-model-in-splunk-to-enable-interactive-reports-and-dashboards

Pravin Dhandre

26 Jun 2018

8 min read

Create a data model in Splunk to enable interactive reports and dashboards

Pravin Dhandre

26 Jun 2018

8 min read

Data models enable you to create Splunk reports and dashboards without having to develop Splunk search. Typically, data models are designed by those that understand the specifics around the format, the semantics of certain data, and the manner in which users may expect to work with that data. In building a typical data model, knowledge managers use knowledge object types (such as lookups, transactions, search-time field extractions, and calculated fields). Today we are going to learn how to create a Splunk data model and how to describe that model with various fields and lookup attributes. This article is an excerpt from a book written by James D. Miller titled Implementing Splunk 7 - Third Edition. Creating a data model So now that we have a general idea of what a Splunk data model is, let's go ahead and create one. Before we can get started, we need to verify that our user ID is set up with the proper access required to create a data model. By default, only users with an admin or power role can create data models. For other users, the ability to create a data model depends on whether their roles have write access to an app. To begin (once you have verified that you do have access to create a data model), you can click on Settings and then on Data models (under KNOWLEDGE): This takes you to the Data Models (management) page, shown in the next screenshot. This is where a list of data models is displayed. From here, you can manage permissions, acceleration, cloning, and removal of existing data models. You can also use this page to upload a data model or create new data models, using the Upload Data Model and New Data Model buttons on the upper-right corner, respectively. Since this is a new data model, you can click on the button labeled New Data Model. This will open the New Data Model dialog box (shown in the following image). We can fill in the required information in this dialog box: Filling in the new data model dialog You have four fields to fill in order to describe your new Splunk data model (Title, ID, App, and Description): Title: Here you must enter a Title for your data model. This field accepts any character, as well as spaces. The value you enter here is what will appear on the data model listing page. ID: This is an optional field. It gets prepopulated with what you entered for your data model title (with any spaces replaced with underscores. Take a minute to make sure you have a good one, since once you enter the data model ID, you can't change it. App: Here you select (from a drop-down list) the Splunk app that your data model will serve. Description: The description is also an optional field, but I recommend adding something descriptive to later identify your data model. Once you have filled in these fields, you can click on the button labeled Create. This opens the data model (in our example, Aviation Games) in the Splunk Edit Objects page as shown in the following screenshot: The next step in defining a data model is to add the first object. As we have already stated, data models are typically composed of object hierarchies built on root event objects. Each root event object represents a set of data that is defined by a constraint, which is a simple search that filters out events that are not relevant to the object. Getting back to our example, let's create an object for our data model to track purchase requests on our Aviation Games website. To define our first event-based object, click on Add Dataset (as shown in the following screenshot): Our data model's first object can either be a Root Event, or Root Search. We're going to add a Root Event, so select Root Event. This will take you to the Add Event Dataset editor: Our example event will expose events that contain the phrase error, which represents processing errors that have occurred within our data source. So, for Dataset Name, we will enter Processing Errors. The Dataset ID will automatically populate when you type in the Dataset Name (you can edit it if you want to change it). For our object's constraint, we'll enter sourcetype=tm1* error. This constraint defines the events that will be reported on (all events that contain the phrase error that are indexed in the data sources starting with tml). After providing Constraints for the event-based object, you can click on Preview to test whether the constraints you've supplied return the kind of events that you want. The following screenshot depicts the preview of the constraints given in this example: After reviewing the output, click on Save. The list of attributes for our root object is displayed: host, source, sourcetype, and _time. If you want to add child objects to client and server errors, you need to edit the attributes list to include additional attributes: Editing fields (attributes) Let's add an auto-extracted attribute, as mentioned earlier in this chapter, to our data model. Remember, auto-extracted attributes are derived by Splunk at search time. To start, click on Add Field: Next, select Auto-Extracted. The Add Auto-Extracted Field window opens: You can scroll through the list of automatically extracted fields and check the fields that you want to include. Since my data model example deals with errors that occurred, I've selected date_mday, date_month, and date_year. Notice that to the right of the field list, you have the opportunity to rename and type set each of the fields that you selected. Rename is self-explanatory, but for Type, Splunk allows you to select String, Number, Boolean, or IPV$ and indicate if the attribute is Required, Optional, Hidden, or Hidden & Required. Optional means that the attribute doesn't have to appear in every event represented by the object. The attribute may appear in some of the object events and not others. Once you have reviewed your selected field types, click on Save: Lookup attributes Let's discuss lookup attributes now. Splunk can use the existing lookup definitions to match the values of an attribute that you select to values of a field in the specified lookup table. It then returns the corresponding field/value combinations and applies them to your object as (lookup) attributes. Once again, if you click on Add Field and select Lookup, Splunk opens the Add Fields with a Lookup page (shown in the following screenshot) where you can select from your currently defined lookup definitions. For this example, we select dnslookup: The dnslookup converts clienthost to clientip. We can configure a lookup attribute using this lookup to add that result to the processing errors objects. Under Input, select clienthost for Field in Lookup and Field in Dataset. Field in Lookup is the field to be used in the lookup table. Field in Dataset is the name of the field used in the event data. In our simple example, Splunk will match the field clienthost with the field host: Under Output, I have selected host as the output field to be matched with the lookup. You can provide a Display Name for the selected field. This display name is the name used for the field in your events. I simply typed AviationLookupName for my display name (see the following screenshot): Again, Splunk allows you to click on Preview to review the fields that you want to add. You can use the tabs to view the Events in a table, or view the values of each of the fields that you selected in Output. For example, the following screenshot shows the values of AviationLookupName: Finally, we can click on Save: Add Child object to our model We have just added a root (or parent) object to our data model. The next step is to add some children. Although a child object inherits all the constraints and attributes from its parent, when you create a child, you will give it additional constraints with the intention of further filtering the dataset that the object represents. To add a child object to our data model, click on Add Field and select Child: Splunk then opens the editor window, Add Child Dataset (shown in the following screenshot): On this page, follow these steps: Enter the Object Name: Dimensional Errors. Leave the Object ID as it is: Dimensional_Errors. Under Inherit From, select Processing Errors. This means that this child object will inherit all the attributes from the parent object, Processing Errors. Add the Additional Constraints, dimension, which means that the data models search for the events in this object; when expanded, it will look something like sourcetype=tm1* error dimension. Finally, click on Save to save your changes: Following the previously outlined steps, you can add more objects, each continuing to filter the results until you have the results that you need. With this we learned to create data models, and manage permissions, cloning and accelerating operational data models with ease. If you found this tutorial useful, do check out the book Implementing Splunk 7 - Third Edition and start transforming machine-generated data into valuable and actionable business insights. How to use R to boost your Data Model Building a Microsoft Power BI Data Model Splunk’s Input Methods and Data Feeds

0
0
22418

How-To Tutorials

article-image-set-up-scala-plugin-for-intellij-ide

Pavan Ramchandani

26 Jun 2018

2 min read

How to set up the Scala Plugin in IntelliJ IDE [Tutorial]

Pavan Ramchandani

26 Jun 2018

2 min read

The Scala Plugin is used to turn a normal IntelliJ IDEA into a convenient Scala development environment. In this article, we will discuss how to set up Scala Plugin for IntelliJ IDEA IDE. If you do not have IntelliJ IDEA, you can download it from here. By default, IntelliJ IDEA does not come with Scala features. Scala Plugin adds Scala features means that we can create Scala/Play Projects, we can create Scala Applications, Scala worksheets, and more. Scala Plugin contains the following technologies: Scala Play Framework SBT Scala.js It supports three popular OS Environments: Windows, Mac, and Linux. Setting up Scala Plugin for IntelliJ IDE Perform the following steps to install Scala Plugin for IntelliJ IDE to develop our Scala-based projects: Open IntelliJ IDE: Go to Configure at the bottom right and click on the Plugins option available in the drop-down, as shown here: This opens the Plugins window as shown here: Now click on InstallJetbrainsplugins, as shown in the preceding screenshot. Next, type the word Scala in the search bar to see the ScalaPlugin, as shown here: Click on the Install button to install Scala Plugin for IntelliJ IDEA. Now restart IntelliJ IDEA to see that Scala Plugin features. After we re-open IntelliJ IDEA, if we try to access File | New Project option, we will see Scala option in New Project window as shown in the following screenshot to create new Scala or Play Framework-based SBT projects: We can see the Play Framework option only in the IntelliJ IDEA Ultimate Edition. As we are using CE (Community Edition), we cannot see that option. It's now time to start Scala/Play application development using the IntelliJ IDE. You can start developing some Scala/Play-based applications. To summarize, we got an understanding to Scala Plugin and covered the installation steps for Scala Plugin for IntelliJ. To learn more about solutions for taking reactive programming approach with Scala, please refer the book Scala Reactive Programming. What Scala 3.0 Roadmap looks like! Building Scalable Microservices Exploring Scala Performance

0
0
31806

article-image-interact-with-hbase-using-hbase-shell-tutorial

Amey Varangaonkar

25 Jun 2018

9 min read

How to interact with HBase using HBase shell [Tutorial]

Amey Varangaonkar

25 Jun 2018

9 min read

0
0
9711

Create enterprise-grade Angular forms in TypeScript [Tutorial]

Why do React developers love Redux for state management?

Automating OpenStack Networking and Security with Ansible 2 [Tutorial]

Interactive dashboard with vRealize Operations Manager [Tutorial]

Configuring and deploying HBase [Tutorial]

Administration rights for Power BI users

How to migrate Power BI datasets to Microsoft Analysis Services models [Tutorial]

Indexing, Replicating, and Sharding in MongoDB [Tutorial]

Build an Actuator app for controlling Illumination with Raspberry Pi 3

Build and train an RNN chatbot using TensorFlow [Tutorial]

Trending Topics

How to implement immutability functions in Kotlin [Tutorial]

Build an IoT application with Google Cloud [Tutorial]

Create a data model in Splunk to enable interactive reports and dashboards

How to set up the Scala Plugin in IntelliJ IDE [Tutorial]

How to interact with HBase using HBase shell [Tutorial]

Create a Free Account To Continue Reading

Sign in to activate your 7-day free access