Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Save more on your purchases! discount-offer-chevron-icon
Savings automatically calculated. No voucher code required.
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Newsletter Hub
Free Learning
Arrow right icon
timer SALE ENDS IN
0 Days
:
00 Hours
:
00 Minutes
:
00 Seconds

How-To Tutorials - Programming

1083 Articles
article-image-capability-model-microservices
Packt
17 Jun 2016
19 min read
Save for later

A capability model for microservices

Packt
17 Jun 2016
19 min read
In this article by Rajesh RV, the author of Spring Microservices, you will learn aboutthe concepts of microservices. More than sticking to definitions, it is better to understand microservices by examining some common characteristics of microservices that are seen across many successful microservices implementations. Spring Boot is an ideal framework to implement microservices. In this article, we will examine how to implement microservices using Spring Boot with an example use case. Beyond services, we will have to be aware of the challenges around microservices implementation. This article will also talk about some of the common challenges around microservices. A successful microservices implementation has to have some set of common capabilities. In this article, we will establish a microservices capability model that can be used in a technology-neutral framework to implement large-scale microservices. What are microservices? Microservices is an architecture style used by many organizations today as a game changer to achieve a high degree of agility, speed of delivery, and scale. Microservices give us a way to develop more physically separated modular applications. Microservices are not invented. Many organizations, such as Netflix, Amazon, and eBay, successfully used the divide-and-conquer technique to functionally partition their monolithic applications into smaller atomic units, and each performs a single function. These organizations solved a number of prevailing issues they experienced with their monolithic application. Following the success of these organizations, many other organizations started adopting this as a common pattern to refactor their monolithic applications. Later, evangelists termed this pattern microservices architecture. Microservices originated from the idea of Hexagonal Architecture coined by Alister Cockburn. Hexagonal Architecture is also known as thePorts and Adapters pattern. Microservices is an architectural style or an approach to building IT systems as aset of business capabilities that are autonomous, self-contained, and loosely coupled. The preceding diagram depicts a traditional N-tier application architecture having a presentation layer, business layer, and database layer. Modules A, B, and C represents three different business capabilities. The layers in the diagram represent a separation of architecture concerns. Each layer holds all three business capabilities pertaining to this layer. The presentation layer has the web components of all three modules, the business layer has the business components of all the three modules, and the database layer hosts tables of all the three modules. In most cases, layers are physically spreadable, whereas modules within a layer are hardwired. Let's now examine a microservices-based architecture, as follows: As we can note in the diagram, the boundaries are inverted in the microservices architecture. Each vertical slice represents a microservice. Each microservice has its own presentation layer, business layer, and database layer. Microservices are aligned toward business capabilities. By doing so, changes to one microservice do not impact others. There is no standard for communication or transport mechanisms for microservices. In general, microservices communicate with each other using widely adopted lightweight protocols such as HTTP and REST or messaging protocols such as JMS or AMQP. In specific cases, one might choose more optimized communication protocols such as Thrift, ZeroMQ, Protocol Buffers, or Avro. As microservices are more aligned to the business capabilities and have independently manageable lifecycles, they are the ideal choice for enterprises embarking on DevOps and cloud. DevOps and cloud are two other facets of microservices. Microservices are self-contained, independently deployable, and autonomous services that take full responsibility of a business capability and its execution. They bundle all dependencies, including library dependencies and execution environments such as web servers and containers or virtual machines that abstract physical resources. These self-contained services assume single responsibility and are well enclosed with in a bounded context. Microservices – The honeycomb analogy The honeycomb is an ideal analogy to represent the evolutionary microservices architecture. In the real world, bees build a honeycomb by aligning hexagonal wax cells. They start small, using different materials to build the cells. Construction is based on what is available at the time of building. Repetitive cells form a pattern and result in a strong fabric structure. Each cell in the honeycomb is independent but also integrated with other cells. By adding new cells, the honeycomb grows organically to a big solid structure. The content inside each cell is abstracted and is not visible outside. Damage to one cell does not damage other cells, and bees can reconstruct these cells without impacting the overall honeycomb. Characteristics of microservices The microservices definition discussed at the beginning of this article is arbitrary. Evangelists and practitioners have strong but sometimes different opinions on microservices. There is no single, concrete, and universally accepted definition for microservices. However, all successful microservices implementations exhibit a number of common characteristics. Some of these characteristics are explained as follows: Since microservices are more or less similar to a flavor of SOA, many of the service characteristics of SOA are applicable to microservices, as well. In the microservices world, services are first-class citizens. Microservices expose service endpoints as APIs and abstract all their realization details. The APIs could be synchronous or asynchronous. HTTP/REST is the popular choice for APIs. As microservices are autonomous and abstract everything behind service APIs, it is possible to have different architectures for different microservices. The internal implementation logic, architecture, and technologies, including programming language, database, quality of service mechanisms,and so on, are completely hidden behind the service API. Well-designed microservices are aligned to a single business capability, so they perform only one function. As a result, one of the common characteristics we see in most of the implementations are microservices with smaller footprints. Most of the microservices implementations are automated to the maximum extent possible, from development to production. Most large-scale microservices implementations have a supporting ecosystem in place. The ecosystem's capabilities include DevOps processes, centralized log management, service registry, API gateways, extensive monitoring, service routing and flow control mechanisms, and so on. Successful microservices implementations encapsulate logic and data within the service. This results in two unconventional situations: a distributed data and logic and decentralized governance. A microservice example The Customer profile microservice exampleexplained here demonstrates the implementation of microservice and interaction between different microservices. In this example, two microservices, Customer Profile and Customer Notification, will be developed. As shown in the diagram, the Customer Profile microservice exposes methods to create, read, update, and delete a customer and a registration service to register a customer. The registration process applies a certain business logic, saves the customer profile, and sends a message to the CustomerNotification microservice. The CustomerNotification microservice accepts the message send by the registration service and sends an e-mail message to the customer using an SMTP server. Asynchronous messaging is used to integrate CustomerProfile with the CustomerNotification service. The customer microservices class domain model diagram is as shown here: Implementing this Customer Profile microservice is not a big deal. The Spring framework, together with Spring Boot, provides all the necessary capabilities to implement this microservice without much hassle. The key is CustomerController in the diagram, which exposes the REST endpoint for our microservice. It is also possible to use HATEOAS to explore the repository's REST services directly using the @RepositoryRestResource annotation. The following code sample shows the Spring Boot main class called Application and theREST endpoint definition for theregistration of a new customer: @SpringBootApplication public class Application { public static void main(String[] args) { SpringApplication.run(Application.class, args);  } }   @RestController class CustomerController{ //other code here @RequestMapping( path="/register", method = RequestMethod.POST) Customer register(@RequestBody Customer customer){ returncustomerComponent.register(customer); } } CustomerControllerinvokes a component class, CustomerComponent. The component class/bean handles all the business logic. CustomerRepository is a Spring data JPA repository defined to handlethe persistence of the Customer entity. The whole application will then be deployed as a Spring Boot application by building a standalone jar rather than using the conventional war file. Spring Boot encapsulates the server runtime along with the fat jar it produces. By default, it is an instance of the Tomcat server. CustomerComponent, in addition to calling the CustomerRepository class, sends a message to the RabbitMQ queue, where the CustomerNotification component is listening. This can be easily achieved in Spring using the RabbitMessagingTemplate class as shown in the following Sender implementation: @Component class CustomerComponent { //other code here   Customer register(Customer customer){ customerRespository.save(customer); sender.send(customer.getEmail()); return customer; } }   @Component @Lazy class Sender { RabbitMessagingTemplate template;   @Autowired Sender(RabbitMessagingTemplate template){ this.template = template; }   @Bean Queue queue() { return new Queue("CustomerQ", false); }   public void send(String message){ template.convertAndSend("CustomerQ", message); } } The receiver on the other sideconsumes the message using RabbitListener and sends out an e-mail using theJavaMailSender component. Execute the following code: @Component class Receiver { @Autowired private  JavaMailSenderjavaMailService;   @Bean Queue queue() { return new Queue("CustomerQ", false); }   @RabbitListener(queues = "CustomerQ") public void processMessage(String email) { System.out.println(email); SimpleMailMessagemailMessage=new SimpleMailMessage(); mailMessage.setTo(email); mailMessage.setSubject("Registration"); mailMessage.setText("Successfully Registered"); javaMailService.send(mailMessage);       }   } In this case,CustomerNotification isour secondSpring Boot microservice. In this case, instead of the REST endpoint, it only exposes a message listener end point. Microservices challenges In the previous section,you learned about the right design decisions to be made and the trade-offs to be applied. In this section, we will review some of the challenges with microservices. Take a look at the following list: Data islands: Microservices abstract their own local transactional store, which is used for their own transactional purposes. The type of store and the data structure will be optimized for the services offered by the microservice. This can lead to data islands and, hence, challenges around aggregating data from different transactional stores to derive meaningful information. Logging and monitoring: Log files are a good piece of information for analysis and debugging. As each microservice is deployed independently, they emit separate logs, maybe to a local disk. This will result in fragmented logs. When we scale services across multiple machines, each service instance would produce separate log files. This makes it extremely difficult to debug and understand the behavior of the services through log mining. Dependency management: Dependency management is one of the key issues in large microservices deployments. How do we ensure the chattiness between services is manageable?How do we identify and reduce the impact of a change? How do we know whether all the dependent services are up and running? How will the service behave if one of the dependent services is not available? Organization's culture: One of the biggest challenges in microservices implementation is the organization's culture. An organization following waterfall development or heavyweight release management processes with infrequent release cycles is a challenge for microservices development. Insufficient automation is also a challenge for microservices deployments. Governance challenges: Microservices impose decentralized governance, and this is quite in contrast to the traditional SOA governance. Organizations may find it hard to come up with this change, and this could negatively impact microservices development. How can we know who is consuming service? How do we ensure service reuse? How do we define which services are available in the organization? How do we ensure that the enterprise polices are enforced? Operation overheads: Microservices deployments generally increases the number of deployable units and virtual machines (or containers). This adds significant management overheads and cost of operations.With a single application, a dedicated number of containers or virtual machines in an on-premises data center may not make much sense unless the business benefit is high. With many microservices, the number of Configurable Items (CIs) is too high, and the number of servers in which these CIs are deployed might also be unpredictable. This makes it extremely difficult to manage data in a traditional Configuration Management Database (CMDB). Testing microservices: Microservices also pose a challenge for the testability of services. In order to achieve full service functionality, one service may rely on another service, and this, in turn, may rely on another service, either synchronously or asynchronously. The issue is how we test an end-to-end service to evaluate its behavior. Dependent services may or may not be available at the time of testing. Infrastructure provisioning: As briefly touched upon under operation overheads, manual deployment can severely challenge microservices rollouts. If a deployment has manual elements, the deployer or operational administrators should know the running topology, manually reroute traffic, and then deploy the application one by one until all the services are upgraded. With many server instances running, this could lead to significant operational overheads. Moreover, the chance of error is high in this manual approach. Beyond just services– The microservices capability model Microservice are not as simple as the Customer Profile implementation we discussedearlier. This is specifically true when deploying hundreds or thousands of services. In many cases, an improper microservices implementation may lead to a number of challenges, as mentioned before.Any successful Internet-scale microservices deployment requires a number of additional surrounding capabilities. The following diagram depicts the microservices capability model: The capability model is broadly classified in to four areas, as follows: Core capabilities, which are part of the microservices themselves Supporting capabilities, which are software solutions supporting core microservice implementations Infrastructure capabilities, which are infrastructure-level expectations for a successful microservices implementation Governance capabilities, which are more of process, people, and reference information Core capabilities The core capabilities are explained here: Service listeners (HTTP/Messaging): If microservices are enabled for HTTP-based service endpoints, then the HTTP listener will be embedded within the microservices, thereby eliminating the need to have any external application server requirement. The HTTP listener will be started at the time of the application startup. If the microservice is based on asynchronous communication, then instead of an HTTP listener, a message listener will be stated. Optionally, other protocols could also be considered. There may not be any listeners if the microservices is a scheduled service. Spring Boot and Spring Cloud Streams provide this capability. Storage capability: Microservices have storage mechanisms to store state or transactional data pertaining to the business capability. This is optional, depending on the capabilities that are implemented. The storage could be either a physical storage (RDBMS,such as MySQL, and NoSQL,such as Hadoop, Cassandra, Neo4J, Elasticsearch,and so on), or it could be an in-memory store (cache,such as Ehcache and Data grids,such as Hazelcast, Infinispan,and so on). Business capability definition: This is the core of microservices, in which the business logic is implemented. This could be implemented in any applicable language, such as Java, Scala, Conjure, Erlang, and so on. All required business logic to fulfil the function is embedded within the microservices itself. Event sourcing: Microservices send out state changes to the external world without really worrying about the targeted consumers of these events. They could be consumed by other microservices, supporting services such as audit by replication, external applications,and so on. This will allow other microservices and applications to respond to state changes. Service endpoints and communication protocols: This defines the APIs for external consumers to consume. These could be synchronous endpoints or asynchronous endpoints. Synchronous endpoints could be based on REST/JSON or other protocols such as Avro, Thrift, protocol buffers, and so on. Asynchronous endpoints will be through Spring Cloud Streams backed by RabbitMQ or any other messaging servers or other messaging style implementations, such as Zero MQ. The API gateway: The API gateway provides a level of indirection by either proxying service endpoints or composing multiple service endpoints. The API gateway is also useful for policy enforcements. It may also provide real-time load balancing capabilities. There are many API gateways available in the market. Spring Cloud Zuul, Mashery, Apigee, and 3 Scale are some examples of API gateway providers. User interfaces: Generally, user interfaces are also part of microservices for users to interact with the business capabilities realized by the microservices. These could be implemented in any technology and is channel and device agnostic. Infrastructure capabilities Certain infrastructure capabilities are required for a successful deployment and to manage large-scale microservices. When deploying microservices at scale, not having proper infrastructure capabilities can be challenging and can lead to failures. Cloud: Microservices implementation is difficult in a traditional data center environment with a long lead time to provision infrastructures. Even a large number of infrastructure dedicated per microservice may not be very cost effective. Managing them internally in a data center may increase the cost of ownership and of operations. A cloud-like infrastructure is better for microservices deployment. Containers or virtual machines: Managing large physical machines is not cost effective and is also hard to manage. With physical machines, it is also hard to handle automatic fault tolerance. Virtualization is adopted by many organizations because of its ability to provide an optimal use of physical resources, and it provides resource isolation. It also reduces the overheads in managing large physical infrastructure components. Containers are the next generation of virtual machines. VMWare, Citrix,and so on provide virtual machine technologies. Docker, Drawbridge, Rocket, and LXD are some containerizing technologies. Cluster control and provisioning: Once we have a large number of containers or virtual machines, it is hard to manage and maintain them automatically. Cluster control tools provide a uniform operating environment on top of the containers and share the available capacity across multiple services. Apache Mesos and Kubernetes are examples of cluster control systems. Application lifecycle management: Application lifecycle management tools help to invoke applications when a new container is launched or kill the application when the container shuts down. Application lifecycle management allows to script application deployments and releases. It automatically detects failure scenarios and responds to them, thereby ensuring the availability of the application. This works in conjunction with the cluster control software. Marathon partially address this capability. Supporting capabilities Supporting capabilities are not directly linked to microservices, but these are essential for large-scale microservices development. Software-defined load balancer: The load balancer should be smart enough to understand the changes in deployment topology and respond accordingly. This moves away from the traditional approach of configuring static IP addresses, domain aliases, or cluster address in the load balancer. When new servers are added to the environment, it should automatically detect this and include them in the logical cluster by avoiding any manual interactions. Similarly, if a service instance is unavailable, it should take it out of the load balancer. A combination of Ribbon, Eureka, and Zuul provides this capability in Spring Cloud Netflix. Central log management: As explored earlier in this article, a capability is required to centralize all the logs emitted by service instances with correlation IDs. This helps debug, identify performances bottlenecks, and in predictive analysis. The result of this could feedback into the lifecycle manager to take corrective actions. Service registry: A service registry provides a runtime environment for services to automatically publish their availability at runtime. A registry will be a good source of information to understand the services topology at any point. Eureka from Spring Cloud, ZooKeeper, and Etcd are some of the service registry tools available. Security service: The distributed microservices ecosystem requires a central server to manage service security. This includes service authentication and token services. OAuth2-based services are widely used for microservices security. Spring Security and Spring Security OAuth are good candidates to build this capability. Service configuration: All service configurations should be externalized, as discussed in the Twelve-Factor application principles. A central service for all configurations could be a good choice. The Spring Cloud Config server and Archaius are out-of-the-box configuration servers. Testing tools (Anti-Fragile, RUM, and so on): Netflix uses Simian Army for antifragile testing. Mature services need consistent challenges to see the reliability of the services and how good fallback mechanisms are. Simian Army components create various error scenarios to explore the behavior of the system under failure scenarios. Monitoring and dashboards: Microservices also require a strong monitoring mechanism. This monitoring is not just at the infrastructurelevel but also at the service level. Spring Cloud Netflix Turbine, the Hysterix dashboard,and others provide service-level information. End-to-end monitoring tools,such as AppDynamic, NewRelic, Dynatrace, and other tools such as Statd, Sensu, and Spigo, could add value in microservices monitoring. Dependency and CI management: We also need tools to discover runtime topologies, to find service dependencies, and to manage configurable items (CIs). A graph-based CMDB is more obvious to manage these scenarios. Data lakes: As discussed earlier in this article, we need a mechanism to combine data stored in different microservices and perform near real-time analytics. Data lakesare a good choice to achieve this. Data ingestion tools such as Spring Cloud Data Flow, Flume, and Kafka are used to consume data. HDFS, Cassandra,and others are used to store data. Reliable messaging: If the communication is asynchronous, we may need a reliable messaging infrastructure service, such as RabbitMQ or any other reliable messaging service. Cloud messaging or messaging as service is a popular choice in Internet-scale message-based service endpoints. Process and governance capabilities The last in the puzzle are the process and governance capabilities required for microservices, which are: DevOps: Key in successful implementation is to adopt DevOps. DevOps complements microservices development by supporting agile development, high-velocity delivery, automation, and better change management. DevOps tools: DevOps tools for agile development, continuous integration, continuous delivery, and continuous deployment are essential for a successful delivery of microservices. A lot of emphasis is required in automated, functional, and real user testing as well as synthetic, integration, release, and performance testing. Microservices repository: A microservices repository is where the versioned binaries of microservices are placed. These could be a simple Nexus repository or container repositories such as the Docker registry. Microservice documentation: It is important to have all microservices properly documented. Swagger or API blueprint are helpful in achieving good microservices documentation. Reference architecture and libraries: Reference architecture provides a blueprint at the organization level to ensure that services are developed according to certain standards and guidelines in a consistent manner. Many of these could then be translated to a number of reusable libraries that enforce service development philosophies. Summary In this article,you learned the concepts and characteristics of microservices. We took as example a holiday portal to understand the concept of microservices better. We also examined some of the common challenges in large-scale microservice implementation. Finally, we established a microservices capability model in this article that can be used to deliver successful Internet-scale microservices.
Read more
  • 0
  • 0
  • 31235

article-image-python-3-8-available-walrus-operator-positional-only-parameters-vectorcall
Sugandha Lahoti
15 Oct 2019
6 min read
Save for later

Python 3.8 is now available with walrus operator, positional-only parameters support for Vectorcall, and more

Sugandha Lahoti
15 Oct 2019
6 min read
Yesterday, the latest version of the Python programming language, Python 3.8 was made available with multiple new improvements and features. Features include the new walrus operator and positional-only parameters, runtime audit Hooks, Vectorcall, a fast calling protocol for CPython and more. Earlier this month, the team behind Python announced the release of Python 3.8b2, the second of four planned beta releases. What’s new in Python 3.8 PEP 572: New walrus operator in assignment expressions Python 3.8 has a new walrus operator := that assigns values to variables as part of a larger expression. It is useful when matching regular expressions where match objects are needed twice. It can also be used with while-loops that compute a value to test loop termination and then need that same value again in the body of the loop. It can also be used in list comprehensions where a value computed in a filtering condition is also needed in the expression body. The walrus operator was proposed in PEP 572 (Assignment Expressions) by Chris Angelico, Tim Peters, and Guido van Rossum last year. Since then it has been heavily discussed in the Python community with many questioning whether it is a needed improvement. Others are excited as the operator does make the code more readable. One user commented on HN, “The "walrus operator" will occasionally be useful, but I doubt I will find many effective uses for it. Same with the forced positional/keyword arguments and the "self-documenting" f-string expressions. Even when they have a use, it's usually just to save one line of code or a few extra characters.” https://twitter.com/reynoldsnlp/status/1183498971425042433 https://twitter.com/jakevdp/status/1140071525870997504 PEP 570: New function parameter syntax in positional-only parameters Python 3.8 has a new function parameter syntax / to indicate that some function parameters must be specified positionally and cannot be used as keyword arguments. This notation allows pure Python functions to fully emulate behaviors of existing C-coded functions. It can be used to preclude keyword arguments when the parameter name is not helpful. It also allows the parameter name to be changed in the future without the risk of breaking client code. As with PEP 572, this proposal also got mixed reactions from Python developers. In support, one developer said, “Position-only parameters already exist in cpython builtins like range and min. Making their support at the language level would make their existence less confusing and documented.” While others think that this will allow authors to “dictate” how their methods could be used. “Not the biggest fan of this one because it allows library authors to overly dictate how their functions can be used, as in, mark an argument as positional merely because they want to. But cool all the same,” a Redditor commented. PEP 578: Python Audit Hooks and Verified Open Hook Python 3.8 now has an Audit Hook and Verified Open Hook. These hooks allow applications and frameworks written in pure Python code to take advantage of extra notifications. They also allow embedders or system administrators to deploy builds of Python where auditing is always enabled. These are available from Python and native code. PEP 587: New C API to configure the Python Initialization Though Python is highly configurable, its configuration seems scattered all around the code. Python 3.8 adds a new C API to configure the Python Initialization providing finer control on the whole configuration and better error reporting. This PEP also adds _PyRuntimeState.preconfig (PyPreConfig type) and PyInterpreterState.config (PyConfig type) fields to internal structures. PyInterpreterState.config becomes the new reference configuration, replacing global configuration variables and other private variables. PEP 590: Provisional support for Vectorcall, a fast calling protocol for CPython A currently provisional Vectorcall protocol is added to the Python/C API. It is meant to formalize existing optimizations that were already done for various classes. Any extension type implementing a callable can use this protocol. It will be made fully public in Python 3.9. PEP 574: Pickle protocol 5 supports out-of-band data buffers The pickle protocol 5 now introduces support for out-of-band buffers. This means PEP 3118 compatible data can be transmitted separately from the main pickle stream, at the discretion of the communication layer. Parallel filesystem cache for compiled bytecode files There is a new PYTHONPYCACHEPREFIX setting that configures the implicit bytecode cache to use a separate parallel filesystem tree, rather than the default __pycache__ subdirectories within each source directory. Python uses the same ABI whether it’s built-in release or debug mode With Python 3.8, Python uses the same ABI whether it’s built-in release or debug mode. On Unix, when Python is built in debug mode, it is now possible to load C extensions built-in release mode and C extensions built using the stable ABI. On Unix, C extensions are no longer linked to libpython except on Android and Cygwin. Also, on Unix, when Python is built in debug mode, import now also looks for C extensions compiled in release mode and for C extensions compiled with the stable ABI. f-strings now have a = specifier Formatted strings (f-strings) were introduced in Python 3.6 with PEP 498. It enables you to evaluate an expression as part of the string along with inserting the result of function calls and so on. Python 3.8 adds a = specifier to f-strings for self-documenting expressions and debugging. An f-string such as f'{expr=}' will expand to the text of the expression, an equal sign, then the representation of the evaluated expression. One developer expressed their delight on Hacker News, “F strings are pretty awesome. I’m coming from JavaScript and partially java background. JavaScript’s String concatenation can become too complex and I have difficulty with large strings.” Another developer said, “The expansion of f-strings is a welcome addition. The more I use them, the happier I am that they exist.” Someone added to this, “This makes clean string interpolation so much easier to do, especially for print statements. It's almost hard to use python < 3.6 now because of them. New metadata module Python 3.8 has a new importlib.metadata module that provides (provisional) support for reading metadata from third-party packages. It can, for instance, extract an installed package’s version number, list of entry points, and more. You can go through other improved modules, language changes, Build and C API changes, API and Feature removals in Python 3.8 on Python docs. For full details, see the changelog. Python 3.8b2 new features: the walrus operator, positional-only parameters, and much more Python 3.8 beta 1 is now ready for you to test Łukasz Langa at PyLondinium19: “If Python stays synonymous with CPython for too long, we’ll be in big trouble. PyPy will continue to support Python 2.7, even as major Python projects migrate to Python 3.
Read more
  • 0
  • 0
  • 30875

article-image-work-with-classes-in-typescript
Amey Varangaonkar
15 May 2018
8 min read
Save for later

How to work with classes in Typescript

Amey Varangaonkar
15 May 2018
8 min read
If we are developing any application using TypeScript, be it a small-scale or a large-scale application, we will use classes to manage our properties and methods. Prior to ES 2015, JavaScript did not have the concept of classes, and we used functions to create class-like behavior. TypeScript introduced classes as part of its initial release, and now we have classes in ES6 as well. The behavior of classes in TypeScript and JavaScript ES6 closely relates to the behavior of any object-oriented language that you might have worked on, such as C#. This excerpt is taken from the book TypeScript 2.x By Example written by Sachin Ohri. Object-oriented programming in TypeScript Object-oriented programming allows us to represent our code in the form of objects, which themselves are instances of classes holding properties and methods. Classes form the container of related properties and their behavior. Modeling our code in the form of classes allows us to achieve various features of object-oriented programming, which helps us write more intuitive, reusable, and robust code. Features such as encapsulation, polymorphism, and inheritance are the result of implementing classes. TypeScript, with its implementation of classes and interfaces, allows us to write code in an object-oriented fashion. This allows developers coming from traditional languages, such as Java and C#, feel right at home when learning TypeScript. Understanding classes Prior to ES 2015, JavaScript developers did not have any concept of classes; the best way they could replicate the behavior of classes was with functions. The function provides a mechanism to group together related properties and methods. The methods can be either added internally to the function or using the prototype keyword. The following is an example of such a function: function Name (firstName, lastName) { this.firstName = firstName; this.lastName = lastName; this.fullName = function() { return this.firstName + ' ' + this.lastName ; }; } In this preceding example, we have the fullName method encapsulated inside the Name function. Another way of adding methods to functions is shown in the following code snippet with the prototype keyword: function Name (firstName, lastName) { this.firstName = firstName; this.lastName = lastName; } Name.prototype.fullName = function() { return this.firstName + ' ' + this.lastName ; }; These features of functions did solve most of the issues of not having classes, but most of the dev community has not been comfortable with these approaches. Classes make this process easier. Classes provide an abstraction on top of common behavior, thus making code reusable. The following is the syntax for defining a class in TypeScript: The syntax of the class should look very similar to readers who come from an object-oriented background. To define a class, we use a class keyword followed by the name of the class. The News class has three member properties and one method. Each member has a type assigned to it and has an access modifier to define the scope. On line 10, we create an object of a class with the new keyword. Classes in TypeScript also have the concept of a constructor, where we can initialize some properties at the time of object creation. Access modifiers Once the object is created, we can access the public members of the class with the dot operator. Note that we cannot access the author property with the espn object because this property is defined as private. TypeScript provides three types of access modifiers. Public Any property defined with the public keyword will be freely accessible outside the class. As we saw in the previous example, all the variables marked with the public keyword were available outside the class in an object. Note that TypeScript assigns public as a default access modifier if we do not assign any explicitly. This is because the default JavaScript behavior is to have everything public. Private When a property is marked as private, it cannot be accessed outside of the class. The scope of a private variable is only inside the class when using TypeScript. In JavaScript, as we do not have access modifiers, private members are treated similarly to public members. Protected The protected keyword behaves similarly to private, with the exception that protected variables can be accessed in the derived classes. The following is one such example: class base{ protected id: number; } class child extends base{ name: string; details():string{ return `${name} has id: ${this.id}` } } In the preceding code, we extend the child class with the base class and have access to the id property inside the child class. If we create an object of the child class, we will still not have access to the id property outside. Readonly As the name suggests, a property with a readonly access modifier cannot be modified after the value has been assigned to it. The value assigned to a readonly property can only happen at the time of variable declaration or in the constructor. In the above code, line 5 gives an error stating that property name is readonly, and cannot be an assigned value. Transpiled JavaScript from classes While learning TypeScript, it is important to remember that TypeScript is a superset of JavaScript and not a new language on its own. Browsers can only understand JavaScript, so it is important for us to understand the JavaScript that is transpiled by TypeScript. TypeScript provides an option to generate JavaScript based on the ECMA standards. You can configure TypeScript to transpile into ES5 or ES6 (ES 2015) and even ES3 JavaScript by using the flag target in the tsconfig.json file. The biggest difference between ES5 and ES6 is with regard to the classes, let, and const keywords which were introduced in ES6. Even though ES6 has been around for more than a year, most browsers still do not have full support for ES6. So, if you are creating an application that would target older browsers as well, consider having the target as ES5. So, the JavaScript that's generated will be different based on the target setting. Here, we will take an example of class in TypeScript and generate JavaScript for both ES5 and ES6. The following is the class definition in TypeScript: This is the same code that we saw when we introduced classes in the Understanding Classes section. Here, we have a class named News that has three members, two of which are public and one private. The News class also has a format method, which returns a string concatenated from the member variables. Then, we create an object of the News class in line 10 and assign values to public properties. In the last line, we call the format method to print the result. Now let's look at the JavaScript transpiled by TypeScript compiler for this class. ES6 JavaScript ES6, also known as ES 2015, is the latest version of JavaScript, which provides many new features on top of ES5. Classes are one such feature; JavaScript did not have classes prior to ES6. The following is the code generated from the TypeScript class, which we saw previously: If you compare the preceding code with TypeScript code, you will notice minor differences. This is because classes in TypeScript and JavaScript are similar, with just types and access modifiers additional in TypeScript. In JavaScript, we do not have the concept of declaring public members. The author variable, which was defined as private and was initialized at its declaration, is converted to a constructor initialization in JavaScript. If we had not have initialized author, then the produced JavaScript would not have added author in the constructor. ES5 JavaScript ES5 is the most popular JavaScript version supported in browsers, and if you are developing an application that has to support the majority of browser versions, then you need to transpile your code to the ES5 version. This version of JavaScript does not have classes, and hence the transpiled code converts classes to functions, and methods inside the classes are converted to prototypically defined methods on the functions. The following is the code transpiled when we have the target set as ES5 in the TypeScript compiler options: As discussed earlier, the basic difference is that the class is converted to a function. The interesting aspect of this conversion is that the News class is converted to an immediately invoked function expression (IIFE). An IIFE can be identified by the parenthesis at the end of the function declaration, as we see in line 9 in the preceding code snippet. IIFEs cause the function to be executed immediately and help to maintain the correct scope of a function rather than declaring the function in a global scope. Another difference was how we defined the method format in the ES5 JavaScript. The prototype keyword is used to add the additional behavior to the function, which we see here. A couple of other differences you may have noticed include the change of the let keyword to var, as let is not supported in ES5. All variables in ES5 are defined with the var keyword. Also, the format method now does not use a template string, but standard string concatenation to print the output. TypeScript does a good job of transpiling the code to JavaScript while following recommended practices. This helps in making sure we have a robust and reusable code with minimum error cases. If you found this tutorial useful, make sure you check out the book TypeScript 2.x By Example for more hands-on tutorials on how to effectively leverage the power of TypeScript to develop and deploy state-of-the-art web applications. How to install and configure TypeScript Understanding Patterns and Architectures in TypeScript Writing SOLID JavaScript code with TypeScript
Read more
  • 0
  • 0
  • 30675

article-image-what-are-microservices
Packt
20 Jun 2017
12 min read
Save for later

What are Microservices?

Packt
20 Jun 2017
12 min read
In this article written by Gaurav Kumar Aroraa, Lalit Kale, Kanwar Manish, authors of the book Building Microservices with .NET Core, we will start with a brief introduction. Then, we will define its predecessors: monolithic architecture and service-oriented architecture (SOA). After this, we will see how microservices fare against both SOA and the monolithic architecture. We will then compare the advantages and disadvantages of each one of these architectural styles. This will enable us to identify the right scenario for these styles. We will understand the problems that arise from having a layered monolithic architecture. We will discuss the solutions available to these problems in the monolithic world. At the end, we will be able to break down a monolithic application into a microservice architecture. We will cover the following topics in this article: Origin of microservices Discussing microservices (For more resources related to this topic, see here.) Origin of microservices The term microservices was used for the first time in mid-2011 at a workshop of software architects. In March 2012, James Lewis presented some of his ideas about microservices. By the end of 2013, various groups from the IT industry started having discussions on microservices, and by 2014, it had become popular enough to be considered a serious contender for large enterprises. There is no official introduction available for microservices. The understanding of the term is purely based on the use cases and discussions held in the past. We will discuss this in detail, but before that, let's check out the definition of microservices as per Wikipedia (https://en.wikipedia.org/wiki/Microservices), which sums it up as: Microservices is a specialization of and implementation approach for SOA used to build flexible, independently deployable software systems. In 2014, James Lewis and Martin Fowler came together and provided a few real-world examples and presented microservices (refer to http://martinfowler.com/microservices/) in their own words and further detailed it as follows: The microservice architectural style is an approach to developing a single application as a suite of small services, each running in its own process and communicating with lightweight mechanisms, often an HTTP resource API. These services are built around business capabilities and independently deployable by fully automated deployment machinery. There is a bare minimum of centralized management of these services, which may be written in different programming languages and use different data storage technologies. It is very important that you see all the attributes James and Martin defined here. They defined it as an architectural style that developers could utilize to develop a single application with the business logic spread across a bunch of small services, each having their own persistent storage functionality. Also, note its attributes: it can be independently deployable, can run in its own process, is a lightweight communication mechanism, and can be written in different programming languages. We want to emphasize this specific definition since it is the crux of the whole concept. And as we move along, it will come together by the time we finish this book. Discussing microservices Until now, we have gone through a few definitions of microservices; now, let's discuss microservices in detail. In short, a microservice architecture removes most of the drawbacks of SOA architectures.  Slicing your application into a number of services is neither SOA nor microservices. However, combining service design and best practices from the SOA world along with a few emerging practices, such as isolated deployment, semantic versioning, providing lightweight services, and service discovery in polyglot programming, is microservices. We implement microservices to satisfy business features and implement them with reduced time to market and greater flexibility. Before we move on to understand the architecture, let's discuss the two important architectures that have led to its existence: The monolithic architecture style SOA Most of us would be aware of the scenario where during the life cycle of an enterprise application development, a suitable architectural style is decided. Then, at various stages, the initial pattern is further improved and adapted with changes that cater to various challenges, such as deployment complexity, large code base, and scalability issues. This is exactly how the monolithic architecture style evolved into SOA, further leading up to microservices. Monolithic architecture The monolithic architectural style is a traditional architecture type and has been widely used in the industry. The term "monolithic" is not new and is borrowed from the Unix world. In Unix, most of the commands exist as a standalone program whose functionality is not dependent on any other program. As seen in the succeeding image, we can have different components in the application such as: User interface: This handles all of the user interaction while responding with HTML or JSON or any other preferred data interchange format (in the case of web services). Business logic: All the business rules applied to the input being received in the form of user input, events, and database exist here. Database access: This houses the complete functionality for accessing the database for the purpose of querying and persisting objects. A widely accepted rule is that it is utilized through business modules and never directly through user-facing components. Software built using this architecture is self-contained. We can imagine a single .NET assembly that contains various components, as described in the following image: As the software is self-contained here, its components are interconnected and interdependent. Even a simple code change in one of the modules may break a major functionality in other modules. This would result in a scenario where we'd need to test the whole application. With the business depending critically on its enterprise application frameworks, this amount of time could prove to be very critical. Having all the components tightly coupled poses another challenge: whenever we execute or compile such software, all the components should be available or the build will fail; refer to the preceding image that represents a monolithic architecture and is a self-contained or a single .NET assembly project. However, monolithic architectures might also have multiple assemblies. This means that even though a business layer (assembly, data access layer assembly, and so on) is separated, at run time, all of them will come together and run as one process.  A user interface depends on other components' direct sale and inventory in a manner similar to all other components that depend upon each other. In this scenario, we will not be able to execute this project in the absence of any one of these components. The process of upgrading any one of these components will be more complex as we may have to consider other components that require code changes too. This results in more development time than required for the actual change. Deploying such an application will become another challenge. During deployment, we will have to make sure that each and every component is deployed properly; otherwise, we may end up facing a lot of issues in our production environments. If we develop an application using the monolithic architecture style, as discussed previously, we might face the following challenges: Large code base: This is a scenario where the code lines outnumber the comments by a great margin. As components are interconnected, we will have to bear with a repetitive code base. Too many business modules: This is in regard to modules within the same system. Code base complexity: This results in a higher chance of code breaking due to the fix required in other modules or services. Complex code deployment: You may come across minor changes that would require whole system deployment. One module failure affecting the whole system: This is in regard to modules that depend on each other. Scalability: This is required for the entire system and not just the modules in it. Intermodule dependency: This is due to tight coupling. Spiraling development time: This is due to code complexity and interdependency. Inability to easily adapt to a new technology: In this case, the entire system would need to be upgraded. As discussed earlier, if we want to reduce development time, ease of deployment, and improve maintainability of software for enterprise applications, we should avoid the traditional or monolithic architecture. Service-oriented architecture In the previous section, we discussed the monolithic architecture and its limitations. We also discussed why it does not fit into our enterprise application requirements. To overcome these issues, we should go with some modular approach where we can separate the components such that they should come out of the self-contained or single .NET assembly. The main difference between SOA & monolithic is not one or multiple assembly. But as the service in SOA runs as separate process, SOA scales better compared to monolithic. Let's discuss the modular architecture, that is, SOA. This is a famous architectural style using which the enterprise applications are designed with a collection of services as its base. These services may be RESTful or ASMX Web services. To understand SOA in more detail, let's discuss "service" first. What is service? Service, in this case, is an essential concept of SOA. It can be a piece of code, program, or software that provides some functionality to other system components. This piece of code can interact directly with the database or indirectly through another service. Furthermore, it can be consumed by clients directly, where the client may either be a website, desktop app, mobile app, or any other device app. Refer to the following diagram: Service refers to a type of functionality exposed for consumption by other systems (generally referred to as clients/client applications). As mentioned earlier, it can be represented by a piece of code, program, or software. Such services are exposed over the HTTP transport protocol as a general practice. However, the HTTP protocol is not a limiting factor, and a protocol can be picked as deemed fit for the scenario. In the following image, Service – direct selling is directly interacting with Database, and three different clients, namely Web, Desktop, and Mobile, are consuming the service. On the other hand, we have clients consuming Service – partner selling, which is interacting with Service – channel partners for database access. A product selling service is a set of services that interacts with client applications and provides database access directly or through another service, in this case, Service – Channel partner.  In the case of Service – direct selling, shown in the preceding example, it is providing some functionality to a Web Store, a desktop application, and a mobile application. This service is further interacting with the database for various tasks, namely fetching data, persisting data, and so on. Normally, services interact with other systems via some communication channel, generally the HTTP protocol. These services may or may not be deployed on the same or single servers. In the preceding image, we have projected an SOA example scenario. There are many fine points to note here, so let's get started. Firstly, our services can be spread across different physical machines. Here, Service-direct selling is hosted on two separate machines. It is a possible scenario that instead of the entire business functionality, only a part of it will reside on Server 1 and the remaining on Server 2. Similarly, Service – partner selling appears to be having the same arrangement on Server 3 and Server 4. However, it doesn't stop Service – channel partners being hosted as a complete set on both the servers: Server 5 and Server 6. A system that uses a service or multiple services in a fashion mentioned in the preceding figure is called an SOA. We will discuss SOA in detail in the following sections. Let's recall the monolithic architecture. In this case, we did not use it because it restricts code reusability; it is a self-contained assembly, and all the components are interconnected and interdependent. For deployment, in this case, we will have to deploy our complete project after we select the SOA (refer to preceding image and subsequent discussion). Now, because of the use of this architectural style, we have the benefit of code reusability and easy deployment. Let's examine this in the wake of the preceding figure: Reusability: Multiple clients can consume the service. The service can also be simultaneously consumed by other services. For example, OrderService is consumed by web and mobile clients. Now, OrderService can also be used by the Reporting Dashboard UI. Stateless: Services do not persist any state between requests from the client, that is, the service doesn't know, nor care, that the subsequent request has come from the client that has/hasn't made the previous request. Contract-based: Interfaces make it technology-agnostic on both sides of implementation and consumption. It also serves to make it immune to the code updates in the underlying functionality. Scalability: A system can be scaled up; SOA can be individually clustered with appropriate load balancing. Upgradation: It is very easy to roll out new functionalities or introduce new versions of the existing functionality. The system doesn't stop you from keeping multiple versions of the same business functionality. Summary In this article, we discussed what the microservice architectural style is in detail, its history, and how it differs from its predecessors: monolithic and SOA. We further defined the various challenges that monolithic faces when dealing with large systems. Scalability and reusability are some definite advantages that SOA provides over monolithic. We also discussed the limitations of the monolithic architecture, including scaling problems, by implementing a real-life monolithic application. The microservice architecture style resolves all these issues by reducing code interdependency and isolating the dataset size that any one of the microservices works upon. We utilized dependency injection and database refactoring for this. We further explored automation, CI, and deployment. These easily allow the development team to let the business sponsor choose what industry trends to respond to first. This results in cost benefits, better business response, timely technology adoption, effective scaling, and removal of human dependency. Resources for Article: Further resources on this subject: Microservices and Service Oriented Architecture [article] Breaking into Microservices Architecture [article] Microservices – Brave New World [article]
Read more
  • 0
  • 0
  • 30666

article-image-python-design-patterns-depth-factory-pattern
Packt
15 Feb 2016
17 min read
Save for later

Python Design Patterns in Depth: The Factory Pattern

Packt
15 Feb 2016
17 min read
Creational design patterns deal with an object creation [j.mp/wikicrea]. The aim of a creational design pattern is to provide better alternatives for situations where a direct object creation (which in Python happens by the __init__() function [j.mp/divefunc], [Lott14, page 26]) is not convenient. In the Factory design pattern, a client asks for an object without knowing where the object is coming from (that is, which class is used to generate it). The idea behind a factory is to simplify an object creation. It is easier to track which objects are created if this is done through a central function, in contrast to letting a client create objects using a direct class instantiation [Eckel08, page 187]. A factory reduces the complexity of maintaining an application by decoupling the code that creates an object from the code that uses it [Zlobin13, page 30]. Factories typically come in two forms: the Factory Method, which is a method (or in Pythonic terms, a function) that returns a different object per input parameter [j.mp/factorympat]; the Abstract Factory, which is a group of Factory Methods used to create a family of related products [GOF95, page 100], [j.mp/absfpat] (For more resources related to this topic, see here.) Factory Method In the Factory Method, we execute a single function, passing a parameter that provides information about what we want. We are not required to know any details about how the object is implemented and where it is coming from. A real-life example An example of the Factory Method pattern used in reality is in plastic toy construction. The molding powder used to construct plastic toys is the same, but different figures can be produced using different plastic molds. This is like having a Factory Method in which the input is the name of the figure that we want (soldier and dinosaur) and the output is the plastic figure that we requested. The toy construction case is shown in the following figure, which is provided by www.sourcemaking.com [j.mp/factorympat]. A software example The Django framework uses the Factory Method pattern for creating the fields of a form. The forms module of Django supports the creation of different kinds of fields (CharField, EmailField) and customizations (max_length, required) [j.mp/djangofacm]. Use cases If you realize that you cannot track the objects created by your application because the code that creates them is in many different places instead of a single function/method, you should consider using the Factory Method pattern [Eckel08, page 187]. The Factory Method centralizes an object creation and tracking your objects becomes much more easier. Note that it is absolutely fine to create more than one Factory Method, and this is how it is typically done in practice. Each Factory Method logically groups the creation of objects that have similarities. For example, one Factory Method might be responsible for connecting you to different databases (MySQL, SQLite), another Factory Method might be responsible for creating the geometrical object that you request (circle, triangle), and so on. The Factory Method is also useful when you want to decouple an object creation from an object usage. We are not coupled/bound to a specific class when creating an object, we just provide partial information about what we want by calling a function. This means that introducing changes to the function is easy without requiring any changes to the code that uses it [Zlobin13, page 30]. Another use case worth mentioning is related with improving the performance and memory usage of an application. A Factory Method can improve the performance and memory usage by creating new objects only if it is absolutely necessary [Zlobin13, page 28]. When we create objects using a direct class instantiation, extra memory is allocated every time a new object is created (unless the class uses caching internally, which is usually not the case). We can see that in practice in the following code (file id.py), it creates two instances of the same class A and uses the id() function to compare their memory addresses. The addresses are also printed in the output so that we can inspect them. The fact that the memory addresses are different means that two distinct objects are created as follows: class A(object):     pass if __name__ == '__main__':     a = A()     b = A()     print(id(a) == id(b))     print(a, b) Executing id.py on my computer gives the following output:>> python3 id.pyFalse<__main__.A object at 0x7f5771de8f60> <__main__.A object at 0x7f5771df2208> Note that the addresses that you see if you execute the file are not the same as I see because they depend on the current memory layout and allocation. But the result must be the same: the two addresses should be different. There's one exception that happens if you write and execute the code in the Python Read-Eval-Print Loop (REPL) (interactive prompt), but that's a REPL-specific optimization which is not happening normally. Implementation Data comes in many forms. There are two main file categories for storing/retrieving data: human-readable files and binary files. Examples of human-readable files are XML, Atom, YAML, and JSON. Examples of binary files are the .sq3 file format used by SQLite and the .mp3 file format used to listen to music. In this example, we will focus on two popular human-readable formats: XML and JSON. Although human-readable files are generally slower to parse than binary files, they make data exchange, inspection, and modification much more easier. For this reason, it is advised to prefer working with human-readable files, unless there are other restrictions that do not allow it (mainly unacceptable performance and proprietary binary formats). In this problem, we have some input data stored in an XML and a JSON file, and we want to parse them and retrieve some information. At the same time, we want to centralize the client's connection to those (and all future) external services. We will use the Factory Method to solve this problem. The example focuses only on XML and JSON, but adding support for more services should be straightforward. First, let's take a look at the data files. The XML file, person.xml, is based on the Wikipedia example [j.mp/wikijson] and contains information about individuals (firstName, lastName, gender, and so on) as follows: <persons>   <person>     <firstName>John</firstName>     <lastName>Smith</lastName>     <age>25</age>     <address>       <streetAddress>21 2nd Street</streetAddress>       <city>New York</city>       <state>NY</state>       <postalCode>10021</postalCode>     </address>     <phoneNumbers>       <phoneNumber type="home">212 555-1234</phoneNumber>       <phoneNumber type="fax">646 555-4567</phoneNumber>     </phoneNumbers>     <gender>       <type>male</type>     </gender>   </person>   <person>     <firstName>Jimy</firstName>     <lastName>Liar</lastName>     <age>19</age>     <address>       <streetAddress>18 2nd Street</streetAddress>       <city>New York</city>       <state>NY</state>       <postalCode>10021</postalCode>     </address>     <phoneNumbers>       <phoneNumber type="home">212 555-1234</phoneNumber>     </phoneNumbers>     <gender>       <type>male</type>     </gender>   </person>   <person>     <firstName>Patty</firstName>     <lastName>Liar</lastName>     <age>20</age>     <address>       <streetAddress>18 2nd Street</streetAddress>       <city>New York</city>       <state>NY</state>       <postalCode>10021</postalCode>     </address>     <phoneNumbers>       <phoneNumber type="home">212 555-1234</phoneNumber>       <phoneNumber type="mobile">001 452-8819</phoneNumber>     </phoneNumbers>     <gender>       <type>female</type>     </gender>   </person> </persons> The JSON file, donut.json, comes from the GitHub account of Adobe [j.mp/adobejson] and contains donut information (type, price/unit i.e. ppu, topping, and so on) as follows: [   {     "id": "0001",     "type": "donut",     "name": "Cake",     "ppu": 0.55,     "batters": {       "batter": [         { "id": "1001", "type": "Regular" },         { "id": "1002", "type": "Chocolate" },         { "id": "1003", "type": "Blueberry" },         { "id": "1004", "type": "Devil's Food" }       ]     },     "topping": [       { "id": "5001", "type": "None" },       { "id": "5002", "type": "Glazed" },       { "id": "5005", "type": "Sugar" },       { "id": "5007", "type": "Powdered Sugar" },       { "id": "5006", "type": "Chocolate with Sprinkles" },       { "id": "5003", "type": "Chocolate" },       { "id": "5004", "type": "Maple" }     ]   },   {     "id": "0002",     "type": "donut",     "name": "Raised",     "ppu": 0.55,     "batters": {       "batter": [         { "id": "1001", "type": "Regular" }       ]     },     "topping": [       { "id": "5001", "type": "None" },       { "id": "5002", "type": "Glazed" },       { "id": "5005", "type": "Sugar" },       { "id": "5003", "type": "Chocolate" },       { "id": "5004", "type": "Maple" }     ]   },   {     "id": "0003",     "type": "donut",     "name": "Old Fashioned",     "ppu": 0.55,     "batters": {       "batter": [         { "id": "1001", "type": "Regular" },         { "id": "1002", "type": "Chocolate" }       ]     },     "topping": [       { "id": "5001", "type": "None" },       { "id": "5002", "type": "Glazed" },       { "id": "5003", "type": "Chocolate" },       { "id": "5004", "type": "Maple" }     ]   } ] We will use two libraries that are part of the Python distribution for working with XML and JSON: xml.etree.ElementTree and json as follows: import xml.etree.ElementTree as etree import json The JSONConnector class parses the JSON file and has a parsed_data() method that returns all data as a dictionary (dict). The property decorator is used to make parsed_data() appear as a normal variable instead of a method as follows: class JSONConnector:     def __init__(self, filepath):         self.data = dict()         with open(filepath, mode='r', encoding='utf-8') as f:             self.data = json.load(f)       @property     def parsed_data(self):         return self.data The XMLConnector class parses the XML file and has a parsed_data() method that returns all data as a list of xml.etree.Element as follows: class XMLConnector:     def __init__(self, filepath):         self.tree = etree.parse(filepath)     @property    def parsed_data(self):         return self.tree The connection_factory() function is a Factory Method. It returns an instance of JSONConnector or XMLConnector depending on the extension of the input file path as follows: def connection_factory(filepath):     if filepath.endswith('json'):         connector = JSONConnector     elif filepath.endswith('xml'):         connector = XMLConnector     else:         raise ValueError('Cannot connect to {}'.format(filepath))     return connector(filepath) The connect_to() function is a wrapper of connection_factory(). It adds exception handling as follows: def connect_to(filepath):     factory = None     try:         factory = connection_factory(filepath)     except ValueError as ve:         print(ve)     return factory The main() function demonstrates how the Factory Method design pattern can be used. The first part makes sure that exception handling is effective as follows: def main():     sqlite_factory = connect_to('data/person.sq3') The next part shows how to work with the XML files using the Factory Method. XPath is used to find all person elements that have the last name Liar. For each matched person, the basic name and phone number information are shown as follows: xml_factory = connect_to('data/person.xml')     xml_data = xml_factory.parsed_data()     liars = xml_data.findall     (".//{person}[{lastName}='{}']".format('Liar'))     print('found: {} persons'.format(len(liars)))     for liar in liars:         print('first name:         {}'.format(liar.find('firstName').text))         print('last name: {}'.format(liar.find('lastName').text))         [print('phone number ({}):'.format(p.attrib['type']),         p.text) for p in liar.find('phoneNumbers')] The final part shows how to work with the JSON files using the Factory Method. Here, there's no pattern matching, and therefore the name, price, and topping of all donuts are shown as follows: json_factory = connect_to('data/donut.json')     json_data = json_factory.parsed_data     print('found: {} donuts'.format(len(json_data)))     for donut in json_data:         print('name: {}'.format(donut['name']))         print('price: ${}'.format(donut['ppu']))         [print('topping: {} {}'.format(t['id'], t['type'])) for t         in donut['topping']] For completeness, here is the complete code of the Factory Method implementation (factory_method.py) as follows: import xml.etree.ElementTree as etree import json class JSONConnector:     def __init__(self, filepath):         self.data = dict()         with open(filepath, mode='r', encoding='utf-8') as f:             self.data = json.load(f)     @property     def parsed_data(self):         return self.data class XMLConnector:     def __init__(self, filepath):         self.tree = etree.parse(filepath)     @property     def parsed_data(self):         return self.tree def connection_factory(filepath):     if filepath.endswith('json'):         connector = JSONConnector     elif filepath.endswith('xml'):         connector = XMLConnector     else:         raise ValueError('Cannot connect to {}'.format(filepath))     return connector(filepath) def connect_to(filepath):     factory = None     try:        factory = connection_factory(filepath)     except ValueError as ve:         print(ve)     return factory def main():     sqlite_factory = connect_to('data/person.sq3')     print()     xml_factory = connect_to('data/person.xml')     xml_data = xml_factory.parsed_data     liars = xml_data.findall(".//{}[{}='{}']".format('person',     'lastName', 'Liar'))     print('found: {} persons'.format(len(liars)))     for liar in liars:         print('first name:         {}'.format(liar.find('firstName').text))         print('last name: {}'.format(liar.find('lastName').text))         [print('phone number ({}):'.format(p.attrib['type']),         p.text) for p in liar.find('phoneNumbers')]     print()     json_factory = connect_to('data/donut.json')     json_data = json_factory.parsed_data     print('found: {} donuts'.format(len(json_data)))     for donut in json_data:     print('name: {}'.format(donut['name']))     print('price: ${}'.format(donut['ppu']))     [print('topping: {} {}'.format(t['id'], t['type'])) for t     in donut['topping']] if __name__ == '__main__':     main() Here is the output of this program as follows: >>> python3 factory_method.pyCannot connect to data/person.sq3found: 2 personsfirst name: Jimylast name: Liarphone number (home): 212 555-1234first name: Pattylast name: Liarphone number (home): 212 555-1234phone number (mobile): 001 452-8819found: 3 donutsname: Cakeprice: $0.55topping: 5001 Nonetopping: 5002 Glazedtopping: 5005 Sugartopping: 5007 Powdered Sugartopping: 5006 Chocolate with Sprinklestopping: 5003 Chocolatetopping: 5004 Maplename: Raisedprice: $0.55topping: 5001 Nonetopping: 5002 Glazedtopping: 5005 Sugartopping: 5003 Chocolatetopping: 5004 Maplename: Old Fashionedprice: $0.55topping: 5001 Nonetopping: 5002 Glazedtopping: 5003 Chocolatetopping: 5004 Maple Notice that although JSONConnector and XMLConnector have the same interfaces, what is returned by parsed_data() is not handled in a uniform way. Different codes must be used to work with each connector. Although it would be nice to be able to use the same code for all connectors, this is at most times not realistic unless we use some kind of common mapping for the data which is very often provided by external data providers. Assuming that you can use exactly the same code for handling the XML and JSON files, what changes are required to support a third format, for example, SQLite? Find an SQLite file or create your own and try it. As is now, the code does not forbid a direct instantiation of a connector. Is it possible to do this? Try doing it (hint: functions in Python can have nested classes). Summary To learn more about design patterns in depth, the following books published by Packt Publishing (https://www.packtpub.com/) are recommended: Learning Python Design Patterns (https://www.packtpub.com/application-development/learning-python-design-patterns) Learning Python Design Patterns – Second Edition (https://www.packtpub.com/application-development/learning-python-design-patterns-second-edition) Resources for Article:   Further resources on this subject: Recommending Movies at Scale (Python) [article] An In-depth Look at Ansible Plugins [article] Elucidating the Game-changing Phenomenon of the Docker-inspired Containerization Paradigm [article]
Read more
  • 0
  • 0
  • 30615

article-image-regular-expressions-awk-programming
Pavan Ramchandani
18 May 2018
8 min read
Save for later

Regular expressions in AWK programming: What, Why, and How

Pavan Ramchandani
18 May 2018
8 min read
AWK is a pattern-matching language. It searches for a pattern in a file and, upon finding the corresponding match, it performs the file's action on the input line. This pattern could consist of fixed strings or a pattern of text. This variable content or pattern is generally searched with the help of regular expressions. Hence, regular expressions form an important part of AWK programming language. Today we will introduce you to the regular expressions in AWK programming and will get started with string-matching patterns and basic constructs to use with AWK. This article is an excerpt from a book written by Shiwang Kalkhanda, titled Learning AWK Programming. What is a regular expression? A regular expression, or regexpr, is a set of characters used to describe a pattern. A regular expression is generally used to match lines in a file that contain a particular pattern. Many Unix utilities operate on plain text files line by line, such as grep, sed, and awk. Regular expressions search for a pattern on a single line in a file. A regular expression doesn't search for a pattern that begins on one line and ends on another. Other programming languages may support this, notably Perl. Why use regular expressions? Generally, all editors have the ability to perform search-and-replace operations. Some editors can only search for patterns, others can also replace them, and others can also print the line containing that pattern. A regular expression goes many steps beyond this simple search, replace, and printing functionality, and hence it is more powerful and flexible. We can search for a word of a certain size, such as a word that has four characters or numbers. We can search for a word that ends with a particular character, let's say e. You can search for phone numbers, email IDs, and so on, and can also perform validation using regular expressions. They simplify complex pattern-matching tasks and hence form an important part of AWK programming. Other regular expression variations also exist, notably those for Perl. Using regular expressions with AWK There are mainly two types of regular expressions in Linux: Basic regular expressions that are used by vi, sed, grep, and so on Extended regular expressions that are used by awk, nawk, gawk, and egrep Here, we will refer to extended regular expressions as regular expressions in the context of AWK. In AWK, regular expressions are enclosed in forward slashes, '/', (forming the AWK pattern) and match every input record whose text belongs to that set. The simplest regular expression is a string of letters, numbers, or both that matches itself. For example, here we use the ly regular expression string to print all lines that contain the ly pattern in them. We just need to enclose the regular expression in forward slashes in AWK: $ awk '/ly/' emp.dat The output on execution of this code is as follows: Billy Chabra 9911664321 bily@yahoo.com M lgs 1900 Emily Kaur 8826175812 emily@gmail.com F Ops 2100 In this example, the /ly/ pattern matches when the current input line contains the ly sub-string, either as ly itself or as some part of a bigger word, such as Billy or Emily, and prints the corresponding line. Regular expressions as string-matching patterns with AWK Regular expressions are used as string-matching patterns with AWK in the following three ways. We use the '~' and '! ~' match operators to perform regular expression comparisons: /regexpr/: This matches when the current input line contains a sub-string matched by regexpr. It is the most basic regular expression, which matches itself as a string or sub-string. For example, /mail/ matches only when the current input line contains the mail string as a string, a sub-string, or both. So, we will get lines with Gmail as well as Hotmail in the email ID field of the employee database as follows: $ awk '/mail/' emp.dat The output on execution of this code is as follows: Jack Singh 9857532312 jack@gmail.com M hr 2000 Jane Kaur 9837432312 jane@gmail.com F hr 1800 Eva Chabra 8827232115 eva@gmail.com F lgs 2100 Ana Khanna 9856422312 anak@hotmail.com F Ops 2700 Victor Sharma 8826567898 vics@hotmail.com M Ops 2500 John Kapur 9911556789 john@gmail.com M hr 2200 Sam khanna 8856345512 sam@hotmail.com F lgs 2300 Emily Kaur 8826175812 emily@gmail.com F Ops 2100 Amy Sharma 9857536898 amys@hotmail.com F Ops 2500 In this example, we do not specify any expression, hence it automatically matches a whole line, as follows: $ awk '$0 ~ /mail/' emp.dat The output on execution of this code is as follows: Jack Singh 9857532312 jack@gmail.com M hr 2000 Jane Kaur 9837432312 jane@gmail.com F hr 1800 Eva Chabra 8827232115 eva@gmail.com F lgs 2100 Ana Khanna 9856422312 anak@hotmail.com F Ops 2700 Victor Sharma 8826567898 vics@hotmail.com M Ops 2500 John Kapur 9911556789 john@gmail.com M hr 2200 Sam khanna 8856345512 sam@hotmail.com F lgs 2300 Emily Kaur 8826175812 emily@gmail.com F Ops 2100 Amy Sharma 9857536898 amys@hotmail.com F Ops 2500 expression ~ /regexpr /: This matches if the string value of the expression contains a sub-string matched by regexpr. Generally, this left-hand operand of the matching operator is a field. For example, in the following command, we print all the lines in which the value in the second field contains a /Singh/ string: $ awk '$2 ~ /Singh/{ print }' emp.dat We can also use the expression as follows: $ awk '{ if($2 ~ /Singh/) print}' emp.dat The output on execution of the preceding code is as follows: Jack Singh 9857532312 jack@gmail.com M hr 2000 Hari Singh 8827255666 hari@yahoo.com M Ops 2350 Ginny Singh 9857123466 ginny@yahoo.com F hr 2250 Vina Singh 8811776612 vina@yahoo.com F lgs 2300 expression !~ /regexpr /: This matches if the string value of the expression does not contain a sub-string matched by regexpr. Generally, this expression is also a field variable. For example, in the following example, we print all the lines that don't contain the Singh sub-string in the second field, as follows: $ awk '$2 !~ /Singh/{ print }' emp.dat The output on execution of the preceding code is as follows: Jane Kaur 9837432312 jane@gmail.com F hr 1800 Eva Chabra 8827232115 eva@gmail.com F lgs 2100 Amit Sharma 9911887766 amit@yahoo.com M lgs 2350 Julie Kapur 8826234556 julie@yahoo.com F Ops 2500 Ana Khanna 9856422312 anak@hotmail.com F Ops 2700 Victor Sharma 8826567898 vics@hotmail.com M Ops 2500 John Kapur 9911556789 john@gmail.com M hr 2200 Billy Chabra 9911664321 bily@yahoo.com M lgs 1900 Sam khanna 8856345512 sam@hotmail.com F lgs 2300 Emily Kaur 8826175812 emily@gmail.com F Ops 2100 Amy Sharma 9857536898 amys@hotmail.com F Ops 2500 Any expression may be used in place of /regexpr/ in the context of ~; and !~. The expression here could also be if, while, for, and do statements. Basic regular expression construct Regular expressions are made up of two types of characters: normal text characters, called literals, and special characters, such as the asterisk (*, +, ?, .), called metacharacters. There are times when you want to match a metacharacter as a literal character. In such cases, we prefix that metacharacter with a backslash (), which is called an escape sequence. The basic regular expression construct can be summarized as follows: Here is the list of metacharacters, also known as special characters, that are used in building regular expressions:     ^    $    .    [    ]    |    (    )    *    +    ? The following table lists the remaining elements that are used in building a basic regular expression, apart from the metacharacters mentioned before: Literal A literal character (non-metacharacter ), such as A, that matches itself. Escape sequence An escape sequence that matches a special symbol: for example t matches tab. Quoted metacharacter () In quoted metacharacters, we prefix metacharacter with a backslash, such as $ that matches the metacharacter literally. Anchor (^) Matches the beginning of a string. Anchor ($) Matches the end of a string. Dot (.) Matches any single character. Character classes (...) A character class [ABC] matches any one of the A, B, or C characters. Character classes may include abbreviations, such as [A-Za-z]. They match any single letter. Complemented character classes Complemented character classes [^0-9] match any character except a digit. These operators combine regular expressions into larger ones: Alternation (|) A|B matches A or B. Concatenation AB matches A immediately followed by B. Closure (*) A* matches zero or more As. Positive closure (+) A+ matches one or more As. Zero or one (?) A? matches the null string or A. Parentheses () Used for grouping regular expressions and back-referencing. Like regular expressions, (r) can be accessed using n digit in future. Do check out the book Learning AWK Programming to learn more about the intricacies of AWK programming language for text processing. Read More What is the difference between functional and object-oriented programming? What makes a programming language simple or complex?
Read more
  • 0
  • 0
  • 30605
Unlock access to the largest independent learning library in Tech for FREE!
Get unlimited access to 7500+ expert-authored eBooks and video courses covering every tech area you can think of.
Renews at €18.99/month. Cancel anytime
article-image-firefox-nightly-browser-debugging-your-app-is-now-fun-with-mozillas-new-time-travel-feature
Natasha Mathur
30 Jul 2018
3 min read
Save for later

Firefox Nightly browser: Debugging your app is now fun with Mozilla’s new ‘time travel’ feature

Natasha Mathur
30 Jul 2018
3 min read
Earlier this month, Mozilla announced a fancy new feature called “Time Travel debugging” for its Firefox Nightly web browser at the JSConf EU 2018.  With time travel debugging, you can easily track the bugs in your code or app as it lets you pause and rewind to the exact time when your app broke down. Time travel debugging technology is particularly useful for local web development where it allows you to pause and step forward or backward, pause and rewind to a previous state, rewind to the time a console message was logged and rewind to the time where an element had a certain style. It is also great for times where you might want to save user recordings or view a test recording when the testing fails. With time travel debugging, you can record a tab on your browser and later replay it using WebReplay, an experimental project which allows you to record, rewind and replay the processes for the web. According to Jason Laster, a Senior Software Engineer at Mozilla,“ with time travel, we have a full recording of time, you can jump to any point in the path and see it immediately, you don’t have to refresh or re-click or pause or look at logs”. Here’s a video of Jason Laster talking about the potential of time travel debugging. JSConf  He also mentioned how time travel is “not a new thing” and he was inspired by Dan Abramov, creator of Redux when he showcased Redux at JSConfEU saying how he wanted “time travel” to “reduce his action over time”. With Redux, you get a slider that shows you all the actions over time and as you’re moving, you get to see the UI update as well. In fact, Mozilla rebuilt the debugger in order to use React and redux for its time travel feature. Their debugger comes equipped with Redux dev tools, which shows a list of all the actions for the debugger. So, the dev tools show you the state of the app, sources, and the pause data. Finally, Laster added how “this is just the beginning” and that “they hope to pull this off well in the future”. To use this new time travel debugging feature, you must install the Firefox Nightly browser first. For more details on the new feature, check out the official documentation. Mozilla is building a bridge between Rust and JavaScript Firefox has made a password manager for your iPhone Firefox 61 builds on Firefox Quantum, adds Tab Warming, WebExtensions, and TLS 1.3  
Read more
  • 0
  • 0
  • 30438

article-image-gophercon-2019-go-2-update-open-source-go-library-for-gui-support-for-webassembly-tinygo-for-microcontrollers-and-more
Fatema Patrawala
30 Jul 2019
9 min read
Save for later

GopherCon 2019: Go 2 update, open-source Go library for GUI, support for WebAssembly, TinyGo for microcontrollers and more

Fatema Patrawala
30 Jul 2019
9 min read
Last week Go programmers had a gala time learning, networking and programming at the Marriott Marquis San Diego Marina as the most awaited event GopherCon 2019 was held starting from 24th July till 27th July. GopherCon this year hit the road at San Diego with some exceptional conferences, and many exciting announcements for more than 1800 attendees from around the world. One of the attendees, Andrea Santillana Fernández, says the Go Community is growing, and doing quite well. She wrote on her blog post on the Source graph website that there are 1 million Go programmers around the world and month on month its membership keeps increasing. Indeed there is a significant growth in the Go community, so what did it have in store for the programmers at this year’s GopherCon 2019: On the road to Go 2 The major milestones for the journey to Go 2 were presented by Russ Coxx on Wednesday last week. He explained the main areas of focus for Go 2, which are as below: Error handling Russ notes that writing a program correctly without errors is hard. But writing a program correctly accounting for errors and external dependencies is much more difficult. He listed down a few errors which led in introducing error handling helpers like an optional Unwrap interface, errors.Is and errors.As in Go 1.13 version. Generics Russ spoke about Generics and said that they started exploring a new design since last year. They are working with programming language theory experts on the problem to help refine the proposal of generics code in Go. In a separate session, Ian Lance Taylor, introduced generics codes in Go. He briefly explained the need, implementation and benefits from generics for the Go language. Next, Taylor reviewed the Go contract design draft which included the addition of optional type parameters to types and functions. Taylor defined generics as “Generic programming which enables the representation of functions and data structures in a generic form, with types factored out.” Generic code is written using types, which are specified later. An unspecified type is called as type parameter. A type parameter offers support only when permitted by contracts. A generic code imparts strong basis for sharing codes and building programs. It can be compiled using an interface-based approach which optimizes time as the package is compiled only once. If a generic code is compiled multiple times, it can carry compile time cost. Ian showed a few sample codes written in Generics in Go. Dependency management In Go 2 the team wants to focus on Dependency management and explicitly refer to dependencies similar to Java. Russ explained this by giving a history of how in 2011 they introduced GOPATH to separate the distribution from the actual dependencies so that users could run multiple different distributions and to separate the concerns of the distribution from the external libraries. Then in 2015, they introduced the go vendor spec to formalize the vendor directory and simplify dependency management implementations. But in practice it did not work well. In 2016, they formed the dependency working group. This team started work on dep: a tool to reshape all the existing tools into one.The problem with dep and the vendor directory was multiple distinct incompatible versions of a dependency were represented by one import path. It is now called as the "Import Compatibility Rule". The team took what worked well and learned from VGo. VGo provides package uniqueness without breaking builds. VGo dictates different import paths for incompatible package versions. The team grouped similar packages and gave these groups a name: Modules. The VGo system is now go modules. It now integrates directly with the Go command. The challenge presented going forward is mostly around updating everything to use modules. Everything needs to be updated to work with the new conventions to work well. Tooling Finally, as a result of all these changes, they distilled and refined the Go toolchain. One of the examples of this is gopls or "Go Please". Gopls aims to create a smoother, standard interface to integrate with all editors, IDEs, continuous integration and others. Simple, portable and efficient graphical interfaces in Go Elias Naur presented Gio, a new open source Go library for writing immediate mode GUI programs that run on all the major platforms: Android, iOS/tvOS, macOS, Linux, Windows. The talk covered Gio's unusual design and how it achieves simplicity, portability and performance. Elias said, “I wanted to be able to write a GUI program in GO that I could implement only once and have it work on every platform. This, to me, is the most interesting feature of Gio.” https://twitter.com/rakyll/status/1154450455214190593 Elias also presented Scatter which is a Gio program for end-to-end encrypted messaging over email. Other features of Gio include: Immediate mode design UI state owned by program Only depends on lowest-level platform libraries Minimal dependency tree to keep things low level as possible GPU accelerated vector and text rendering It’s super efficient No garbage generated in drawing or layout code Cross platform (macOS, Linux, Windows, Android, iOS, tvOS, Webassembly) Core is 100% Go while OS-specific native interfaces are optional Gopls, new tool serves as a backend for Go editor Rebecca Stambler, mentioned in her presentation that the Go community has built many amazing tools to improve the Go developer experience. However, when a maintainer disappears or a new Go release wreaks havoc, the Go development experience becomes frustrating and complicated. To solve this issue, Rebecca revealed the details behind a new tool: gopls (pronounced as 'go please'). The tool is currently in development by the Go team and community, and it will ultimately serve as the backend for your Go editor. Below listed functionalities are expected from gopls: Show me errors, like unused variables or typos autocomplete would be nice function signature help, because we often forget While we're at it, hover-accessible "tooltip" documentation in general Help me jump to a variable that is needed to see An outline of package structure Get started with WebAssembly in Go WebAssembly in Go is here and ready to try! Although the landscape is evolving quickly, the opportunity is huge. The ability to deliver truly portable system binaries could potentially replace JavaScript in the browser. WebAssembly has the potential to finally realize the goal of being platform agnostic without having to rely on a JVM. In a session by Johan Brandhorst who introduces the technology, shows how to get started with WebAssembly and Go, discusses what is possible today and what will be possible tomorrow. As of Go 1.13, there is experimental support for WebAssembly using the JavaScript interface but as it is only experimental, using it in production is not recommended. Support for the WASI interface is not currently available but has been planned and may be available as early as in Go 1.14. Better x86 assembly generation from Go Michael McLoughlin in his presentation made the case for code generation techniques for writing x86 assembly from Go. Michael introduced assembly, assembly in Go, the use cases for when you would want to drop into assembly, and techniques for realizing speedups using assembly. He pointed out that most of the time, pure Go will be enough for 97% of programs, but there are those 3% of cases where it is warranted, and the examples he brought up were crypto, syscalls, and scientific computing. Michael then introduced a package called avo which makes high-performance Go assembly easier to write. He said that writing your assembly in Go will allow you to realize the benefits of a high level language such as code readability, the ability to create loops, variables, and functions, and parameterized code generation all while still realizing the benefits of writing assembly. Michael concluded the talk with his ideas for the future of avo. Use avo in projects specifically in large crypto implementations. More architecture support Possibly make avo an assembler itself (these kinds of techniques are used in JIT compilers) avo based libraries (avo/std/math/big, avo/std/crypto) The audience appreciated this talk on Twitter. https://twitter.com/darethas/status/1155336268076576768 The presentation slides for this are available on the blog. Miniature version of Golang, TinyGo for microcontrollers Ron Evans, creator of GoCV, GoBot and "technologist for hire" introduced TinyGo that can run directly on microcontrollers like Arduino and more. TinyGo uses the LLVM compiler toolchain to create native code that can run directly even on the smallest of computing devices. Ron demonstrated how Go code can be run on embedded systems using TinyGo, a compiler intended for use in microcontrollers, WebAssembly (WASM), and command-line tools. Evans began his presentation by countering the idea that Go, while fast, produces executables too large to run on the smallest computers. While that may be true of the standard Go compiler, TinyGo produces much smaller outputs. For example: "Hello World" program compiled using Go 1.12 => 1.1 MB Same program compiled using TinyGo 0.7.0 => 12 KB TinyGo currently lacks support for the full Go language and Go standard library. For example, TinyGo does not have support for the net package, although contributors have created implementations of interfaces that work with the WiFi chip built into Arduino chips. Support for Go Routines is also limited, although simple programs usually work. Evans demonstrated that despite some limitations, thanks to TinyGo, the Go language can still be run in embedded systems. Salvador Evans, son of Ron Evans, assisted him for this demonstration. At age 11, he has become the youngest GopherCon speaker so far. https://twitter.com/erikstmartin/status/1155223328329625600 There were talks by other speakers on topics like, improvements in VSCode for Golang, the first open source Golang interpreter with complete support of the language spec, Athens Project which is a proxy server in Go and how mobile development works in Go. https://twitter.com/ramyanexus/status/1155238591120805888 https://twitter.com/containous/status/1155191121938649091 https://twitter.com/hajimehoshi/status/1155184796386988035 Apart from these there were a whole lot of other talks which happened at the GopherCon 2019. There were live blogs posted by the attendees on various talks and till now more than 25 blogs are posted by the attendees on the Sourcegraph website. The Go team shares new proposals planned to be implemented in Go 1.13 and 1.14 Go introduces generic codes and a new contract draft design at GopherCon 2019 Is Golang truly community driven and does it really matter?  
Read more
  • 0
  • 0
  • 30379

article-image-gain-practical-expertise-latest-edition-software-architecture-with-c-sharp9-dotnet5
Expert Network
08 Jul 2021
3 min read
Save for later

Gain Practical Expertise with the Latest Edition of Software Architecture with C# 9 and .NET 5 

Expert Network
08 Jul 2021
3 min read
Software architecture is one of the most discussed topics in the software industry today, and its importance will certainly grow more in the future. But the speed at which new features are added to these software solutions keeps increasing, and new architectural opportunities keep emerging. To strengthen your command on this, Packt brings to you the Second Edition of Software Architecture with C# 9 and .NET 5 by Gabriel Baptista and Francesco Abbruzzese – a fully revised and expanded guide, featuring the latest features of .NET 5 and C# 9.  This book covers the most common design patterns and frameworks involved in modern cloud-based and distributed software architectures. It discusses when and how to use each pattern, by providing you with practical real-world scenarios. This book also presents techniques and processes such as DevOps, microservices, Kubernetes, continuous integration, and cloud computing, so that you can have a best-in-class software solution developed and delivered for your customers.   This book will help you to understand the product that your customer wants from you. It will guide you to deliver and solve the biggest problems you can face during development. It also covers the do's and don'ts that you need to follow when you manage your application in a cloud-based environment. You will learn about different architectural approaches, such as layered architectures, service-oriented architecture, microservices, Single Page Applications, and cloud architecture, and understand how to apply them to specific business requirements.   Finally, you will deploy code in remote environments or on the cloud using Azure. All the concepts in this book will be explained with the help of real-world practical use cases where design principles make the difference when creating safe and robust applications. By the end of the book, you will be able to develop and deliver highly scalable and secure enterprise-ready applications that meet the end customers' business needs.   It is worth mentioning that Software Architecture with C# 9 and .NET 5, Second Edition will not only cover the best practices that a software architect should follow for developing C# and .NET Core solutions, but it will also discuss all the environments that we need to master in order to develop a software product according to the latest trends.   This second edition is improved in code, and adapted to the new opportunities offered by C# 9 and .Net 5. We added all new frameworks and technologies such as gRPC, and Blazor, and described Kubernetes in more detail in a dedicated chapter.   To get the most out of this book, understand it as a guidance that you may want to revisit many times for different circumstances. Do not forget to have Visual Studio Community 2019 or higher installed and be sure that you understand C# .NET principles.
Read more
  • 0
  • 0
  • 29869

article-image-2019-stack-overflow-survey-quick-overview
Sugandha Lahoti
10 Apr 2019
5 min read
Save for later

2019 Stack Overflow survey: A quick overview

Sugandha Lahoti
10 Apr 2019
5 min read
The results of the 2019 Stack Overflow survey have just been published: 90,000 developers took the 20-minute survey this year. The survey shed light on some very interesting insights – from the developers’ preferred language for programming, to the development platform they hate the most, to the blockers to developer productivity. As the survey is quite detailed and comprehensive, here’s a quick look at the most important takeaways. Key highlights from the Stack Overflow Survey Programming languages Python again emerged as the fastest-growing programming language, a close second behind Rust. Interestingly, Python and Typescript achieved the same votes with almost 73% respondents saying it was their most loved language. Python was the most voted language developers wanted to learn next and JavaScript remains the most used programming language. The most dreaded languages were VBA and Objective C. Source: Stack Overflow Frameworks and databases in the Stack Overflow survey Developers preferred using React.js and Vue.js web frameworks while dreaded Drupal and jQuery. Redis was voted as the most loved database and MongoDB as the most wanted database. MongoDB’s inclusion in the list is surprising considering its controversial Server Side Public License. Over the last few months, Red Hat dropped support for MongoDB over this license, so did GNU Health Federation. Both of these organizations choose PostgreSQL over MongoDB, which is one of the reasons probably why PostgreSQL was the second most loved and wanted database of Stack Overflow Survey 2019. Source: Stack Overflow It’s interesting to see WebAssembly making its way in the popular technology segment as well as one of the top paying technologies. Respondents who use Clojure, F#, Elixir, and Rust earned the highest salaries Stackoverflow also did a new segment this year called "Blockchain in the real world" which gives insight into the adoption of Blockchain. Most respondents (80%) on the survey said that their organizations are not using or implementing blockchain technology. Source: Stack Overflow Developer lifestyles and learning About 80% of our respondents say that they code as a hobby outside of work and over half of respondents had written their first line of code by the time they were sixteen, although this experience varies by country and by gender. For instance, women wrote their first code later than men and non-binary respondents wrote code earlier than men. About one-quarter of respondents are enrolled in a formal college or university program full-time or part-time. Of professional developers who studied at the university level, over 60% said they majored in computer science, computer engineering, or software engineering. DevOps specialists and site reliability engineers are among the highest paid, most experienced developers most satisfied with their jobs, and are looking for new jobs at the lowest levels. The survey also noted that developers who are system admins or DevOps specialists are 25-30 times more likely to be men than women. Chinese developers are the most optimistic about the future while developers in Western European countries like France and Germany are among the least optimistic. Developers also overwhelmingly believe that Elon Musk will be the most influential person in tech in 2019. With more than 30,000 people responding to a free text question asking them who they think will be the most influential person this year, an amazing 30% named Tesla CEO Musk. For perspective, Jeff Bezos was in second place, being named by ‘only’ 7.2% of respondents. Although, this year the US survey respondents proportion of women, went up from 9% to 11%, it’s still a slow growth and points to problems with inclusion in the tech industry in general and on Stack Overflow in particular. When thinking about blockers to productivity, different kinds of developers report different challenges. Men are more likely to say that being tasked with non-development work is a problem for them, while gender minority respondents are more likely to say that toxic work environments are a problem. Stack Overflow survey demographics and diversity challenges This report is based on a survey of 88,883 software developers from 179 countries around the world. It was conducted between January 23 to February 14 and the median time spent on the survey for qualified responses was 23.3 minutes. The majority of survey respondents this year were people who said they are professional developers or who code sometimes as part of their work, or are students preparing for such a career. Majority of them were from the US, India, China and Europe. Stack Overflow acknowledged that their results did not represent racial disparities evenly and people of color continue to be underrepresented among developers. This year nearly 71% of respondents continued to be of White or European descent, a slight improvement from last year (74%). The survey notes that, “In the United States this year, 22% of respondents are people of color; last year 19% of United States respondents were people of color.” This clearly signifies that a lot of work is still needed to be done particularly for people of color, women, and underrepresented groups. Although, last year in August, Stack Overflow revamped its Code of Conduct to include more virtues around kindness, collaboration, and mutual respect. It also updated  its developers salary calculator to include 8 new countries. Go through the full report to learn more about developer salaries, job priorities, career values, the best music to listen to while coding, and more. Developers believe Elon Musk will be the most influential person in tech in 2019, according to Stack Overflow survey results Creators of Python, Java, C#, and Perl discuss the evolution and future of programming language design at PuPPy Stack Overflow is looking for a new CEO as Joel Spolsky becomes Chairman
Read more
  • 0
  • 0
  • 29753
Packt
16 Feb 2016
11 min read
Save for later

Python Design Patterns in Depth – The Observer Pattern

Packt
16 Feb 2016
11 min read
In this atricle you will see a group of objects when the state of another object changes. A very popular example lies in the Model-View-Controller (MVC) pattern. Assume that we are using the data of the same model in two views, for instance in a pie chart and in a spreadsheet. Whenever the model is modified, both the views need to be updated. That's the role of the Observer design pattern [Eckel08, page 213]. The Observer pattern describes a publish-subscribe relationship between a single object, : the publisher, which is also known as the subject or observable, and one or more objects, : the subscribers, also known as observers. In the MVC example, the publisher is the model and the subscribers are the views. However, MVC is not the only publish-subscribe example. Subscribing to a news feed such as RSS or Atom is another example. Many readers can subscribe to the feed typically using a feed reader, and every time a new item is added, they receive the update automatically. The ideas behind Observer are the same as the ideas behind MVC and the separation of concerns principle, that is, to increase decoupling between the publisher and subscribers, and to make it easy to add/remove subscribers at runtime. Additionally, the publisher is not concerned about who its observers are. It just sends notifications to all the subscribers [GOF95, page 327]. (For more resources related to this topic, see here.) A real-life example In reality, an auction resembles Observer. Every auction bidder has a number paddle that is raised whenever they want to place a bid. Whenever the paddle is raised by a bidder, the auctioneer acts as the subject by updating the price of the bid and broadcasting the new price to all bidders (subscribers). The following figure, courtesy of www.sourcemaking.com, [j.mp/observerpat], shows how the Observer pattern relates to an auction: A software example The django-observer package [j.mp/django-obs] is a third-party Django package that can be used to register callback functions that are executed when there are changes in several Django fields. Many different types of fields are supported (CharField, IntegerField, and so forth). RabbitMQ is a library that can be used to add asynchronous messaging support to an application. Several messaging protocols are supported, such as HTTP and AMQP. RabbitMQ can be used in a Python application to implement a publish-subscribe pattern, which is nothing more than the Observer design pattern [j.mp/rabbitmqobs]. Use cases We generally use the Observer pattern when we want to inform/update one or more objects (observers/subscribers) about a change that happened to another object (subject/publisher/observable). The number of observers as well as who the observers are may vary and can be changed dynamically (at runtime). We can think of many cases where Observer can be useful. Whether it is RSS, Atom, or another format, the idea is the same; you follow a feed, and every time it is updated, you receive a notification about the update [Zlobin13, page 60]. The same concept exists in social networking. If you are connected to another person using a social networking service, and your connection updates something, you are notified about it. It doesn't matter if the connection is a Twitter user that you follow, a real friend on Facebook, or a business colleague on LinkedIn. Event-driven systems are another example where Observer can be (and usually is) used. In such systems, listeners are used to "listen" for specific events. The listeners are triggered when an event they are listening to is created. This can be typing a specific key (of the keyboard), moving the mouse, and more. The event plays the role of the publisher and the listeners play the role of the observers. The key point in this case is that multiple listeners (observers) can be attached to a single event (publisher) [j.mp/magobs]. Implementation We will implement a data formatter. The ideas described here are based on the ActiveState Python Observer code recipe [j.mp/pythonobs]. There is a default formatter that shows a value in the decimal format. However, we can add/register more formatters. In this example, we will add a hex and binary formatter. Every time the value of the default formatter is updated, the registered formatters are notified and take action. In this case, the action is to show the new value in the relevant format. Observer is actually one of the patterns where inheritance makes sense. We can have a base Publisher class that contains the common functionality of adding, removing, and notifying observers. Our DefaultFormatter class derives from Publisher and adds the formatter-specific functionality. We can dynamically add and remove observers on demand. The following class diagram shows an instance of the example using two observers: HexFormatter and BinaryFormatter. Note that, because class diagrams are static, they cannot show the whole lifetime of a system, only the state of it at a specific point in time. We begin with the Publisher class. The observers are kept in the observers list. The add() method registers a new observer, or throws an error if it already exists. The remove() method unregisters an existing observer, or throws an exception if it does not exist. Finally, the notify() method informs all observers about a change: class Publisher: def __init__(self): self.observers = [] def add(self, observer): if observer not in self.observers: self.observers.append(observer) else: print('Failed to add: {}'.format(observer)) def remove(self, observer): try: self.observers.remove(observer) except ValueError: print('Failed to remove: {}'.format(observer)) def notify(self): [o.notify(self) for o in self.observers] Let's continue with the DefaultFormatter class. The first thing that __init__() does is call __init__() method of the base class, since this is not done automatically in Python. A DefaultFormatter instance has name to make it easier for us to track its status. We use name mangling in the _data variable to state that it should not be accessed directly. Note that this is always possible in Python [Lott14, page 54] but fellow developers have no excuse for doing so, since the code already states that they shouldn't. There is a serious reason for using name mangling in this case. Stay tuned. DefaultFormatter treats the _data variable as an integer, and the default value is zero: class DefaultFormatter(Publisher): def __init__(self, name): Publisher.__init__(self) self.name = name self._data = 0 The __str__() method returns information about the name of the publisher and the value of _data. type(self).__name__ is a handy trick to get the name of a class without hardcoding it. It is one of those things that make the code less readable but easier to maintain. It is up to you to decide if you like it or not: def __str__(self): return "{}: '{}' has data = {}".format(type(self).__name__, self.name, self._data) There are two data() methods. The first one uses the @property decorator to give read access to the _data variable. Using this, we can just execute object.data instead of object.data(): @property def data(self): return self._data The second data() method is more interesting. It uses the @setter decorator, which is called every time the assignment (=) operator is used to assign a new value to the _data variable. This method also tries to cast a new value to an integer, and does exception handling in case this operation fails: @data.setter def data(self, new_value): try: self._data = int(new_value) except ValueError as e: print('Error: {}'.format(e)) else: self.notify() The next step is to add the observers. The functionality of HexFormatter and BinaryFormatter is very similar. The only difference between them is how they format the value of data received by the publisher, that is, in hexadecimal and binary, respectively: class HexFormatter: def notify(self, publisher): print("{}: '{}' has now hex data = {}".format(type(self).__name__, publisher.name, hex(publisher.data))) class BinaryFormatter: def notify(self, publisher): print("{}: '{}' has now bin data = {}".format(type(self).__name__, publisher.name, bin(publisher.data))) No example is fun without some test data. The main() function initially creates a DefaultFormatter instance named test1 and afterwards attaches (and detaches) the two available observers. Exception handling is also exercised to make sure that the application does not crash when erroneous data is passed by the user. Moreover, things such as trying to add the same observer twice or removing an observer that does not exist should cause no crashes: def main(): df = DefaultFormatter('test1') print(df) print() hf = HexFormatter() df.add(hf) df.data = 3 print(df) print() bf = BinaryFormatter() df.add(bf) df.data = 21 print(df) print() df.remove(hf) df.data = 40 print(df) print() df.remove(hf) df.add(bf) df.data = 'hello' print(df) print() df.data = 15.8 print(df) Here's how the full code of the example (observer.py) looks: class Publisher: def __init__(self): self.observers = [] def add(self, observer): if observer not in self.observers: self.observers.append(observer) else: print('Failed to add: {}'.format(observer)) def remove(self, observer): try: self.observers.remove(observer) except ValueError: print('Failed to remove: {}'.format(observer)) def notify(self): [o.notify(self) for o in self.observers] class DefaultFormatter(Publisher): def __init__(self, name): Publisher.__init__(self) self.name = name self._data = 0 def __str__(self): return "{}: '{}' has data = {}".format(type(self).__name__, self.name, self._data) @property def data(self): return self._data @data.setter def data(self, new_value): try: self._data = int(new_value) except ValueError as e: print('Error: {}'.format(e)) else: self.notify() class HexFormatter: def notify(self, publisher): print("{}: '{}' has now hex data = {}".format(type(self).__name__, publisher.name, hex(publisher.data))) class BinaryFormatter: def notify(self, publisher): print("{}: '{}' has now bin data = {}".format(type(self).__name__, publisher.name, bin(publisher.data))) def main(): df = DefaultFormatter('test1') print(df) print() hf = HexFormatter() df.add(hf) df.data = 3 print(df) print() bf = BinaryFormatter() df.add(bf) df.data = 21 print(df) print() df.remove(hf) df.data = 40 print(df) print() df.remove(hf) df.add(bf) df.data = 'hello' print(df) print() df.data = 15.8 print(df) if __name__ == '__main__': main() Executing observer.py gives the following output: >>> python3 observer.py DefaultFormatter: 'test1' has data = 0 HexFormatter: 'test1' has now hex data = 0x3 DefaultFormatter: 'test1' has data = 3 HexFormatter: 'test1' has now hex data = 0x15 BinaryFormatter: 'test1' has now bin data = 0b10101 DefaultFormatter: 'test1' has data = 21 BinaryFormatter: 'test1' has now bin data = 0b101000 DefaultFormatter: 'test1' has data = 40 Failed to remove: <__main__.HexFormatter object at 0x7f30a2fb82e8> Failed to add: <__main__.BinaryFormatter object at 0x7f30a2fb8320> Error: invalid literal for int() with base 10: 'hello' BinaryFormatter: 'test1' has now bin data = 0b101000 DefaultFormatter: 'test1' has data = 40 BinaryFormatter: 'test1' has now bin data = 0b1111 DefaultFormatter: 'test1' has data = 15 What we see in the output is that as the extra observers are added, more (and relevant) output is shown, and when an observer is removed, it is not notified any longer. That's exactly what we want: runtime notifications that we are able to enable/disable on demand. The defensive programming part of the application also seems to work fine. Trying to do funny things such as removing an observer that does not exist or adding the same observer twice is not allowed. The messages shown are not very user-friendly but I leave that up to you as an exercise. Runtime failures of trying to pass a string when the API expects a number are also properly handled without causing the application to crash/terminate. This example would be much more interesting if it were interactive. Even a simple menu that allows the user to attach/detach observers at runtime and modify the value of DefaultFormatter would be nice because the runtime aspect becomes much more visible. Feel free to do it. Another nice exercise is to add more observers. For example, you can add an octal formatter, a roman numeral formatter, or any other observer that uses your favorite representation. Be creative and have fun! Summary In this article, we covered the Observer design pattern. We use Observer when we want to be able to inform/notify all stakeholders (an object or a group of objects) when the state of an object changes. An important feature of observer is that the number of subscribers/observers as well as who the subscribers are may vary and can be changed at runtime. You can refer more books on this topic mentioned as follows: Expert Python Programming: https://www.packtpub.com/application-development/expert-python-programming Learning Python Design Patterns: https://www.packtpub.com/application-development/learning-python-design-patterns Resources for Article: Further resources on this subject: Python Design Patterns in Depth: The Singleton Pattern[article] Python Design Patterns in Depth: The Factory Pattern[article] Automating Processes with ModelBuilder and Python[article]
Read more
  • 0
  • 0
  • 29706

article-image-creating-graph-application-python-neo4j-gephi-linkuriousjs
Greg Roberts
12 Oct 2015
13 min read
Save for later

Creating a graph application with Python, Neo4j, Gephi & Linkurious.js

Greg Roberts
12 Oct 2015
13 min read
I love Python, and to celebrate Packt Python week, I’ve spent some time developing an app using some of my favorite tools. The app is a graph visualization of Python and related topics, as well as showing where all our content fits in. The topics are all StackOverflow tags, related by their co-occurrence in questions on the site. The app is available to view at http://gregroberts.github.io/ and in this blog, I’m going to discuss some of the techniques I used to construct the underlying dataset, and how I turned it into an online application. Graphs, not charts Graphs are an incredibly powerful tool for analyzing and visualizing complex data. In recent years, many different graph database engines have been developed to make use of this novel manner of representing data. These databases offer many benefits over traditional, relational databases because of how the data is stored and accessed. Here at Packt, I use a Neo4j graph to store and analyze data about our business. Using the Cypher query language, it’s easy to express complicated relations between different nodes succinctly. It’s not just the technical aspect of graphs which make them appealing to work with. Seeing the connections between bits of data visualized explicitly as in a graph helps you to see the data in a different light, and make connections that you might not have spotted otherwise. This graph has many uses at Packt, from customer segmentation to product recommendations. In the next section, I describe the process I use to generate recommendations from the database. Make the connection For product recommendations, I use what’s known as a hybrid filter. This considers both content based filtering (product x and y are about the same topic) and collaborative filtering (people who bought x also bought y). Each of these methods has strengths and weaknesses, so combining them into one algorithm provides a more accurate signal. The collaborative aspect is straightforward to implement in Cypher. For a particular product, we want to find out which other product is most frequently bought alongside it. We have all our products and customers stored as nodes, and purchases are stored as edges. Thus, the Cypher query we want looks like this: MATCH (n:Product {title:’Learning Cypher’})-[r:purchased*2]-(m:Product) WITH m.title AS suggestion, count(distinct r)/(n.purchased+m.purchased) AS alsoBought WHERE m<>n RETURN* ORDER BY alsoBought DESC and will very efficiently return the most commonly also purchased product. When calculating the weight, we divide by the total units sold of both titles, so we get a proportion returned. We do this so we don’t just get the titles with the most units; we’re effectively calculating the size of the intersection of the two titles’ audiences relative to their overall audience size. The content side of the algorithm looks very similar: MATCH (n:Product {title:’Learning Cypher’})-[r:is_about*2]-(m:Product) WITH m.title AS suggestion, count(distinct r)/(length(n.topics)+length(m.topics)) AS alsoAbout WHERE m<>n RETURN * ORDER BY alsoAbout DESC Implicit in this algorithm is knowledge that a title is_about a topic of some kind. This could be done manually, but where’s the fun in that? In Packt’s domain there already exists a huge, well moderated corpus of technology concepts and their usage: StackOverflow. The tagging system on StackOverflow not only tells us about all the topics developers across the world are using, it also tells us how those topics are related, by looking at the co-occurrence of tags in questions. So in our graph, StackOverflow tags are nodes in their own right, which represent topics. These nodes are connected via edges, which are weighted to reflect their co-occurrence on StackOverflow: edge_weight(n,m) = (Number of questions tagged with both n & m)/(Number questions tagged with n or m) So, to find topics related to a given topic, we could execute a query like this: MATCH (n:StackOverflowTag {name:'Matplotlib'})-[r:related_to]-(m:StackOverflowTag) RETURN n.name, r.weight, m.name ORDER BY r.weight DESC LIMIT 10 Which would return the following: | n.name | r.weight | m.name ----+------------+----------+-------------------- 1 | Matplotlib | 0.065699 | Plot 2 | Matplotlib | 0.045678 | Numpy 3 | Matplotlib | 0.029667 | Pandas 4 | Matplotlib | 0.023623 | Python 5 | Matplotlib | 0.023051 | Scipy 6 | Matplotlib | 0.017413 | Histogram 7 | Matplotlib | 0.015618 | Ipython 8 | Matplotlib | 0.013761 | Matplotlib Basemap 9 | Matplotlib | 0.013207 | Python 2.7 10 | Matplotlib | 0.012982 | Legend There are many, more complex relationships you can define between topics like this, too. You can infer directionality in the relationship by looking at the local network, or you could start constructing Hyper graphs using the extensive StackExchange API. So we have our topics, but we still need to connect our content to topics. To do this, I’ve used a two stage process. Step 1 – Parsing out the topics We take all the copy (words) pertaining to a particular product as a document representing that product. This includes the title, chapter headings, and all the copy on the website. We use this because it’s already been optimized for search, and should thus carry a fair representation of what the title is about. We then parse this document and keep all the words which match the topics we’ve previously imported. #...code for fetching all the copy for all the products key_re = 'W(%s)W' % '|'.join(re.escape(i) for i in topic_keywords) for i in documents tags = re.findall(key_re, i[‘copy’]) i['tags'] = map(lambda x: tag_lookup[x],tags) Having done this for each product, we have a bag of words representing each product, where each word is a recognized topic. Step 2 – Finding the information From each of these documents, we want to know the topics which are most important for that document. To do this, we use the tf-idf algorithm. tf-idf stands for term frequency, inverse document frequency. The algorithm takes the number of times a term appears in a particular document, and divides it by the proportion of the documents that word appears in. The term frequency factor boosts terms which appear often in a document, whilst the inverse document frequency factor gets rid of terms which are overly common across the entire corpus (for example, the term ‘programming’ is common in our product copy, and whilst most of the documents ARE about programming, this doesn’t provide much discriminating information about each document). To do all of this, I use python (obviously) and the excellent scikit-learn library. Tf-idf is implemented in the class sklearn.feature_extraction.text.TfidfVectorizer. This class has lots of options you can fiddle with to get more informative results. import sklearn.feature_extraction.text as skt tagger = skt.TfidfVectorizer(input = 'content', encoding = 'utf-8', decode_error = 'replace', strip_accents = None, analyzer = lambda x: x, ngram_range = (1,1), max_df = 0.8, min_df = 0.0, norm = 'l2', sublinear_tf = False) It’s a good idea to use the min_df & max_df arguments of the constructor so as to cut out the most common/obtuse words, to get a more informative weighting. The ‘analyzer’ argument tells it how to get the words from each document, in our case, the documents are already lists of normalized words, so we don’t need anything additional done. #create vectors of all the documents vectors = tagger.fit_transform(map(lambda x: x['tags'],rows)).toarray() #get back the topic names to map to the graph t_map = tagger.get_feature_names() jobs = [] for ind, vec in enumerate(vectors): features = filter(lambda x: x[1]>0, zip(t_map,vec)) doc = documents[ind] for topic, weight in features: job = ‘’’MERGE (n:StackOverflowTag {name:’%s’}) MERGE (m:Product {id:’%s’}) CREATE UNIQUE (m)-[:is_about {source:’tf_idf’,weight:%d}]-(n) ’’’ % (topic, doc[‘id’], weight) jobs.append(job) We then execute all of the jobs using Py2neo’s Batch functionality. Having done all of this, we can now relate products to each other in terms of what topics they have in common: MATCH (n:Product {isbn10:'1783988363'})-[r:is_about]-(a)-[q:is_about]-(m:Product {isbn10:'1783289007'}) WITH a.name as topic, r.weight+q.weight AS weight RETURN topic ORDER BY weight DESC limit 6 Which returns: | topic ---+------------------ 1 | Machine Learning 2 | Image 3 | Models 4 | Algorithm 5 | Data 6 | Python Huzzah! I now have a graph into which I can throw any piece of content about programming or software, and it will fit nicely into the network of topics we’ve developed. Take a breath So, that’s how the graph came to be. To communicate with Neo4j from Python, I use the excellent py2neo module, developed by Nigel Small. This module has all sorts of handy abstractions to allow you to work with nodes and edges as native Python objects, and then update your Neo instance with any changes you’ve made. The graph I’ve spoken about is used for many purposes across the business, and has grown in size and scope significantly over the last year. For this project, I’ve taken from this graph everything relevant to Python. I started by getting all of our content which is_about Python, or about a topic related to python: titles = [i.n for i in graph.cypher.execute('''MATCH (n)-[r:is_about]-(m:StackOverflowTag {name:'Python'}) return distinct n''')] t2 = [i.n for i in graph.cypher.execute('''MATCH (n)-[r:is_about]-(m:StackOverflowTag)-[:related_to]-(m:StackOverflowTag {name:'Python'}) where has(n.name) return distinct n''')] titles.extend(t2) then hydrated this further by going one or two hops down each path in various directions, to get a large set of topics and content related to Python. Visualising the graph Since I started working with graphs, two visualisation tools I’ve always used are Gephi and Sigma.js. Gephi is a great solution for analysing and exploring graphical data, allowing you to apply a plethora of different layout options, find out more about the statistics of the network, and to filter and change how the graph is displayed. Sigma.js is a lightweight JavaScript library which allows you to publish beautiful graph visualizations in a browser, and it copes very well with even very large graphs. Gephi has a great plugin which allows you to export your graph straight into a web page which you can host, share and adapt. More recently, Linkurious have made it their mission to bring graph visualization to the masses. I highly advise trying the demo of their product. It really shows how much value it’s possible to get out of graph based data. Imagine if your Customer Relations team were able to do a single query to view the entire history of a case or customer, laid out as a beautiful graph, full of glyphs and annotations. Linkurious have built their product on top of Sigma.js, and they’ve made available much of the work they’ve done as the open source Linkurious.js. This is essentially Sigma.js, with a few changes to the API, and an even greater variety of plugins. On Github, each plugin has an API page in the wiki and a downloadable demo. It’s worth cloning the repository just to see the things it’s capable of! Publish It! So here’s the workflow I used to get the Python topic graph out of Neo4j and onto the web. Use Py2neo to graph the subgraph of content and topics pertinent to Python, as described above Add to this some other topics linked to the same books to give a fuller picture of the Python “world” Add in topic-topic edges and product-product edges to show the full breadth of connections observed in the data Export all the nodes and edges to csv files Import node and edge tables into Gephi. The reason I’m using Gephi as a middle step is so that I can fiddle with the visualisation in Gephi until it looks perfect. The layout plugin in Sigma is good, but this way the graph is presentable as soon as the page loads, the communities are much clearer, and I’m not putting undue strain on browsers across the world! The layout of the graph has been achieved using a number of plugins. Instead of using the pre-installed ForceAtlas layouts, I’ve used the OpenOrd layout, which I feel really shows off the communities of a large graph. There’s a really interesting and technical presentation about how this layout works here. Export the graph into gexf format, having applied some partition and ranking functions to make it more clear and appealing. Now it’s all down to Linkurious and its various plugins! You can explore the source code of the final page to see all the details, but here I’ll give an overview of the different plugins I’ve used for the different parts of the visualisation: First instantiate the graph object, pointing to a container (note the CSS of the container, without this, the graph won’t display properly: <style type="text/css"> #container { max-width: 1500px; height: 850px; margin: auto; background-color: #E5E5E5; } </style> … <div id="container"></div> … <script> s= new sigma({ container: 'container', renderer: { container: document.getElementById('container'), type: 'canvas' }, settings: { … } }); sigma.parsers.gexf - used for (trivially!) importing a gexf file into a sigma instance sigma.parsers.gexf( 'static/data/Graph1.gexf', s, function(s) { //callback executed once the data is loaded, use this to set up any aspects of the app which depend on the data }); sigma.plugins.filter - Adds the ability to very simply hide nodes/edges based on a callback function which returns a boolean. This powers the filtering widgets on the page. <input class="form-control" id="min-degree" type="range" min="0" max="0" value="0"> … function applyMinDegreeFilter(e) { var v = e.target.value; $('#min-degree-val').textContent = v; filter .undo('min-degree') .nodesBy( function(n, options) { return this.graph.degree(n.id) >= options.minDegreeVal; },{ minDegreeVal: +v }, 'min-degree' ) .apply(); }; $('#min-degree').change(applyMinDegreeFilter); sigma.plugins.locate - Adds the ability to zoom in on a single node or collection of nodes. Very useful if you’re filtering a very large initial graph function locateNode (nid) { if (nid == '') { locate.center(1); } else { locate.nodes(nid); } }; sigma.renderers.glyphs - Allows you to add custom glyphs to each node. Useful if you have many types of node. Outro This application has been a very fun little project to build. The improvements to Sigma wrought by Linkurious have resulted in an incredibly powerful toolkit to rapidly generate graph based applications with a great degree of flexibility and interaction potential. None of this would have been possible were it not for Python. Python is my right (left, I’m left handed) hand which I use for almost everything. Its versatility and expressiveness make it an incredibly robust Swiss army knife in any data-analysts toolkit.
Read more
  • 0
  • 0
  • 29669

article-image-the-openjdk-transition-things-to-know-and-do
Rich Sharples
28 Feb 2019
5 min read
Save for later

The OpenJDK Transition: Things to know and do

Rich Sharples
28 Feb 2019
5 min read
At this point, it should not be a surprise that a number of major changes were announced in the Java ecosystem, some of which have the potential to force a reassessment of Java roadmaps and even vendor selection for enterprise Java users. Some of the biggest changes are taking place in the Upstream OpenJDK (Open Java Development Kit), which means that users will need to have a backup plan in place for the transition. Read on to better understand exactly what the changes Oracle made to Oracle JDK are, why you should care about them and how to choose the best next steps for your organization. For some quick background, OpenJDK began as a free and open source implementation of the Java Platform, Standard Edition, through a 2006 Sun Microsystems initiative. They made the announcement at the 2006 JavaOne event that Java and the core Java platform would be open sourced. Over the next few years, major components of the Java Platform were released as free, open source software using the GPL. Other vendors - like Red Hat - then got involved with the project about a year later, through a broad contributor agreement which covers the participation of non-Sun engineers in the project. This piece is by Rich Sharples, Senior Director of Product Management at Red Hat , on what Oracle ending free lifecycle support means, and what steps users can take now to prepare for the change. So what is new in the world of Java and JDK? Why should you care? Okay, enough about the history. What are we anticipating for the future of Java and OpenJDK? At the end of January 2019, Oracle officially ended free public updates to Oracle JDK for non-Oracle customer commercial users. These users will no longer be able to get updates without an Oracle support contract. Additionally, Oracle has changed the Oracle JDK license (BCPL) so that commercial use for JDK 11 and beyond will require an Oracle subscription. They do also have a non-proprietary (GPLv2+CPE) distribution but the support policy around that is unclear at this time. This is a big deal because it means that users need to have strategies and plans in place to continue to get the support that Oracle had offered. You may be wondering whether it really is critical to run OpenJDK with support and if you can get away with running it without the support. The truth is that while the open source license and nature of OpenJDK means that you can technically run the software with no commercial support, it doesn’t mean that you necessarily should. There are too many risks associated with running OpenJDK unsupported, including numerous cases of critical vulnerabilities and security flaws. While there is nothing inherently insecure about open source software, when it is deployed without support or even a maintenance plan, it can open up your organization to threats that you may not be prepared to handle and resolve. Additionally, and probably the biggest reason why it is important to have commercial OpenJDK support, is that without a third party to help, you have to be entirely self-sufficient, meaning that you would have to ensure that you have specific engineering efforts that are permanently allocated to monitoring the upstream project and maintaining installations of OpenJDK. Not all organizations have the bandwidth or resources to be able to do this consistently. How can vendors help and what are other key technology considerations? Software vendors, including Red Hat, offer OpenJDK distributions and support to make sure that those who had been reliant on Oracle JDK have a seamless transition to take care of the above risks associated with not having OpenJDK support. It also allows customers and partners to focus on their business software, rather than needing to spend valuable engineering efforts on underlying Java runtimes. Additionally, Java SE 11 has introduced some significant new features, as well as deprecated others, and is the first significant update to the platform that will require users to think seriously about the impact of moving from it. With this, it is important to separate upgrading from Java SE 8 to Java SE 11 from moving from one vendor’s Java SE distribution to another vendor’s. In fact, moving from Java SE distributions - ones that are based in OpenJDK - without needing to change versions should be a fairly simple task. It is not recommended to change both the vendor and the version in one single step. Luckily also, the differences between Oracle JDK and OpenJDK are fairly minor and in fact, are slowly aligning to become more similar. Technology vendors can help in the transition that will inevitably come for OpenJDK - if it has not happened already. Having a proper regression test plan in place to ensure that applications run as they previously did, with a particular focus on performance and scalability is key and is something that vendors can help set up. Auditing and understanding if you need to make any code changes to applications is also a major undertaking that a third-party can help guide you on, that likely includes rulesets for the migrations. Finally, third-party vendors can help you deploy to the new OpenJDK solution and provide patches and security guidance as it comes up, to help keep the environment secure, updated and safe. While there are changes to Oracle JDK, once you find the alternative solution that is best suited for your organization, it should be fairly straightforward and cost-effective to make the transition, and third party vendors have the expertise, knowledge, and experience to help guide users through the OpenJDK transition. OpenJDK Project Valhalla is now in Phase III State of OpenJDK: Past, Present and Future with Oracle Mark Reinhold on the evolution of Java platform and OpenJDK
Read more
  • 0
  • 0
  • 29638
article-image-adding-postgis-layers-using-qgis-tutorial
Pravin Dhandre
26 Jul 2018
5 min read
Save for later

Adding PostGIS layers using QGIS [Tutorial]

Pravin Dhandre
26 Jul 2018
5 min read
Viewing tables as layers is great for creating maps or for simply working on a copy of the database outside the database. In this tutorial, we will establish a connection to our PostGIS database in order to add a table as a layer in QGIS (formerly known as Quantum GIS). Please navigate to the following site to install the latest version LTR of QGIS. On this page, click on Download Now and you will be able to choose a suitable operating system and the relevant settings. QGIS is available for Android, Linux, macOS X, and Windows. You might also be inclined to click on Discover QGIS to get an overview of basic information about the program along with features, screenshots, and case studies. This QGIS tutorial is an excerpt from a book written by Mayra Zurbaran,Pedro Wightman, Paolo Corti, Stephen Mather, Thomas Kraft and Bborie Park, titled PostGIS Cookbook - Second Edition. Loading Database... To begin, create the schema for this tutorial then, download data from the U.S. Census Bureau's FTP site: The shapefile is All Lines for Cuyahoga county in Ohio, which consist of roads and streams among other line features. Extract the ZIP file to your working directory and then load it into your database using shp2pgsql. Be sure to specify the spatial reference system, EPSG/SRID: 4269. When in doubt about using projections, use the service provided by the folks at OpenGeo at the following website: http://prj2epsg.org/search Use the following command to generate the SQL to load the shapefile: shp2pgsql -s 4269 -W LATIN1 -g the_geom -I tl_2012_39035_edges.shp chp11.tl_2012_39035_edges > tl_2012_39035_edges.sql How to do it... Now it's time to give the data we downloaded a look using QGIS. We must first create a connection to the database in order to access the table. Get connected and add the table as a layer by following the ensuing steps: Click on the Add PostGIS Layers icon: Click on the New button below the Connections drop-down menu. Create a new PostGIS connection. After the Add PostGIS Table(s) window opens, create a name for the connection and fill in a few parameters for your database, including Host, Port, Database, Username, and Password: Once you have entered all of the pertinent information for your database, click on the Test Connection button to verify that the connection is successful. If the connection is not successful, double-check for typos and errors. Additionally, make sure you are attempting to connect to a PostGIS-enabled database. If the connection is successful, go ahead and check the Save Username and Save Password checkboxes. This will prevent you from having to enter your login information multiple times throughout the exercise. Click on OK at the bottom of the menu to apply the connection settings. Now you can connect! Make sure the name of your PostGIS connection appears in the drop-down menu and then click on the Connect button. If you choose not to store your username and password, you will be asked to submit this information every time you try to access the database. Once connected, all schemas within the database will be shown and the tables will be made visible by expanding the target schema. Select the table(s) to be added as a layer by simply clicking on the table name or anywhere along its row. Selection(s) will be highlighted in blue. To deselect a table, click on it a second time and it will no longer be highlighted. Select the tl_2012_39035_edges table that was downloaded at the beginning of the tutorial and click on the Add button, as shown in the following screenshot: A subset of the table can also be added as a layer. This is accomplished by double-clicking on the desired table name. The Query Builder window will open, which aids in creating simple SQL WHERE clause statements. Add the roads by selecting the records where roadflg = Y. This can be done by typing a query or using the buttons within Query Builder: Click on the OK button followed by the Add button. A subset of the table is now loaded into QGIS as a layer. The layer is strictly a static, temporary copy of your database. You can make whatever changes you like to the layer and not affect the database table. The same holds true the other way around. Changes to the table in the database will have no effect on the layer in QGIS. If needed, you can save the temporary layer in a variety of formats, such as DXF, GeoJSON, KML, or SHP. Simply right-click on the layer name in the Layers panel and click on Save As. This will then create a file, which you can recall at a later time or share with others. The following screenshot shows the Cuyahoga county road network: You may also use the QGIS Browser Panel to navigate through the now connected PostGIS database and list the schemas and tables. This panel allows you to double-click to add spatial layers to the current project, providing a better user experience not only of connected databases, but on any directory of your machine: How it works... You have added a PostGIS layer into QGIS using the built-in Add PostGIS Table GUI. This was achieved by creating a new connection and entering your database parameters. Any number of database connections can be set up simultaneously. If working with multiple databases is more common for your workflows, saving all of the connections into one XML file and would save time and energy when returning to these projects in QGIS. To explore more on 3D capabilities of PostGIS, including LiDAR point clouds, read PostGIS Cookbook - Second Edition. Using R to implement Kriging – A Spatial Interpolation technique for Geostatistics data Top 7 libraries for geospatial analysis Learning R for Geospatial Analysis
Read more
  • 0
  • 0
  • 29598

article-image-introducing-weld-a-runtime-written-in-rust-and-llvm-for-cross-library-optimizations
Bhagyashree R
24 Sep 2019
5 min read
Save for later

Introducing Weld, a runtime written in Rust and LLVM for cross-library optimizations

Bhagyashree R
24 Sep 2019
5 min read
Weld is an open-source Rust project for improving the performance of data-intensive applications. It is an interface and runtime that can be integrated into existing frameworks including Spark, TensorFlow, Pandas, and NumPy without changing their user-facing APIs. The motivation behind Weld Data analytics applications today often require developers to combine various functions from different libraries and frameworks to accomplish a particular task. For instance, a typical Python ecosystem application selects some data using Spark SQL, transforms it using NumPy and Pandas, and trains a model with TensorFlow. This improves developers’ productivity as they are taking advantage of functions from high-quality libraries. However, these functions are usually optimized in isolation, which is not enough to achieve the best application performance. Weld aims to solve this problem by providing an interface and runtime that can optimize across data-intensive libraries and frameworks while preserving their user-facing APIs. In an interview with Federico Carrone, a Tech Lead at LambdaClass, Weld’s main contributor, Shoumik Palkar shared, “The motivation behind Weld is to provide bare-metal performance for applications that rely on existing high-level APIs such as NumPy and Pandas. The main problem it solves is enabling cross-function and cross-library optimizations that other libraries today don’t provide.” How Weld works Weld serves as a common runtime that allows libraries from different domains like SQL and machine learning to represent their computations in a common functional intermediate representation (IR). This IR is then optimized by a compiler optimizer and JIT’d to efficient machine code for diverse parallel hardware. It performs a wide range of optimizations on the IR including loop fusion, loop tiling, and vectorization. “Weld’s IR is natively parallel, so programs expressed in it can always be trivially parallelized,” said Palkar. When Weld was first introduced it was mainly used for cross-library optimizations. However, over time people have started to use it for other applications as well. It can be used to build JITs or new physical execution engines for databases or analytics frameworks, individual libraries, target new kinds of parallel hardware using the IR, and more. To evaluate Weld’s performance the team integrated it with popular data analytics frameworks including Spark, NumPy, and TensorFlow. This prototype showed up to 30x improvements over the native framework implementations. While cross library optimizations between Pandas and NumPy also improved performance by up to two orders of magnitude. Source: Weld Why Rust and LLVM were chosen for its implementation The first iteration of Weld was implemented in Scala because of its algebraic data types, powerful pattern matching, and large ecosystem. However, it did have some shortcomings. Palkar shared in the interview, “We moved away from Scala because it was too difficult to embed a JVM-based language into other runtimes and languages.” It had a managed runtime, clunky build system, and its JIT compilations were quite slow for larger programs. Because of these shortcomings the team wanted to redesign the JIT compiler, core API, and runtime from the ground up. They were in the search for a language that was fast, safe, didn’t have a managed runtime, provided a rich standard library, functional paradigms, good package manager, and great community background. So, they zeroed-in on Rust that happens to meet all these requirements. Rust provides a very minimal, no setup required runtime. It can be easily embedded into other languages such as Java and Python. To make development easier, it has high-quality packages, known as crates, and functional paradigms such as pattern matching. Lastly, it is backed by a great Rust Community. Read also: “Rust is the future of systems programming, C is the new Assembly”: Intel principal engineer, Josh Triplett Explaining the reason why they chose LLVM, Palkar said in the interview, “We chose LLVM because its an open-source compiler framework that has wide use and support; we generate LLVM directly instead of C/C++ so we don’t need to rely on the existence of a C compiler, and because it improves compilation times (we don’t need to parse C/C++ code).” In a discussion on Hacker News many users listed other Weld-like projects that developers may find useful. A user commented, “Also worth checking out OmniSci (formerly MapD), which features an LLVM query compiler to gain large speedups executing SQL on both CPU and GPU.” Users also talked about Numba, an open-source JIT compiler that translates Python functions to optimized machine code at runtime with the help of the LLVM compiler library.  “Very bizarre there is no discussion of numba here, which has been around and used widely for many years, achieves faster speedups than this, and also emits an LLVM IR that is likely a much better starting point for developing a “universal” scientific computing IR than doing yet another thing that further complicates it with fairly needless involvement of Rust,” a user added. To know more about Weld, check out the full interview on Medium. Also, watch this RustConf 2019 talk by Shoumik Palkar: https://www.youtube.com/watch?v=AZsgdCEQjFo&t Other news in Programming Darklang available in private beta GNU community announces ‘Parallel GCC’ for parallelism in real-world compilers TextMate 2.0, the text editor for macOS releases  
Read more
  • 0
  • 0
  • 29549
Modal Close icon
Modal Close icon