Programming | 0 articles | Tech News, Tutorials & Expert Insights

article-image-capability-model-microservices

17 Jun 2016

19 min read

A capability model for microservices

17 Jun 2016

0
0
31235

article-image-python-3-8-available-walrus-operator-positional-only-parameters-vectorcall

Sugandha Lahoti

15 Oct 2019

6 min read

Python 3.8 is now available with walrus operator, positional-only parameters support for Vectorcall, and more

Sugandha Lahoti

15 Oct 2019

6 min read

Yesterday, the latest version of the Python programming language, Python 3.8 was made available with multiple new improvements and features. Features include the new walrus operator and positional-only parameters, runtime audit Hooks, Vectorcall, a fast calling protocol for CPython and more. Earlier this month, the team behind Python announced the release of Python 3.8b2, the second of four planned beta releases. What’s new in Python 3.8 PEP 572: New walrus operator in assignment expressions Python 3.8 has a new walrus operator := that assigns values to variables as part of a larger expression. It is useful when matching regular expressions where match objects are needed twice. It can also be used with while-loops that compute a value to test loop termination and then need that same value again in the body of the loop. It can also be used in list comprehensions where a value computed in a filtering condition is also needed in the expression body. The walrus operator was proposed in PEP 572 (Assignment Expressions) by Chris Angelico, Tim Peters, and Guido van Rossum last year. Since then it has been heavily discussed in the Python community with many questioning whether it is a needed improvement. Others are excited as the operator does make the code more readable. One user commented on HN, “The "walrus operator" will occasionally be useful, but I doubt I will find many effective uses for it. Same with the forced positional/keyword arguments and the "self-documenting" f-string expressions. Even when they have a use, it's usually just to save one line of code or a few extra characters.” https://twitter.com/reynoldsnlp/status/1183498971425042433 https://twitter.com/jakevdp/status/1140071525870997504 PEP 570: New function parameter syntax in positional-only parameters Python 3.8 has a new function parameter syntax / to indicate that some function parameters must be specified positionally and cannot be used as keyword arguments. This notation allows pure Python functions to fully emulate behaviors of existing C-coded functions. It can be used to preclude keyword arguments when the parameter name is not helpful. It also allows the parameter name to be changed in the future without the risk of breaking client code. As with PEP 572, this proposal also got mixed reactions from Python developers. In support, one developer said, “Position-only parameters already exist in cpython builtins like range and min. Making their support at the language level would make their existence less confusing and documented.” While others think that this will allow authors to “dictate” how their methods could be used. “Not the biggest fan of this one because it allows library authors to overly dictate how their functions can be used, as in, mark an argument as positional merely because they want to. But cool all the same,” a Redditor commented. PEP 578: Python Audit Hooks and Verified Open Hook Python 3.8 now has an Audit Hook and Verified Open Hook. These hooks allow applications and frameworks written in pure Python code to take advantage of extra notifications. They also allow embedders or system administrators to deploy builds of Python where auditing is always enabled. These are available from Python and native code. PEP 587: New C API to configure the Python Initialization Though Python is highly configurable, its configuration seems scattered all around the code. Python 3.8 adds a new C API to configure the Python Initialization providing finer control on the whole configuration and better error reporting. This PEP also adds _PyRuntimeState.preconfig (PyPreConfig type) and PyInterpreterState.config (PyConfig type) fields to internal structures. PyInterpreterState.config becomes the new reference configuration, replacing global configuration variables and other private variables. PEP 590: Provisional support for Vectorcall, a fast calling protocol for CPython A currently provisional Vectorcall protocol is added to the Python/C API. It is meant to formalize existing optimizations that were already done for various classes. Any extension type implementing a callable can use this protocol. It will be made fully public in Python 3.9. PEP 574: Pickle protocol 5 supports out-of-band data buffers The pickle protocol 5 now introduces support for out-of-band buffers. This means PEP 3118 compatible data can be transmitted separately from the main pickle stream, at the discretion of the communication layer. Parallel filesystem cache for compiled bytecode files There is a new PYTHONPYCACHEPREFIX setting that configures the implicit bytecode cache to use a separate parallel filesystem tree, rather than the default __pycache__ subdirectories within each source directory. Python uses the same ABI whether it’s built-in release or debug mode With Python 3.8, Python uses the same ABI whether it’s built-in release or debug mode. On Unix, when Python is built in debug mode, it is now possible to load C extensions built-in release mode and C extensions built using the stable ABI. On Unix, C extensions are no longer linked to libpython except on Android and Cygwin. Also, on Unix, when Python is built in debug mode, import now also looks for C extensions compiled in release mode and for C extensions compiled with the stable ABI. f-strings now have a = specifier Formatted strings (f-strings) were introduced in Python 3.6 with PEP 498. It enables you to evaluate an expression as part of the string along with inserting the result of function calls and so on. Python 3.8 adds a = specifier to f-strings for self-documenting expressions and debugging. An f-string such as f'{expr=}' will expand to the text of the expression, an equal sign, then the representation of the evaluated expression. One developer expressed their delight on Hacker News, “F strings are pretty awesome. I’m coming from JavaScript and partially java background. JavaScript’s String concatenation can become too complex and I have difficulty with large strings.” Another developer said, “The expansion of f-strings is a welcome addition. The more I use them, the happier I am that they exist.” Someone added to this, “This makes clean string interpolation so much easier to do, especially for print statements. It's almost hard to use python < 3.6 now because of them. New metadata module Python 3.8 has a new importlib.metadata module that provides (provisional) support for reading metadata from third-party packages. It can, for instance, extract an installed package’s version number, list of entry points, and more. You can go through other improved modules, language changes, Build and C API changes, API and Feature removals in Python 3.8 on Python docs. For full details, see the changelog. Python 3.8b2 new features: the walrus operator, positional-only parameters, and much more Python 3.8 beta 1 is now ready for you to test Łukasz Langa at PyLondinium19: “If Python stays synonymous with CPython for too long, we’ll be in big trouble. PyPy will continue to support Python 2.7, even as major Python projects migrate to Python 3.

0
0
30875

article-image-work-with-classes-in-typescript

Amey Varangaonkar

15 May 2018

8 min read

How to work with classes in Typescript

Amey Varangaonkar

15 May 2018

8 min read

If we are developing any application using TypeScript, be it a small-scale or a large-scale application, we will use classes to manage our properties and methods. Prior to ES 2015, JavaScript did not have the concept of classes, and we used functions to create class-like behavior. TypeScript introduced classes as part of its initial release, and now we have classes in ES6 as well. The behavior of classes in TypeScript and JavaScript ES6 closely relates to the behavior of any object-oriented language that you might have worked on, such as C#. This excerpt is taken from the book TypeScript 2.x By Example written by Sachin Ohri. Object-oriented programming in TypeScript Object-oriented programming allows us to represent our code in the form of objects, which themselves are instances of classes holding properties and methods. Classes form the container of related properties and their behavior. Modeling our code in the form of classes allows us to achieve various features of object-oriented programming, which helps us write more intuitive, reusable, and robust code. Features such as encapsulation, polymorphism, and inheritance are the result of implementing classes. TypeScript, with its implementation of classes and interfaces, allows us to write code in an object-oriented fashion. This allows developers coming from traditional languages, such as Java and C#, feel right at home when learning TypeScript. Understanding classes Prior to ES 2015, JavaScript developers did not have any concept of classes; the best way they could replicate the behavior of classes was with functions. The function provides a mechanism to group together related properties and methods. The methods can be either added internally to the function or using the prototype keyword. The following is an example of such a function: function Name (firstName, lastName) { this.firstName = firstName; this.lastName = lastName; this.fullName = function() { return this.firstName + ' ' + this.lastName ; }; } In this preceding example, we have the fullName method encapsulated inside the Name function. Another way of adding methods to functions is shown in the following code snippet with the prototype keyword: function Name (firstName, lastName) { this.firstName = firstName; this.lastName = lastName; } Name.prototype.fullName = function() { return this.firstName + ' ' + this.lastName ; }; These features of functions did solve most of the issues of not having classes, but most of the dev community has not been comfortable with these approaches. Classes make this process easier. Classes provide an abstraction on top of common behavior, thus making code reusable. The following is the syntax for defining a class in TypeScript: The syntax of the class should look very similar to readers who come from an object-oriented background. To define a class, we use a class keyword followed by the name of the class. The News class has three member properties and one method. Each member has a type assigned to it and has an access modifier to define the scope. On line 10, we create an object of a class with the new keyword. Classes in TypeScript also have the concept of a constructor, where we can initialize some properties at the time of object creation. Access modifiers Once the object is created, we can access the public members of the class with the dot operator. Note that we cannot access the author property with the espn object because this property is defined as private. TypeScript provides three types of access modifiers. Public Any property defined with the public keyword will be freely accessible outside the class. As we saw in the previous example, all the variables marked with the public keyword were available outside the class in an object. Note that TypeScript assigns public as a default access modifier if we do not assign any explicitly. This is because the default JavaScript behavior is to have everything public. Private When a property is marked as private, it cannot be accessed outside of the class. The scope of a private variable is only inside the class when using TypeScript. In JavaScript, as we do not have access modifiers, private members are treated similarly to public members. Protected The protected keyword behaves similarly to private, with the exception that protected variables can be accessed in the derived classes. The following is one such example: class base{ protected id: number; } class child extends base{ name: string; details():string{ return `${name} has id: ${this.id}` } } In the preceding code, we extend the child class with the base class and have access to the id property inside the child class. If we create an object of the child class, we will still not have access to the id property outside. Readonly As the name suggests, a property with a readonly access modifier cannot be modified after the value has been assigned to it. The value assigned to a readonly property can only happen at the time of variable declaration or in the constructor. In the above code, line 5 gives an error stating that property name is readonly, and cannot be an assigned value. Transpiled JavaScript from classes While learning TypeScript, it is important to remember that TypeScript is a superset of JavaScript and not a new language on its own. Browsers can only understand JavaScript, so it is important for us to understand the JavaScript that is transpiled by TypeScript. TypeScript provides an option to generate JavaScript based on the ECMA standards. You can configure TypeScript to transpile into ES5 or ES6 (ES 2015) and even ES3 JavaScript by using the flag target in the tsconfig.json file. The biggest difference between ES5 and ES6 is with regard to the classes, let, and const keywords which were introduced in ES6. Even though ES6 has been around for more than a year, most browsers still do not have full support for ES6. So, if you are creating an application that would target older browsers as well, consider having the target as ES5. So, the JavaScript that's generated will be different based on the target setting. Here, we will take an example of class in TypeScript and generate JavaScript for both ES5 and ES6. The following is the class definition in TypeScript: This is the same code that we saw when we introduced classes in the Understanding Classes section. Here, we have a class named News that has three members, two of which are public and one private. The News class also has a format method, which returns a string concatenated from the member variables. Then, we create an object of the News class in line 10 and assign values to public properties. In the last line, we call the format method to print the result. Now let's look at the JavaScript transpiled by TypeScript compiler for this class. ES6 JavaScript ES6, also known as ES 2015, is the latest version of JavaScript, which provides many new features on top of ES5. Classes are one such feature; JavaScript did not have classes prior to ES6. The following is the code generated from the TypeScript class, which we saw previously: If you compare the preceding code with TypeScript code, you will notice minor differences. This is because classes in TypeScript and JavaScript are similar, with just types and access modifiers additional in TypeScript. In JavaScript, we do not have the concept of declaring public members. The author variable, which was defined as private and was initialized at its declaration, is converted to a constructor initialization in JavaScript. If we had not have initialized author, then the produced JavaScript would not have added author in the constructor. ES5 JavaScript ES5 is the most popular JavaScript version supported in browsers, and if you are developing an application that has to support the majority of browser versions, then you need to transpile your code to the ES5 version. This version of JavaScript does not have classes, and hence the transpiled code converts classes to functions, and methods inside the classes are converted to prototypically defined methods on the functions. The following is the code transpiled when we have the target set as ES5 in the TypeScript compiler options: As discussed earlier, the basic difference is that the class is converted to a function. The interesting aspect of this conversion is that the News class is converted to an immediately invoked function expression (IIFE). An IIFE can be identified by the parenthesis at the end of the function declaration, as we see in line 9 in the preceding code snippet. IIFEs cause the function to be executed immediately and help to maintain the correct scope of a function rather than declaring the function in a global scope. Another difference was how we defined the method format in the ES5 JavaScript. The prototype keyword is used to add the additional behavior to the function, which we see here. A couple of other differences you may have noticed include the change of the let keyword to var, as let is not supported in ES5. All variables in ES5 are defined with the var keyword. Also, the format method now does not use a template string, but standard string concatenation to print the output. TypeScript does a good job of transpiling the code to JavaScript while following recommended practices. This helps in making sure we have a robust and reusable code with minimum error cases. If you found this tutorial useful, make sure you check out the book TypeScript 2.x By Example for more hands-on tutorials on how to effectively leverage the power of TypeScript to develop and deploy state-of-the-art web applications. How to install and configure TypeScript Understanding Patterns and Architectures in TypeScript Writing SOLID JavaScript code with TypeScript

0
0
30675

Packt

20 Jun 2017

12 min read

What are Microservices?

Packt

20 Jun 2017

12 min read

0
0
30666

article-image-python-design-patterns-depth-factory-pattern

Packt

15 Feb 2016

17 min read

Python Design Patterns in Depth: The Factory Pattern

Packt

15 Feb 2016

17 min read

Creational design patterns deal with an object creation [j.mp/wikicrea]. The aim of a creational design pattern is to provide better alternatives for situations where a direct object creation (which in Python happens by the __init__() function [j.mp/divefunc], [Lott14, page 26]) is not convenient. In the Factory design pattern, a client asks for an object without knowing where the object is coming from (that is, which class is used to generate it). The idea behind a factory is to simplify an object creation. It is easier to track which objects are created if this is done through a central function, in contrast to letting a client create objects using a direct class instantiation [Eckel08, page 187]. A factory reduces the complexity of maintaining an application by decoupling the code that creates an object from the code that uses it [Zlobin13, page 30]. Factories typically come in two forms: the Factory Method, which is a method (or in Pythonic terms, a function) that returns a different object per input parameter [j.mp/factorympat]; the Abstract Factory, which is a group of Factory Methods used to create a family of related products [GOF95, page 100], [j.mp/absfpat] (For more resources related to this topic, see here.) Factory Method In the Factory Method, we execute a single function, passing a parameter that provides information about what we want. We are not required to know any details about how the object is implemented and where it is coming from. A real-life example An example of the Factory Method pattern used in reality is in plastic toy construction. The molding powder used to construct plastic toys is the same, but different figures can be produced using different plastic molds. This is like having a Factory Method in which the input is the name of the figure that we want (soldier and dinosaur) and the output is the plastic figure that we requested. The toy construction case is shown in the following figure, which is provided by www.sourcemaking.com [j.mp/factorympat]. A software example The Django framework uses the Factory Method pattern for creating the fields of a form. The forms module of Django supports the creation of different kinds of fields (CharField, EmailField) and customizations (max_length, required) [j.mp/djangofacm]. Use cases If you realize that you cannot track the objects created by your application because the code that creates them is in many different places instead of a single function/method, you should consider using the Factory Method pattern [Eckel08, page 187]. The Factory Method centralizes an object creation and tracking your objects becomes much more easier. Note that it is absolutely fine to create more than one Factory Method, and this is how it is typically done in practice. Each Factory Method logically groups the creation of objects that have similarities. For example, one Factory Method might be responsible for connecting you to different databases (MySQL, SQLite), another Factory Method might be responsible for creating the geometrical object that you request (circle, triangle), and so on. The Factory Method is also useful when you want to decouple an object creation from an object usage. We are not coupled/bound to a specific class when creating an object, we just provide partial information about what we want by calling a function. This means that introducing changes to the function is easy without requiring any changes to the code that uses it [Zlobin13, page 30]. Another use case worth mentioning is related with improving the performance and memory usage of an application. A Factory Method can improve the performance and memory usage by creating new objects only if it is absolutely necessary [Zlobin13, page 28]. When we create objects using a direct class instantiation, extra memory is allocated every time a new object is created (unless the class uses caching internally, which is usually not the case). We can see that in practice in the following code (file id.py), it creates two instances of the same class A and uses the id() function to compare their memory addresses. The addresses are also printed in the output so that we can inspect them. The fact that the memory addresses are different means that two distinct objects are created as follows: class A(object): pass if __name__ == '__main__': a = A() b = A() print(id(a) == id(b)) print(a, b) Executing id.py on my computer gives the following output:>> python3 id.pyFalse<__main__.A object at 0x7f5771de8f60> <__main__.A object at 0x7f5771df2208> Note that the addresses that you see if you execute the file are not the same as I see because they depend on the current memory layout and allocation. But the result must be the same: the two addresses should be different. There's one exception that happens if you write and execute the code in the Python Read-Eval-Print Loop (REPL) (interactive prompt), but that's a REPL-specific optimization which is not happening normally. Implementation Data comes in many forms. There are two main file categories for storing/retrieving data: human-readable files and binary files. Examples of human-readable files are XML, Atom, YAML, and JSON. Examples of binary files are the .sq3 file format used by SQLite and the .mp3 file format used to listen to music. In this example, we will focus on two popular human-readable formats: XML and JSON. Although human-readable files are generally slower to parse than binary files, they make data exchange, inspection, and modification much more easier. For this reason, it is advised to prefer working with human-readable files, unless there are other restrictions that do not allow it (mainly unacceptable performance and proprietary binary formats). In this problem, we have some input data stored in an XML and a JSON file, and we want to parse them and retrieve some information. At the same time, we want to centralize the client's connection to those (and all future) external services. We will use the Factory Method to solve this problem. The example focuses only on XML and JSON, but adding support for more services should be straightforward. First, let's take a look at the data files. The XML file, person.xml, is based on the Wikipedia example [j.mp/wikijson] and contains information about individuals (firstName, lastName, gender, and so on) as follows: <persons> <person> <firstName>John</firstName> <lastName>Smith</lastName> <age>25</age> <address> <streetAddress>21 2nd Street</streetAddress> <city>New York</city> <state>NY</state> <postalCode>10021</postalCode> </address> <phoneNumbers> <phoneNumber type="home">212 555-1234</phoneNumber> <phoneNumber type="fax">646 555-4567</phoneNumber> </phoneNumbers> <gender> <type>male</type> </gender> </person> <person> <firstName>Jimy</firstName> <lastName>Liar</lastName> <age>19</age> <address> <streetAddress>18 2nd Street</streetAddress> <city>New York</city> <state>NY</state> <postalCode>10021</postalCode> </address> <phoneNumbers> <phoneNumber type="home">212 555-1234</phoneNumber> </phoneNumbers> <gender> <type>male</type> </gender> </person> <person> <firstName>Patty</firstName> <lastName>Liar</lastName> <age>20</age> <address> <streetAddress>18 2nd Street</streetAddress> <city>New York</city> <state>NY</state> <postalCode>10021</postalCode> </address> <phoneNumbers> <phoneNumber type="home">212 555-1234</phoneNumber> <phoneNumber type="mobile">001 452-8819</phoneNumber> </phoneNumbers> <gender> <type>female</type> </gender> </person> </persons> The JSON file, donut.json, comes from the GitHub account of Adobe [j.mp/adobejson] and contains donut information (type, price/unit i.e. ppu, topping, and so on) as follows: [ { "id": "0001", "type": "donut", "name": "Cake", "ppu": 0.55, "batters": { "batter": [ { "id": "1001", "type": "Regular" }, { "id": "1002", "type": "Chocolate" }, { "id": "1003", "type": "Blueberry" }, { "id": "1004", "type": "Devil's Food" } ] }, "topping": [ { "id": "5001", "type": "None" }, { "id": "5002", "type": "Glazed" }, { "id": "5005", "type": "Sugar" }, { "id": "5007", "type": "Powdered Sugar" }, { "id": "5006", "type": "Chocolate with Sprinkles" }, { "id": "5003", "type": "Chocolate" }, { "id": "5004", "type": "Maple" } ] }, { "id": "0002", "type": "donut", "name": "Raised", "ppu": 0.55, "batters": { "batter": [ { "id": "1001", "type": "Regular" } ] }, "topping": [ { "id": "5001", "type": "None" }, { "id": "5002", "type": "Glazed" }, { "id": "5005", "type": "Sugar" }, { "id": "5003", "type": "Chocolate" }, { "id": "5004", "type": "Maple" } ] }, { "id": "0003", "type": "donut", "name": "Old Fashioned", "ppu": 0.55, "batters": { "batter": [ { "id": "1001", "type": "Regular" }, { "id": "1002", "type": "Chocolate" } ] }, "topping": [ { "id": "5001", "type": "None" }, { "id": "5002", "type": "Glazed" }, { "id": "5003", "type": "Chocolate" }, { "id": "5004", "type": "Maple" } ] } ] We will use two libraries that are part of the Python distribution for working with XML and JSON: xml.etree.ElementTree and json as follows: import xml.etree.ElementTree as etree import json The JSONConnector class parses the JSON file and has a parsed_data() method that returns all data as a dictionary (dict). The property decorator is used to make parsed_data() appear as a normal variable instead of a method as follows: class JSONConnector: def __init__(self, filepath): self.data = dict() with open(filepath, mode='r', encoding='utf-8') as f: self.data = json.load(f) @property def parsed_data(self): return self.data The XMLConnector class parses the XML file and has a parsed_data() method that returns all data as a list of xml.etree.Element as follows: class XMLConnector: def __init__(self, filepath): self.tree = etree.parse(filepath) @property def parsed_data(self): return self.tree The connection_factory() function is a Factory Method. It returns an instance of JSONConnector or XMLConnector depending on the extension of the input file path as follows: def connection_factory(filepath): if filepath.endswith('json'): connector = JSONConnector elif filepath.endswith('xml'): connector = XMLConnector else: raise ValueError('Cannot connect to {}'.format(filepath)) return connector(filepath) The connect_to() function is a wrapper of connection_factory(). It adds exception handling as follows: def connect_to(filepath): factory = None try: factory = connection_factory(filepath) except ValueError as ve: print(ve) return factory The main() function demonstrates how the Factory Method design pattern can be used. The first part makes sure that exception handling is effective as follows: def main(): sqlite_factory = connect_to('data/person.sq3') The next part shows how to work with the XML files using the Factory Method. XPath is used to find all person elements that have the last name Liar. For each matched person, the basic name and phone number information are shown as follows: xml_factory = connect_to('data/person.xml') xml_data = xml_factory.parsed_data() liars = xml_data.findall (".//{person}[{lastName}='{}']".format('Liar')) print('found: {} persons'.format(len(liars))) for liar in liars: print('first name: {}'.format(liar.find('firstName').text)) print('last name: {}'.format(liar.find('lastName').text)) [print('phone number ({}):'.format(p.attrib['type']), p.text) for p in liar.find('phoneNumbers')] The final part shows how to work with the JSON files using the Factory Method. Here, there's no pattern matching, and therefore the name, price, and topping of all donuts are shown as follows: json_factory = connect_to('data/donut.json') json_data = json_factory.parsed_data print('found: {} donuts'.format(len(json_data))) for donut in json_data: print('name: {}'.format(donut['name'])) print('price: ${}'.format(donut['ppu'])) [print('topping: {} {}'.format(t['id'], t['type'])) for t in donut['topping']] For completeness, here is the complete code of the Factory Method implementation (factory_method.py) as follows: import xml.etree.ElementTree as etree import json class JSONConnector: def __init__(self, filepath): self.data = dict() with open(filepath, mode='r', encoding='utf-8') as f: self.data = json.load(f) @property def parsed_data(self): return self.data class XMLConnector: def __init__(self, filepath): self.tree = etree.parse(filepath) @property def parsed_data(self): return self.tree def connection_factory(filepath): if filepath.endswith('json'): connector = JSONConnector elif filepath.endswith('xml'): connector = XMLConnector else: raise ValueError('Cannot connect to {}'.format(filepath)) return connector(filepath) def connect_to(filepath): factory = None try: factory = connection_factory(filepath) except ValueError as ve: print(ve) return factory def main(): sqlite_factory = connect_to('data/person.sq3') print() xml_factory = connect_to('data/person.xml') xml_data = xml_factory.parsed_data liars = xml_data.findall(".//{}[{}='{}']".format('person', 'lastName', 'Liar')) print('found: {} persons'.format(len(liars))) for liar in liars: print('first name: {}'.format(liar.find('firstName').text)) print('last name: {}'.format(liar.find('lastName').text)) [print('phone number ({}):'.format(p.attrib['type']), p.text) for p in liar.find('phoneNumbers')] print() json_factory = connect_to('data/donut.json') json_data = json_factory.parsed_data print('found: {} donuts'.format(len(json_data))) for donut in json_data: print('name: {}'.format(donut['name'])) print('price: ${}'.format(donut['ppu'])) [print('topping: {} {}'.format(t['id'], t['type'])) for t in donut['topping']] if __name__ == '__main__': main() Here is the output of this program as follows: >>> python3 factory_method.pyCannot connect to data/person.sq3found: 2 personsfirst name: Jimylast name: Liarphone number (home): 212 555-1234first name: Pattylast name: Liarphone number (home): 212 555-1234phone number (mobile): 001 452-8819found: 3 donutsname: Cakeprice: $0.55topping: 5001 Nonetopping: 5002 Glazedtopping: 5005 Sugartopping: 5007 Powdered Sugartopping: 5006 Chocolate with Sprinklestopping: 5003 Chocolatetopping: 5004 Maplename: Raisedprice: $0.55topping: 5001 Nonetopping: 5002 Glazedtopping: 5005 Sugartopping: 5003 Chocolatetopping: 5004 Maplename: Old Fashionedprice: $0.55topping: 5001 Nonetopping: 5002 Glazedtopping: 5003 Chocolatetopping: 5004 Maple Notice that although JSONConnector and XMLConnector have the same interfaces, what is returned by parsed_data() is not handled in a uniform way. Different codes must be used to work with each connector. Although it would be nice to be able to use the same code for all connectors, this is at most times not realistic unless we use some kind of common mapping for the data which is very often provided by external data providers. Assuming that you can use exactly the same code for handling the XML and JSON files, what changes are required to support a third format, for example, SQLite? Find an SQLite file or create your own and try it. As is now, the code does not forbid a direct instantiation of a connector. Is it possible to do this? Try doing it (hint: functions in Python can have nested classes). Summary To learn more about design patterns in depth, the following books published by Packt Publishing (https://www.packtpub.com/) are recommended: Learning Python Design Patterns (https://www.packtpub.com/application-development/learning-python-design-patterns) Learning Python Design Patterns – Second Edition (https://www.packtpub.com/application-development/learning-python-design-patterns-second-edition) Resources for Article: Further resources on this subject: Recommending Movies at Scale (Python) [article] An In-depth Look at Ansible Plugins [article] Elucidating the Game-changing Phenomenon of the Docker-inspired Containerization Paradigm [article]

0
0
30615

article-image-regular-expressions-awk-programming

Pavan Ramchandani

18 May 2018

8 min read

Regular expressions in AWK programming: What, Why, and How

Pavan Ramchandani

18 May 2018

8 min read

0
0
30605

article-image-firefox-nightly-browser-debugging-your-app-is-now-fun-with-mozillas-new-time-travel-feature

Natasha Mathur

30 Jul 2018

3 min read

Firefox Nightly browser: Debugging your app is now fun with Mozilla’s new ‘time travel’ feature

Natasha Mathur

30 Jul 2018

3 min read

Earlier this month, Mozilla announced a fancy new feature called “Time Travel debugging” for its Firefox Nightly web browser at the JSConf EU 2018. With time travel debugging, you can easily track the bugs in your code or app as it lets you pause and rewind to the exact time when your app broke down. Time travel debugging technology is particularly useful for local web development where it allows you to pause and step forward or backward, pause and rewind to a previous state, rewind to the time a console message was logged and rewind to the time where an element had a certain style. It is also great for times where you might want to save user recordings or view a test recording when the testing fails. With time travel debugging, you can record a tab on your browser and later replay it using WebReplay, an experimental project which allows you to record, rewind and replay the processes for the web. According to Jason Laster, a Senior Software Engineer at Mozilla,“ with time travel, we have a full recording of time, you can jump to any point in the path and see it immediately, you don’t have to refresh or re-click or pause or look at logs”. Here’s a video of Jason Laster talking about the potential of time travel debugging. JSConf He also mentioned how time travel is “not a new thing” and he was inspired by Dan Abramov, creator of Redux when he showcased Redux at JSConfEU saying how he wanted “time travel” to “reduce his action over time”. With Redux, you get a slider that shows you all the actions over time and as you’re moving, you get to see the UI update as well. In fact, Mozilla rebuilt the debugger in order to use React and redux for its time travel feature. Their debugger comes equipped with Redux dev tools, which shows a list of all the actions for the debugger. So, the dev tools show you the state of the app, sources, and the pause data. Finally, Laster added how “this is just the beginning” and that “they hope to pull this off well in the future”. To use this new time travel debugging feature, you must install the Firefox Nightly browser first. For more details on the new feature, check out the official documentation. Mozilla is building a bridge between Rust and JavaScript Firefox has made a password manager for your iPhone Firefox 61 builds on Firefox Quantum, adds Tab Warming, WebExtensions, and TLS 1.3

0
0
30438

article-image-gophercon-2019-go-2-update-open-source-go-library-for-gui-support-for-webassembly-tinygo-for-microcontrollers-and-more

Fatema Patrawala

30 Jul 2019

9 min read

GopherCon 2019: Go 2 update, open-source Go library for GUI, support for WebAssembly, TinyGo for microcontrollers and more

Fatema Patrawala

30 Jul 2019

9 min read

Last week Go programmers had a gala time learning, networking and programming at the Marriott Marquis San Diego Marina as the most awaited event GopherCon 2019 was held starting from 24th July till 27th July. GopherCon this year hit the road at San Diego with some exceptional conferences, and many exciting announcements for more than 1800 attendees from around the world. One of the attendees, Andrea Santillana Fernández, says the Go Community is growing, and doing quite well. She wrote on her blog post on the Source graph website that there are 1 million Go programmers around the world and month on month its membership keeps increasing. Indeed there is a significant growth in the Go community, so what did it have in store for the programmers at this year’s GopherCon 2019: On the road to Go 2 The major milestones for the journey to Go 2 were presented by Russ Coxx on Wednesday last week. He explained the main areas of focus for Go 2, which are as below: Error handling Russ notes that writing a program correctly without errors is hard. But writing a program correctly accounting for errors and external dependencies is much more difficult. He listed down a few errors which led in introducing error handling helpers like an optional Unwrap interface, errors.Is and errors.As in Go 1.13 version. Generics Russ spoke about Generics and said that they started exploring a new design since last year. They are working with programming language theory experts on the problem to help refine the proposal of generics code in Go. In a separate session, Ian Lance Taylor, introduced generics codes in Go. He briefly explained the need, implementation and benefits from generics for the Go language. Next, Taylor reviewed the Go contract design draft which included the addition of optional type parameters to types and functions. Taylor defined generics as “Generic programming which enables the representation of functions and data structures in a generic form, with types factored out.” Generic code is written using types, which are specified later. An unspecified type is called as type parameter. A type parameter offers support only when permitted by contracts. A generic code imparts strong basis for sharing codes and building programs. It can be compiled using an interface-based approach which optimizes time as the package is compiled only once. If a generic code is compiled multiple times, it can carry compile time cost. Ian showed a few sample codes written in Generics in Go. Dependency management In Go 2 the team wants to focus on Dependency management and explicitly refer to dependencies similar to Java. Russ explained this by giving a history of how in 2011 they introduced GOPATH to separate the distribution from the actual dependencies so that users could run multiple different distributions and to separate the concerns of the distribution from the external libraries. Then in 2015, they introduced the go vendor spec to formalize the vendor directory and simplify dependency management implementations. But in practice it did not work well. In 2016, they formed the dependency working group. This team started work on dep: a tool to reshape all the existing tools into one.The problem with dep and the vendor directory was multiple distinct incompatible versions of a dependency were represented by one import path. It is now called as the "Import Compatibility Rule". The team took what worked well and learned from VGo. VGo provides package uniqueness without breaking builds. VGo dictates different import paths for incompatible package versions. The team grouped similar packages and gave these groups a name: Modules. The VGo system is now go modules. It now integrates directly with the Go command. The challenge presented going forward is mostly around updating everything to use modules. Everything needs to be updated to work with the new conventions to work well. Tooling Finally, as a result of all these changes, they distilled and refined the Go toolchain. One of the examples of this is gopls or "Go Please". Gopls aims to create a smoother, standard interface to integrate with all editors, IDEs, continuous integration and others. Simple, portable and efficient graphical interfaces in Go Elias Naur presented Gio, a new open source Go library for writing immediate mode GUI programs that run on all the major platforms: Android, iOS/tvOS, macOS, Linux, Windows. The talk covered Gio's unusual design and how it achieves simplicity, portability and performance. Elias said, “I wanted to be able to write a GUI program in GO that I could implement only once and have it work on every platform. This, to me, is the most interesting feature of Gio.” https://twitter.com/rakyll/status/1154450455214190593 Elias also presented Scatter which is a Gio program for end-to-end encrypted messaging over email. Other features of Gio include: Immediate mode design UI state owned by program Only depends on lowest-level platform libraries Minimal dependency tree to keep things low level as possible GPU accelerated vector and text rendering It’s super efficient No garbage generated in drawing or layout code Cross platform (macOS, Linux, Windows, Android, iOS, tvOS, Webassembly) Core is 100% Go while OS-specific native interfaces are optional Gopls, new tool serves as a backend for Go editor Rebecca Stambler, mentioned in her presentation that the Go community has built many amazing tools to improve the Go developer experience. However, when a maintainer disappears or a new Go release wreaks havoc, the Go development experience becomes frustrating and complicated. To solve this issue, Rebecca revealed the details behind a new tool: gopls (pronounced as 'go please'). The tool is currently in development by the Go team and community, and it will ultimately serve as the backend for your Go editor. Below listed functionalities are expected from gopls: Show me errors, like unused variables or typos autocomplete would be nice function signature help, because we often forget While we're at it, hover-accessible "tooltip" documentation in general Help me jump to a variable that is needed to see An outline of package structure Get started with WebAssembly in Go WebAssembly in Go is here and ready to try! Although the landscape is evolving quickly, the opportunity is huge. The ability to deliver truly portable system binaries could potentially replace JavaScript in the browser. WebAssembly has the potential to finally realize the goal of being platform agnostic without having to rely on a JVM. In a session by Johan Brandhorst who introduces the technology, shows how to get started with WebAssembly and Go, discusses what is possible today and what will be possible tomorrow. As of Go 1.13, there is experimental support for WebAssembly using the JavaScript interface but as it is only experimental, using it in production is not recommended. Support for the WASI interface is not currently available but has been planned and may be available as early as in Go 1.14. Better x86 assembly generation from Go Michael McLoughlin in his presentation made the case for code generation techniques for writing x86 assembly from Go. Michael introduced assembly, assembly in Go, the use cases for when you would want to drop into assembly, and techniques for realizing speedups using assembly. He pointed out that most of the time, pure Go will be enough for 97% of programs, but there are those 3% of cases where it is warranted, and the examples he brought up were crypto, syscalls, and scientific computing. Michael then introduced a package called avo which makes high-performance Go assembly easier to write. He said that writing your assembly in Go will allow you to realize the benefits of a high level language such as code readability, the ability to create loops, variables, and functions, and parameterized code generation all while still realizing the benefits of writing assembly. Michael concluded the talk with his ideas for the future of avo. Use avo in projects specifically in large crypto implementations. More architecture support Possibly make avo an assembler itself (these kinds of techniques are used in JIT compilers) avo based libraries (avo/std/math/big, avo/std/crypto) The audience appreciated this talk on Twitter. https://twitter.com/darethas/status/1155336268076576768 The presentation slides for this are available on the blog. Miniature version of Golang, TinyGo for microcontrollers Ron Evans, creator of GoCV, GoBot and "technologist for hire" introduced TinyGo that can run directly on microcontrollers like Arduino and more. TinyGo uses the LLVM compiler toolchain to create native code that can run directly even on the smallest of computing devices. Ron demonstrated how Go code can be run on embedded systems using TinyGo, a compiler intended for use in microcontrollers, WebAssembly (WASM), and command-line tools. Evans began his presentation by countering the idea that Go, while fast, produces executables too large to run on the smallest computers. While that may be true of the standard Go compiler, TinyGo produces much smaller outputs. For example: "Hello World" program compiled using Go 1.12 => 1.1 MB Same program compiled using TinyGo 0.7.0 => 12 KB TinyGo currently lacks support for the full Go language and Go standard library. For example, TinyGo does not have support for the net package, although contributors have created implementations of interfaces that work with the WiFi chip built into Arduino chips. Support for Go Routines is also limited, although simple programs usually work. Evans demonstrated that despite some limitations, thanks to TinyGo, the Go language can still be run in embedded systems. Salvador Evans, son of Ron Evans, assisted him for this demonstration. At age 11, he has become the youngest GopherCon speaker so far. https://twitter.com/erikstmartin/status/1155223328329625600 There were talks by other speakers on topics like, improvements in VSCode for Golang, the first open source Golang interpreter with complete support of the language spec, Athens Project which is a proxy server in Go and how mobile development works in Go. https://twitter.com/ramyanexus/status/1155238591120805888 https://twitter.com/containous/status/1155191121938649091 https://twitter.com/hajimehoshi/status/1155184796386988035 Apart from these there were a whole lot of other talks which happened at the GopherCon 2019. There were live blogs posted by the attendees on various talks and till now more than 25 blogs are posted by the attendees on the Sourcegraph website. The Go team shares new proposals planned to be implemented in Go 1.13 and 1.14 Go introduces generic codes and a new contract draft design at GopherCon 2019 Is Golang truly community driven and does it really matter?

0
0
30379

article-image-gain-practical-expertise-latest-edition-software-architecture-with-c-sharp9-dotnet5

Expert Network

08 Jul 2021

3 min read

Gain Practical Expertise with the Latest Edition of Software Architecture with C# 9 and .NET 5

Expert Network

08 Jul 2021

3 min read

Software architecture is one of the most discussed topics in the software industry today, and its importance will certainly grow more in the future. But the speed at which new features are added to these software solutions keeps increasing, and new architectural opportunities keep emerging. To strengthen your command on this, Packt brings to you the Second Edition of Software Architecture with C# 9 and .NET 5 by Gabriel Baptista and Francesco Abbruzzese – a fully revised and expanded guide, featuring the latest features of .NET 5 and C# 9. This book covers the most common design patterns and frameworks involved in modern cloud-based and distributed software architectures. It discusses when and how to use each pattern, by providing you with practical real-world scenarios. This book also presents techniques and processes such as DevOps, microservices, Kubernetes, continuous integration, and cloud computing, so that you can have a best-in-class software solution developed and delivered for your customers. This book will help you to understand the product that your customer wants from you. It will guide you to deliver and solve the biggest problems you can face during development. It also covers the do's and don'ts that you need to follow when you manage your application in a cloud-based environment. You will learn about different architectural approaches, such as layered architectures, service-oriented architecture, microservices, Single Page Applications, and cloud architecture, and understand how to apply them to specific business requirements. Finally, you will deploy code in remote environments or on the cloud using Azure. All the concepts in this book will be explained with the help of real-world practical use cases where design principles make the difference when creating safe and robust applications. By the end of the book, you will be able to develop and deliver highly scalable and secure enterprise-ready applications that meet the end customers' business needs. It is worth mentioning that Software Architecture with C# 9 and .NET 5, Second Edition will not only cover the best practices that a software architect should follow for developing C# and .NET Core solutions, but it will also discuss all the environments that we need to master in order to develop a software product according to the latest trends. This second edition is improved in code, and adapted to the new opportunities offered by C# 9 and .Net 5. We added all new frameworks and technologies such as gRPC, and Blazor, and described Kubernetes in more detail in a dedicated chapter. To get the most out of this book, understand it as a guidance that you may want to revisit many times for different circumstances. Do not forget to have Visual Studio Community 2019 or higher installed and be sure that you understand C# .NET principles.

0
0
29869

article-image-2019-stack-overflow-survey-quick-overview

Sugandha Lahoti

10 Apr 2019

5 min read

2019 Stack Overflow survey: A quick overview

Sugandha Lahoti

10 Apr 2019

5 min read

The results of the 2019 Stack Overflow survey have just been published: 90,000 developers took the 20-minute survey this year. The survey shed light on some very interesting insights – from the developers’ preferred language for programming, to the development platform they hate the most, to the blockers to developer productivity. As the survey is quite detailed and comprehensive, here’s a quick look at the most important takeaways. Key highlights from the Stack Overflow Survey Programming languages Python again emerged as the fastest-growing programming language, a close second behind Rust. Interestingly, Python and Typescript achieved the same votes with almost 73% respondents saying it was their most loved language. Python was the most voted language developers wanted to learn next and JavaScript remains the most used programming language. The most dreaded languages were VBA and Objective C. Source: Stack Overflow Frameworks and databases in the Stack Overflow survey Developers preferred using React.js and Vue.js web frameworks while dreaded Drupal and jQuery. Redis was voted as the most loved database and MongoDB as the most wanted database. MongoDB’s inclusion in the list is surprising considering its controversial Server Side Public License. Over the last few months, Red Hat dropped support for MongoDB over this license, so did GNU Health Federation. Both of these organizations choose PostgreSQL over MongoDB, which is one of the reasons probably why PostgreSQL was the second most loved and wanted database of Stack Overflow Survey 2019. Source: Stack Overflow It’s interesting to see WebAssembly making its way in the popular technology segment as well as one of the top paying technologies. Respondents who use Clojure, F#, Elixir, and Rust earned the highest salaries Stackoverflow also did a new segment this year called "Blockchain in the real world" which gives insight into the adoption of Blockchain. Most respondents (80%) on the survey said that their organizations are not using or implementing blockchain technology. Source: Stack Overflow Developer lifestyles and learning About 80% of our respondents say that they code as a hobby outside of work and over half of respondents had written their first line of code by the time they were sixteen, although this experience varies by country and by gender. For instance, women wrote their first code later than men and non-binary respondents wrote code earlier than men. About one-quarter of respondents are enrolled in a formal college or university program full-time or part-time. Of professional developers who studied at the university level, over 60% said they majored in computer science, computer engineering, or software engineering. DevOps specialists and site reliability engineers are among the highest paid, most experienced developers most satisfied with their jobs, and are looking for new jobs at the lowest levels. The survey also noted that developers who are system admins or DevOps specialists are 25-30 times more likely to be men than women. Chinese developers are the most optimistic about the future while developers in Western European countries like France and Germany are among the least optimistic. Developers also overwhelmingly believe that Elon Musk will be the most influential person in tech in 2019. With more than 30,000 people responding to a free text question asking them who they think will be the most influential person this year, an amazing 30% named Tesla CEO Musk. For perspective, Jeff Bezos was in second place, being named by ‘only’ 7.2% of respondents. Although, this year the US survey respondents proportion of women, went up from 9% to 11%, it’s still a slow growth and points to problems with inclusion in the tech industry in general and on Stack Overflow in particular. When thinking about blockers to productivity, different kinds of developers report different challenges. Men are more likely to say that being tasked with non-development work is a problem for them, while gender minority respondents are more likely to say that toxic work environments are a problem. Stack Overflow survey demographics and diversity challenges This report is based on a survey of 88,883 software developers from 179 countries around the world. It was conducted between January 23 to February 14 and the median time spent on the survey for qualified responses was 23.3 minutes. The majority of survey respondents this year were people who said they are professional developers or who code sometimes as part of their work, or are students preparing for such a career. Majority of them were from the US, India, China and Europe. Stack Overflow acknowledged that their results did not represent racial disparities evenly and people of color continue to be underrepresented among developers. This year nearly 71% of respondents continued to be of White or European descent, a slight improvement from last year (74%). The survey notes that, “In the United States this year, 22% of respondents are people of color; last year 19% of United States respondents were people of color.” This clearly signifies that a lot of work is still needed to be done particularly for people of color, women, and underrepresented groups. Although, last year in August, Stack Overflow revamped its Code of Conduct to include more virtues around kindness, collaboration, and mutual respect. It also updated its developers salary calculator to include 8 new countries. Go through the full report to learn more about developer salaries, job priorities, career values, the best music to listen to while coding, and more. Developers believe Elon Musk will be the most influential person in tech in 2019, according to Stack Overflow survey results Creators of Python, Java, C#, and Perl discuss the evolution and future of programming language design at PuPPy Stack Overflow is looking for a new CEO as Joel Spolsky becomes Chairman

0
0
29753

Packt

16 Feb 2016

11 min read

Python Design Patterns in Depth – The Observer Pattern

Packt

16 Feb 2016

11 min read

0
0
29706

article-image-creating-graph-application-python-neo4j-gephi-linkuriousjs

Greg Roberts

12 Oct 2015

13 min read

Creating a graph application with Python, Neo4j, Gephi & Linkurious.js

Greg Roberts

12 Oct 2015

13 min read

I love Python, and to celebrate Packt Python week, I’ve spent some time developing an app using some of my favorite tools. The app is a graph visualization of Python and related topics, as well as showing where all our content fits in. The topics are all StackOverflow tags, related by their co-occurrence in questions on the site. The app is available to view at http://gregroberts.github.io/ and in this blog, I’m going to discuss some of the techniques I used to construct the underlying dataset, and how I turned it into an online application. Graphs, not charts Graphs are an incredibly powerful tool for analyzing and visualizing complex data. In recent years, many different graph database engines have been developed to make use of this novel manner of representing data. These databases offer many benefits over traditional, relational databases because of how the data is stored and accessed. Here at Packt, I use a Neo4j graph to store and analyze data about our business. Using the Cypher query language, it’s easy to express complicated relations between different nodes succinctly. It’s not just the technical aspect of graphs which make them appealing to work with. Seeing the connections between bits of data visualized explicitly as in a graph helps you to see the data in a different light, and make connections that you might not have spotted otherwise. This graph has many uses at Packt, from customer segmentation to product recommendations. In the next section, I describe the process I use to generate recommendations from the database. Make the connection For product recommendations, I use what’s known as a hybrid filter. This considers both content based filtering (product x and y are about the same topic) and collaborative filtering (people who bought x also bought y). Each of these methods has strengths and weaknesses, so combining them into one algorithm provides a more accurate signal. The collaborative aspect is straightforward to implement in Cypher. For a particular product, we want to find out which other product is most frequently bought alongside it. We have all our products and customers stored as nodes, and purchases are stored as edges. Thus, the Cypher query we want looks like this: MATCH (n:Product {title:’Learning Cypher’})-[r:purchased*2]-(m:Product) WITH m.title AS suggestion, count(distinct r)/(n.purchased+m.purchased) AS alsoBought WHERE m<>n RETURN* ORDER BY alsoBought DESC and will very efficiently return the most commonly also purchased product. When calculating the weight, we divide by the total units sold of both titles, so we get a proportion returned. We do this so we don’t just get the titles with the most units; we’re effectively calculating the size of the intersection of the two titles’ audiences relative to their overall audience size. The content side of the algorithm looks very similar: MATCH (n:Product {title:’Learning Cypher’})-[r:is_about*2]-(m:Product) WITH m.title AS suggestion, count(distinct r)/(length(n.topics)+length(m.topics)) AS alsoAbout WHERE m<>n RETURN * ORDER BY alsoAbout DESC Implicit in this algorithm is knowledge that a title is_about a topic of some kind. This could be done manually, but where’s the fun in that? In Packt’s domain there already exists a huge, well moderated corpus of technology concepts and their usage: StackOverflow. The tagging system on StackOverflow not only tells us about all the topics developers across the world are using, it also tells us how those topics are related, by looking at the co-occurrence of tags in questions. So in our graph, StackOverflow tags are nodes in their own right, which represent topics. These nodes are connected via edges, which are weighted to reflect their co-occurrence on StackOverflow: edge_weight(n,m) = (Number of questions tagged with both n & m)/(Number questions tagged with n or m) So, to find topics related to a given topic, we could execute a query like this: MATCH (n:StackOverflowTag {name:'Matplotlib'})-[r:related_to]-(m:StackOverflowTag) RETURN n.name, r.weight, m.name ORDER BY r.weight DESC LIMIT 10 Which would return the following: | n.name | r.weight | m.name ----+------------+----------+-------------------- 1 | Matplotlib | 0.065699 | Plot 2 | Matplotlib | 0.045678 | Numpy 3 | Matplotlib | 0.029667 | Pandas 4 | Matplotlib | 0.023623 | Python 5 | Matplotlib | 0.023051 | Scipy 6 | Matplotlib | 0.017413 | Histogram 7 | Matplotlib | 0.015618 | Ipython 8 | Matplotlib | 0.013761 | Matplotlib Basemap 9 | Matplotlib | 0.013207 | Python 2.7 10 | Matplotlib | 0.012982 | Legend There are many, more complex relationships you can define between topics like this, too. You can infer directionality in the relationship by looking at the local network, or you could start constructing Hyper graphs using the extensive StackExchange API. So we have our topics, but we still need to connect our content to topics. To do this, I’ve used a two stage process. Step 1 – Parsing out the topics We take all the copy (words) pertaining to a particular product as a document representing that product. This includes the title, chapter headings, and all the copy on the website. We use this because it’s already been optimized for search, and should thus carry a fair representation of what the title is about. We then parse this document and keep all the words which match the topics we’ve previously imported. #...code for fetching all the copy for all the products key_re = 'W(%s)W' % '|'.join(re.escape(i) for i in topic_keywords) for i in documents tags = re.findall(key_re, i[‘copy’]) i['tags'] = map(lambda x: tag_lookup[x],tags) Having done this for each product, we have a bag of words representing each product, where each word is a recognized topic. Step 2 – Finding the information From each of these documents, we want to know the topics which are most important for that document. To do this, we use the tf-idf algorithm. tf-idf stands for term frequency, inverse document frequency. The algorithm takes the number of times a term appears in a particular document, and divides it by the proportion of the documents that word appears in. The term frequency factor boosts terms which appear often in a document, whilst the inverse document frequency factor gets rid of terms which are overly common across the entire corpus (for example, the term ‘programming’ is common in our product copy, and whilst most of the documents ARE about programming, this doesn’t provide much discriminating information about each document). To do all of this, I use python (obviously) and the excellent scikit-learn library. Tf-idf is implemented in the class sklearn.feature_extraction.text.TfidfVectorizer. This class has lots of options you can fiddle with to get more informative results. import sklearn.feature_extraction.text as skt tagger = skt.TfidfVectorizer(input = 'content', encoding = 'utf-8', decode_error = 'replace', strip_accents = None, analyzer = lambda x: x, ngram_range = (1,1), max_df = 0.8, min_df = 0.0, norm = 'l2', sublinear_tf = False) It’s a good idea to use the min_df & max_df arguments of the constructor so as to cut out the most common/obtuse words, to get a more informative weighting. The ‘analyzer’ argument tells it how to get the words from each document, in our case, the documents are already lists of normalized words, so we don’t need anything additional done. #create vectors of all the documents vectors = tagger.fit_transform(map(lambda x: x['tags'],rows)).toarray() #get back the topic names to map to the graph t_map = tagger.get_feature_names() jobs = [] for ind, vec in enumerate(vectors): features = filter(lambda x: x[1]>0, zip(t_map,vec)) doc = documents[ind] for topic, weight in features: job = ‘’’MERGE (n:StackOverflowTag {name:’%s’}) MERGE (m:Product {id:’%s’}) CREATE UNIQUE (m)-[:is_about {source:’tf_idf’,weight:%d}]-(n) ’’’ % (topic, doc[‘id’], weight) jobs.append(job) We then execute all of the jobs using Py2neo’s Batch functionality. Having done all of this, we can now relate products to each other in terms of what topics they have in common: MATCH (n:Product {isbn10:'1783988363'})-[r:is_about]-(a)-[q:is_about]-(m:Product {isbn10:'1783289007'}) WITH a.name as topic, r.weight+q.weight AS weight RETURN topic ORDER BY weight DESC limit 6 Which returns: | topic ---+------------------ 1 | Machine Learning 2 | Image 3 | Models 4 | Algorithm 5 | Data 6 | Python Huzzah! I now have a graph into which I can throw any piece of content about programming or software, and it will fit nicely into the network of topics we’ve developed. Take a breath So, that’s how the graph came to be. To communicate with Neo4j from Python, I use the excellent py2neo module, developed by Nigel Small. This module has all sorts of handy abstractions to allow you to work with nodes and edges as native Python objects, and then update your Neo instance with any changes you’ve made. The graph I’ve spoken about is used for many purposes across the business, and has grown in size and scope significantly over the last year. For this project, I’ve taken from this graph everything relevant to Python. I started by getting all of our content which is_about Python, or about a topic related to python: titles = [i.n for i in graph.cypher.execute('''MATCH (n)-[r:is_about]-(m:StackOverflowTag {name:'Python'}) return distinct n''')] t2 = [i.n for i in graph.cypher.execute('''MATCH (n)-[r:is_about]-(m:StackOverflowTag)-[:related_to]-(m:StackOverflowTag {name:'Python'}) where has(n.name) return distinct n''')] titles.extend(t2) then hydrated this further by going one or two hops down each path in various directions, to get a large set of topics and content related to Python. Visualising the graph Since I started working with graphs, two visualisation tools I’ve always used are Gephi and Sigma.js. Gephi is a great solution for analysing and exploring graphical data, allowing you to apply a plethora of different layout options, find out more about the statistics of the network, and to filter and change how the graph is displayed. Sigma.js is a lightweight JavaScript library which allows you to publish beautiful graph visualizations in a browser, and it copes very well with even very large graphs. Gephi has a great plugin which allows you to export your graph straight into a web page which you can host, share and adapt. More recently, Linkurious have made it their mission to bring graph visualization to the masses. I highly advise trying the demo of their product. It really shows how much value it’s possible to get out of graph based data. Imagine if your Customer Relations team were able to do a single query to view the entire history of a case or customer, laid out as a beautiful graph, full of glyphs and annotations. Linkurious have built their product on top of Sigma.js, and they’ve made available much of the work they’ve done as the open source Linkurious.js. This is essentially Sigma.js, with a few changes to the API, and an even greater variety of plugins. On Github, each plugin has an API page in the wiki and a downloadable demo. It’s worth cloning the repository just to see the things it’s capable of! Publish It! So here’s the workflow I used to get the Python topic graph out of Neo4j and onto the web. Use Py2neo to graph the subgraph of content and topics pertinent to Python, as described above Add to this some other topics linked to the same books to give a fuller picture of the Python “world” Add in topic-topic edges and product-product edges to show the full breadth of connections observed in the data Export all the nodes and edges to csv files Import node and edge tables into Gephi. The reason I’m using Gephi as a middle step is so that I can fiddle with the visualisation in Gephi until it looks perfect. The layout plugin in Sigma is good, but this way the graph is presentable as soon as the page loads, the communities are much clearer, and I’m not putting undue strain on browsers across the world! The layout of the graph has been achieved using a number of plugins. Instead of using the pre-installed ForceAtlas layouts, I’ve used the OpenOrd layout, which I feel really shows off the communities of a large graph. There’s a really interesting and technical presentation about how this layout works here. Export the graph into gexf format, having applied some partition and ranking functions to make it more clear and appealing. Now it’s all down to Linkurious and its various plugins! You can explore the source code of the final page to see all the details, but here I’ll give an overview of the different plugins I’ve used for the different parts of the visualisation: First instantiate the graph object, pointing to a container (note the CSS of the container, without this, the graph won’t display properly: <style type="text/css"> #container { max-width: 1500px; height: 850px; margin: auto; background-color: #E5E5E5; } </style> … <div id="container"></div> … <script> s= new sigma({ container: 'container', renderer: { container: document.getElementById('container'), type: 'canvas' }, settings: { … } }); sigma.parsers.gexf - used for (trivially!) importing a gexf file into a sigma instance sigma.parsers.gexf( 'static/data/Graph1.gexf', s, function(s) { //callback executed once the data is loaded, use this to set up any aspects of the app which depend on the data }); sigma.plugins.filter - Adds the ability to very simply hide nodes/edges based on a callback function which returns a boolean. This powers the filtering widgets on the page. <input class="form-control" id="min-degree" type="range" min="0" max="0" value="0"> … function applyMinDegreeFilter(e) { var v = e.target.value; $('#min-degree-val').textContent = v; filter .undo('min-degree') .nodesBy( function(n, options) { return this.graph.degree(n.id) >= options.minDegreeVal; },{ minDegreeVal: +v }, 'min-degree' ) .apply(); }; $('#min-degree').change(applyMinDegreeFilter); sigma.plugins.locate - Adds the ability to zoom in on a single node or collection of nodes. Very useful if you’re filtering a very large initial graph function locateNode (nid) { if (nid == '') { locate.center(1); } else { locate.nodes(nid); } }; sigma.renderers.glyphs - Allows you to add custom glyphs to each node. Useful if you have many types of node. Outro This application has been a very fun little project to build. The improvements to Sigma wrought by Linkurious have resulted in an incredibly powerful toolkit to rapidly generate graph based applications with a great degree of flexibility and interaction potential. None of this would have been possible were it not for Python. Python is my right (left, I’m left handed) hand which I use for almost everything. Its versatility and expressiveness make it an incredibly robust Swiss army knife in any data-analysts toolkit.

0
0
29669

article-image-the-openjdk-transition-things-to-know-and-do

Rich Sharples

28 Feb 2019

5 min read

The OpenJDK Transition: Things to know and do

Rich Sharples

28 Feb 2019

5 min read

At this point, it should not be a surprise that a number of major changes were announced in the Java ecosystem, some of which have the potential to force a reassessment of Java roadmaps and even vendor selection for enterprise Java users. Some of the biggest changes are taking place in the Upstream OpenJDK (Open Java Development Kit), which means that users will need to have a backup plan in place for the transition. Read on to better understand exactly what the changes Oracle made to Oracle JDK are, why you should care about them and how to choose the best next steps for your organization. For some quick background, OpenJDK began as a free and open source implementation of the Java Platform, Standard Edition, through a 2006 Sun Microsystems initiative. They made the announcement at the 2006 JavaOne event that Java and the core Java platform would be open sourced. Over the next few years, major components of the Java Platform were released as free, open source software using the GPL. Other vendors - like Red Hat - then got involved with the project about a year later, through a broad contributor agreement which covers the participation of non-Sun engineers in the project. This piece is by Rich Sharples, Senior Director of Product Management at Red Hat , on what Oracle ending free lifecycle support means, and what steps users can take now to prepare for the change. So what is new in the world of Java and JDK? Why should you care? Okay, enough about the history. What are we anticipating for the future of Java and OpenJDK? At the end of January 2019, Oracle officially ended free public updates to Oracle JDK for non-Oracle customer commercial users. These users will no longer be able to get updates without an Oracle support contract. Additionally, Oracle has changed the Oracle JDK license (BCPL) so that commercial use for JDK 11 and beyond will require an Oracle subscription. They do also have a non-proprietary (GPLv2+CPE) distribution but the support policy around that is unclear at this time. This is a big deal because it means that users need to have strategies and plans in place to continue to get the support that Oracle had offered. You may be wondering whether it really is critical to run OpenJDK with support and if you can get away with running it without the support. The truth is that while the open source license and nature of OpenJDK means that you can technically run the software with no commercial support, it doesn’t mean that you necessarily should. There are too many risks associated with running OpenJDK unsupported, including numerous cases of critical vulnerabilities and security flaws. While there is nothing inherently insecure about open source software, when it is deployed without support or even a maintenance plan, it can open up your organization to threats that you may not be prepared to handle and resolve. Additionally, and probably the biggest reason why it is important to have commercial OpenJDK support, is that without a third party to help, you have to be entirely self-sufficient, meaning that you would have to ensure that you have specific engineering efforts that are permanently allocated to monitoring the upstream project and maintaining installations of OpenJDK. Not all organizations have the bandwidth or resources to be able to do this consistently. How can vendors help and what are other key technology considerations? Software vendors, including Red Hat, offer OpenJDK distributions and support to make sure that those who had been reliant on Oracle JDK have a seamless transition to take care of the above risks associated with not having OpenJDK support. It also allows customers and partners to focus on their business software, rather than needing to spend valuable engineering efforts on underlying Java runtimes. Additionally, Java SE 11 has introduced some significant new features, as well as deprecated others, and is the first significant update to the platform that will require users to think seriously about the impact of moving from it. With this, it is important to separate upgrading from Java SE 8 to Java SE 11 from moving from one vendor’s Java SE distribution to another vendor’s. In fact, moving from Java SE distributions - ones that are based in OpenJDK - without needing to change versions should be a fairly simple task. It is not recommended to change both the vendor and the version in one single step. Luckily also, the differences between Oracle JDK and OpenJDK are fairly minor and in fact, are slowly aligning to become more similar. Technology vendors can help in the transition that will inevitably come for OpenJDK - if it has not happened already. Having a proper regression test plan in place to ensure that applications run as they previously did, with a particular focus on performance and scalability is key and is something that vendors can help set up. Auditing and understanding if you need to make any code changes to applications is also a major undertaking that a third-party can help guide you on, that likely includes rulesets for the migrations. Finally, third-party vendors can help you deploy to the new OpenJDK solution and provide patches and security guidance as it comes up, to help keep the environment secure, updated and safe. While there are changes to Oracle JDK, once you find the alternative solution that is best suited for your organization, it should be fairly straightforward and cost-effective to make the transition, and third party vendors have the expertise, knowledge, and experience to help guide users through the OpenJDK transition. OpenJDK Project Valhalla is now in Phase III State of OpenJDK: Past, Present and Future with Oracle Mark Reinhold on the evolution of Java platform and OpenJDK

0
0
29638

article-image-adding-postgis-layers-using-qgis-tutorial

Pravin Dhandre

26 Jul 2018

5 min read

Adding PostGIS layers using QGIS [Tutorial]

Pravin Dhandre

26 Jul 2018

5 min read

Viewing tables as layers is great for creating maps or for simply working on a copy of the database outside the database. In this tutorial, we will establish a connection to our PostGIS database in order to add a table as a layer in QGIS (formerly known as Quantum GIS). Please navigate to the following site to install the latest version LTR of QGIS. On this page, click on Download Now and you will be able to choose a suitable operating system and the relevant settings. QGIS is available for Android, Linux, macOS X, and Windows. You might also be inclined to click on Discover QGIS to get an overview of basic information about the program along with features, screenshots, and case studies. This QGIS tutorial is an excerpt from a book written by Mayra Zurbaran,Pedro Wightman, Paolo Corti, Stephen Mather, Thomas Kraft and Bborie Park, titled PostGIS Cookbook - Second Edition. Loading Database... To begin, create the schema for this tutorial then, download data from the U.S. Census Bureau's FTP site: The shapefile is All Lines for Cuyahoga county in Ohio, which consist of roads and streams among other line features. Extract the ZIP file to your working directory and then load it into your database using shp2pgsql. Be sure to specify the spatial reference system, EPSG/SRID: 4269. When in doubt about using projections, use the service provided by the folks at OpenGeo at the following website: http://prj2epsg.org/search Use the following command to generate the SQL to load the shapefile: shp2pgsql -s 4269 -W LATIN1 -g the_geom -I tl_2012_39035_edges.shp chp11.tl_2012_39035_edges > tl_2012_39035_edges.sql How to do it... Now it's time to give the data we downloaded a look using QGIS. We must first create a connection to the database in order to access the table. Get connected and add the table as a layer by following the ensuing steps: Click on the Add PostGIS Layers icon: Click on the New button below the Connections drop-down menu. Create a new PostGIS connection. After the Add PostGIS Table(s) window opens, create a name for the connection and fill in a few parameters for your database, including Host, Port, Database, Username, and Password: Once you have entered all of the pertinent information for your database, click on the Test Connection button to verify that the connection is successful. If the connection is not successful, double-check for typos and errors. Additionally, make sure you are attempting to connect to a PostGIS-enabled database. If the connection is successful, go ahead and check the Save Username and Save Password checkboxes. This will prevent you from having to enter your login information multiple times throughout the exercise. Click on OK at the bottom of the menu to apply the connection settings. Now you can connect! Make sure the name of your PostGIS connection appears in the drop-down menu and then click on the Connect button. If you choose not to store your username and password, you will be asked to submit this information every time you try to access the database. Once connected, all schemas within the database will be shown and the tables will be made visible by expanding the target schema. Select the table(s) to be added as a layer by simply clicking on the table name or anywhere along its row. Selection(s) will be highlighted in blue. To deselect a table, click on it a second time and it will no longer be highlighted. Select the tl_2012_39035_edges table that was downloaded at the beginning of the tutorial and click on the Add button, as shown in the following screenshot: A subset of the table can also be added as a layer. This is accomplished by double-clicking on the desired table name. The Query Builder window will open, which aids in creating simple SQL WHERE clause statements. Add the roads by selecting the records where roadflg = Y. This can be done by typing a query or using the buttons within Query Builder: Click on the OK button followed by the Add button. A subset of the table is now loaded into QGIS as a layer. The layer is strictly a static, temporary copy of your database. You can make whatever changes you like to the layer and not affect the database table. The same holds true the other way around. Changes to the table in the database will have no effect on the layer in QGIS. If needed, you can save the temporary layer in a variety of formats, such as DXF, GeoJSON, KML, or SHP. Simply right-click on the layer name in the Layers panel and click on Save As. This will then create a file, which you can recall at a later time or share with others. The following screenshot shows the Cuyahoga county road network: You may also use the QGIS Browser Panel to navigate through the now connected PostGIS database and list the schemas and tables. This panel allows you to double-click to add spatial layers to the current project, providing a better user experience not only of connected databases, but on any directory of your machine: How it works... You have added a PostGIS layer into QGIS using the built-in Add PostGIS Table GUI. This was achieved by creating a new connection and entering your database parameters. Any number of database connections can be set up simultaneously. If working with multiple databases is more common for your workflows, saving all of the connections into one XML file and would save time and energy when returning to these projects in QGIS. To explore more on 3D capabilities of PostGIS, including LiDAR point clouds, read PostGIS Cookbook - Second Edition. Using R to implement Kriging – A Spatial Interpolation technique for Geostatistics data Top 7 libraries for geospatial analysis Learning R for Geospatial Analysis

0
0
29598

article-image-introducing-weld-a-runtime-written-in-rust-and-llvm-for-cross-library-optimizations

Bhagyashree R

24 Sep 2019

5 min read

Introducing Weld, a runtime written in Rust and LLVM for cross-library optimizations

Bhagyashree R

24 Sep 2019

5 min read

Weld is an open-source Rust project for improving the performance of data-intensive applications. It is an interface and runtime that can be integrated into existing frameworks including Spark, TensorFlow, Pandas, and NumPy without changing their user-facing APIs. The motivation behind Weld Data analytics applications today often require developers to combine various functions from different libraries and frameworks to accomplish a particular task. For instance, a typical Python ecosystem application selects some data using Spark SQL, transforms it using NumPy and Pandas, and trains a model with TensorFlow. This improves developers’ productivity as they are taking advantage of functions from high-quality libraries. However, these functions are usually optimized in isolation, which is not enough to achieve the best application performance. Weld aims to solve this problem by providing an interface and runtime that can optimize across data-intensive libraries and frameworks while preserving their user-facing APIs. In an interview with Federico Carrone, a Tech Lead at LambdaClass, Weld’s main contributor, Shoumik Palkar shared, “The motivation behind Weld is to provide bare-metal performance for applications that rely on existing high-level APIs such as NumPy and Pandas. The main problem it solves is enabling cross-function and cross-library optimizations that other libraries today don’t provide.” How Weld works Weld serves as a common runtime that allows libraries from different domains like SQL and machine learning to represent their computations in a common functional intermediate representation (IR). This IR is then optimized by a compiler optimizer and JIT’d to efficient machine code for diverse parallel hardware. It performs a wide range of optimizations on the IR including loop fusion, loop tiling, and vectorization. “Weld’s IR is natively parallel, so programs expressed in it can always be trivially parallelized,” said Palkar. When Weld was first introduced it was mainly used for cross-library optimizations. However, over time people have started to use it for other applications as well. It can be used to build JITs or new physical execution engines for databases or analytics frameworks, individual libraries, target new kinds of parallel hardware using the IR, and more. To evaluate Weld’s performance the team integrated it with popular data analytics frameworks including Spark, NumPy, and TensorFlow. This prototype showed up to 30x improvements over the native framework implementations. While cross library optimizations between Pandas and NumPy also improved performance by up to two orders of magnitude. Source: Weld Why Rust and LLVM were chosen for its implementation The first iteration of Weld was implemented in Scala because of its algebraic data types, powerful pattern matching, and large ecosystem. However, it did have some shortcomings. Palkar shared in the interview, “We moved away from Scala because it was too difficult to embed a JVM-based language into other runtimes and languages.” It had a managed runtime, clunky build system, and its JIT compilations were quite slow for larger programs. Because of these shortcomings the team wanted to redesign the JIT compiler, core API, and runtime from the ground up. They were in the search for a language that was fast, safe, didn’t have a managed runtime, provided a rich standard library, functional paradigms, good package manager, and great community background. So, they zeroed-in on Rust that happens to meet all these requirements. Rust provides a very minimal, no setup required runtime. It can be easily embedded into other languages such as Java and Python. To make development easier, it has high-quality packages, known as crates, and functional paradigms such as pattern matching. Lastly, it is backed by a great Rust Community. Read also: “Rust is the future of systems programming, C is the new Assembly”: Intel principal engineer, Josh Triplett Explaining the reason why they chose LLVM, Palkar said in the interview, “We chose LLVM because its an open-source compiler framework that has wide use and support; we generate LLVM directly instead of C/C++ so we don’t need to rely on the existence of a C compiler, and because it improves compilation times (we don’t need to parse C/C++ code).” In a discussion on Hacker News many users listed other Weld-like projects that developers may find useful. A user commented, “Also worth checking out OmniSci (formerly MapD), which features an LLVM query compiler to gain large speedups executing SQL on both CPU and GPU.” Users also talked about Numba, an open-source JIT compiler that translates Python functions to optimized machine code at runtime with the help of the LLVM compiler library. “Very bizarre there is no discussion of numba here, which has been around and used widely for many years, achieves faster speedups than this, and also emits an LLVM IR that is likely a much better starting point for developing a “universal” scientific computing IR than doing yet another thing that further complicates it with fairly needless involvement of Rust,” a user added. To know more about Weld, check out the full interview on Medium. Also, watch this RustConf 2019 talk by Shoumik Palkar: https://www.youtube.com/watch?v=AZsgdCEQjFo&t Other news in Programming Darklang available in private beta GNU community announces ‘Parallel GCC’ for parallelism in real-world compilers TextMate 2.0, the text editor for macOS releases

0
0
29549

How-To Tutorials - Programming

A capability model for microservices

Python 3.8 is now available with walrus operator, positional-only parameters support for Vectorcall, and more

How to work with classes in Typescript

What are Microservices?

Python Design Patterns in Depth: The Factory Pattern

Regular expressions in AWK programming: What, Why, and How

Firefox Nightly browser: Debugging your app is now fun with Mozilla’s new ‘time travel’ feature

GopherCon 2019: Go 2 update, open-source Go library for GUI, support for WebAssembly, TinyGo for microcontrollers and more

Gain Practical Expertise with the Latest Edition of Software Architecture with C# 9 and .NET 5

2019 Stack Overflow survey: A quick overview

Trending Topics

Python Design Patterns in Depth – The Observer Pattern

Creating a graph application with Python, Neo4j, Gephi & Linkurious.js

The OpenJDK Transition: Things to know and do

Adding PostGIS layers using QGIS [Tutorial]

Introducing Weld, a runtime written in Rust and LLVM for cross-library optimizations

Create a Free Account To Continue Reading

Sign in to activate your 7-day free access