Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Save more on your purchases! discount-offer-chevron-icon
Savings automatically calculated. No voucher code required.
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Newsletter Hub
Free Learning
Arrow right icon
timer SALE ENDS IN
0 Days
:
00 Hours
:
00 Minutes
:
00 Seconds

How-To Tutorials

7007 Articles
article-image-understanding-result-type-in-swift-5-with-daniel-steinberg
Sugandha Lahoti
16 Dec 2019
4 min read
Save for later

Understanding Result Type in Swift 5 with Daniel Steinberg

Sugandha Lahoti
16 Dec 2019
4 min read
One of the first things many programmers add to their Swift projects is a Result type. From Swift 5 onwards, Swift included an official Result type. In his talk at iOS Cong SG 2019, Daniel Steinberg explained why developers would need a Result type, how and when to use it, and what map and flatMap bring for Result. Swift 5, released in March this year hosts a number of key features such as concurrency, generics, and memory management. If you want to learn and master Swift 5, you may like to go through Mastering Swift 5, a book by Packt Publishing. Inside this book, you'll find the key features of Swift 5 easily explained with complete sets of examples. Handle errors in Swift 5 easily with Result type Result type gives a simple clear way of handling errors in complex code such as asynchronous APIs. Daniel describes the Result type as a hybrid of optionals and errors. He says, “We've used it like optionals but we've got the power of errors we know what went wrong and we can pull that error out at any time that we need it. The idea was we have one return type whether we succeeded or failed. We get a record of our first error and we are able to keep going if there are no errors.” In Swift 5, Swift’s Result type is implemented as an enum that has two cases: success and failure. Both are implemented using generics so they can have an associated value of your choosing, but failure must be something that conforms to Swift’s Error type. Due to the addition of Standard Library, the Error protocol now conforms to itself and makes working with errors easier. Image taken from Daniel’s presentation Result type has four other methods namely map(), flatMap(), mapError(), and flatMapError(). These methods enables us to do many other kinds of transformations using inline closures and functions. The map() method looks inside the Result, and transforms the success value into a different kind of value using a closure specified. However, if it finds failure instead, it just uses that directly and ignores the transformation. Basically, it enables the automatic transformation of a value (error) through a closure, but only in case of success (failure), otherwise, the Result is left unmodified. flatMap() returns a new result, mapping any success value using the given transformation and unwrapping the produced result. Daniel says, “If I need recursion I'm often reaching for flat map.” Daniel adds, “Things that can’t fail use map() and things that can fail use flatmap().” mapError(_:) returns a new result, mapping any failure value using the given transformation and flatMapError(_:) returns a new result, mapping any failure value using the given transformation and unwrapping the produced result. flatMap() (flatMapError()) is useful when you want to transform your value (error) using a closure that returns itself a Result to handle the case when the transformation fails. Using a Result type can be a great way to reduce ambiguity when dealing with values and results of asynchronous operations. By adding convenience APIs using extensions we can also reduce boilerplate and make it easier to perform common operations when working with results, all while retaining full type safety. You can watch Daniel Steinberg’s full video on YouTube where he explains Result Type with detailed code examples and points out common mistakes. If you want to learn more about all the new features of Swift 5 programming language then check out our book, Mastering Swift 5 by Jon Hoffman. Swift 5 for Xcode 10.2 is here! Developers from the Swift for TensorFlow project propose adding first-class differentiable programming to Swift Apple releases native SwiftUI framework with declarative syntax, live editing, and support of Xcode 11 beta.
Read more
  • 0
  • 0
  • 31826

article-image-harrison-ferrone-why-c-preferred-programming-language-building-games-unity
Sugandha Lahoti
16 Dec 2019
6 min read
Save for later

Harrison Ferrone explains why C# is the preferred programming language for building games in Unity

Sugandha Lahoti
16 Dec 2019
6 min read
C# is one of the most popular programming languages which is used to create games in the Unity game engine. Experiences (games, AR/VR apps, etc) built with Unity have reached nearly 3 billion devices worldwide and were installed 24 billion times in the last 12 months. We spoke to Harrison Ferrone, software engineer, game developer, creative technologist and author of the book, “Learning C# by Developing Games with Unity 2019”. We talked about why C# is used for game designing, the recent Unity 2019.2 release, and some tips and tricks tips for those developing games with Unity. On C# and Game development Why is C# is widely-used to create games? How does it compare to C++? How is C# being used in other areas such as mobile and web development? I think Unity chose to move forward with C# instead of Javascript or Boo because of its learning curve and its history with Microsoft. [Boo was one of the three scripting languages for the Unity game engine until it was dropped in 2014]. In my experience, C# is easier to learn than languages like C++, and that accessibility is a huge draw for game designers and programmers in general. With Xamarin mobile development and ASP.NET web applications in the mix, there’s really no stopping the C# language any time soon. What are C# scripts? How are they useful for creating games with Unity? C# scripts are the code files that store behaviors in Unity, powering everything the engine does. While there are a lot of new tools that will allow a developer to make a game without them, scripts are still the best way to create custom actions and interactions within a game space. Editor’s Tip: To get started with how to create a C# script in Unity, you can go through Chapter 1 of Harrison Ferrone’s book Learning C# by Developing Games with Unity 2019. On why Harrison wrote his book, Learning C# by Developing Games with Unity 2019 Tell us the motivation behind writing your book Learning C# by Developing Games with Unity 2019. Why is developing Unity games a good way to learn the C# programming language? Why do you prefer Unity over other game engines? My main motivation for writing the book was two-fold. First, I always wanted to be a writer, so marrying my love for technology with a lifelong dream was a no-brainer. Second, I wanted to write a beginner’s book that would stay true to a beginner audience, always keeping them in mind. In terms of choosing games as a medium for learning, I’ve found that making something interesting and novel while learning a new skill-set leads to greater absorption of the material and more overall enjoyment. Unity has always been my go-to engine because its interface is highly intuitive and easy to get started with. You have 3 years of experience building iOS applications in Swift. You also have a number of articles and tutorials on the same on the Ray Wenderlich website. Recently, you started branching out into C++ and Unreal Engine 4. How did you get into game design and Unity development? What made you interested in building games?  I actually got into Game design and Unity development first, before all the iOS and Swift experience. It was my major in university, and even though I couldn’t find a job in the game industry right after I graduated, I still held onto it as a passion. On developing games The latest release of Unity, Unity 2019.2 has a number of interesting features such as ProBuilder, Shader Graph, and effects, 2D Animation, Burst Compiler, etc. What are some of your favorite features in this release? What are your expectations from Unity 2019.3?  I’m really excited about ProBuilder in this release, as it’s a huge time saver for someone as artistically challenged as I am. I think tools like this will level the playing field for independent developers who may not have access to the environment or level builders. What are some essential tips and tricks that a game developer must keep in mind when working in Unity? What are the do’s and don’ts? I’d say the biggest thing to keep in mind when working with Unity is the component architecture that it’s built on. When you’re writing your own scripts, think about how they can be separated into their individual functions and structure them like that - with purpose. There’s nothing worse than having a huge, bloated C# script that does everything under the sun and attaching it to a single game object in your project, then realizing it really needs to be separated into its component parts. What are the biggest challenges today in the field of game development? What is your advice for those developing games using C#? Reaching the right audience is always challenge number one in any industry, and game development is no different. This is especially true for indie game developers as they have to always be mindful of who they are making their game for and purposefully design and program their games accordingly. As far as advice goes, I always say the same thing - learn design patterns and agile development methodologies, they will open up new avenues for professional programming and project management. Rust has been touted as one of the successors of the C family of languages. The present state of game development in Rust is also quite encouraging. What are your thoughts on Rust for game dev? Do you think major game engines like Unity and Unreal will support Rust for game development in the future? I don’t have any experience with Rust, but major engines like Unity and Unreal are unlikely to adopt a new language because of the huge cost associated with a changeover of that magnitude. However, that also leaves the possibility open for another engine to be developed around Rust in the future that targets games, mobile, and/or web development. About the Author Harrison Ferrone was born in Chicago, IL, and raised all over. Most days, you can find him creating instructional content for LinkedIn Learning and Pluralsight, or tech editing for the Ray Wenderlich website. After a few years as an iOS developer at small start-ups, and one Fortune 500 company, he fell into a teaching career and never looked back. Throughout all this, he's bought many books, acquired a few cats, worked abroad, and continually wondered why Neuromancer isn't on more course syllabi. You can follow him on Linkedin, and GitHub.
Read more
  • 0
  • 0
  • 42463

article-image-ten-tips-to-successfully-migrate-from-on-premise-to-microsoft-azure
Savia Lobo
13 Dec 2019
11 min read
Save for later

Ten tips to successfully migrate from on-premise to Microsoft Azure 

Savia Lobo
13 Dec 2019
11 min read
The decision to start using Azure Cloud Services for your IT infrastructure seems simple. However, to succeed, a cloud migration requires hard work and good planning. At Microsoft Ignite 2018, Eric Berg, an Azure Lead Architect at COMPAREX, a Microsoft MVP Azure + Cloud and Data Center Management, shared ‘Ten tips for a successful migration from on-premises to Azure’, based on their day-to-day learnings. Eric shares known issues, common pitfalls, and best practices to get started. Further Reading To gain a deep understanding of various Azure services related to infrastructure, applications, and environments, you can check out our book Microsoft Azure Administrator – Exam Guide AZ-103 by Sjoukje Zaal. This book is also an effective guide for acquiring the skills needed to pass the Exam AZ-103, with effective mock tests and solutions so that you can confidently crack this exam. Tip #1: Have your Azure Governance Set One needs to have a basic plan of what they are going to do with Azure. Consider Azure Governance as the basis for Cloud Adoption. Berg says, “if you don't have a plan for what you do with Azure, it will hurt you.” To run something on Azure is good, but to keep it secure is the key thing. Here, Governance rule sets help users to audit and figure out if everything is running as expected. One of the key parts of Azure Governance is Networking. Hence one should consider a networking concept that suits both the company and the business. Microsoft is moving really fast; in 2018, to connect to the US and Europe you had to use a VPN then came global v-net peering, and now we have ESRI virtual WAN. Such advancements allow a concept to further grow and always use the top of the edge technologies while adoption of such a rule set enables customers to try a lot of things on their own. Tip #2: Think about different requirements From an IT perspective, every organization wants control, focus on its IT, and also to ensure that everything is compliant. Many organizations also want to write policies in place. On the other hand, the human resource department section wants to be totally agile and innovative and wants to consume services and self-service without feeling the need to communicate with IT. “I've seen so many human resource departments doing their own contracts with external partners building some fancy new hiring platforms and IT didn't know anything about it,” Berg points out. When it comes to Cloud, each and every member of the company should be aware and should be involved. It is simply not just an IT-dependent decision, but is company dependent. Tip #3: Assess your infrastructure Berg says organizations should assess their environment. Migrating your servers as they are to Azure is not the right thing to do. This is because in Azure the decision between 8 and 16 gigabytes of RAM is a decision between 100 and 200 percent of the cost. Hence, right scaling or a good assessment is extremely important and this cannot be achieved by running a script once for 10 minutes and you know what your VMs are doing. Instead, you should at least run an assessment for one month or even three months to see some peaks and some low times. This is like a good assessment where you know what you really need to migrate your systems better. Keep a check on your inventory and also on your contracts to check if you are allowed to migrate your ERP system or CRM system to Azure. As some contracts state that the “deployment of this solution outside of the premises of the company needs some extra contract and some extra cost,” Berg warns. Migrating to Azure is technically easy but difficult from a contract perspective. Also, you should define your needs for migration to a cloud platform. If you don't get value out of your migration don't do it. Berg advises, don't migrate to Azure because everybody does or because it's cool or fancy. Tip #4: Do not rebuild your on-premises structures on Cloud Cloud needs trust. Organizations often try to bring in the old stuff on the on-premises infrastructures such as the external DMZ, the internal DMZ, and also 15 security layers. Berg said they use intune, a cloud-based service in the enterprise mobility management (EMM) space that helps enable your workforce to be productive while keeping your corporate data protected, along with Office 365 on a cloud.  In tune doesn't stick to a DMZ; even if you want to deploy your application or use the latest tech such as BOTS, cognitive services, etc. It may not fit totally into a structured network design on the cloud. On the other hand, there will be disconnected subscriptions, i.e. there will be subscriptions with no connection to your on-premises network. This problem has to be dealt with on a security level. New services need new ways. If you are not agile your IT won't be agile. If you need 16 days or six weeks to deploy a server and you want to stick to those rules and processes, then Azure won't be beneficial for you as there will be no value in it for you. Tip #5: Azure consumption is billed If you spin up a VM that costs $25,000 a month you have to pay for it. The M-series VMs have 128 cores 4 terabytes of RAM and are simply amazing. If you deployed using Windows Server and SQL Server Enterprise, the cost goes up to $58,000 a month for just one VM. When you migrate to Azure and you start integrating new things you probably have to change your own business model. To implement tech such as facial recognition, and others you have to set up a cost management tool for usage tracking. There are many usage APIs and third-party tools available. Proper cost management into the Azure infrastructure helps to divide costs. If you put everything into one subscription, one resource group, where everyone is the owner. Here, the problem won’t be the functioning but you will not be able to figure out who's responsible for what. Instead, a good structure of subscriptions, a good role-based access control, a good tagging policy will help you to figure out cost better. Tip #6: Identity is the new perimeter Azure Ad is the center of everything. To access a user’s data center is not easy these days as it needs access within the premises, then into the data center, then log into the user’s own premises infrastructure. If anyone has a user’s login ID, they are inside the user’s Azure AD, the user’s visa VPN, and also on their on-premises data center. Hence identity is a key part of security. “So, don’t think about using MFA, use MFA. Don't think about using Privileged Identity Management, use it because that's the only way to secure your infrastructure probably and get an insight into who is using what in my infrastructure and how is it going,” Berg warns. In the modern workplace, one can work from anywhere. However, one needs to have proper security levels in place. Secure devices, secure identity, secure access ways to MFA, and so on. Stay cautious. Tip #7: Include your users Users are the most important part of any ecosystem. So, when you migrate servers or the entire on-premise architecture, inform them. What if you have a CRM system fully in the cloud and there's no local cache on the system anymore? This won't fit the needs of your customers or internal customers and this is why organizations should inform them of their plans. They should also ask them what they really need and this will, in turn, help the organizations. Berg illustrated this point with a project in Germany that includes a customer with a very specific project that wanted the product to decrease their response times. The client needs up to two days to answer a customer's email because the project product is very complex and they have a very spread documentation library and it's hard. Their internal goal is to bring down the product response to ten minutes--from two days to 10 minutes. Berg said they considered using a bot, some cognitive services and Azure search, and a plug-in an Outlook. So you get the mail you just search for your product and everything will be figured out. The documentation, the fact sheets, and the standard email template for answering such a thing. The solution proposed was good; both Berg and the IT liked it. However, when the sales team was asked, they said such a solution would steal their jobs. The mistake here was Sales was not included in the process of finding this solution. To rectify this, organizations should include all stakeholders. Focus on benefits, have some key users because they will help you to spread the word over. In the above case, explain and evangelize the sales teams as they are afraid because they don't know and don't understand what happens if you have a bot and some cognitive services to figure out which document is right. This won’t steal their job but instead, help to do better at their job with improved efficiency. Train and educate so they are able to use it, check processes and consider changes. Managed services can help you focus. Back up, monitoring, patching, this is something somebody can do for you. Instead, organizations can now focus on after the migration such as integrating new services, improving right scaling, optimizing cost, optimizing performance, staying up-to-date with all the changes in Azure, etc. Tip #8: Consider Transformation instead of Migration Consider a transformation instead of a migration. Build some logical blocks, don't move an ERP system without your database or the other way around. Berg suggests: To adopt technical and licensing showstoppers define your infrastructure requirements check your compatibility to migrate update helpdesk about SLAs Ask if Azure is really helping me (to figure out or to cover my assets or is it getting better or maybe worse). Tip #9: Keep up to date Continuous learning and continuous knowledge are key to growth. As Azure releases a lot of changes very often, users are notified of these latest updates via emails or via Azure news. Organizations should review their architecture on a regular basis, Berg says. VPN to global v-net peering to Global WAN so that you can change your infrastructure quite fast. Audit your governance not on a yearly basis may be monthly or quarterly. Consider changes fast; don't think two years about a change because then it will not be any more interesting. If there's a new opportunity, grab it, use it and three weeks later probably drop it away. But avoid thinking for two months or more else it will be too late. Tip #10: Plan for the future Do some end to end planning, think about the end-to-end solution; who's using it, what's my back end on this, and so on. Save money and forecast your costs. Keep an eye on resources that probably spread because someone runs the script without knowing what they are doing.  Simply migrating an IIS server with a static website to Azure is not actual cloud migration. Instead, customers should consider moving their servers to a static storage website, to a web app, etc. but not in the Windows VM. Berg concludes by saying that an important migration step is to move from infrastructure. Everybody migrates infrastructure to Azure because that's easy because it's just migrating from one VM to another VM. Customers should not ‘only’ migrate. They should also start an optimization, move forward to platform services, be more agile, think about new ways and most importantly get rid of all on-premise old stuff. Berg adds, “In five years probably nobody will talk about infrastructure as a service anymore because everybody has migrated and optimized it already.” To stay more compliant with corporate standards and SLAs, learn how to configure Azure subscription policies with “Microsoft Azure Administrator – Exam Guide AZ-103” by Packt Publishing. 5 reasons Node.js developers might actually love using Azure [Sponsored by Microsoft] Azure Functions 3.0 released with support for .NET Core 3.1! Microsoft announces Azure Quantum, an open cloud ecosystem to learn and build scalable quantum solutions
Read more
  • 0
  • 0
  • 47349

article-image-mongodbs-cto-eliot-horowitz-on-whats-new-in-mongodb-4-2-ops-manager-atlas-and-more
Bhagyashree R
13 Dec 2019
9 min read
Save for later

MongoDB’s CTO Eliot Horowitz on what’s new in MongoDB 4.2, Ops Manager, Atlas, and more

Bhagyashree R
13 Dec 2019
9 min read
At MongoDB.local London event that happened in September this year, Eliot Horowitz, the CTO and Co-Founder of MongoDB took to the stage to talk about the latest features in MongoDB 4.2. He also discussed the updates to Ops Manager and MongoDB Atlas, and new cloud services including integrated full-text search, the Realm development platform, and MongoDB Data Lake. MongoDB.local is a one-day educational conference that brings together people who develop MongoDB and its ecosystem, as well as fellow MongoDB users. This is where you can get a deeper knowledge of the latest in MongoDB, tools, and best practices directly from the MongoDB experts. [box type="shadow" align="" class="" width=""] Further Learning This article lists the various features that have landed in MongoDB 4.2. To get a practical understanding of administering database applications both on-premises and on the cloud, check out our book Mastering MongoDB 4.x - Second Edition by Alex Giamas. [/box] Exciting features in MongoDB 4.2 Distributed transactions MongoDB 4.0 came with support for multi-document transactions on replica sets. This support was extended in MongoDB 4.2 by introducing distributed transactions. These add support for multi-document transactions on sharded clusters and also include the existing support for multi-document transactions on replica sets. Distributed transactions have the same syntax and semantics as the replica set transactions. They are fully ACID compliant and have conversational syntax. Another important update is that there is now no limit to how big a transaction can be. “It is just a matter of how much hardware you have and what the hardware can handle,” Horowitz adds. Also, previously the sharding system did not allow changing the shard key as it often meant moving a document from one shard to another. Starting with MongoDB 4.2, you are allowed to change the shard key and that too very easily. Now, if you change the value of a shard key and a document is required to be moved from one shard to another, MongoDB will automatically wrap that update behind the scenes inside of a transaction. This is one step towards ensuring that there is no “difference between a sharded MongoDB cluster and a replica set,” Horowitz shared. Another function that Horowitz talked about was global cluster locale reassignment. For instance, suppose you have geo zone sharding with some data residing in Europe and some other data in the US. When the users move, you can just change the value of their location field and that data will be automatically moved from Europe to the US using a transaction. Retryable reads and writes Retryable reads and writes enable the MongoDB drivers to automatically retry certain transactions if they encounter network errors or if they were not able to find a healthy primary in the replica sets or sharded cluster. Starting with MongoDB 4.2, this feature is enabled by default. One of the main goals of this feature is ensuring that whenever there is some change in the infrastructure whether it is for planned maintenance or know crashes, the application code shouldn’t care or be affected. Explaining through an example, he shared, “You have got a web page that does 20 different database operations. Rather than having to reload the entire thing, rather than having to wrap the entire web page in some sort of loop the driver under the covers can just say this I am going to retry this operation.” He adds, “So if a write fails it will retry that write automatically and will have a contract with the server to guarantee that every write happens once and only once.” Much more expressive updates MongoDB’s query language is now much richer and expressive with the support for aggregations and other modern use-cases including geo-based search, graph search, and text search. You can do things like sums, handle arrays, and other math directly through an update statement. “Let’s imagine you’ve got a document and all you want to do is to set the value of A to the value of B+C in every document. Previously, you couldn’t do that and now you can do very simple arithmetic in MongoDB.” On-demand materialized views The MongoDB aggregation pipeline, a framework for data aggregation, consists of stages. Each stage is responsible for transforming a document as they pass through the pipeline. MongoDB 4.2 introduces a new stage called ‘$merge’ that allows you to create collections based on aggregation and update those created collections efficiently. The $out stage already allows creating collections based on an aggregation. It takes the results of an aggregation and puts it into a new collection. But the difference is that it replaces the collections entire contents with the new results. As it regenerates the entire collection every time, it ends up consuming a lot of CPU and IO. The new $merge feature can incorporate the pipeline results into an existing output collection rather than fully replacing the collection. This enables users to create on-demand materialized views, where the content of the output collection is perennially updated “maybe every minute, every hour, or maybe every day depending on the use case.” Wildcard indexes In MongoDB 4.2, we have wildcard indexes that let you index an entire document or a subset of a document. It is introduced to support queries against unknown or arbitrary fields. Horowitz explains, “Previously, you were required to either add an index for every attribute you care about or put these into an array...With wild card indexes, you can actually just say “hey index the entire document or index this entire subset of the document.” What will happen is we will actually index everything in there so you can just do any query that you want.” However, keep in mind that wildcard indexes are not really designed to replace workload-based index planning. It is suitable for cases when you have polymorphic patterns in your data. Examples of data containing polymorphic pattern include product catalogs, e-commerce, social data, and IoT applications. Modern operations Along with offering such great features, it is also important for a database to provide developers a great operational experience. It should have great availability, a powerful monitoring and alerting system, backup, self-service, and APIs. To manage MongoDB we have two options: MongoDB Ops Manager and MongoDB Atlas. MongoDB Ops Manager MongoDB Ops Manager is the “best way to run MongoDB on-premises.” Its backup system offers great features such as point-in-time restore and queryable snapshots. In previous versions, however, it was a complex system and in many cases expensive to run. Starting with MongoDB 4.2, it was completely overhauled to be much simpler. Now, there is no concept of “heads.” This release also introduces a new Kubernetes operator for Ops Manager. On-premise users are moving to private cloud and for that, they mainly rely on Kubernetes. This is why you now have the Kubernetes operator for Ops Manager. It will enable you to directly control the Ops Manager through your Kubernetes interfaces. MongoDB Atlas MongoDB Atlas is a fully-managed MongoDB as a service. It now has integration with Terraform, a tool used for building, changing, and versioning infrastructure. There is also a new feature called Atlas Auto Scaling for fully-automated capacity management. Once you enable the feature, Atlas will monitor resource utilization metrics in real-time and automatically scale up or down your VM. In terms of security, MongoDB Atlas is now ISO 27001 certified and PCI compliant. It also supports field-level encryption (FLE) beta. This enables applications to encrypt fields in documents before transmitting data to the server. This encryption happens on the client-side and is completely transparent to the developers. Another key update in this release is the introduction of MongoDB Atlas Full-Text Search (Beta). Atlas now has a rich-text search functionality against your fully managed MongoDB databases. Horowitz explains, “Today, you typically have to take in MongoDB and synchronize it to some other system (such as Elasticsearch) and under those systems is Apache Lucene.” The team decided to remove this “middleman” to let users go “straight from MongoDB to Lucene.” Horowitz also talked about MongoDB Atlas Data Lake that enables you to quickly query data in any format on Amazon S3 using the MongoDB Query Language (MQL). It lets you run regular MongoDB queries against data in Amazon S3. It supports any file format including JSON, BSON, CSV, TSV, Avro, and Parquet formats. MongoDB Realm In May this year, MongoDB acquired Realm, a database for mobile applications. Horowitz gave some insight into what future plans he has for Realm. “MongoDB is investing in a lot of the things that Realm users have been asking for a long time or taking a lot of the resources we have and making sure that we can accelerate the core realm roadmap as fast as possible.” Among the new features that RealmDB will get are new data types for unstructured data such as Dicts, Sets, Any/Mixed type for polymorphic data. It will have cascading deletes, inheritance, analytics and transformational queries, support for more platforms. Horowitz plans to integrate Realm more tightly with MongoDB and together they will be called MongoDB Realm.  It will be “the best way to build data-intensive applications anywhere.” This article walked you through the new features in MongoDB 4.2, Ops Manager, Atlas, and much more presented by Eliot Horowitz in his MongoDB.local talk. Check out our book Mastering MongoDB 4.x - Second Edition by Alex Giamas to become a successful MongoDB expert.  This book dives into niche areas of managing databases (such as modeling and querying databases) along with various administration techniques in MongoDB, and much more. MongoDB is partnering with Alibaba Homebrew removes MongoDB from core formulas MongoDB withdraws controversial Server Side Public License from the Open Source Initiative’s approval process
Read more
  • 0
  • 0
  • 20942

article-image-new-qgis-3d-capabilities-and-future-plans-presented-by-martin-dobias-a-core-qgis-developer
Bhagyashree R
13 Dec 2019
8 min read
Save for later

New QGIS 3D capabilities and future plans presented by Martin Dobias, a core QGIS developer

Bhagyashree R
13 Dec 2019
8 min read
In his talk titled QGIS 3D: current state and future at FOSS4G 2019, Martin Dobias, CTO of Lutra Consulting talked about the new features in QGIS 3D. He also shared a list of features that can be added to QGIS 3D to make 3D rendering in QGIS more powerful. Free and Open Source Software for Geospatial (FOSS4G) 2019 was a five-day event that happened from Aug 26-30 at Bucharest. FOSS4G is a conference where geospatial professionals, students, professors come together to discuss about free and open-source software for geospatial storage, processing, and visualization. [box type="shadow" align="" class="" width=""] Further Learning This article explores the new features in QGIS 3D native rendering support. If you are embarking on your QGIS journey, check out our book Learn QGIS - Fourth Edition by Andrew Cutts and Anita Graser. In this book, you will explore QGIS user interface, load your data, edit, and then create data. QGIS often surprises new users with its mapping capabilities; you will discover how easily you can style and create your first map. But that’s not all! In the final part of the book, you’ll learn about spatial analysis, powerful tools in QGIS, and conclude by looking at Python processing options. [/box] 3D visualization in QGIS QGIS 3D native rendering support was introduced in QGIS 3. Prior to that, developers had to rely on third-party tools like NVIZ from GRASS GIS, GVIZ, Globe plugin, Qgis2threejs plugin, and more. Though these worked, “the integration was never great with the rest of QGIS,” remarks Dobias. In 2017, the QGIS grand proposal was accepted to start the initial work on QGIS 3D. A year later, QGIS 3 was announced with an interactive, fully integrated interface for you to work in 3D. QGIS 3 has a separate interface dedicated to 3D data visualization called 3D map view, which you can access from the View context menu. After you select this option, a new window will open that you can dock to the main panel. In the new window you will see all the layers that are visible in the main map view and rendered digital elevation and vector data in 3D. With native QGIS 3D support you can render raster, vector, and mesh layers. It also provides various methods for visualizing and styling the 3D data depending on the data or geometry type. Here are some of the features that Dobias talked about: Point-based rendering Starting with QGIS 3, you have three ways to render points: Basic symbols: You can use symbols such as spheres, cylinders, boxes, or cubes, apply a color, and apply a few transformations. 3D models loaded from a file: You can use the Open Asset Import Library (Assimp) to load the 3D models. This library allows you to import and export a wide-range of 3D model file formats including Collada, Wavefront, and more. After loading the model you can do tweaks like changing the color. However, there are currently limitations like “you can only change the color of the whole model and not the individual components,” Dobias mentioned. Billboard rendering: This feature was contributed by Ismail Sunni as a part of the Google Summer of Code (GSoC) 2019 project, QGIS 3D Improvement. The billboard support, which was released in QGIS 3.10, will allow you to render points as a billboard in 3D map view. Line rendering For line rendering, you have two options: Simple lines: In this approach, you define the width of a line in pixels and it does not change when you zoom-in or zoom-out. This technique preserves Z coordinates. Buffered lines: In this approach, you define the line width in map units. So, as soon as you start zooming in the line will appear zoomed out. Buffered rendering ignores z-coordinates. Polygon rendering For polygon rendering, you have four different options: Planar 3D entity: QGIS 3 provides a method to draw polygon geometries as planar polygons. Extrusion: Extrusion is a way to create 3D symbology from 2D features by stretching it vertically. QGIS now supports extruding a planar polygon to make it look like a box. You can specify a constant height or you can write an expression that determines it. Polyhedral surfaces or PolygonZ: QGIS 3 has a provision for creating polyhedral surfaces. Polyhedron is simply a three-dimensional solid which consists of a collection of polygons, usually joined at their edges. Triangular mesh or MultiPatch: It is similar to polyhedral surfaces, the only difference is that it consists of individual triangles. 3D map tools Navigation: You can use mouse and keyboard to navigate the map. Now, with the latest QGIS release you can also perform navigation using on-screen controls. Dobias said, “This is good for beginners when they are not completely sure about other means of moving the map.” Identify tool: With this tool, you can interact with the map canvas and get information on features in a pop-up window. It works exactly like its 2D counterpart, the only difference being it will be on a 3D entity. Measurement tool: This tool was also built as part of the GSoC project. This will enable you to measure real distances between given points. Other 3D capabilities Print layout support QGIS already had support to save the 3D map view as an image file, but for print layouts you needed to perform multiple steps. You had to first save 3D scene images and then embed them within print layouts. Also, the resolution of the saved images was limited to the size of the 3D window. To simplify the use of 3D scenes for printing and allow high resolution scene exports, QGIS 3 supports a new type of layout item that is capable of high resolution exports of 3D map scenes. Camera animation support With the QGIS 3D support, now users can define keyframes on a timeline with camera positions and view directions for various points in time. The 3D engine will interpolate camera parameters between keyframes to create animations. These resulting animations can then be played within the 3D view or exported frame-by-frame to a series of images. Configuration of lights By default, the 3D view has a single white light placed above the centre of the 3D scene. Now, users can set up light source position, color, and intensity and even define multiple lights for some interesting effects. Rule-based 3D rendering Previously, it was only possible to define one 3D renderer per layer meaning all features appear the same. QGIS 3 features rule-based rendering for 3D to make it much easier to apply more complex styling in 3D without having to duplicate vector layers and apply filters. There are many other 3D capabilities that you can explore including terrain shading, better camera control, and more. Where you can find data for 3D maps Dobias shared a few great 3D city models that are free to use including CityGML and CityJSON. To easily load CityJSON datasets in QGIS you can use the CityJSON Loader plugin. OpenStreetMap (OSM) is another project that provides buildings data. You can also use the Google dataset search. Just type CityGML in a search box and find the data you need. QGIS 3D capabilities to expect in the future Dobias further talked about the future plans for QGIS 3D. Currently, the team is working on improving support for larger 3D scenes and also make them load faster. For the far future, Dobias shared a wishlist of features that can be implemented in QGIS to make its 3D support much more powerful: Enhancing the 3D rendering performance More rendering techniques like shadows, transparency New materials to show textured objects More styles for vector layers such as lines and 3D pipes More data types such as point cloud and 3D rasters Formats support like 3D tiles, Arc SceneLayer Animation of data in scenes Profile tool Blender export Rendering of point cloud You just read about some of the latest features in QGIS 3 for 3D rendering. If you are new to QGIS and want to grasp its fundamentals, check out our book Learn QGIS - Fourth Edition by Anita Graser and Andrew Cutts. In this book, you will explore various ways to load data into QGIS, understand how to style data and present it in a map, and create maps and explore ways to expand them. You will get acquainted with the new processing toolbox in QGIS 3.4, manipulate your geospatial data and gain quality insights, and work with QGIS 3.4 in 3D. Why geospatial analysis and GIS matters more than ever today Top 7 libraries for geospatial analysis Uber’s kepler.gl, an open source toolbox for GeoSpatial Analysis
Read more
  • 0
  • 0
  • 30225

article-image-elastic-marks-its-entry-in-security-analytics-market-with-elastic-siem-and-endgame-acquisition
Bhagyashree R
13 Dec 2019
6 min read
Save for later

Elastic marks its entry in security analytics market with Elastic SIEM and Endgame acquisition

Bhagyashree R
13 Dec 2019
6 min read
For many years, Elastic Stack has served as an open-source, simple yet powerful interface for security analysts to detect and mitigate malicious behavior. However, Elastic marked its official entry into the security analytics market with Elastic SIEM in June this year. Since its initial release, Elastic SIEM has seen a number of enhancements including machine learning-based anomaly detection, maps integration, and more. To further expand its presence in the security field, Elastic in early October, completed the acquisition of Endgame, a security company focused on endpoint prevention, detection, and response. Following this acquisition, Elastic introduced the Elastic Endpoint Security solution in October to help organizations “automatically and flexibly respond to threats in real-time.” The company has also eliminated per-endpoint pricing. In this article, we will look at what is Elastic SIEM, how it fits into the Elastic Stack, its components, and how a security operations team leverages Elastic SIEM to defend its data and infrastructure against attacks. [box type="shadow" align="" class="" width=""] Further learning This is a quick overview of the Elastic Stack. To learn more check out our book, Learning Elastic Stack 7.0 - Second Edition by Pranav Shukla and Sharath Kumar M N. This book will give you a fundamental understanding of what the stack is all about, and help you use it efficiently to build powerful real-time data processing. [/box] Introducing Elastic SIEM Elastic SIEM is not a standalone product but rather builds on the existing Elastic Stack capabilities used for security analytics including search, visualizations, dashboards, alerting, machine learning features, and more. The following diagram shows how Elastic SIEM fits into the Elastic Stack: Source: Elastic The beta version of Elastic SIEM was released in June this year with Elastic Stack 7.2. It includes a new set of data integrations for security use cases and a dedicated app in Kibana. It enables users to analyze host-related and network-related security events as part of alert investigations, threat hunting, initial investigations, and triaging of events. You can access Elastic SIEM through the Elastic Cloud or by downloading its default distribution. Elastic SIEM supports the recently introduced Elastic Common Schema (ECS), a uniform way to represent data across different sources. ECS defines a common set of fields and objects to ingest data into Elasticsearch enabling users to centrally analyze information like logs, flows, and contextual data from across environments. Features of Elastic SIEM Host-related security event analysis The Hosts view shows key metrics regarding host-related security events and a set of data tables that enable interaction with the Timeline Event Viewer. For further investigation, you can drag-and-drop items of interest from the Hosts view tables to Timeline. This gives you deeper insight into hosts, unique IPs, user authentications, uncommon processes, and events. We can filter the host view with the search bar at the top. To help you search faster, SIEM provides a search experience that combines traditional text-based search with the visual query builder that’s deeply integrated with drag-and-drop throughout the SIEM app and powered by the Elastic common schema. Network-related security event analysis The Network view provides analysts the key network activity metrics and event tables. You can drag-and-drop these tables to Timeline for further investigation to get deeper insight into the source and destination IP, top DNS domains, users, transport layer security certs, and more. Starting with Elastic Stack 7.4, you have Elastic Maps integrated right into Elastic SIEM. The interactive map is created based on live data that analysts can search, filter, and explore in real-time. The map gives analysts an overview of the network traffic. They can simply hover over source and destination points to uncover more details such as hostnames and IP addresses. They can also click a hostname to go to the SIEM Host view or an IP address to open the relevant network details. This integration lets Elastic SIEM leverage geospatial analytics and search capabilities of Elastic Maps. It also uses the new point-to-point line feature to easily visualize the connections in your data. Timeline Event Viewer The Timeline Event Viewer enables security analysts to gather and store evidence of an attack. They can pin and annotate relevant events, comment on and share their findings, and do everything within Kibana. It is a collaborative workspace for investigations or threat hunting where analysts can easily drag objects of interest from Network and Hosts view for further investigation. Anomaly detection with machine learning integration Cyber attacks today have become so sophisticated that it is hard to maintain an effective defense with just a set of static rules. Looking at the importance of automated analysis and detection, Elastic integrated machine learning capabilities right into the SIEM app in 7.3. This allowed security analysts to enable and run a set of machine learning anomaly detection jobs designed to detect specific cyber attack behaviors. The detected anomalies are then displayed on the Hosts and Network views in the SIEM app. However, in Elastic SIEM 7.3, there were only three built-in anomaly detection jobs. In the latest release (7.4), Elastic has added thirteen more anomaly detection jobs some of which are anomalous network activity, anomalous process, anomalous path activity, anomalous Powershell script, and more. This machine learning integration is extensible allowing users to add their own jobs to the SIEM job group. These were some of the key features in Elastic SIEM. Check out the Elastic SIEM 7.4 release announcement to know more. Also, to get a better understanding of how Elastic SIEM works, see the webinar Hands-on with Elastic SIEM: Defending your organization with the Elastic Stack by Elastic. To get started with Elastic Stack you can check out our book Learning Elastic Stack 7.0 - Second Edition. This book will help you learn how to use Elasticsearch for distributed searching and analytics, Logstash for logging, and Kibana for data visualization.  As you work through the book, you will discover the technique of creating custom plugins using Kibana and Beats. The book also touches upon Elastic X-Pack, a useful extension for effective security and monitoring.  You’ll also find helpful tips on how to use Elastic Cloud and deploy Elastic Stack in production environments. How to push Docker images to AWS’ Elastic Container Registry(ECR) [Tutorial] Core security features of Elastic Stack are now free! Elastic Stack 6.7 releases with Elastic Maps, Elastic Update and much more!
Read more
  • 0
  • 0
  • 31498
Unlock access to the largest independent learning library in Tech for FREE!
Get unlimited access to 7500+ expert-authored eBooks and video courses covering every tech area you can think of.
Renews at €18.99/month. Cancel anytime
article-image-challenge-deep-learning-sustain-current-pace-innovation-ivan-vasilev-machine-learning-engineer
Sugandha Lahoti
13 Dec 2019
8 min read
Save for later

“The challenge in Deep Learning is to sustain the current pace of innovation”, explains Ivan Vasilev, machine learning engineer

Sugandha Lahoti
13 Dec 2019
8 min read
If we talk about recent breakthroughs in the software community, machine learning and deep learning is a major contender - the usage, adoption, and experimentation of deep learning has exponentially increased. Especially in the areas of computer vision, speech, natural language processing and understanding, deep learning has made unprecedented progress. GANs, variational autoencoders and deep reinforcement learning are also creating impressive AI results. To know more about the progress of deep learning, we interviewed Ivan Vasilev, a machine learning engineer and researcher based in Bulgaria. Ivan is also the author of the book Advanced Deep Learning with Python. In this book, he teaches advanced deep learning topics like attention mechanism, meta-learning, graph neural networks, memory augmented neural networks, and more using the Python ecosystem. In this interview, he shares his experiences working on this book, compares TensorFlow and PyTorch, as well as talks about computer vision, NLP, and GANs. On why he chose Computer Vision and NLP as two major focus areas of his book Computer Vision and Natural Language processing are two popular areas where a number of developments are ongoing. In his book, Advanced Deep Learning with Python, Ivan delves deep into these two broad application areas. “One of the reasons I emphasized computer vision and NLP”, he clarifies, “is that these fields have a broad range of real-world commercial applications, which makes them interesting for a large number of people.” The other reason for focusing on Computer Vision, he says “is because of the natural (or human-driven if you wish) progress of deep learning. One of the first modern breakthroughs was in 2012, when a solution based on convolutional network won the ImageNet competition of that year with a large margin compared to any previous algorithms. Thanks in part to this impressive result, the interest in the field was renewed and brought many other advances including solving complex tasks like object detection and new generative models like generative adversarial networks. In parallel, the NLP domain saw its own wave of innovation with things like word vector embeddings and the attention mechanism.” On the ongoing battle between TensorFlow and PyTorch There are two popular machine learning frameworks that are currently at par - TensorFlow and PyTorch (Both had new releases in the past month, TensorFlow 2.0 and PyTorch 1.3). There is an ongoing debate that pitches TensorFlow and PyTorch as rivaling tech and communities. Ivan does not think there is a clear winner between the two libraries and this is why he has included them both in the book. He explains, “On the one hand, it seems that the API of PyTorch is more streamlined and the library is more popular with the academic community. On the other hand, TensorFlow seems to have better cloud support and enterprise features. In any case, developers will only benefit from the competition. For example, PyTorch has demonstrated the importance of eager execution and TensorFlow 2.0 now has much better support for eager execution to the point that it is enabled by default. In the past, TensorFlow had internal competing APIs, whereas now Keras is promoted as its main high-level API. On the other hand, PyTorch 1.3 has introduced experimental support for iOS and Android devices and quantization (computation operations with reduced precision for increased efficiency).” Using Machine Learning in the stock trading process can make markets more efficient Ivan discusses his venture into the field of financial machine learning, being the author of an ML-oriented event-based algorithmic trading library. However, financial machine learning (and stock price prediction in particular) is usually not in the focus of mainstream deep learning research. “One reason”, Ivan states, “is that the field isn’t as appealing as, say, computer vision or NLP. At first glance, it might even appear gimmicky to predict stock prices.” He adds, “Another reason is that quality training data isn’t freely available and can be quite expensive to obtain. Even if you have such data, pre-processing it in an ML-friendly way is not a straightforward process, because the noise-to-signal ratio is a lot higher compared to images or text. Additionally, the data itself could have huge volume.” “However”, he counters, “using ML in finance could have benefits, besides the obvious (getting rich by trading stocks). The participation of ML algorithms in the stock trading process can make the markets more efficient. This efficiency will make it harder for market imbalances to stay unnoticed for long periods of time. Such imbalances will be corrected early, thus preventing painful market corrections, which could otherwise lead to economic recessions.” GANs can be used for nefarious purposes, but that doesn’t warrant discarding them Ivan has also given a special emphasis to Generative adversarial networks in his book. Although extremely useful, in recent times GANs have been used to generate high-dimensional fake data that look very convincing. Many researchers and developers have raised concerns about the negative repercussions of using GANs and wondered if it is even possible to prevent and counter its misuse/abuse. Ivan acknowledges that GANs may have unintended outcomes but that shouldn’t be the sole reason to discard them. He says, “Besides great entertainment value, GANs have some very useful applications and could help us better understand the inner workings of neural networks. But as you mentioned, they can be used for nefarious purposes as well. Still, we shouldn’t discard GANs (or any algorithm with similar purpose) because of this. If only because the bad actors won’t discard them. I think the solution to this problem lies beyond the realm of deep learning. We should strive to educate the public on the possible adverse effects of these algorithms, but also to their benefits. In this way we can raise the awareness of machine learning and spark an honest debate about its role in our society.” Machine learning can have both intentional and unintentional harmful effects Awareness and Ethics go in parallel. Ethics is one of the most important topics to emerge in machine learning and artificial intelligence over the last year. Ivan agrees that the ethics and algorithmic bias in machine learning are of extreme importance. He says, “We can view the potential harmful effects of machine learning as either intentional and unintentional. For example, the bad actors I mentioned when we discussed GANs fall into the intentional category. We can limit their influence by striving to keep the cutting edge of ML research publicly available, thus denying them any unfair advantage of potentially better algorithms. Fortunately, this is largely the case now and hopefully will remain that way in the future. “ “I don't think algorithmic bias is necessarily intentional,'' he says. “Instead, I believe that it is the result of the underlying injustices in our society, which creep into ML through either skewed training datasets or unconscious bias of the researchers. Although the bias might not be intentional, we still have a responsibility to put a conscious effort to eliminate it.” Challenges in the Machine learning ecosystem “The field of ML exploded (in a good sense) a few years ago,'' says Ivan, “thanks to a combination of algorithmic and computer hardware advances. Since then, the researches have introduced new smarter and more elegant deep learning algorithms. But history has shown that AI can generate such a great hype that even the impressive achievements of the last few years could fall short of the expectations of the general public.” “So, in a broader sense, the challenge in front of ML is to sustain the current pace of innovation. In particular, current deep learning algorithms fall short in some key intelligence areas, where humans excel. For example, neural networks have a hard time learning multiple unrelated tasks. They also tend to perform better when working with unstructured data (like images), compared to structured data (like graphs).” “Another issue is that neural networks sometimes struggle to remember long-distance dependencies in sequential data. Solving these problems might require new fundamental breakthroughs, and it’s hard to give an estimation of such one time events. But even at the current level, ML can fundamentally change our society (hopefully for the better). For instance, in the next 5 to 10 years, we can see the widespread introduction of fully autonomous vehicles, which have the potential to transform our lives.” This is just a snapshot of some of the important focus areas in the deep learning ecosystem. You can check out more of Ivan’s work in his book Advanced Deep Learning with Python. In this book you will investigate and train CNN models with GPU accelerated libraries like TensorFlow and PyTorch. You will also apply deep neural networks to state-of-the-art domains like computer vision problems, NLP, GANs, and more. Author Bio Ivan Vasilev started working on the first open source Java Deep Learning library with GPU support in 2013. The library was acquired by a German company, where he continued its development. He has also worked as a machine learning engineer and researcher in the area of medical image classification and segmentation with deep neural networks. Since 2017 he has focused on financial machine learning. He is working on a Python based platform, which provides the infrastructure to rapidly experiment with different ML algorithms for algorithmic trading. You can find him on Linkedin and GitHub. Kaggle’s Rachel Tatman on what to do when applying deep learning is overkill  Brad Miro talks TensorFlow 2.0 features and how Google is using it internally François Chollet, creator of Keras on TensorFlow 2.0 and Keras integration, tricky design decisions in deep learning and more
Read more
  • 0
  • 0
  • 24286

article-image-why-should-you-consider-becoming-aws-developer-associate-certified
Savia Lobo
12 Dec 2019
5 min read
Save for later

Why should you consider becoming ‘AWS Developer Associate’ certified?

Savia Lobo
12 Dec 2019
5 min read
Organizations both large and small are looking to automating their day-to-day processes and the best option they consider is moving to the cloud. However, they also fear certain challenges that can make cloud adoption difficult. The biggest challenge is the lack of resources or expertise to understand how different cloud services function or how they are built, to leverage its advantages to the fullest. Many developers use cloud computing services--either through the companies they work with or simply subscribe to it--without really knowing the intricacies. Their knowledge of how the internal processes work remains limited. Certifications, can, in fact, help you understand how cloud functions and what goes on within these gigantic data holders. To start with, enroll yourself into a basic certification by any of the popular cloud service providers. Once you know the basics, you can go ahead to master the other certifications available based on your job role or career aspirations. Why choose an AWS certification Amazon Web Services (AWS) is considered one of the top cloud services providers in the cloud computing market currently. According to Gartner’s Magic Quadrant 2019, AWS continues to lead in public cloud adoption. AWS also offers eleven certifications that include foundational and specialty cloud computing topics. If you are a developer or a professional who wants to pursue a career in Cloud computing, you should consider taking the ‘AWS Certified Developer - Associate’ certification. Do you wish to learn from the AWS subject-matter experts, explore real-world scenarios, and pass the AWS Certified Developer – Associate exam? We recommend you to explore the book, AWS Certified Developer - Associate Guide - Second Edition by Vipul Tankariya and Bhavin Parmar. Many organizations use AWS services and being certified can open various options for improved learning. Along with being popular among companies, AWS includes a host of cloud service options compared to other cloud service providers. While having a hands-on experience holds great value for developers, getting certified by one of the most popular cloud services will only have greater advantages for their better future. Starting with web developers, database admins, IoT or an AI developer, etc., AWS includes various certification options that delve into almost every aspect of technology. It is also constantly adding more offerings and innovating in a way that keeps one updated with cutting-edge technologies. Getting an AWS certification is definitely a difficult task but you do not have to quit your current job for this one. Unlike other vendors, Amazon offers a realistic certification path that does not require highly specialized (and expensive) training to start. AWS certifications validate a candidate’s familiarity and knowledge of best practices in cloud architecture, management, and security. Prerequisites for this certification The AWS Developer Associate certification will help you enhance your skills impacting your career growth. However, one needs to keep certain prerequisites in mind. A developer should have: Attended the AWS Essentials course or should have an equivalent experience Knowledge in developing applications with API interfaces Basic understanding of relational and non-relational databases. How the AWS Certified Developer - Associate level certification course helps a developer AWS Certified Developer Associate certification training will give you hands-on exposure to core AWS services through guided lectures, videos, labs and quizzes. You'll get trained in compute and storage fundamentals, architecture and security best practices that are relevant to the AWS certified developer exam. This associate-level course will help developers identify the appropriate AWS architecture and also learn to design, develop, and deploy optimum AWS cloud solutions. If one already has some existing knowledge of AWS, this course will help them identify and deploy secure procedures for optimal cloud deployment and maintenance. Developers will also learn to develop and maintain applications written for Amazon S3, DynamoDB, SQS, SNS, SWS, AWS Elastic Beanstalk, and AWS CloudFormation. After achieving this certification, you will be an asset to any organization. You can help them leverage best practices around advanced cloud-based solutions and migrate existing workloads to the cloud. This indirectly means a rise in your annual income and also career growth. However, getting certified alone is not enough, other factors such as skills, experience, geographic location, etc. are also important. This certification will help you become competent in using Amazon’s cloud services. This course is a part of the first tier (Associate level) of certifications that AWS offers. You could further improve your cloud computing skills by taking up certifications from the professional tier and later from the specialty tiers, whatever suits you the best. New AWS services and features are added every year. Certification alone is not enough, staying relevant is the key. To continually demonstrate expertise and knowledge of best practices for the most up to date AWS services, certification holders are required to re-certify every two years. You can either choose to take a professional-level exam for the same certification or pass the re-certification exam for your existing certification. To further gain valuable insights on how to design, develop, and deploy cloud-based solutions using AWS and also get familiar with Identity and Access Management (IAM) along with Virtual private cloud (VPC), you can check out the book, AWS Certified Developer - Associate Guide - Second Edition by Vipul Tankariya and Bhavin Parmar. How do AWS developers manage Web apps? Why AWS is the preferred cloud platform for developers working with big data How do you become a developer advocate?
Read more
  • 0
  • 0
  • 26938

article-image-abel-wang-explains-the-relationship-between-devops-and-cloud-native
Savia Lobo
12 Dec 2019
5 min read
Save for later

Abel Wang explains the relationship between DevOps and Cloud-Native

Savia Lobo
12 Dec 2019
5 min read
Cloud-native is microservices containers and serverless apps that run in multi-cloud environments and are managed by DevOps processes. However, the relationship between these is not always clearly defined. Shayne Boyer, Principal Cloud Advocate, in a conversation with Abel Wang, Principal cloud Advocate, and DevOps lead discussed the relationship between DevOps and Cloud-Native on the Microsoft developer channel. Do you wish to further learn how to implement DevOps using Azure DevOps, and also want to learn the entire serverless stack available in Azure including Azure Event Grid, Azure Functions, and Azure Logic Apps, you should check out the book Azure for Architects - Second Edition to know more. What is DevOps? Abel starts off by saying, DevOps at Microsoft is the union of people, processes, and products to enable the continuous delivery of value to our end-users. The reason one should care about DevOps when it comes to cloud native apps is that the key here is continuously delivering value. One of the powers of Cloud Native is because all of your infrastructures is out in the cloud, you're able to iterate it very quickly. This is what DevOps helps you do as well, continuously deliver value to your end-users. These days the speed of business is so quick that if we can't iterate quickly and give value quickly, our competitors will and once they do it the rest will be obsolete. Hence, it is extremely important to iterate quickly which Cloud-Native helps enable. The concept of continuously delivering value remains similar to the concept we carry out on our local machine during a standard deployment. Where Cloud-Native become completely unique is, All your infrastructures are out in the cloud. Hence, deploying to the cloud is easier to do than deploying on to like a mobile app. One of the most powerful things about cloud-native is that it is a microservice-based architecture. With these advantages, we're able to iterate quickly because instead of deploying this massive model if we make one tiny little change, we can just deploy that one service. This will simplify and speed up the process. With this every developer check-in can go through our gates, can go through our pipeline, reach production at a quicker rate, and so we're able to give value even faster and better. Key DevOps and Cloud-Native Apps concepts Wang says to aim for a CI/CD pipeline that can process code as soon as somebody adds it in. The pipeline should further make it easy to build and then finally deploy it to the infrastructure present. Wang demonstrates a cloud-native application with a slightly complicated infrastructure. The application consists of a static website that is held in Azure storage, it has a back-end written in .Net Core which is held in an Azure function and they both connect up to a Cosmos DB. If microservices are deployed independently, the services need to be smart enough to realize what version the other services are on so that the entire application is not disturbed if additional service is uploaded. He further demonstrates an instance for deploying the entire infrastructure all at once. You can check out the video to know more about the demonstration in detail. How to ensure quality across environments in a DevOps practice? In a cloud-native application, we need not worry about deploying similar infrastructures while moving from one environment to the next. This is because the dev environment will be exactly the same as the QA environment and further the same throughout all the way out into production. This will be cost-effective because we can just spin it up to run whatever we need to and as soon as it's done, tear it down so we're not paying for anything. Automating the Cloud Native processes include a few manual steps such as approving an email. Wang says one could technically automate everything, however, he prefers having manual approvers. Within Azure DevOps, there is a concept of an automated approval gate as well. So one can use automation to help decide if they should postpone approval or not. Wang says he uses an automated approval gate to conduct DNS checks that can inform him whether or not the DNS has propagated. Wang says, trying to keep quality in your pipelines is really difficult to do. You can do things like run all of the automated UI tests for a particular environment. “so by the time let's say I deployed this into a QA environment by the time my QA testers even look at it it could have run through like hundreds of thousands of automated UI tests already. So there's a lot less that a human needs to do,” Wang adds. To learn comprehensively how to develop Azure cloud architecture and a pipeline management system and also to know about some security best practices for your Azure deployment, you can check out the book, Azure for Architects - Second Edition by Ritesh Modi. Pivotal and Heroku team up to create Cloud Native Buildpacks for Kubernetes Can DevOps promote empathy in software engineering? Is DevOps really that different from Agile? No, says Viktor Farcic [Podcast]
Read more
  • 0
  • 0
  • 21855

article-image-why-secure-web-based-applications-with-kali-linux
Guest Contributor
12 Dec 2019
12 min read
Save for later

Why secure web-based applications with Kali Linux?

Guest Contributor
12 Dec 2019
12 min read
The security of web-based applications is of critical importance. The strength of an application is about more than the collection of features it provides. It includes essential (yet often overlooked) elements such as security. Kali Linux is a trusted critical component of a security professional’s toolkit for securing web applications. The official documentation says it is “is specifically geared to meet the requirements of professional penetration testing and security auditing.“ Incidences of security breaches in web-based applications can be largely contained through the deployment of Kali Linux’s suite of up-to-date software. Build secure systems with Kali Linux... If you wish to employ advanced pentesting techniques with Kali Linux to build highly secured systems, you should check out our recent book Mastering Kali Linux for Advanced Penetration Testing - Third Edition written by Vijay Kumar Velu and Robert Beggs. This book will help you discover concepts such as social engineering, attacking wireless networks, web services, and embedded devices. What it means to secure Web-based applications There is a branch of information security dealing with the security of websites and web services (such as APIs), the same area that deals with securing web-based applications. For web-based businesses, web application security is a central component. The Internet serves a global population and is used in almost every walk of life one may imagine. As such, web properties can be attacked from various locations and with variable levels of complexity and scale. It is therefore critical to have protection against a variety of security threats that take advantage of vulnerabilities in an application’s code. Common web-based application targets are SaaS (Software-as-a-Service) applications and content management systems like WordPress. A web-based application is a high-priority target if: the source code is complex enough to increase the possibility of vulnerabilities that are not contained and result in malicious code manipulation, the source code contains exploitable bugs, especially where code is not tested extensively, it can provide rewards of high value, including sensitive private data, after successful manipulation of source code, attacking it is easy to execute since most attacks are easy to automate and launch against multiple targets. Failing to secure its web-based application opens an organization up to attacks. Common consequences include information theft, damaged client relationships, legal proceedings, and revoked licenses. Common Web App Security Vulnerabilities A wide variety of attacks are available in the wild for web-based applications. These include targeted database manipulation and large-scale network disruption. Following are a few vectors or methods of attacks used by attackers: Data breaches A data breach differs from specific attack vectors. A data breach generally refers to the release of private or confidential information. It can stem from mistakes or due to malicious action. Data breaches cover a broad scope and could consist of several highly valuable records to millions of exposed user accounts. Common examples of data breaches include Cambridge Analytica and Ashley Madison. Cross-site scripting (XSS) It is a vulnerability that gives an attacker a way to inject client-side scripts into a webpage. The attacker can also directly access relevant information, impersonate a user, or trick them into divulging valuable information. A perpetrator could notice a vulnerability in an e-commerce site that permits embedding of HTML tags in the site’s comments section. The embedded tags feature permanently on the page, causing the browser to parse them along with other source code each time the page is accessed. SQL injection (SQLi) A method whereby a web security vulnerability allows an attacker to interfere with the queries that an application makes to its database. With this, an attacker can view data that they could normally not retrieve. Attackers may also modify or create fresh user permissions, manipulate or remove sensitive data. Such data could belong to other users, or be any data the application itself can access. In certain cases, an attacker can escalate the attack to exploit backend infrastructure like the underlying server. Common SQL injection examples include: Retrieving hidden data, thus modifying a SQL query to return enhanced results; Subverting application logic by essentially changing a query; UNION attacks, so as to retrieve data from different database tables; Examining the database, to retrieve information on the database’s version and structure; and Blind SQL injection, where you’re unable to retrieve application responses from queries you control. To illustrate subverting application logic, take an application that lets users log in with a username and password. If the user submits their username as donnie and their password as peddie, the application tests the credentials by performing this SQL query: SELECT * FROM users WHERE username = ‘donnie’ AND password = ‘donnie’ The login is successful where the query returns the user’s details. It is rejected, otherwise. An attacker can log in here as a regular user without a password, by merely using the SQL comment sequence -- to eliminate the password check from the WHERE clause of the query. An example is submitting the username admin’--along with a blank password in this query: SELECT * FROM users WHERE username = ‘admin’--’ AND password = ‘’ This query returns the user whose username is admin, successfully logging in the attacker in as that user. Memory corruption When a memory location is modified, leading to unexpected behavior in the software, memory corruption occurs. It is often not deliberate. Bad actors work hard to determine and exploit memory corruption using code injection or buffer overflow attacks. Hackers love memory vulnerabilities because it enables them to completely control a victim’s machine. Continuing the password example, let’s consider a simple password-validation C program. The code performs no validation on the length of the user input. It also does not ensure that sufficient memory is available to store the data coming from the user. Buffer overflow A buffer is a defined temporary storage in memory. When software writes data to a buffer, a buffer overflow might occur. Overflowing the buffer's capacity leads to overwriting adjacent memory locations with data. Attackers can exploit this to introduce malicious code in memory, with the possibility of developing a vulnerability within the target. In buffer overflow attacks, the extra data sometimes contains specific instructions for actions within the plan of a malicious user. An example is data that is able to trigger a response that changes data, reveals private information, or damages files. Heap-based buffer overflows are more difficult to execute than stack-based overflows. They are also less common, attacking an application by flooding the memory space dedicated for a program. Stack-based buffer overflows exploit applications by using a stack - a memory space for storing input. Cross-site request forgery (CSRF) Cross-site request forgery tricks a victim into supplying their authentication or authorization details in requests. The attacker now has the user's account details and proceeds to send a request by pretending as the user. Armed with a legitimate user account, the attacker can modify, exfiltrate, or destroy critical information. Vital accounts belonging to executives or administrators are typical targets. The attacker commonly requests the victim user to perform an action unintentionally. Changing an email address on their account, changing their password, or making a funds transfer are examples of such actions. The nature of the action could give the attacker full control over the user’s control. The attacker may even gain full control of the application’s data and security if the target user has high privileges within the application. Three vital conditions for a CSRF attack include: A relevant action within the application that the attacker has reason to induce. Modifying permissions for other users (privileged action) or acting on user-specific data (changing the user’s password, for example). Cookie-based session handling to identify who has made user requests. There is no other mechanism to track sessions or validate user requests. No unpredictable request parameters. When causing a user to change their password, for example, the function is not vulnerable if an attacker needs to know the value of the existing password. Let’s say an application contains a function that allows users to change the email address on their account. When a user performs this action, they make a request such as the following: POST /email/change HTTP/1.1 Host: target-site.com Content-Type: application/x-www-form-urlencoded Content-Length: 30 Cookie: session=yvthwsztyeQkAPzeQ5gHgTvlyxHfsAfE email=don@normal-user.com The attacker may then build a web page containing the following HTML: Where the victim visits the attacker’s web page, these will happen: The attacker’s page will trigger an HTTP request to the vulnerable website. If the user is logged in to the vulnerable site, their browser will automatically include their session cookie in the request. The vulnerable website will carry on as normal, processing the malicious request, and change the victim user’s email address. Mitigating Vulnerabilities with Kali Linux Securing web-based user accounts from exploitation includes essential steps, such as using up-to-date encryption. Tools are available in Kali that can help generate application crashes or scan for various other vulnerabilities. Fuzzers, as these tools are called, are a relatively easy way to generate malformed data to observe how applications handle them. Other measures include demanding proper authentication, continuously patching vulnerabilities, and exercising good software development hygiene. As part of their first line of defence, many companies take a proactive approach, engaging hackers to participate in bug bounty programs. A bug bounty rewards developers for finding critical flaws in software. Open source software like Kali Linux allow anyone to scour an application’s code for flaws. Monetary rewards are a typical incentive. White hat hackers can also come onboard with the sole assignment of finding internal vulnerabilities that may have been treated lightly. Smart attackers can find loopholes even in stable security environments, making a fool-proof security strategy a necessity. The security of web-based applications can be through protecting against Application Layer, DDoS, and DNS attacks. Kali Linux is a comprehensive tool for securing web-based applications Organizations curious about the state of security of their web-based application need not fear; especially when they are not prepared for a full-scale penetration test. Attackers are always on the prowl, scanning thousands of web-based applications for the low-hanging fruit. By ensuring a web-based application is resilient in the face of these overarching attacks, applications reduce any chances of experiencing an attack. The hackers will only migrate to more peaceful grounds. So, how do organizations or individuals stay secure from attackers? Regular pointers include using HTTPS, adding a Web Application, installing security plugins, hashing passwords, and ensuring all software is current. These significant recommendations lower the probability of finding vulnerabilities in application code. Security continues to evolve, so it's best to integrate it into the application development lifecycle. Security vulnerabilities within your app are almost impossible to avoid. To identify vulnerabilities, one must think like an attacker, and test the web-based application thoroughly. A Debian Linux derivative from Offensive Security Limited, Kali Linux, is primarily for digital forensics and penetration testing. It is a successor to the revered BackTrack Linux project. The BackTrack project was based on Knoppix and manually maintained. Offensive Security wanted a true Debian derivative, with all the necessary infrastructure and improved packaging techniques. The quality, stability, and wide software selection were key considerations in choosing Debian. While developers churn out web-based applications by the minute, the number of web-based application attacks grows alongside in an exponential order. Attackers are interested in exploiting flaws in the applications, just as organizations want the best way to detect attackers’ footprints in the web application firewall. Thus, it will be detecting and blocking the specific patterns on the web-based application. Key features of Kali Linux Kali Linux has 32-bit and 64-bit distributions for hosts relying on the x86 instruction set. There's also an image for the ARM architecture. The ARM architecture image is for the Beagle Board computer and the ARM Chromebook from Samsung. Kali Linux is available for other devices like the Asus Chromebook Flip C100P, HP Chromebook, CuBox, CubieBoard 2, Raspberry Pi, Odroid U2, EfikaMX, Odroid XU, Odroid XU3, Utilite Pro, SS808, Galaxy Note 10.1, and BeagleBone Black. There are plans to make distributions for more ARM devices. Android devices like Google's Nexus line, OnePlus One, and Galaxy models also have Kali Linux through Kali NetHunter. Kali NetHunter is Offensive Security’s project to ensure compatibility and porting to specific Android devices. Via the Windows Subsystem for Linux (WSL), Windows 10 users can use any of the more than 600 ethical hacking tools within Kali Linux to expose vulnerabilities in web applications. The official Windows distribution IS from the Microsoft Store, and there are tools for various other systems and platforms. Conclusion Despite a plethora of tools dedicated to web app security and a robust curation mechanism, Kali Linux is the distribution of choice to expose vulnerabilities in web-based applications. Other tool options include Kubuntu, Black Parrot OS, Cyborg Linux, BackBox Linux, and Wifislax. While being open source has helped its meteoric rise, Kali Linux is one of the better platforms for up-to-date security utilities. It remains the most advanced penetration testing platform out there, supporting a wide variety of devices and hardware platforms. Kali Linux also has decent documentation compared to numerous other open source projects. There is a large, active, and vibrant community and you can easily install Kali Linux in VirtualBox on Windows to begin your hacking exploits right away. To further discover various stealth techniques to remain undetected and defeat modern infrastructures and also to explore red teaming techniques to exploit secured environment, do check out the book Mastering Kali Linux for Advanced Penetration Testing - Third Edition written by Vijay Kumar Velu and Robert Beggs. Author Bio Chris is a growth marketing and cybersecurity expert writer. He has contributed to sites such as “Cyber Defense Magazine,” “Social Media News,” and “MTA.” He’s also contributed to several cybersecurity magazines. He enjoys freelancing and helping others learn more about protecting themselves online. He’s always curious and interested in learning about the latest developments in the field. He’s currently the Editor in Chief for EveryCloud’s media division. Glen Singh on why Kali Linux is an arsenal for any cybersecurity professional [Interview] 3 cybersecurity lessons for e-commerce website administrators Implementing Web application vulnerability scanners with Kali Linux [Tutorial]
Read more
  • 1
  • 0
  • 31631
article-image-master-the-art-of-face-swapping-with-opencv-and-python-by-sylwek-brzeczkowski-developer-at-truststamp
Vincy Davis
12 Dec 2019
8 min read
Save for later

Master the art of face swapping with OpenCV and Python by Sylwek Brzęczkowski, developer at TrustStamp

Vincy Davis
12 Dec 2019
8 min read
No discussion on image processing can be complete without talking about OpenCV. Its 2500+ algorithms, extensive documentation and sample code are considered world-class for exploring real-time computer vision. OpenCV supports a wide variety of programming languages such as C++, Python, Java, etc., and is also available on different platforms including Windows, Linux, OS X, Android, and iOS. OpenCV-Python, the Python API for OpenCV is one of the most popular libraries used to solve computer vision problems. It combines the best qualities of OpenCV, C++ API, and the Python language. The OpenCV-Python library uses Numpy, which is a highly optimized library for numerical operations with a MATLAB-style syntax. This makes it easier to integrate the Python API with other libraries that use Numpy such as SciPy and Matplotlib. This is the reason why it is used by many developers to execute different computer vision experiments. Want to know more about OpenCV with Python? [box type="shadow" align="" class="" width=""]If you are interested in developing your computer vision skills, you should definitely master the algorithms in OpenCV 4 and Python explained in our book ‘Mastering OpenCV 4 with Python’ written by Alberto Fernández Villán. This book will help you build complete projects in relation to image processing, motion detection, image segmentation, and many other tasks by exploring the deep learning Python libraries and also by learning the OpenCV deep learning capabilities.[/box] At the PyData Warsaw 2018 conference, Sylwek Brzęczkowski walked through how to implement a face swap using OpenCV and Python. Face swaps are used by apps like Snapchat to dispense various face filters. Brzęczkowski is a Python developer at TrustStamp. Steps to implement face swapping with OpenCV and Python #1 Face detection using histogram of oriented gradients (HOG) Histogram of oriented gradients (HOG) is a feature descriptor that is used to detect objects in computer vision and image processing. Brzęczkowski demonstrated the working of a HOG using square patches which when hovered over an array of images produces a histogram of oriented gradients feature vectors. These feature vectors are then passed to the classifier to generate a result having the highest matching samples. In order to implement face detection using HOG in Python, the image needs to be imported using import OpenCV. Next a frontal face detector object is created for the loaded image detector=dlib.get_frontal_face_detector(). The detector then produces the vector with the detected face. #2 Facial landmark detection aka face alignment Face landmark detection is the process of finding points of interest in an image of a human face. When dlib is used for facial landmark detection, it returns 68 unique fashion landmarks for the whole face. After the first iteration of the algorithm, the value of T equals 0. This value increases linearly such that at the end of the iteration, T gets the value 10. The image evolved at this stage produces the ‘ground truth’, which means that the iteration can stop now. Due to this working, this stage of the process is also called as face alignment. To implement this stage, Brzęczkowski showed how to add a predictor in the Python program with the values shape_predictor_68_face_landmarks.dat such that it produces a model of around 100 megabytes. This process generally takes up a long time as we tend to pick the biggest clearer image for detection. #3 Finding face border using convex hull The convex hull is a set of points defined as the smallest convex polygon, which encloses all of the points in the set. This means that for a given set of points, the convex hull is the subset of these points such that all the given points are inside the subset. To find the face border in an image, we need to change the structure a bit. The structure is first passed to the convex hull function with return points to false, this means that we get an output of indexes. Brzęczkowski then exhibited the face border in the image in blue color using the find_convex_hull.py function. #4 Approximating nonlinear operations with linear operations In a linear filtering of an image, the value of an output pixel is a linear combination of the values of the pixels. Brzęczkowski put forth the example of Affine transformation which is a type of linear mapping method and is used to preserve points, straight lines, and planes. On the other hand, a non-linear filtering produces an output which is not a linear function of its input. He then goes on to unveil both the transitions using his own image. Brzęczkowski then advised users to check the website learnOpenCV.com to learn how to create a nonlinear operation with a linear one. #5 Finding triangles in an image using Delaunay triangulation A Delaunay triangulation subdivides a set of points in a plane into triangles such that the points become vertices of the triangles. This means that this method subdivides the space or the surface into triangles in such a way that if you look at any triangle on the image, it will not have another point inside the triangle. Brzęczkowski then demonstrates how the image developed in the previous stage contained “face points from which you can identify my teeth and then create sub div to the object, insert all these points that I created or all detected.” Next, he deploys Delaunay triangulation to produce a list of two angles. This list is then used to obtain the triangles in the image. Post this step, he uses the delaunay_triangulation.py function to generate these triangles on the images. #6 Blending one face into another To recap, we started from detecting a face using HOG and finding its border using convex hull, followed it by adding mouth points to indicate specific indexes. Next, Delaunay triangulation was implemented to obtain all the triangles on the images. Next, Brzęczkowski begins the blending of images using seamless cloning. A seamless cloning combines the attributes of other cloning methods to create a unique solution to allow “sequence-independent and scarless insertion of one or more fragments of DNA into a plasmid vector.” This cloning method also provides a variety of skin colors to choose from. Brzęczkowski then explains a feature called ‘pass on edit image’ in the Poisson image editing which uses the value of the gradients instead of the identities or the values of the pixels of the image. To implement the same method in OpenCV, he further demonstrates how information like source, destination, source image destination, mask and center (which is the location where the cloned part should be placed) is required to blend the two faces. Brzęczkowski then depicts a  string of illustrations to transform his image with the images of popular artists like Jamie Foxx, Clint Eastwood, and others. #7 Stabilization using optical flow with the Lucas-Kanade method In computer vision, the Lucas-Kanade method is a widely used differential method for optical flow estimation. It assumes that the flow is essentially constant in a local neighborhood of the pixel under consideration, and solves the basic optical flow equations for all the pixels in that neighborhood, by the least-squares criterion. Thus by combining information from several nearby pixels, the Lucas–Kanade method resolves the inherent ambiguity of the optical flow equation. This method is also less sensitive to noises in an image. By using this method to implement the stabilization of the face swapped image, it is assumed that the optical flow is essentially constant in a local neighborhood of the pixel under consideration in human language. This means that “if we have a red point in the center we assume that all the points around, let's say in this example is three on three pixels we assume that all of them have the same optical flow and thanks to that assumption we have nine equations and only two unknowns.” This makes the computation fairly easy to solve. By using this assumption the optical flow works smoothly if we have the previous gray position of the image. This means that for face swapping images using OpenCV, a user needs to have details of the previous points of the image along with the current points of the image. By combining all this information, the actual point becomes a combination of the detected landmark and the predicted landmark. Thus by implementing the Lucas-Kanade method for stabilizing the image, Brzęczkowski implements a non-shaky version of his face-swapped image. Watch Brzęczkowski’s full video to see a step-by-step implementation of a face-swapping task. You can learn advanced applications like facial recognition, target tracking, or augmented reality from our book, ‘Mastering OpenCV 4 with Python’ written by Alberto Fernández Villán. This book will also help you understand the application of artificial intelligence and deep learning techniques using popular Python libraries like TensorFlow and Keras. Getting to know PyMC3, a probabilistic programming framework for Bayesian Analysis in Python How to perform exception handling in Python with ‘try, catch and finally’ Implementing color and shape-based object detection and tracking with OpenCV and CUDA [Tutorial] OpenCV 4.0 releases with experimental Vulcan, G-API module and QR-code detector among others
Read more
  • 0
  • 0
  • 34232

article-image-getting-to-know-pymc3-a-probabilistic-programming-framework-for-bayesian-analysis-in-python
Vincy Davis
11 Dec 2019
5 min read
Save for later

Getting to know PyMC3, a probabilistic programming framework for Bayesian Analysis in Python

Vincy Davis
11 Dec 2019
5 min read
Bayes' theorem, named after 18th-century British mathematician Thomas Bayes, is a mathematical formula for determining conditional probability. This theorem is used to revise or update existing predictions or theories using new or additional evidence. Bayes theorem is also used in the field of data science as it provides a rule for moving from a prior probability to a posterior probability.  In Bayesian statistics, a prior probability is the probability of an event before a new data is collected and a posterior probability is a conditional probability that is allotted after the relevant evidence is acquired. Hence, the Bayes algorithm is one of the most popular machine learning techniques in the field of data science.  In this post, we are going to discuss a specific Bayesian implementation called probabilistic programming (PP) in Python, considering that modern Bayesian statistics is mainly done by writing code. The probabilistic programming enables flexible specification of complex Bayesian statistical models, thus giving users the ability to focus more on model design, evaluation, and interpretation, and less on mathematical or computational details. Further Reading [box type="shadow" align="" class="" width=""]To know more about Bayesian data analysis techniques using PyMC3 and ArviZ, read our book ‘Bayesian Analysis with Python’, written by Osvaldo Martin. This book will help you acquire skills for a practical and computational approach towards Bayesian statistical modeling. The book also lists the best practices in Bayesian Analysis with the help of sample problems and practice exercises.[/box] A group of researchers have published a paper “Probabilistic Programming in Python using PyMC” exhibiting a primer on the use of PyMC3 for solving general Bayesian statistical inference and prediction problems. PyMC3 is a popular open-source PP framework in Python with an intuitive and powerful syntax closer to the natural syntax statisticians. The PyMC3 installation depends on several third-party Python packages which are automatically installed when installing via pip. It requires four dependencies: Theano, NumPy, SciPy, and Matplotlib. To undertake the full advantage of PyMC3, the researchers suggest, the optional dependencies Pandas and Patsy should also be installed using: pip install patsy pandas. How to use PyMC3 in probabilistic programming? In the paper, the researchers have utilized a simple Bayesian linear regression model with normal priors for the parameters. The unknown variables in the model are also assigned a prior distribution. The artificial data in the model are then simulated using NumPy’s random module, followed by the PyMC3 model to retrieve the corresponding parameters. The straightforward PyMC3 model structure is used to generate the unknown data as it is close to the statistical notation.  Firstly, the necessary components are imported from PyMC to build the required model. It is represented in the full format initially and then explained partly. The paper states, “Following instantiation of the model, the subsequent specification of the model components is performed inside a with statement: with basic_model: This creates a context manager, with our basic model as the context, that includes all statements until the indented block ends.” This means that all the PyMC3 objects introduced in the indented code block below the with statements are added to the model behind the scenes. In the absence of this context manager idiom, users would be forced to manually associate each of the variables with the basic model immediately after we create them. Also, if a user tries to create a new random variable without a with model: statement, it will cause an error due to the absence of an obvious model for the variable to be added to.  Next, to obtain posterior estimates for the unknown variables in the model, the posterior estimates are calculated analytically. The researchers have explained two approaches to obtain posterior estimates, users can choose either of them depending on the structure of the model and the goals of the analysis. The first approach is called finding the maximum a posteriori (MAP) point using optimization methods and the second approach is computing summaries based on samples drawn from the posterior distribution using Markov Chain Monte Carlo (MCMC) sampling methods. For producing a posterior analysis of the required model, PyMC3 provides plotting and summarization functions for inspecting the sampling output.  A simple posterior plot can be created using traceplot. In the traceplot, the left column consists of the smoothed histogram while the right column contains the samples of the Markov chain plotted in sequential order. In addition, the summary function of PyMC3 also provides a text-based output of common posterior statistics. You can also learn more about the practical implementation of PyMC3 and its loss functions in the book ‘Bayesian Analysis with Python’ by Packt Publishing. How Facebook data scientists use Bayesian optimization for tuning their online systems How to perform exception handling in Python with ‘try, catch and finally’ Fake Python libraries removed from PyPi when caught stealing SSH and GPG keys, reports ZDNet Netflix open-sources Metaflow, its Python framework for building and managing data science projects ActiveState adds thousands of curated Python packages to its platform
Read more
  • 0
  • 0
  • 45584

article-image-kaggles-rachel-tatman-on-what-to-do-when-applying-deep-learning-is-overkill
Vincy Davis
11 Dec 2019
8 min read
Save for later

Kaggle's Rachel Tatman on what to do when applying deep learning is overkill 

Vincy Davis
11 Dec 2019
8 min read
Deep learning, an emerging branch of machine learning, has garnered a lot of recognition in the field of technology over the last decade. It is regarded as a game-changer in AI, with distinct progress in computer vision, natural language processing (NLP), speech and other areas of machine learning. This year an Indeed survey found ‘deep learning engineer’ to be the best job in a tech position in the USA. Though deep learning has many benefits and a very appealing track record, not everybody can afford deep learning. It has some downsides like large data requirements, being excessively expensive, and has a high computing time. Below is a breakdown of Rachael Tatman’s talk “Put down the deep learning: When not to use neural networks and what to do instead” at the PyCon 2019 conference that delved into the problems with deep learning. Tatman is a data science advocate at Kaggle. Deep learning models require a very large amount of data in order to perform better than other techniques. Also, according to Tatman, just the compute of a simple image generation model in deep learning can cost around $60,000. This cost will increase with the complexity of the data models. It additionally requires expensive GPUs and hundreds of machines which will again deepen the cost to the user. Many less skilled people also find it difficult to adopt deep learning, as there is no standard theory available for learning about deep learning tools. The choice of a deep learning tool depends on the user’s knowledge of topology, training method, and other parameters. Next, deep learning also takes a lot of time for training large models. As the talk progresses, Tatman provides a list of three different types of models that can be used instead of deep learning. The three proposed models are regression-based models, tree-based models, and distance-based models.  The three proposed models instead of deep learning The most interpretable: Regression-based models The biggest advantage of a regression-based model is that it has a “well-principled” understanding of problems and provides many kinds of regression models, unlike deep learning. Users can simply work through the flowchart and decide on the best type of regression model for their data.  Some other advantages of regression models include its “fast to fit” feature. This means that it is much faster to fit when compared to a neural network, especially “if you're working with a well-optimized library the Python regression libraries tend to vary wildly so you might want to do a little bit of shopping around”. It also works well with small data as Tatman affirmed that she has worked on eight dozen data points. She added that since regression models are easy to interpret, she was able to learn many useful and interesting things from the data.  A few drawbacks of regression models are that a bit more data preparation is needed than for some other methods. They also require validation as regression models are based on strong assumptions about the distribution of the data points or the distribution of the errors.  Tatman also proclaimed that if she were to use a single machine learning model for the rest of her life, it would be a mixed-effects regression model. Mixed-effects models are extensions of linear regression models for data that are collected and summarized in groups. It is mainly used to determine the expected or mean values of the subject population. She believes, “you need to do a little bit more hands-on stuff, you need to do your validation, you probably need to do some additional data cleaning,” however, it only takes some time to do a lot of computing in less money and data. Want to know more about Regression? [box type="shadow" align="" class="" width=""]With so many benefits in regression-based models, you should definitely give Regression models a try. Read our book ‘Python Machine Learning By Example’ written by Yuxi (Hayden) Liu, to learn about regression algorithms and their evaluation. You can also master the art of building your own machine learning systems using other models such as Support Vector Machines and Text Analysis Algorithms with this example-based practical guide.[/box] The user-friendliest: Tree-based models  The next model which has the ability to replace deep learning models is called the tree-based models works that similar to a decision tree. It checks each node for a feature and depending on the value of that feature, the user can decide the path to be followed. When going down a particular path, it again checks for nodes with a feature. In this way, it works recursively to cut down a decision region into smaller chunks. Tatman also notified that developers generally opt for a forests model, instead of a tree-based model. A random forest is an ensemble model that combines many different decision trees together into a single model.  Per Tatman, “If you're in the machine learning community you might actually associate random forests with Kaggle and from 2010 to 2016, about two-thirds of all Kaggle competition winners used random forests.” On the other hand, “less than half use some form of deep learning, also random forests continue to do very well today.”  In the case of classification of data, random forests deliver better performance than logistic regression. It also does not need a lot of data cleaning or model validation. Random forests also do not require a user to convert the categorical variables, it simply undertakes the values and provides a corresponding output. It also supports many easy to use packages like XG boost, LightGBM, CatBoost, and others. In short, regression trees are the most user-friendly model, especially when doing classification. The drawbacks of trees/random forests are that they can easily overfit, it is also more sensitive to differences between datasets. It is less interpretable and requires more compute and training time when compared to regression models. Thus, tree-based models require little money but do need some data and time to train big data sets. The most lightweight: Distance-based models The last model, which according to Tatman can replace deep learning models is a common notation to group together a large group of methods like K-nearest neighbors, Gaussian Mixture models, and Support Vector machine. These models work with the basic idea that “points closer together to each other in a particular feature space are more likely to be in the same group.” The K-nearest neighbor model decides the value of a point based on the nearest majority neighbors. The Gaussian mixture models utilizes any distribution of distribution points that are a mixture of different Gaussians. The support vector model tries to be as far away from all the data points as possible. Distance-based models, particularly support vector models work very well with small data sets. They also tend to train 10 times faster than a regression model on the same data. In terms of accuracy, distance-based models lag behind other models, but in case of quick and dirty modeling, they perform better. They are good at data classification but are a little slower when compared to regression-based models. Consequently, distance-based models take very little time, requires very little money and are extremely lightweight. To conclude, Tatman says that the choice of one’s model should depend on the kind of time and money, the individual or organization possesses. Also, the most vital point to choose a model depends on its performance. Tatman adds, “based on empirical evidence right now it looks like deep learning will perform the best on a given data set given sufficient time money and compute.” Watch Tatman’s full talk for a detailed comparison of the three models. You can learn more about all the above machine learning models from our book, ‘Python Machine Learning By Example’ written by Yuxi (Hayden) Liu. The book will help you in implementing machine learning classification and regression algorithms from scratch in Python. Also, learn how to optimize the performance of a machine learning model for your application from our book. François Chollet, creator of Keras on TensorFlow 2.0 and Keras integration, tricky design decisions in Deep Learning, and more Baidu adds Paddle Lite 2.0, new development kits, EasyDL Pro, and other upgrades to its PaddlePaddle deep learning platform Why use JVM (Java Virtual Machine) for deep learning Prof. Rowel Atienza discusses the intuition behind deep learning, advances in GANs & techniques to create cutting-edge AI models Why Intel is betting on BFLOAT16 to be a game changer for deep learning training? Hint: Range trumps Precision.
Read more
  • 0
  • 0
  • 46464
article-image-why-decision-trees-are-more-flexible-than-linear-models-explains-stephen-klosterman
Guest Contributor
11 Dec 2019
7 min read
Save for later

Why decision trees are more flexible than linear models, explains Stephen Klosterman

Guest Contributor
11 Dec 2019
7 min read
This blog post will examine a hypothetical dataset of website visits and customer conversion, to illustrate how decision trees are a more flexible mathematical model than linear models such as logistic regression. Imagine you are monitoring the webpage of one of your products. You are keeping track of how many times individual customers visit this page, the total amount of time they've spent on the page across all their visits, and whether or not they bought the product. Your goal is to be able to predict, for future visitors, how likely they are to buy the product, based on the page visit data. You are considering presenting a discount, or some other kind of offer, to customers you think are likely to buy the product but haven't yet. Get to know more about decision trees and linear models! [box type="shadow" align="" class="" width=""]If you are interested in building your knowledge to prepare data for regularized logistic regression and random forest algorithms, read our book Data Science Projects with Python written by Stephen Klosterman. This book will give you practical guidance on industry-standard data analysis and machine learning tools in Python, with the help of realistic data. You will also learn how to use pandas and Matplotlib to critically examine a dataset with summary statistics and graphs and extract the insights you seek to derive. [/box] After logging the data on many customers, you visualize them and see the following, including some jitter to help see all the data points: There are several interesting patterns visible here. We see that in general, the longer someone spends on the page, the more likely they are to purchase the item. However, this effect seems to depend on the number of visits, in a complex way. Someone who visited the page once and spent at least two minutes there (i.e. two minutes per visit) seems likely to buy, at least up until 18 or so minutes. But someone who visited 10 times as much as this seems likely to buy after only 12 minutes cumulative time (1.2 minutes per visit). Additionally, there is a phenomenon of customers who spend a relatively long time (at least 18 or 19 minutes) over a relatively small number of visits (just one or two), who don't buy. Maybe they opened the page, but then walked away from their computer, and closed the page as soon as they came back. Whatever the reason, the patterns in this data set are interesting and complicated. If you want to create a predictive model of these data, you should consider the likely success of non-linear models, such as decision trees, versus linear models, such as logistic regression (for more information see chapter 3 of my book, Data Science Projects with Python). Logistic Regression as a linear model At a high level, linear models will take the feature space (the two-dimensional space where time is on the x-axis and number of visits is on the y-axis, as in the graph above), and seek to draw a straight line somewhere that creates an accurate division of the two classes of the response variable ("Bought" or "Did not buy"). Consider how likely this is to work. Where would you draw a straight line on the graph above, so that the two regions on either side of the line would primarily contain responses of only one class? It should be apparent that this is not likely to be an entirely successful task. The best you could probably do would be to draw a line that isolates non-buying customers who spent relatively little time on the page, represented by the region of dots to the left of the graph, from the blue dots representing buying customers to the right. While this would basically ignore the little group of customers to the lower right, it's probably the best you could do overall for most customers, using the straight-line approach. In fact, this is essentially what a logistic regression classifier looks like when the model is calibrated to these data. The above graph shows the regions of prediction ("Unlikely to buy" and "Likely to buy") as red or blue shading in the background. Deeper colors indicate a higher likelihood for either class. The conceptual straight-line decision boundary that divides the two regions mentioned above, would run right through the white portion of the background, where the probability of belonging to either class is very low. In other words, the model is "uncertain" about what prediction to make in this region. From the above graph, it can be seen that in addition to ignoring the small group of non-buying customers in the lower right, a straight line is also not a great model for isolating the non-buying customers on the left of the graph. While you can imagine that a curve might be able to define this boundary, a straight line cannot. Decision Trees as a non-linear model How can we do better? Enter non-linear models. Decision trees are a prime example of non-linear models. Decision trees work by dividing the data up into regions based on the "if-then" type of questions. For example, if a user spends less than three minutes over two or fewer visits, how likely are they to buy? Graphically, by asking many of these types of questions, a decision tree can divide up the feature space using little segments of vertical and horizontal lines. This approach can create a much more complex decision boundary, as shown below. It should be clear that decision trees can be used with more success, to model this data set. Given this, you would have a better model for the likelihood of customer conversion and could then proceed to design offers to increase conversion (for more information see chapter 5 of my book, Data Science Projects with Python). In conclusion, this post has shown how non-linear models, such as decision trees, can more effectively describe relationships in complex data sets than linear models, such as logistic regression. It should be noted that linear models can be extended to non-linearity by various means including feature engineering. On the other hand, non-linear models may suffer from overfitting, since they are so flexible. Nonetheless, approaches to prevent decision trees from overfitting have been formulated using ensemble models such as random forests and gradient boosted trees, which are among the most successful machine learning techniques in use today. As a final caveat, note this blog post presents a hypothetical, synthetic data set, which can be modeled almost perfectly with decision trees. Real-world data is messier, but the same principles hold. I hope you found this conceptual discussion helpful. For a more detailed explanation of how decision trees and logistic regression work "under the hood" with real-world data, and the python code for a similar hypothetical example to that shown here, check out my book Data Science Projects with Python. Author Bio Stephen Klosterman is a machine learning data scientist and the author of the book Data Science Projects with Python. He enjoys helping to frame problems in a data science context and delivering machine learning solutions that business stakeholders understand and value. His education includes a Ph.D. in biology from Harvard University, where he was an assistant teacher of the data science course. About the Book This book Data Science Projects with Python will help you understand the working and output of machine learning algorithms and gain insight into not only the predictive capabilities of the models but also their reasons for making these predictions. The book also provides detailed insight on how to build a classification model and how to conduct a financial analysis to find the optimal threshold for binary classification. This will help you with financial budgeting and operational strategy for a well-optimized usage model. At the end of this book, you will be able to confidently use various machine learning algorithms to perform detailed data analysis. Netflix open-sources Metaflow, its Python framework for building and managing data science projects What does a data science team look like? Get Ready for Open Data Science Conference 2019 in Europe and California How to learn data science: from data mining to machine learning Dr.Brandon explains Decision Trees to Jon
Read more
  • 0
  • 0
  • 44780

article-image-teaching-gans-a-few-tricks-a-bird-is-a-bird-is-a-bird-robots-holding-on-to-things-and-bots-imitating-human-behavior
Savia Lobo
11 Dec 2019
7 min read
Save for later

Teaching GANs a few tricks: a bird is a bird is a bird, robots holding on to things and bots imitating human behavior

Savia Lobo
11 Dec 2019
7 min read
Generative adversarial networks (GANs) have been at the forefront of research on generative models in the last couple of years. GANs have been used for image generation, image processing, image synthesis from captions, image editing, visual domain adaptation, data generation for visual recognition, and many other applications, often leading to state of the art results. One of the tutorials titled, ‘Generative Adversarial Networks’ conducted at the CVPR 2018 (a Conference on Computer Vision and Pattern Recognition held at Salt Lake City, USA) provides a broad overview of generative adversarial networks and how GANs can be trained to perform different purposes.  The tutorial involved various speakers sharing basic concepts, best practices of the current state-of-the-art GAN including network architectures, objective functions, other training tricks, and much more. Let us look at how GANs are trained for different use cases.  There’s more to GANs….. If you further want to explore different examples of modern GAN implementations, including CycleGAN, simGAN, DCGAN, and 2D image to 3D model generation, you can explore the book, Generative Adversarial Networks Cookbook written by Josh Kalin. The recipes given in this cookbook will help you build on a common architecture in Python, TensorFlow and Keras to explore increasingly difficult GAN architectures in an easy-to-read format. Training GANs for object detection using Adversarial Learning Xialong Wang, from Carnegie Mellon University talked about object detection in computer vision as well as from the context of taking actions in robots. He also explained how to use adversarial learning for instances beyond image generation. To train a GAN, the key idea is to find the adversarial tasks for your target tasks to improve your target by fighting against these adversarial tasks. In computer vision if your target task is to recognize a bird using object detection, one adversarial task is adding occlusions by generating a mask to accrue the bird’s head and its leg which will make it difficult for the detector to recognize. The detector will further try to conquer these difficult tasks and from then on it will become robust to Occlusions. Another adversarial task for object detection can be Deformations. Here the image can be slightly rotated to make the detection difficult.  For training robots to grasp objects, one of the adversaries would be the Shaking test. If the robot arm is stable enough the object it grasps should not fall even with a rigourous shake. Another example is snatching. If another arm can snatch easily, it means it is not completely trained to resist snatching or stealing. Wang said the CMU research team tried generating images using DCGAN on the COCO dataset. However, the images generated could not assist in training the detector as the detectors could easily detect them as false images. Next, the team generated images using Conditional GANs on COCO but these didn’t help either. Hence, the team generated hard positive examples in feed by adding real world occlusions or real world deformations to challenge the detectors. He then talked about a Standard Fast R-CNN Detector which takes an image input in the convolutional neural network language model. After taking the input, the detector extracts features for the whole image, and later you can crop the features according to the proposal bounding box. These cropped features are resized to channel (C*6*6); here 6*6 is interred spatial dimensions. These features are the object features you want to focus on and can also use them to perform classification or regression for detections. The team has added a small network in the middle that would input the extracted features and generate a mask. The mask will assist which spatial locations to chop out certain features that would make it hard for the detectors to recognize. He also shared the benchmark results of the tests using different datasets like the AlexNet, VGG16, FRCN, and so on. The ASTN and the ASDN model showed improved output over the other networks.   Understanding Generative Adversarial Imitation Learning (GAIL) for training a machine to imitate human behaviours Stefano Ermon from Stanford University explained how to use Generative modeling ideas and GAN training to imitate human behaviours in complex environments.  A lot of progress in reinforcement learning has been made with successes in playing board games such as Chess, video games, and so on. However, Reinforcement Learning has one limitation. If you want to use it to solve a new task you have to specify a cost signal / a reward signal to provide some supervision to your reinforcement learning algorithm. You also need to specify what kind of behaviors are desirable and which are not.   In a game scenario the cost signal is whether you win or you lose. However, in further complex tasks like driving an autonomous vehicles to specify a cost signal becomes difficult as there are different objective functions like going off road, not moving above the speed limit, avoiding a road crash, and much more.  The simplest method one can use is Behavioural cloning where you can use your trajectories and your demonstrations to construct a training set of states with the corresponding action that the expert took in those states. You can further use your favorite supervised learning method classification or regression if the actions are continuous. However, this has some limitations: Small errors may compound over time as the learning algorithm will make certain mistakes initially and these mistakes will lead towards never seen before states or objects. It is like a Black box approach where every decision requires initial planning. Ermon suggests an alternative to imitation could be an Inverse RL (IRL) approachHe also demonstrates the similarities between RL and IRL. For the complete demonstration, you can check out the video.  The main difference between a GAIL and GANs is that in GANs the generator is taking inputs, random noise and maps them to the neural network producing some samples for the detector. However, in GAIL, the generator is more complex as it includes two components, a policy P which you can train and an environment (Black Box simulator) that can’t be controlled. What matters is the distribution over states and actions that you encounter when you navigate the environment using the policy that can be tuned. As the environment is difficult to control, training the GAIL model is harder than the simple GANs model. On the other hand, in a GANs model, training the policy is challenging such that the discriminator goes into the direction of fooling.  However, GAIL is the easier generative modelling task because you don’t have to learn the whole thing end to end and neither do you have to come up with a large neural network that maps noise into behaviours as some part of the input is given by the environment. But it is harder to train because you don't really know how the black box works. Ermon further explains how using Generative Adversarial Imitation Learning, one can not only imitate complex behaviors, but also learn interpretable and meaningful representations of complex behavioral data, including visual demonstrations with a method named as InfoGAN, a method, built on top of GAIL.   He also explained a new framework for multi-agent imitation learning for general Markov games by integrating multi-agent RL with a suitable extension of multi-agent inverse RL. This method will generalize Generative Adversarial Imitation Learning (GAIL) in the single agent case. This method will successfully imitate complex behaviors in high-dimensional environments with multiple cooperative or competing agents. To know more about further demonstrations on GAIL, InfoGAIL, and Multi-agent GAIL, watch the complete video on YouTube. Knowing the basics isn’t enough, putting them to practice is necessary. If you want to use GANs practically and experiment with them, Generative Adversarial Networks Cookbook by Josh Kalin is your go-to guide. With this cookbook, you will work with use cases involving DCGAN, Pix2Pix, and so on. To understand these complex applications, you will take different real-world data sets and put them to use. Prof. Rowel Atienza discusses the intuition behind deep learning, advances in GANs & techniques to create cutting edge AI- models Now there is a Deepfake that can animate your face with just your voice and a picture using temporal GANs Now there’s a CycleGAN to visualize the effects of climate change. But is this enough to mobilize action?
Read more
  • 0
  • 0
  • 39676
Modal Close icon
Modal Close icon