Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Save more on your purchases! discount-offer-chevron-icon
Savings automatically calculated. No voucher code required.
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Newsletter Hub
Free Learning
Arrow right icon
timer SALE ENDS IN
0 Days
:
00 Hours
:
00 Minutes
:
00 Seconds

How-To Tutorials

7009 Articles
article-image-top-6-cybersecurity-books-from-packt-to-accelerate-your-career
Expert Network
28 Jun 2021
7 min read
Save for later

Top 6 Cybersecurity Books from Packt to Accelerate Your Career

Expert Network
28 Jun 2021
7 min read
With new technology threats, rising international tensions, and state-sponsored cyber-attacks, cybersecurity is more important than ever. In organizations worldwide, there is not only a dire need for cybersecurity analysts, engineers, and consultants but the senior management executives and leaders are expected to be cognizant of the possible threats and risk management. The era of cyberwarfare is now upon us. What we do now and how we determine what we will do in the future is the difference between whether our businesses live or die and whether our digital self-survives the digital battlefield.  In this article, we'll discuss 6 titles from Packt’s bank of cybersecurity resources for everyone from an aspiring cybersecurity professional to an expert. Adversarial Tradecraft in Cybersecurity  A comprehensive guide that helps you master cutting-edge techniques and countermeasures to protect your organization from live hackers. It enables you to leverage cyber deception in your operations to gain an edge over the competition.  Little has been written about how to act when live hackers attack your system and run amok. Even experienced hackers sometimes tend to struggle when they realize the network defender has caught them and is zoning in on their implants in real-time. This book provides tips and tricks all along the kill chain of an attack, showing where hackers can have the upper hand in a live conflict and how defenders can outsmart them in this adversarial game of computer cat and mouse.  This book contains two subsections in each chapter, specifically focusing on the offensive and defensive teams. Pentesters to red teamers, SOC analysis to incident response, attackers, defenders, general hackers, advanced computer users, and security engineers should gain a lot from this book. This book will also be beneficial to those getting into purple teaming or adversarial simulations, as it includes processes for gaining an advantage over the other team.  The author, Dan Borges, is a passionate programmer and security researcher who has worked in security positions for companies such as Uber, Mandiant, and CrowdStrike. Dan has been programming various devices for >20 years, with 14+ years in the security industry.  Cybersecurity – Attack and Defense Strategies, Second Edition  A book that enables you to counter modern threats and employ state-of-the-art tools and techniques to protect your organization against cybercriminals. It is a completely revised new edition of the bestselling book, covering the very latest security threats and defense mechanisms including a detailed overview of Cloud Security Posture Management (CSPM) and an assessment of the current threat landscape, with additional focus on new IoT threats and cryptomining.  This book is for IT professionals venturing into the IT security domain, IT pentesters, security consultants, or those looking to perform ethical hacking. Prior knowledge of penetration testing is beneficial.  This book is authored by Yuri Diogenes and Dr. Erdal Ozkaya. Yuri Diogenes is a professor at EC-Council University for their master's degree in cybersecurity and a Senior Program Manager at Microsoft for Azure Security Center. Dr. Erdal Ozkaya is a leading Cybersecurity Professional with business development, management, and academic skills who focuses on securing Cyber Space and sharing his real-life skills as a Security Advisor, Speaker, Lecturer, and Author.  Cyber Minds  This book comprises insights on cybersecurity across the cloud, data, artificial intelligence, blockchain, and IoT to keep you cyber safe. Shira Rubinoff's Cyber Minds brings together the top authorities in cybersecurity to discuss the emergent threats that face industries, societies, militaries, and governments today. Cyber Minds serves as a strategic briefing on cybersecurity and data safety, collecting expert insights from sector security leaders. This book will help you to arm and inform yourself of what you need to know to keep your business – or your country – safe.  This book is essential reading for business leaders, the C-Suite, board members, IT decision-makers within an organization, and anyone with a responsibility for cybersecurity.  The author, Shira Rubinoff is a recognized cybersecurity executive, cybersecurity and blockchain advisor, global keynote speaker, and influencer who has built two cybersecurity product companies and led multiple women-in-technology efforts.  Cyber Warfare – Truth, Tactics, and Strategies  Cyber Warfare – Truth, Tactics, and Strategies is as real-life and up-to-date as cyber can possibly be, with examples of actual attacks and defense techniques, tools, and strategies presented for you to learn how to think about defending your own systems and data.  This book introduces you to strategic concepts and truths to help you and your organization survive on the battleground of cyber warfare. The book not only covers cyber warfare, but also looks at the political, cultural, and geographical influences that pertain to these attack methods and helps you understand the motivation and impacts that are likely in each scenario.  This book is for any engineer, leader, or professional with either responsibility for cybersecurity within their organizations, or an interest in working in this ever-growing field.  The author, Dr. Chase Cunningham holds a Ph.D. and M.S. in computer science from Colorado Technical University and a B.S. from American Military University focused on counter-terrorism operations in cyberspace.  Incident Response in the Age of Cloud  This book is a comprehensive guide for organizations on how to prepare for cyber-attacks and control cyber threats and network security breaches in a way that decreases damage, recovery time, and costs, facilitating the adaptation of existing strategies to cloud-based environments.  It is aimed at first-time incident responders, cybersecurity enthusiasts who want to get into IR, and anyone who is responsible for maintaining business security. This book will also interest CIOs, CISOs, and members of IR, SOC, and CSIRT teams. However, IR is not just about information technology or security teams, and anyone with legal, HR, media, or other active business roles would benefit from this book.   The book assumes you have some admin experience. No prior DFIR experience is required. Some infosec knowledge will be a plus but isn’t mandatory.  The author, Dr. Erdal Ozkaya, is a technically sophisticated executive leader with a solid education and strong business acumen. Over the course of his progressive career, he has developed a keen aptitude for facilitating the integration of standard operating procedures that ensure the optimal functionality of all technical functions and systems.  Cybersecurity Threats, Malware Trends, and Strategies   This book trains you to mitigate exploits, malware, phishing, and other social engineering attacks. After scrutinizing numerous cybersecurity strategies, Microsoft's former Global Chief Security Advisor provides unique insights on the evolution of the threat landscape and how enterprises can address modern cybersecurity challenges.    The book will provide you with an evaluation of the various cybersecurity strategies that have ultimately failed over the past twenty years, along with one or two that have actually worked. It will help executives and security and compliance professionals understand how cloud computing is a game-changer for them.  This book is designed to benefit senior management at commercial sector and public sector organizations, including Chief Information Security Officers (CISOs) and other senior managers of cybersecurity groups, Chief Information Officers (CIOs), Chief Technology Officers (CTOs), and senior IT managers who want to explore the entire spectrum of cybersecurity, from threat hunting and security risk management to malware analysis.  The author, Tim Rains worked at Microsoft for the better part of two decades where he held a number of roles including Global Chief Security Advisor, Director of Security, Identity and Enterprise Mobility, Director of Trustworthy Computing, and was a founding technical leader of Microsoft's customer-facing Security Incident Response team.  Summary  If you aspire to become a cybersecurity expert, any good study/reference material is as important as hands-on training and practical understanding. By choosing a suitable guide, one can drastically accelerate the learning graph and carve out one’s own successful career trajectory. 
Read more
  • 0
  • 0
  • 30276

article-image-exploring-the-strategy-behavioral-design-pattern-in-node-js
Expert Network
02 Jun 2021
10 min read
Save for later

Exploring the Strategy Behavioral Design Pattern in Node.js

Expert Network
02 Jun 2021
10 min read
A design pattern is a reusable solution to a recurring problem. The term is really broad in its definition and can span multiple domains of an application. However, the term is often associated with a well-known set of object-oriented patterns that were popularized in the 90s by the book, Design Patterns: Elements of Reusable Object- Oriented Software, Pearson Education, by the almost legendary Gang of Four (GoF): Erich Gamma, Richard Helm, Ralph Johnson, and John Vlissides. This article is an excerpt from the book Node.js Design Patterns, Third Edition by Mario Casciaro and Luciano Mammino – a comprehensive guide for learning proven patterns, techniques, and tricks to take full advantage of the Node.js platform. In this article, we’ll look at the behavior of components in software design. We’ll learn how to combine objects and how to define the way they communicate so that the behavior of the resulting structure becomes extensible, modular, reusable, and adaptable. After introducing all the behavioral design patterns, we will dive deep into the details of the strategy pattern. Now, it's time to roll up your sleeves and get your hands dirty with some behavioral design patterns. Types of Behavioral Design Patterns The Strategy pattern allows us to extract the common parts of a family of closely related components into a component called the context and allows us to define strategy objects that the context can use to implement specific behaviors. The State pattern is a variation of the Strategy pattern where the strategies are used to model the behavior of a component when under different states. The Template pattern, instead, can be considered the "static" version of the Strategy pattern, where the different specific behaviors are implemented as subclasses of the template class, which models the common parts of the algorithm. The Iterator pattern provides us with a common interface to iterate over a collection. It has now become a core pattern in Node.js. JavaScript offers native support for the pattern (with the iterator and iterable protocols). Iterators can be used as an alternative to complex async iteration patterns and even to Node.js streams. The Middleware pattern allows us to define a modular chain of processing steps. This is a very distinctive pattern born from within the Node.js ecosystem. It can be used to preprocess and postprocess data and requests. The Command pattern materializes the information required to execute a routine, allowing such information to be easily transferred, stored, and processed. The Strategy Pattern The Strategy pattern enables an object, called the context, to support variations in its logic by extracting the variable parts into separate, interchangeable objects called strategies. The context implements the common logic of a family of algorithms, while a strategy implements the mutable parts, allowing the context to adapt its behavior depending on different factors, such as an input value, a system configuration, or user preferences. Strategies are usually part of a family of solutions and all of them implement the same interface expected by the context. The following figure shows the situation we just described: Figure 1: General structure of the Strategy pattern Figure 1 shows you how the context object can plug different strategies into its structure as if they were replaceable parts of a piece of machinery. Imagine a car; its tires can be considered its strategy for adapting to different road conditions. We can fit winter tires to go on snowy roads thanks to their studs, while we can decide to fit high-performance tires for traveling mainly on motorways for a long trip. On the one hand, we don't want to change the entire car for this to be possible, and on the other, we don't want a car with eight wheels so that it can go on every possible road. The Strategy pattern is particularly useful in all those situations where supporting variations in the behavior of a component requires complex conditional logic (lots of if...else or switch statements) or mixing different components of the same family. Imagine an object called Order that represents an online order on an e-commerce website. The object has a method called pay() that, as it says, finalizes the order and transfers the funds from the user to the online store. To support different payment systems, we have a couple of options: Use an ..elsestatement in the pay() method to complete the operation based on the chosen payment option Delegate the logic of the payment to a strategy object that implements the logic for the specific payment gateway selected by the user In the first solution, our Order object cannot support other payment methods unless its code is modified. Also, this can become quite complex when the number of payment options grows. Instead, using the Strategy pattern enables the Order object to support a virtually unlimited number of payment methods and keeps its scope limited to only managing the details of the user, the purchased items, and the relative price while delegating the job of completing the payment to another object. Let's now demonstrate this pattern with a simple, realistic example. Multi-format configuration objects Let's consider an object called Config that holds a set of configuration parameters used by an application, such as the database URL, the listening port of the server, and so on. The Config object should be able to provide a simple interface to access these parameters, but also a way to import and export the configuration using persistent storage, such as a file. We want to be able to support different formats to store the configuration, for example, JSON, INI, or YAML. By applying what we learned about the Strategy pattern, we can immediately identify the variable part of the Config object, which is the functionality that allows us to serialize and deserialize the configuration. This is going to be our strategy. Creating a new module Let's create a new module called config.js, and let's define the generic part of our configuration manager: import { promises as fs } from 'fs' import objectPath from 'object-path' export class Config { constructor (formatStrategy) {                           // (1) this.data = {} this.formatStrategy = formatStrategy } get (configPath) {                                       // (2) return objectPath.get(this.data, configPath) } set (configPath, value) {                                // (2) return objectPath.set(this.data, configPath, value) } async load (filePath) {                                  // (3) console.log(`Deserializing from ${filePath}`) this.data = this.formatStrategy.deserialize( await fs.readFile(filePath, 'utf-8') ) } async save (filePath) {                                  // (3) console.log(`Serializing to ${filePath}`) await fs.writeFile(filePath, this.formatStrategy.serialize(this.data)) } } This is what's happening in the preceding code: In the constructor, we create an instance variable called data to hold the configuration data. Then we also store formatStrategy, which represents the component that we will use to parse and serialize the data. We provide two methods, set()and get(), to access the configuration properties using a dotted path notation (for example, property.subProperty) by leveraging a library called object-path (nodejsdp.link/object-path). The load() and save() methods are where we delegate, respectively, the deserialization and serialization of the data to our strategy. This is where the logic of the Config class is altered based on the formatStrategy passed as an input in the constructor. As we can see, this very simple and neat design allows the Config object to seamlessly support different file formats when loading and saving its data. The best part is that the logic to support those various formats is not hardcoded anywhere, so the Config class can adapt without any modification to virtually any file format, given the right strategy. Creating format Strategies To demonstrate this characteristic, let's now create a couple of format strategies in a file called strategies.js. Let's start with a strategy for parsing and serializing data using the INI file format, which is a widely used configuration format (more info about it here: nodejsdp.link/ini-format). For the task, we will use an npm package called ini (nodejsdp.link/ini): import ini from 'ini' export const iniStrategy = { deserialize: data => ini.parse(data), serialize: data => ini.stringify(data) } Nothing really complicated! Our strategy simply implements the agreed interface, so that it can be used by the Config object. Similarly, the next strategy that we are going to create allows us to support the JSON file format, widely used in JavaScript and in the web development ecosystem in general: export const jsonStrategy = { deserialize: data => JSON.parse(data), serialize: data => JSON.stringify(data, null, '  ') } Now, to show you how everything comes together, let's create a file named index.js, and let's try to load and save a sample configuration using different formats: import { Config } from './config.js' import { jsonStrategy, iniStrategy } from './strategies.js' async function main () { const iniConfig = new Config(iniStrategy) await iniConfig.load('samples/conf.ini') iniConfig.set('book.nodejs', 'design patterns') await iniConfig.save('samples/conf_mod.ini') const jsonConfig = new Config(jsonStrategy) await jsonConfig.load('samples/conf.json') jsonConfig.set('book.nodejs', 'design patterns') await jsonConfig.save('samples/conf_mod.json') } main() Our test module reveals the core properties of the Strategy pattern. We defined only one Config class, which implements the common parts of our configuration manager, then, by using different strategies for serializing and deserializing data, we created different Config class instances supporting different file formats. The example we've just seen shows us only one of the possible alternatives that we had for selecting a strategy. Other valid approaches might have been the following: Creating two different strategy families: One for the deserialization and the other for the serialization. This would have allowed reading from a format and saving to another. Dynamically selecting the strategy: Depending on the extension of the file provided; the Config object could have maintained a map extension → strategy and used it to select the right algorithm for the given extension. As we can see, we have several options for selecting the strategy to use, and the right one only depends on your requirements and the tradeoff in terms of features and the simplicity you want to obtain. Furthermore, the implementation of the pattern itself can vary a lot as well. For example, in its simplest form, the context and the strategy can both be simple functions: function context(strategy) {...} Even though this may seem insignificant, it should not be underestimated in a programming language such as JavaScript, where functions are first-class citizens and used as much as fully-fledged objects. Between all these variations, though, what does not change is the idea behind the pattern; as always, the implementation can slightly change but the core concepts that drive the pattern are always the same. Summary In this article, we dive deep into the details of the strategy pattern, one of the Behavioral Design Patterns in Node.js. Learn more in the book, Node.js Design Patterns, Third Edition by Mario Casciaro and Luciano Mammino. About the Authors Mario Casciaro is a software engineer and entrepreneur. Mario worked at IBM for a number of years, first in Rome, then in Dublin Software Lab. He currently splits his time between Var7 Technologies-his own software company-and his role as lead engineer at D4H Technologies where he creates software for emergency response teams. Luciano Mammino wrote his first line of code at the age of 12 on his father's old i386. Since then he has never stopped coding. He is currently working at FabFitFun as principal software engineer where he builds microservices to serve millions of users every day.
Read more
  • 0
  • 0
  • 51059

article-image-scientific-analysis-of-donald-trumps-tweets-on-covid-19-with-transformers
Expert Network
19 May 2021
7 min read
Save for later

Scientific Analysis of Donald Trump’s Tweets on COVID-19 with Transformers

Expert Network
19 May 2021
7 min read
It takes time and effort to figure out what is fake news and what isn't. Like children, we have to work our way through something we perceive as fake news. This article is an excerpt from the book Transformers for Natural Language Processing by Denis Rothman – A comprehensive guide for deep learning & NLP practitioners, data analysts and data scientists who want an introduction to AI language understanding to process the increasing amounts of language-driven functions.  In this article, we will focus on the logic of fake news. We will run the BERT model on SRL and visualize the results on AllenNLP.org. Now, let's go through some presidential tweets on COVID-19.  Our goal is certainly not to judge anybody or anything. Fake news involves both opinion and facts. News often depends on the perception of facts by local culture. We will provide ideas and tools to help others gather more information on a topic and find their way in the jungle of information we receive every day. Semantic Role Labeling (SRL)   SRL is an excellent educational tool for all of us. We tend just to read Tweets passively and listen to what others say about them. Breaking messages down with SRL is a good way to develop social media analytical skills to distinguish fake from accurate information.   I recommend using SRL transformers for educational purposes in class. A young student can enter a Tweet and analyze each verb and its arguments. It could help younger generations become active readers on social media. We will first analyze a relatively undivided Tweet and then a conflictual Tweet. Analyzing the undivided Tweet  Let's analyze the latest Tweet found on July 4 while writing the book, Transformers for Natural Language Processing. I took the name of the person who is referred to as a "Black American" out and paraphrased some of the former President's text:   "X is a great American, is hospitalized with coronavirus, and has requested prayer. Would you join me in praying for him today, as well as all those who are suffering from COVID-19?"    Let's go to AllenNLP.org, visualize our SRL using https://demo.allennlp.org/semantic-role-labeling, run the sentence, and look at the result. The verb "hospitalized" shows the member is staying close to the facts:   Figure: SRL arguments of the verb "hospitalized"   The message is simple: "X" + "hospitalized" + "coronavirus."   The verb "requested" shows that the message is becoming political:   Figure: SRL arguments of the verb "requested"   We don't know if the person requested the former President to pray or he decided he would be the center of the request.   A good exercise would be to display an HTML page and ask the users what they think. For example, the users could be asked to look at the results of the SRL task and answer the two following questions:   "Was former President Trump asked to pray, or did he deviate a request made to others for political reasons?"   "Is the fact that former President Trump states that he was indirectly asked to pray for X fake news or not?"  You can think about it and decide for yourself!   Analyzing the Banned Tweet Let's have a look at one that was banned from Twitter. I took the names out and paraphrased it and toned it down. Still, when we run it on AllenNLP.org and visualize the results, we get some surprising SRL outputs.   Here is the toned-down and paraphrased Tweet:   These thugs are dishonoring the memory of X.   When the looting starts, actions must be taken.   Although I suppressed the main part of the original Tweet, we can see that the SRL task shows the bad associations made in the Tweet:   Figure: SRL arguments of the verb "dishonoring"   An educational approach to this would be to explain that we should not associate the arguments "thugs" and "memory" and "looting." They do not fit together at all.   An important exercise would be to ask a user why the SRL arguments do not fit together.   I recommend many such exercises so that the transformer model users develop SRL skills to have a critical view of any topic presented to them.   Critical thinking is the best way to stop the propagation of the fake news pandemic!   We have gone through rational approaches to fake news with transformers, heuristics, and instructive websites. However, in the end, a lot of the heat in fake news debates boils down to emotional and irrational reactions.   In a world of opinion, you will never find an entirely objective transformer model that detects fake news since opposing sides never agree on what the truth is in the first place! One side will agree with the transformer model's output. Another will say that the model is biased and built by enemies of their opinion!   The best approach is to listen to others and try to keep the heat down!       Looking for the silver bullet   Looking for a silver bullet transformer model can be time-consuming or rewarding, depending on how much time and money you want to spend on continually changing models.   For example, a new approach to transformers can be found through disentanglement. Disentanglement in AI allows you to separate the features of a representation to make the training process more flexible. Pengcheng He, Xiaodong Liu, Jianfeng Gao, and Weizhu Chen designed DeBERTa, a disentangled version of a transformer, and described the model in an interesting article:   DeBERTa: Decoding-enhanced BERT with Disentangled Attention, https://arxiv.org/ abs/2006.03654 The two main ideas implemented in DeBERTa are:   Disentangle the content and position in the transformer model to train the two vectors separately.  Use an absolute position in thedecoderto predict masked tokens in the pretraining process.   The authors provide the code on GitHub: https://github.com/microsoft/DeBERTa DeBERTa exceeds the human baseline on the SuperGLUE leaderboard in December 2020 using 1.5B parameters.   Should you stop everything you are doing on transformers and rush to this model, integrate your data, train the model, test it, and implement it?   It is very probable that by the end of 2021, another model will beat this one and so on. Should you change models all of the time in production? That will be your decision.   You can also choose to design better training methods.   Looking for reliable training methods   Looking for reliable training methods with smaller models such as the PET designed by Timo Schick can also be a solution.   Why? Being in a good position on the SuperGLUE leaderboard does not mean that the model will provide a high quality of decision-making for medical, legal, and other critical areas for sequence predications.   Looking for customized training solutions for a specific topic could be more productive than trying all the best transformers on the SuperGLUE leaderboard.   Take your time to think about implementing transformers to find the best approach for your project.   We will now conclude the article.   Summary   Fake news begins deep inside our emotional history as humans. When an event occurs, emotions take over to help us react quickly to a situation. We are hardwired to react strongly when we are threatened.   We went through raging conflicts over COVID-19, former President Trump, and climate change. In each case, we saw that emotional reactions are the fastest ones to build up into conflicts.   We then designed a roadmap to take the emotional perception of fake news to a rational level. We showed that it is possible to find key information in Tweets, Facebook messages, and other media. The news used in this article is perceived by some as real news and others as fake news to create a rationale for teachers, parents, friends, co-workers, or just people talking.  About the Author Denis Rothman graduated from Sorbonne University and Paris-Diderot University, patenting one of the very first word2matrix embedding solutions. Denis Rothman is the author of three cutting-edge AI solutions: one of the first AI cognitive chatbots more than 30 years ago; a profit-orientated AI resource optimizing system; and an AI APS (Advanced Planning and Scheduling) solution based on cognitive patterns used worldwide in aerospace, rail, energy, apparel, and many other fields. Designed initially as a cognitive AI bot for IBM, it then went on to become a robust APS solution used to this day. 
Read more
  • 0
  • 0
  • 25196

article-image-distributed-training-in-tensorflow-2-x
Expert Network
30 Apr 2021
7 min read
Save for later

Distributed training in TensorFlow 2.x

Expert Network
30 Apr 2021
7 min read
TensorFlow 2 is a rich development ecosystem composed of two main parts: Training and Serving. Training consists of a set of libraries for dealing with datasets (tf.data), a set of libraries for building models, including high-level libraries (tf.Keras and Estimators), low-level libraries (tf.*), and a collection of pretrained models (tf.Hub). Training can happen on CPUs, GPUs, and TPUs via distribution strategies and the result can be saved using the appropriate libraries.  This article is an excerpt from the book, Deep Learning with TensorFlow 2 and Keras, Second Edition by Antonio Gulli, Amita Kapoor, and Sujit Pal. This book teaches deep learning techniques alongside TensorFlow (TF) and Keras. In this article, we’ll review the addition of the powerful new feature, distributed training, in TensorFlow 2.x.  One very useful addition to TensorFlow 2.x is the possibility to train models using distributed GPUs, multiple machines, and TPUs in a very simple way with very few additional lines of code. tf.distribute.Strategy is the TensorFlow API used in this case and it supports both tf.keras and tf.estimator APIs and eager execution. You can switch between GPUs, TPUs, and multiple machines by just changing the strategy instance. Strategies can be synchronous, where all workers train over different slices of input data in a form of sync data parallel computation, or asynchronous, where updates from the optimizers are not happening in sync. All strategies require that data is loaded in batches via the tf.data.Dataset api.  Note that the distributed training support is still experimental. A roadmap is given in Figure 1:  Figure 1: Distributed training support fr different strategies and APIs  Let’s discuss in detail all the different strategies reported in Figure 1.  Multiple GPUs  TensorFlow 2.x can utilize multiple GPUs. If we want to have synchronous distributed training on multiple GPUs on one machine, there are two things that we need to do: (1) We need to load the data in a way that will be distributed into the GPUs, and (2) We need to distribute some computations into the GPUs too:  In order to load our data in a way that can be distributed into the GPUs, we simply need tf.data.Dataset (which has already been discussed in the previous paragraphs). If we do not have a tf.data.Dataset but we have a normal tensor, then we can easily convert the latter into the former using tf.data.Dataset.from_tensors_slices(). This will take a tensor in memory and return a source dataset, the elements of which are slices of the given tensor. In our toy example, we use NumPy to generate training data x and labels y, and we transform it into tf.data.Dataset with tf.data.Dataset.from_tensor_slices(). Then we apply a shuffle to avoid bias in training across GPUs and then generate SIZE_BATCHES batches:  import tensorflow as tf import numpy as np from tensorflow import keras N_TRAIN_EXAMPLES = 1024*1024 N_FEATURES = 10 SIZE_BATCHES = 256  # 10 random floats in the half-open interval [0.0, 1.0). x = np.random.random((N_TRAIN_EXAMPLES, N_FEATURES)) y = np.random.randint(2, size=(N_TRAIN_EXAMPLES, 1)) x = tf.dtypes.cast(x, tf.float32) print (x) dataset = tf.data.Dataset.from_tensor_slices((x, y)) dataset = dataset.shuffle(buffer_size=N_TRAIN_EXAMPLES).batch(SIZE_BATCHES) In order to distribute some computations to GPUs, we instantiate a distribution = tf.distribute.MirroredStrategy() object, which supports synchronous distributed training on multiple GPUs on one machine. Then, we move the creation and compilation of the Keras model inside the strategy.scope(). Note that each variable in the model is mirrored across all the replicas. Let’s see it in our toy example: # this is the distribution strategy distribution = tf.distribute.MirroredStrategy() # this piece of code is distributed to multiple GPUs with distribution.scope(): model = tf.keras.Sequential()   model.add(tf.keras.layers.Dense(16, activation=‘relu’, input_shape=(N_FEATURES,)))   model.add(tf.keras.layers.Dense(1, activation=‘sigmoid’))   optimizer = tf.keras.optimizers.SGD(0.2)   model.compile(loss=‘binary_crossentropy’, optimizer=optimizer) model.summary()  # Optimize in the usual way but in reality you are using GPUs. model.fit(dataset, epochs=5, steps_per_epoch=10)  Note that each batch of the given input is divided equally among the multiple GPUs. For instance, if using MirroredStrategy() with two GPUs, each batch of size 256 will be divided among the two GPUs, with each of them receiving 128 input examples for each step. In addition, note that each GPU will optimize on the received batches and the TensorFlow backend will combine all these independent optimizations on our behalf. In short, using multiple GPUs is very easy and requires minimal changes to the tf.Keras code used for a single server.  MultiWorkerMirroredStrategy  This strategy implements synchronous distributed training across multiple workers, each one with potentially multiple GPUs. As of September 2019 the strategy works only with Estimators and it has experimental support for tf.Keras. This strategy should be used if you are aiming at scaling beyond a single machine with high performance. Data must be loaded with tf.Dataset and shared across workers so that each worker can read a unique subset.  TPUStrategy  This strategy implements synchronous distributed training on TPUs. TPUs are Google’s specialized ASICs chips designed to significantly accelerate machine learning workloads in a way often more efficient than GPUs. According to this public information (https://github.com/tensorflow/tensorflow/issues/24412):  “the gist is that we intend to announce support for TPUStrategy alongside Tensorflow 2.1. Tensorflow 2.0 will work under limited use-cases but has many improvements (bug fixes, performance improvements) that we’re including in Tensorflow 2.1, so we don’t consider it ready yet.”  ParameterServerStrategy  This strategy implements either multi-GPU synchronous local training or asynchronous multi-machine training. For local training on one machine, the variables of the models are placed on the CPU and operations are replicated across all local GPUs. For multi-machine training, some machines are designated as workers and some as parameter servers with the variables of the model placed on parameter servers. Computation is replicated across all GPUs of all workers. Multiple workers can be set up with the environment variable TF_CONFIG as in the following example:  os.environ[“TF_CONFIG”] = json.dumps({    “cluster”: {        “worker”: [“host1:port”, “host2:port”, “host3:port”],         “ps”: [“host4:port”, “host5:port”]    },    “task”: {“type”: “worker”, “index”: 1} })  In this article, we have seen how it is possible to train models using distributed GPUs, multiple machines, and TPUs in a very simple way with very few additional lines of code. Learn how to build machine and deep learning systems with the newly released TensorFlow 2 and Keras for the lab, production, and mobile devices with Deep Learning with TensorFlow 2 and Keras, Second Edition by Antonio Gulli, Amita Kapoor and Sujit Pal.  About the Authors  Antonio Gulli is a software executive and business leader with a passion for establishing and managing global technological talent, innovation, and execution. He is an expert in search engines, online services, machine learning, information retrieval, analytics, and cloud computing.   Amita Kapoor is an Associate Professor in the Department of Electronics, SRCASW, University of Delhi and has been actively teaching neural networks and artificial intelligence for the last 20 years. She is an active member of ACM, AAAI, IEEE, and INNS. She has co-authored two books.   Sujit Pal is a technology research director at Elsevier Labs, working on building intelligent systems around research content and metadata. His primary interests are information retrieval, ontologies, natural language processing, machine learning, and distributed processing. He is currently working on image classification and similarity using deep learning models. He writes about technology on his blog at Salmon Run. 
Read more
  • 0
  • 0
  • 35240

article-image-how-to-create-tensors-in-pytorch
Expert Network
20 Apr 2021
6 min read
Save for later

How to Create Tensors in PyTorch

Expert Network
20 Apr 2021
6 min read
A tensor is the fundamental building block of all DL toolkits. The name sounds rather mystical, but the underlying idea is that a tensor is a multi-dimensional array. Building analogy with school math, one single number is like a point, which is zero-dimensional, while a vector is one-dimensional like a line segment, and a matrix is a two-dimensional object. Three-dimensional number collections can be represented by a parallelepiped of numbers, but they don't have a separate name in the same way as a matrix. We can keep this term for collections of higher dimensions, which are named multi-dimensional arrays.  Another thing to note about tensors used in DL is that they are only partially related to tensors used in tensor calculus or tensor algebra. In DL, tensor is any multi-dimensional array, but in mathematics, tensor is a mapping between vector spaces, which might be represented as a multi-dimensional array in some cases but has much more semantical payload behind it. Mathematicians usually frown at everybody who uses well-established mathematical terms to name different things, so, be warned.  Figure 1: Going from a single number to an n-dimension tensor This article is an excerpt from the book Deep Reinforcement Learning Hands-On - Second Edition by Maxim Lapan. This book is an updated and expanded version of the bestselling guide to the very latest RL tools and techniques. In this article, we’ll discuss the fundamental building block of all DL toolkits, tensor.  Creation of tensors  If you're familiar with the NumPy library, then you already know that its central purpose is the handling of multi-dimensional arrays in a generic way. In NumPy, such arrays aren't called tensors, but they are in fact tensors. Tensors are used very widely in scientific computations as generic storage for data. For example, a color image could be encoded as a 3D tensor with dimensions of width, height, and color plane.  Apart from dimensions, a tensor is characterized by the type of its elements. There are eight types supported by PyTorch: three float types (16-bit, 32-bit, and 64-bit) and five integer types (8-bit signed, 8-bit unsigned, 16-bit, 32-bit, and 64-bit). Tensors of different types are represented by different classes, with the most commonly used being torch.FloatTensor (corresponding to a 32-bit float), torch.ByteTensor (an 8-bit unsigned integer), and torch.LongTensor (a 64-bit signed integer). The rest can be found in the PyTorch documentation.  There are three ways to create a tensor in PyTorch:  By calling a constructor of the required type.  By converting a NumPy array or a Python list into a tensor. In this case, the type will be taken from the array's type.  By asking PyTorch to create a tensor with specific data for you. For example, you can use the torch.zeros() function to create a tensor filled with zero values.  To give you examples of these methods, let's look at a simple session:  >>> import torch >>> import numpy as np >>> a = torch.FloatTensor(3, 2) >>> a tensor([[4.1521e+09,  4.5796e-41],        [ 1.9949e-20, 3.0774e-41],        [ 4.4842e-44, 0.0000e+00]]) Here, we imported both PyTorch and NumPy and created an uninitialized tensor of size 3×2. By default, PyTorch allocates memory for the tensor, but doesn't initialize it with anything. To clear the tensor's content, we need to use its operation:  >> a.zero_() tensor([[ 0., 0.],         [ 0., 0.],         [ 0., 0.]]) There are two types of operation for tensors: inplace and functional. Inplace operations have an underscore appended to their name and operate on the tensor's content. After this, the object itself is returned. The functional equivalent creates a copy of the tensor with the performed modification, leaving the original tensor untouched. Inplace operations are usually more efficient from a performance and memory point of view.  Another way to create a tensor by its constructor is to provide a Python iterable (for example, a list or tuple), which will be used as the contents of the newly created tensor:  >>> torch.FloatTensor([[1,2,3],[3,2,1]]) tensor([[ 1., 2., 3.],         [ 3., 2., 1.]])  Here we are creating the same tensor with zeroes using NumPy:  >>> n = np.zeros(shape=(3, 2)) >>> n array([[ 0., 0.],        [ 0., 0.],        [ 0., 0.]]) >>> b = torch.tensor(n) >>> b tensor([[ 0., 0.],         [ 0., 0.],         [ 0., 0.]], dtype=torch.float64)  The torch.tensor method accepts the NumPy array as an argument and creates a tensor of appropriate shape from it. In the preceding example, we created a NumPy array initialized by zeros, which created a double (64-bit float) array by default. So, the resulting tensor has the DoubleTensor type (which is shown in the preceding example with the dtype value). Usually, in DL, double precision is not required and it adds an extra memory and performance overhead. The common practice is to use the 32-bit float type, or even the 16-bit float type, which is more than enough. To create such a tensor, you need to specify explicitly the type of NumPy array: >>> n = np.zeros(shape=(3, 2), dtype=np.float32) >>> torch.tensor(n) tensor([[ 0., 0.],         [ 0., 0.],         [ 0., 0.]])  As an option, the type of the desired tensor could be provided to the torch.tensor function in the dtype argument. However, be careful, since this argument expects to get a PyTorch type specification, not the NumPy one. PyTorch types are kept in the torch package, for example, torch.float32 and torch.uint8.  >>> n = np.zeros(shape=(3,2)) >>> torch.tensor(n, dtype=torch.float32) tensor([[ 0., 0.],         [ 0., 0.],         [ 0., 0.]])  In this article, we saw a quick overview of tensor, the fundamental building block of all DL toolkits. We talked about tensor and how to create it in the PyTorch library. Discover ways to increase efficiency of RL methods both from theoretical and engineering perspective with the book Deep Reinforcement Learning Hands-on, Second Edition by Maxim Lapan.   About the Author  Maxim Lapan is a deep learning enthusiast and independent researcher. He has spent 15 years working as a software developer and systems architect. His projects have ranged from low-level Linux kernel driver development to performance optimization and the design of distributed applications working on thousands of servers.   With his areas of expertise including big data, machine learning, and large parallel distributed HPC and non-HPC systems, Maxim is able to explain complicated concepts using simple words and vivid examples. His current areas of interest are practical applications of deep learning, such as deep natural language processing and deep reinforcement learning. Maxim lives in Moscow, Russian Federation, with his family.  
Read more
  • 0
  • 0
  • 136987

article-image-rookout-and-appdynamics-team-up-to-help-enterprise-engineering-teams-debug-at-speed-with-deep-code-insights
Richard Gall
20 Feb 2020
3 min read
Save for later

Rookout and AppDynamics team up to help enterprise engineering teams debug at speed with Deep Code Insights

Richard Gall
20 Feb 2020
3 min read
It's not acknowledged enough that the real headache when it comes to software faults and performance problems isn't so much the problems themselves, but instead the process of actually identifying those problems. Sure, problems might slow you down, but wading though your application code to actually understand what's happened can sometimes grind engineering teams to a halt. For enterprise engineering teams, this can be particularly fatal. Agility is hard enough when you're dealing with complex applications and the burden of legacy software; but when things go wrong, any notion of velocity can be summarily discarded to the trashcan. However, a new partnership between debugging platform Rookout and APM company AppDynamics, announced at AppDynamics' Transform 2020 event, might just change that. The two organizations have teamed up, with Rookout's impressive debugging capabilities now available to AppDynamics customers in the form of a new product called Deep Code Insights. [caption id="attachment_31042" align="alignleft" width="696"] Live debugging of an application in production in Deep Code Insights[/caption]                 What is Deep Code Insights? Deep Code Insights is a new product for AppDynamics customers that combines the live-code debugging capabilities offered by Rookout with AppDynamic's APM platform. The advantage for developers could be substantial. Jerrie Pineda, Enterprise Software Architect at Maverik says that "Rookout helps me get the debugging data I need in seconds instead of waiting for several hours." This means, he explains, "[Maverik's] mean time to resolution (MTTR) for most issues is slashed up to 80%.” What does Deep Code Insights mean for AppDynamics? For AppDynamics, Deep Code Insights allows the organization to go one step further in its mission to "make it easier for businesses to understand their own software." At least that's how AppDynamics' VP of corporate development and strategy at Kevin Wagner puts it. "Together [with Rookout], we are narrowing the gaps between indicating a code-related problem impacting performance, pinpointing the direct issue within the line of code, and deploying a solution quickly for a seamless customer experience," he says. What does Deep Code Insights mean for Rookout? For Rookout, meanwhile, the partnership with AppDynamics is a great way for the company to reach out to a wider audience of users working at large enterprise organizations. The company received $8,000,000 in Series A funding back in August. This has provided a solid platform on which it is clearly looking to build and grow. Rookout's Co-Founder and CEO Or Weis describes the partnership as "obvious." "We want to bring the next-gen developer workflow to enterprise customers and help them increase product velocity," he says. Learn more about Rookout: www.rookout.com Learn more about AppDynamics: www.appdynamics.com  
Read more
  • 0
  • 0
  • 34131
Unlock access to the largest independent learning library in Tech for FREE!
Get unlimited access to 7500+ expert-authored eBooks and video courses covering every tech area you can think of.
Renews at €18.99/month. Cancel anytime
article-image-how-to-implement-data-validation-with-xamarin-forms
Packt Editorial Staff
03 Feb 2020
8 min read
Save for later

How to implement data validation with Xamarin.Forms

Packt Editorial Staff
03 Feb 2020
8 min read
In software, data validation is a process that ensures the validity and integrity of user input and usually involves checking that that data is in the correct format and contains an acceptable value. In this Xamarin tutorial, you'll learn how to implement it with Xamarin.Forms. This article is an excerpt from the book Mastering Xamarin.Forms, Third Edition by Ed Snider. The book walks you through the creation of a simple app, explaining at every step why you're doing the things you're doing, so that you gain the skills you need to use Xamarin.Forms to create your own high-quality apps. Types of data validation in mobile application development There are typically two types of validation when building apps: server-side and client-side. Both play an important role in the lifecycle of an app's data. Server-side validation is critical when it comes to security, making sure malicious data or code doesn't make its way into the server or backend infrastructure. Client-side validation is usually more about user experience than security. A mobile app should always validate its data before sending it to a backend (such as a web API) for several reasons, including the following: To provide real time feedback to the user about any issues instead of waiting on a response from the backend. To support saving data in offline scenarios where the backend is not available. To prevent encoding issues when sending the data to the backend. Just as a backend server should never assume all incoming data has been validated by the client-side before being received, a mobile app should also never assume the backend will do its own server-side validation, even though it's a good security practice. For this reason, mobile apps should perform as much client-side validation as possible. When adding validation to a mobile app the actual validation logic can go in a few areas of the app architecture. It could go directly in the UI code (the View layer of an Model-View-ViewModel (MVVM) architecture), it could go in the business logic or controller code (the ViewModel layer of an MVVM architecture), or it could even go in the HTTP code. In most cases when implementing the MVVM pattern it will make the most sense to include validation in the ViewModels for the following reasons: The validation rules can be checked as the individual properties of the ViewModel are changed. The validation rules are often part of or dependent on some business logic that exists in the ViewModel. Most importantly, having the validation rules implemented in the ViewModel makes them easy to test. Adding a base validation ViewModel in Xamarin.Forms Validation makes the most sense in the ViewModel. To do this we will start by creating a new base ViewModel that will provide some base level methods, properties, and events for subclassed ViewModels to leverage. This new base ViewModel will be called BaseValidationViewModel and will subclass the BaseViewModel. It will also implement an interface called from the System.ComponentModel namespace. INotifyDataErrorInfo works a lot like INotifyPropertyChanged – it specifies some properties about what errors have occurred and as well as an event for when the error state of particular property changes. Create a new class in the ViewModels folder name BaseValidationViewModel that subclasses BaseViewModel: Create a new class in the ViewModels folder name BaseValidationViewModel that subclasses BaseViewModel: public class BaseValidationViewModel : BaseViewModel { public BaseValidationViewModel(INavService navService) : base(navService) { } } 2. Update BaseValidationViewModel to implement INotifyDataErrorInfo as follows: public class BaseValidationViewModel : BaseViewModel, INotifyDataErrorInfo { readonly IDictionary<string, List<string>> _errors = new Dictionary<string, List<string>>(); public BaseValidationViewModel(INavService navService) : base(navService) { } public event EventHandler<DataErrorsChangedEventArgs> ErrorsChanged; public bool HasErrors => _errors?.Any(x => x.Value?.Any() == true) == true; public IEnumerable GetErrors(string propertyName) { if (string.IsNullOrWhiteSpace(propertyName)) { return _errors.SelectMany(x => x.Value); } if (_errors.ContainsKey(propertyName) && _errors[propertyName].Any()) { return _errors[propertyName]; } return new List<string>(); } } 3. In addition to implementing the required members of INotifyDataErrorInfo – ErrorChanged, HasErrors, and GetErrors() – we also need to add a method that actually handles validating ViewModel properties. This method needs a validation rule parameter in the form of a Func<bool> and an error message to be used if the validation rule fails. Add a protected method named Validate to BaseValidationViewModel as follows: public class BaseValidationViewModel : BaseViewModel, INotifyDataErrorInfo { // ... protected void Validate(Func<bool> rule, string error, [CallerMemberName] string propertyName = "") { if (string.IsNullOrWhiteSpace(propertyName)) return; if (_errors.ContainsKey(propertyName)) { _errors.Remove(propertyName); } if (rule() == false) { _errors.Add(propertyName, new List<string> { error }); } OnPropertyChanged(nameof(HasErrors)); ErrorsChanged?.Invoke(this, new DataErrorsChangedEventArgs(propertyName)); } } If the validation rule Func<bool> returns false, the error message that is provided is added to a private list of errors-used by HasErrors and GetErrors()-mapped to the specific property that called into this Validate() method. Lastly, the Validate() method invokes the ErrorsChanged event with the caller property's name included in the event arguments. Now any ViewModel that needs to perform validation can subclass BaseValidationViewModel and call the Validate() method to check if individual properties are valid. In the next section, we will use BaseValidationViewModel to add validation to the new entry page and its supporting ViewModel. Adding validation to the new entry page in Xamarin.Forms In this section we will add some simple client-side validation to a couple of the entry fields on the new entry page. First, update NewEntryViewModel to subclass BaseValidationViewModel instead of BaseViewModel. public class NewEntryViewModel : BaseValidationViewModel { // ... } Because BaseValidationViewModel subclasses BaseViewModel, NewEntryViewModel is still able to leverage everything in BaseViewModel as well. 2. Next, add a call to Validate() in the Title property setter that includes a validation rule specifying that the field cannot be left blank: public string Title { get => _title; set { _title = value; Validate(() => !string.IsNullOrWhiteSpace(_title), "Title must be provided."); OnPropertyChanged(); SaveCommand.ChangeCanExecute(); } 3. Next, add a call to Validate() in the Rating property setter that includes a validation rule specifying that the field's value must be between 1 and 5: public int Rating { get => _rating; set { _rating = value; Validate(() => _rating >= 1 && _rating <= 5, "Rating must be between 1 and 5."); OnPropertyChanged(); SaveCommand.ChangeCanExecute(); } Notice we also added SaveCommand.ChangeCanExecute() to the setter as well. This is because we want to update the SaveCommand's canExecute value when this value as changed since it will now impact the return value of CanSave(), which we will update in the next step. 4. Next, update CanSave() – the method used for the SaveCommand's canExecute function – to prevent saving if the ViewModel has any errors: bool CanSave() => !string.IsNullOrWhitespace(Title) && !HasErrors; 5. Finally, update the new entry page to reflect any errors by highlighting the field's text color in red: // NewEntryPage.xaml: <EntryCell x:Name="title" Label="Title" Text="{Binding Title}" /> // ... <EntryCell x:Name="rating" Label="Rating" Keyboard="Numeric" Text="{Binding Rating}" /> // NewEntryPage.xaml.cs: using System; using System.Collections.Generic; using System.ComponentModel; using System.Linq; using Xamarin.Forms; using TripLog.ViewModels; public partial class NewEntryPage : ContentPage { NewEntryViewModel ViewModel => BindingContext as NewEntryViewModel; public NewEntryPage() { InitializeComponent(); BindingContextChanged += Page_BindingContextChanged; BindingContext = new NewEntryViewModel(); } void Page_BindingContextChanged(object sender, EventArgs e) { ViewModel.ErrorsChanged += ViewModel_ErrorsChanged; } void ViewModel_ErrorsChanged(object sender, DataErrorsChangedEventArgs e) { var propHasErrors = (ViewModel.GetErrors(e.PropertyName) as List<string>)?.Any() == true; switch (e.PropertyName) { case nameof(ViewModel.Title): title.LabelColor = propHasErrors ? Color.Red : Color.Black; break; case nameof(ViewModel.Rating): rating.LabelColor = propHasErrors ? Color.Red : Color.Black; break; Default: break; } } } Now when we run the app we will see the following screenshots: [caption id="attachment_31034" align="aligncenter" width="846"] The TripLog new entry page with client-side validation[/caption] Navigate to the new entry page and enter an invalid value in either the Title or Rating field we will see the field label turn red and the Save button will be disabled. Once the error has been corrected the field label color returns to black and the Save button is re-enabled. Learn more mobile application development with Xamarin and the open source Xamarin.Forms toolkit with the third edition Mastering Xamarin.Forms. About Ed Snider Ed Snider is a senior software developer, speaker, author, and Microsoft MVP based in the Washington D.C./Northern Virginia area. He has a passion for mobile design and development and regularly speaks about Xamarin and Windows app development in the community. Ed works at InfernoRed Technology, where his primary role is working with clients and partners to build mobile and media focused products on iOS, Android, and Windows. He started working with.NET in 2005 when .NET 2.0 came out and has been building mobile apps with .NET since 2011. Ed was recognized as a Xamarin MVP in 2015 and as a Microsoft MVP in 2017. Find him on Twitter: @edsnider
Read more
  • 0
  • 0
  • 50162

article-image-5-reasons-why-you-should-use-an-open-source-data-analytics-stack-in-2020
Amey Varangaonkar
28 Jan 2020
7 min read
Save for later

5 reasons why you should use an open-source data analytics stack in 2020

Amey Varangaonkar
28 Jan 2020
7 min read
Today, almost every company is trying to be data-driven in some sense or the other. Businesses across all the major verticals such as healthcare, telecommunications, banking, insurance, retail, education, etc. make use of data to better understand their customers, optimize their business processes and, ultimately, maximize their profits. This is a guest post sponsored by our friends at RudderStack. When it comes to using data for analytics, companies face two major challenges: Data tracking: Tracking the required data from a multitude of sources in order to get insights out of it. As an example, tracking customer activity data such as logins, signups, purchases, and even clicks such as bookmarks from platforms such as mobile apps and websites becomes an issue for many eCommerce businesses. Building a link between the Data and Business Intelligence: Once data is acquired, transforming it and making it compatible for a BI tool can often prove to be a substantial challenge. A well designed data analytics stack comes is essential in combating these challenges. It will ensure you're well-placed to use the data at your disposal in more intelligent ways. It will help you drive more value. What does a data analytics stack do? A data analytics stack is a combination of tools which when put together, allows you to bring together all of your data in one platform, and use it to get actionable insights that help in better decision-making. As seen the diagram above illustrates, a data analytics stack is built upon three fundamental steps: Data Integration: This step involves collecting and blending data from multiple sources and transforming them in a compatible format, for storage. The sources could be as varied as a database (e.g. MySQL), an organization’s log files, or event data such as clicks, logins, bookmarks, etc from mobile apps or websites. A data analytics stack allows you to use all of such data together and use it to perform meaningful analytics. Data Warehousing: This next step involves storing the data for the purpose of analytics. As the complexity of data grows, it is feasible to consolidate all the data in a single data warehouse. Some of the popular modern data warehouses include Amazon’s Redshift, Google BigQuery and platforms such as Snowflake and MarkLogic. Data Analytics: In this final step, we use a visualization tool to load the data from the warehouse and use it to extract meaningful insights and patterns from the data, in the form of charts, graphs and reports. Choosing a data analytics stack - proprietary or open-source? When it comes to choosing a data analytics stack, businesses are often left with two choices - buy it or build it. On one hand, there are proprietary tools such as Google Analytics, Amplitude, Mixpanel, etc. - where the vendors alone are responsible for their configuration and management to suit your needs. With the best in class features and services that come along with the tools, your primary focus can just be project management, rather than technology management. While using proprietary tools have their advantages, there are also some major cons to them that revolve mainly around cost, data sharing, privacy concerns, and more. As a result, businesses today are increasingly exploring the open-source alternatives to build their data analytics stack. The advantages of open source analytics tools Let's now look at the 5 main advantages that open-source tools have over these proprietary tools. Open source analytics tools are cost effective Proprietary analytics products can cost hundreds of thousands of dollars beyond their free tier. For small to medium-sized businesses, the return on investment does not often justify these costs. Open-source tools are free to use and even their enterprise versions are reasonably priced compared to their proprietary counterparts. So, with a lower up-front costs, reasonable expenses for training, maintenance and support, and no cost for licensing, open-source analytics tools are much more affordable. More importantly, they're better value for money. Open source analytics tools provide flexibility Proprietary SaaS analytics products will invariably set restrictions on the ways in which they can be used. This is especially the case with the trial or the lite versions of the tools, which are free. For example, full SQL is not supported by some tools. This makes it hard to combine and query external data alongside internal data. You'll also often find that warehouse dumps provide no support either. And when they do, they'll probably cost more and still have limited functionality. Data dumps from Google Analytics, for instance, can only be loaded into Google BigQuery. Also, these dumps are time-delayed. That means the loading process can be very slow.. With open-source software, you get complete flexibility: from the way you use your tools, how you combine to build your stack, and even how you use your data. If your requirements change - which, let's face it, they probably will - you can make the necessary changes without paying extra for customized solutions. Avoid vendor lock-in Vendor lock-in, also known as proprietary lock-in, is essentially a state where a customer becomes completely dependent on the vendor for their products and services. The customer is unable to switch to another vendor without paying a significant switching cost. Some organizations spend a considerable amount of money on proprietary tools and services that they heavily rely on. If these tools aren't updated and properly maintained, the organization using it is putting itself at a real competitive disadvantage. This is almost never the case with open-source tools. Constant innovation and change is the norm. Even if the individual or the organization handling the tool moves on, the community catn take over the project and maintain it. With open-source, you can rest assured that your tools will always be up-to-date without heavy reliance on anyone. Improved data security and privacy Privacy has become a talking point in many data-related discussions of late. This is thanks, in part, to data protection laws such as the GDPR and CCPA coming into force. High-profile data leaks have also kept the issue high on the agenda. An open-source stack analytics running inside your cloud or on-prem environment gives complete control of your data. This lets you decide which data is to be used when, and how. It lets you dictate how third parties can access and use your data, if at all. Open-source is the present It's hard to counter the fact that open-source is now mainstream. Companies like Microsoft, Apple, and IBM are now not only actively participating in the open-source community, they're also contributing to it. Open-source puts you on the front foot when it comes to innovation. With it, you'll be able to leverage the power of a vibrant developer community to develop better products in more efficient ways. How RudderStack helps you build an ideal open-source data analytics stack RudderStack is a completely open-source, enterprise-ready platform to simplify data management in the most secure and reliable way. It works as a perfect data integration platform by routing your event data from data sources such as websites, mobile apps and servers, to multiple destinations of your choice - thus helping you save time and effort. RudderStack integrates effortlessly with a multitude of destinations such as Google Analytics, Amplitude, MixPanel, Salesforce, HubSpot, Facebook Ads, and more, as well as popular data warehouses such as Amazon Redshift or S3. If performing efficient clickstream analytics is your goal, RudderStack offers you the perfect data pipeline to collect and route your data securely. Learn more about Rudderstack by visiting the RudderStack website, or check out its GitHub page to find out how it works.
Read more
  • 0
  • 0
  • 55561

article-image-3-different-types-of-generative-adversarial-networks-gans-and-how-they-work
Packt Editorial Staff
08 Jan 2020
6 min read
Save for later

3 different types of generative adversarial networks (GANs) and how they work

Packt Editorial Staff
08 Jan 2020
6 min read
Generative adversarial networks (GANs) have been greeted with real excitement since their creation back in 2014 by Ian Goodfellow and his research team. Yann LeCun, Facebook's Director of AI Research went as far as describing GANs as "the most interesting idea in the last 10 years in ML." With all this excitement, however, it can be easy to miss the subtle diversity of GANs; there are a number of different types of generative adversarial networks, each one working in slightly different ways and helping engineers to achieve slightly different results. To give you a deeper insight on GANs, in this article we'll look at three different generative adversarial networks: SRGANs, CycleGANs, and InfoGANs. We'll explore how these different GANs work and how they can be used. This should give you a solid foundation to explore GANs in more depth and begin to apply them in your own experiments and projects. This article is an excerpt from the book, Deep Learning with TensorFlow 2 and Keras, Second Edition by Antonio Gulli, Amita Kapoor, and Sujit Pal.  SRGAN - Super Resolution GANs Remember seeing a crime-thriller where our hero asks the computer guy to magnify the faded image of the crime scene? With the zoom we are able to see the criminal’s face in detail, including the weapon used and anything engraved upon it! Well, SRGAN can perform similar magic. Here a GAN is trained in such a way that it can generate a photorealistic high-resolution image when given a low-resolution image. The SRGAN architecture consists of three neural networks: a very deep generator network, a discriminator network, and a pretrained VGG-16 network. How do SRGANs work? SRGANs use the perceptual loss function (developed by Johnson et al, Perceptual Losses for Real-Time Style Transfer and Super-Resolution). The difference in the feature map activations in high layers of a VGG network between the network output part and the high-resolution part comprises the perceptual loss function. Besides perceptual loss, the authors further added content loss and an adversarial loss so that images generated look more natural and the finer details more artistic. The perceptual loss is defined as the weighted sum of content loss and adversarial loss: lSR = lSR X+ 10−3×lSRGen The first term on the right-hand side is the content loss, obtained using the feature maps generated by pretrained VGG 19. Mathematically it is the Euclidean distance between the feature map of the reconstructed image (that is the one generated by the generator) and the original high-resolution reference image. The second term on the right-hand side is the adversarial loss. It is the standard generative loss term, designed to ensure that images generated by the generator are able to fool the discriminator. You can see in the following figure taken from the original paper that the image generated by SRGAN is much closer to the original high-resolution image: [caption id="attachment_31006" align="aligncenter" width="907"] image via https://arxiv.org/pdf/1609.04802.pdf[/caption] CycleGAN Another noteworthy architecture is CycleGAN; proposed in 2017, it can perform the task of image translation. Once trained you can translate an image from one domain to another domain. For example, when trained on horse and zebra data set, if you give it an image with horses in the ground, the CycleGAN can convert the horses to zebra with the same background. How does CycleGAN work? Have you ever imagined how a scenery would look if Van Gogh or Manet had painted it? We have many sceneries, and many landscapes painted by Gogh/Manet, but we do not have any collection of input-output pairs. CycleGAN performs the image translation, that is, transfers an image given in one domain (scenery for example) to another domain (Van Gogh painting of the same scene, for instance) in the absence of training examples. CycleGAN’s ability to perform image translation in the absence of training pairs is what makes it unique. To achieve image translation the authors of CycleGAN used a very simple and yet effective procedure. They made use of two GANs, the generator of each GAN performing the image translation from one domain to another. To elaborate, let us say the input is X, then the generator of the first GAN performs a mapping G: X → Y, thus its output would be Y = G(X). The generator of the second GAN performs an inverse mapping F: Y → X, resulting in X = F(Y). Each discriminator is trained to distinguish between real images and synthesized images. The idea is shown as follows: To train the combined GANs, the authors added beside the conventional GAN adversarial loss a forward cycle consistency loss (left figure) and a backward cycle consistency loss (right figure). This ensures that if an image X is given as input, then after the two translations F(G(X)) ~ X the obtained image is the same X (similarly the backward cycle consistency loss ensures the G(F(Y)) ~ Y). Following are some of the successful image translations by CycleGAN: Following are few more examples, you can see the translation of seasons (summer → winter), photo → painting and vice versa, horses → zebra: InfoGAN The GAN architectures that we have considered up to now provide us with little or no control over the generated images. InfoGAN changes this; it provides control over various attributes of the images generated. The InfoGAN uses concepts from information theory such that the noise term is transformed into latent codes which provide predictable and systematic control over the output. How does InfoGAN work? The generator in InfoGAN takes two inputs the latent space Z and a latent code c, thus the output of generator is G(Z,c). The GAN is trained such that it maximizes the mutual information between the latent code c and the generated image G(Z,c). The following figure shows the architecture of InfoGAN:   The concatenated vector (Z,c) is fed to the Generator. Q(c|X) is also a neural network, combined with the generator it works to form a mapping between random noise Z and its latent code c_hat, it aims to estimate c given X. This is achieved by adding a regularization term to the objective function of conventional GAN: minDmaxG VI(D,G) = VG(D,G) −λI(c;G(Z,c)) The term VG(D,G) is the loss function of conventional GAN, and the second term is the regularization term, where λ is a constant. Its value was set to 1 in the paper, and I(c;G(Z,c)) is the mutual information between the latent code c and the Generator generated image G(Z,c). Below is the results of InfoGAN on the MNIST dataset: That concludes our brief look at three different types of generative adversarial networks. You can find the book from which this article was taken on the Packt store or you can read the first chapter for free on the Packt subscription platform.
Read more
  • 0
  • 0
  • 100369

article-image-10-tech-startups-for-2020-that-will-help-the-world-build-more-resilient-secure-and-observable-software
Richard Gall
30 Dec 2019
10 min read
Save for later

10 tech startups for 2020 that will help the world build more resilient, secure, and observable software

Richard Gall
30 Dec 2019
10 min read
The Datadog IPO in September marked an important moment for the tech industry. This wasn’t just because the company was the fourth tech startup to reach a $10 billion market cap in 2019, but also because it announced something that many people, particularly those in and around Silicon Valley, have been aware of for some time: the most valuable software products in the world aren’t just those that offer speed, and efficiency, they’re those that provide visibility, and security across our software systems. It shouldn’t come as a surprise. As software infrastructure becomes more complex, constantly shifting and changing according to the needs of users and businesses, the ability to assume some degree of control emerges as particularly precious. Indeed, the idea of control and stability might feel at odds with a decade or so that has prized innovation at speed. The mantra ‘move fast and break things’ is arguably one of the defining ones of the last decade. And while that lust for change might never disappear, it’s nevertheless the case that we’re starting to see a mindset shift in how business leaders think about technology. If everyone really is a tech company now, there’s now a growing acceptance that software needs to be treated with more respect and care. The Datadog IPO, then, is just the tip of an iceberg in which monitoring, observability, security, and resiliency tools have started to capture the imagination of technology leaders. While what follows is far from exhaustive, it does underline some of the key players in a growing field. Whether you're an investor or technology decision maker, here are ten tech startups you should watch out for in 2020 from across the cloud and DevOps space. Honeycomb Honeycomb has been at the center of the growing conversation around observability. Designed to help you “own production in hi-res,” what makes it unique in the market is that it allows you to understand and visualize your systems through high-cardinality dimensions (eg. at a user by user level, rather than, say, browser type or continent). The driving force behind Honeycomb is Charity Majors, its co-founder and former CEO. I was lucky enough to speak to her at the start of the year, and it was clear that she has an acute understanding of the challenges facing engineering teams. What was particularly striking in our conversation is how she sees Honeycomb as a tool for empowering developers. It gives them ownership over the code they write and the systems they build. “Ownership gives you the power to fix the thing you know you need to fix and the power to do a good job…” she told me. “People who find ownership is something to be avoided – that’s a terrible sign of a toxic culture.” Honeycomb’s investment status At the time of writing, Honeycomb has received $26.9 million in funding, with $11.4 million series A back in September. Firehydrant “You just got paged. Now what?” That’s the first line that greets you on the FireHydrant website. We think it sums up many of the companies on this list pretty well; many of the best tools in the DevOps space are designed to help tackle the challenges on-call developers face. FireHydrant isn't a tech startup with the profile of Honeycomb. However, as an incident management tool that integrates very neatly into a massive range of workflow tools, we’re likely to see it gain traction in 2020. We particularly like the one-click post mortem feature - it’s clear the product has been built in a way that allows developers to focus on the hard stuff and minimize the things that can just suck up time. FireHydrant’s investment status FireHydrant has raised $1.5 million in seed funding. NS1 Managing application traffic can be business-critical. That’s why NS1 exists; with DNS, DHCP and IP address management capabilities, it’s arguably one of the leading tools on the planet for dealing with the diverse and extensive challenges that come with managing massive amounts of traffic across complex interlocking software applications and systems. The company boasts an impressive roster of clients, including DropBox, The Guardian and LinkedIn, which makes it hard to bet against NS1 going from strength to strength in 2020. Like all software adoption, it might take some time to move beyond the realms of the largest and most technically forward-thinking organizations, but it’s surely only a matter of time until it the importance of smarter and more efficient becomes clear to even the smallest businesses. NS1’s investment status NS1 has raised an impressive $78.4 million in funding from investors (although it’s important to note that it’s one of the oldest companies on this list, founded all the way back in 2013). It received $33 million in series C funding at the beginning of October. Rookout “It’s time to liberate your data” Rookout implores us. For too long, the startup’s argument goes, data has been buried inside our applications where it’s useless for developers and engineers. Once it has been freed, it can help inform how we go about debugging and monitoring our systems. Designed to work for modern architectural and deployment patterns such as Kubernetes and serverless, Rookout is a tool that not only brings simplicity in the midst of complexity, it can also save engineering teams a serious amount of time when it comes to debugging and logging - the company claims by 80%. Like FireHydrant, this means engineers can focus on other areas of application performance and resilience. Rookout’s investment status Back in August, Rookout raised $8 million in Series A funding, taking its total funding amount to $12.2 million dollars. LaunchDarkly Feature flags or toggles are a concept that have started to gain traction in engineering teams in the last couple of years or so. They allow engineering teams to “modify system behavior without changing code” (thank you Martin Fowler). LaunchDarkly is a platform specifically built to allow engineers to use feature flags. At a fundamental level, the product allows DevOps teams to deploy code (ie. change features) quickly and with minimal risk. This allows for testing in production and experimentation on a large scale. With support for just about every programming language, it’s not surprising to see LaunchDarkly boast a wealth of global enterprises on its list of customers. This includes IBM and NBC. LaunchDarkly’s investment status LaunchDarkly raised $44 million in series C funding early in 2019. To date, it has raised $76.3 million. It’s certainly one to watch closely in 2020; it's ability to help teams walk the delicate line between innovation and instability is well-suited to the reality of engineering today. Gremlin Gremlin is a chaos engineering platform designed to help engineers to ‘stress test’ their software systems. This is important in today’s technology landscape. With system complexity making unpredictability a day-to-day reality, Gremlin lets you identify weaknesses before they impact customers and revenue. Gremlin’s mission is to “help build a more reliable internet.” That’s not just a noble aim, it’s an urgent one too. What’s more, you can see that the business is really living out its mission. With Gremlin Free launching at the start of 2019, and the second ChaosConf taking place in the fall, it’s clear that the company is thinking beyond the core product: they want to make chaos engineering more accessible to a world where resilience can feel impossible in the face of increasing complexity. Gremlin’s investment status Since being founded back in 2016 by CTO Matt Fornaciari and CEO Kolton Andrus, Gremlin has raised $26.8Million in funding from Redpoint Ventures, Index Ventures, and Amplify Partners. Cockroach Labs Cockroach Labs is the organization behind CockroachDB, the cloud-native distributed SQL database. CockroachDB’s popularity comes from two things: it’s ability to scale from a single instance to thousands, and it’s impressive resilience. Indeed, its resilience is where it takes its name from. Like a cockroach, CockroachDB is built to keep going even after everything else has burned to the ground. It’s been an interesting year for CockroachLabs and CockroachDB - in June the company changed the CockroachDB core licence from open source Apache license to the Business Source License (BSL), developed by the MariaDB team. The reason for this was ultimately to protect the product as it seeks to grow. The BSL still means the source code is accessible for any use other than for a DBaaS (you’ll need an enterprise license for that). A few months later, the company took another step in pushing forward in the market with $55 million series C funding. Both stories were evidence that CockroachLabs is setting itself up for a big 2020. Although the database market will always be seriously competitive, with resilience as a core USP it’s hard to bet against Cockroach Labs. Cockroaches find a way, right? CockroachLabs investment status CockroachLabs total investment, following on from that impressive round of series C funding is now $108.5 million. Logz.io Logz.io is another platform in the observability space that you really need to watch out for in 2020. Built on the ELK stack (ElasticSearch, Logstash, and Kibana), what makes Logz.io really stand out is the use of machine learning to help identify issues across thousands and thousands of logs. Logz.io has been on ‘ones to watch’ lists for a number of years now. This was, we think, largely down to the rising wave of AI hype. And while we wouldn’t want to underplay its machine learning capabilities, it’s perhaps with the increasing awareness of the need for more observable software systems that we’ll see it really pack a punch across the tech industry. Logz.io’s investment status To date, Logz.io has raised $98.9 million. FaunaDB Fauna is the organization behind FaunaDB. It describes itself as “a global serverless database that gives you ubiquitous, low latency access to app data, without sacrificing data correctness and scale.” The database could be big in 2020. With serverless likely to go from strength to strength, and JAMstack increasing as a dominant approach for web developers, everything the Fauna team have been doing looks as though it will be a great fit for the shape of the engineering landscape in the future. Fauna’s investment status In total, Fauna has raised $32.6 million in funding from investors. Clubhouse One thing that gets overlooked when talking about DevOps and other issues in software development processes is simple project management. That’s why Clubhouse is such a welcome entry on this list. Of course, there are a massive range of project management tools available at the moment. But one of the reasons Clubhouse is such an interesting product is that it’s very deliberately built with engineers in mind. And more importantly, it appears it’s been built with an acute sense of the importance of enjoyment in a project management product. Clubhouse’s investment status Clubhouse has, to date, raised $16 million. As we see a continuing emphasis on developer experience, the tool is definitely one to watch in a tough marketplace. Conclusion: embrace the unpredictable The tech industry feels as unpredictable as the software systems we're building and managing. But while there will undoubtedly be some surprises in 2020, the need for greater security and resilience are themes that no one should overlook. Similarly, the need to gain more transparency and build for observability are critical. Whether you're an investor, business leader, or even an engineer, then, exploring the products that are shaping and defining the space is vital.
Read more
  • 0
  • 0
  • 44056
article-image-emmanuel-tsukerman-on-why-a-malware-solution-must-include-a-machine-learning-component
Savia Lobo
30 Dec 2019
11 min read
Save for later

Emmanuel Tsukerman on why a malware solution must include a machine learning component

Savia Lobo
30 Dec 2019
11 min read
Machine learning is indeed the tech of present times! Security, which is a growing concern for many organizations today and machine learning is one of the solutions to deal with it. ML can help cybersecurity systems analyze patterns and learn from them to help prevent similar attacks and respond to changing behavior. To know more about machine learning and its application in Cybersecurity, we had a chat with Emmanuel Tsukerman, a Cybersecurity Data Scientist and the author of Machine Learning for Cybersecurity Cookbook. The book also includes modern AI to create powerful cybersecurity solutions for malware, pentesting, social engineering, data privacy, and intrusion detection. In 2017, Tsukerman's anti-ransomware product was listed in the Top 10 ransomware products of 2018 by PC Magazine. In his interview, Emmanuel talked about how ML algorithms help in solving problems related to cybersecurity, and also gave a brief tour through a few chapters of his book. He also touched upon the rise of deepfakes and malware classifiers. On using machine learning for cybersecurity Using Machine learning in Cybersecurity scenarios will enable systems to identify different types of attacks across security layers and also help to take a correct POA. Can you share some examples of the successful use of ML for cybersecurity you have seen recently? A recent and interesting development in cybersecurity is that the bad guys have started to catch up with technology; in particular, they have started utilizing Deepfake tech to commit crime; for example,they have used AI to imitate the voice of a CEO in order to defraud a company of $243,000. On the other hand, the use of ML in malware classifiers is rapidly becoming an industry standard, due to the incredible number of never-before-seen samples (over 15,000,000) that are generated each year. On staying updated with developments in technology to defend against attacks Machine learning technology is not only used by ethical humans, but also by Cybercriminals who use ML for ML-based intrusions. How can organizations counter such scenarios and ensure the safety of confidential organizational/personal data? The main tools that organizations have at their disposal to defend against attacks are to stay current and to pentest. Staying current, of course, requires getting educated on the latest developments in technology and its applications. For example, it’s important to know that hackers can now use AI-based voice imitation to impersonate anyone they would like. This knowledge should be propagated in the organization so that individuals aren’t caught off-guard. The other way to improve one’s security is by performing regular pen tests using the latest attack methodology; be it by attempting to avoid the organization’s antivirus, sending phishing communications, or attempting to infiltrate the network. In all cases, it is important to utilize the most dangerous techniques, which are often ML-based On how ML algorithms and GANs help in solving cybersecurity problems In your book, you have mentioned various algorithms such as clustering, gradient boosting, random forests, and XGBoost. How do these algorithms help in solving problems related to cybersecurity? Unless a machine learning model is limited in some way (e.g., in computation, in time or in training data), there are 5 types of algorithms that have historically performed best: neural networks, tree-based methods, clustering, anomaly detection and reinforcement learning (RL). These are not necessarily disjoint, as one can, for example, perform anomaly detection via neural networks. Nonetheless, to keep it simple, let’s stick to these 5 classes. Neural networks shine with large amounts of data on visual, auditory or textual problems. For that reason, they are used in Deepfakes and their detection, lie detection and speech recognition. Many other applications exist as well. But one of the most interesting applications of neural networks (and deep learning) is in creating data via Generative adversarial networks (GANs). GANs can be used to generate password guesses and evasive malware. For more details, I’ll refer you to the Machine Learning for Cybersecurity Cookbook. The next class of models that perform well are tree-based. These include Random Forests and gradient boosting trees. These perform well on structured data with many features. For example, the PE header of PE files (including malware) can be featurized, yielding ~70 numerical features. It is convenient and effective to construct an XGBoost model (a gradient-boosting model) or a Random Forest model on this data, and the odds are good that performance will be unbeatable by other algorithms. Next there is clustering. Clustering shines when you would like to segment a population automatically. For example, you might have a large collection of malware samples, and you would like to classify them into families. Clustering is a natural choice for this problem. Anomaly detection lets you fight off unseen and unknown threats. For instance, when a hacker utilizes a new tactic to intrude on your network, an anomaly detection algorithm can protect you even if this new tactic has not been documented. Finally, RL algorithms perform well on dynamic problems. The situation can be, for example, a penetration test on a network. The DeepExploit framework, covered in the book, utilizes an RL agent on top of metasploit to learn from prior pen tests and becomes better and better at finding vulnerabilities. Generative Adversarial Networks (GANs) are a popular branch of ML used to train systems against counterfeit data. How can these help in malware detection and safeguarding systems to identify correct intrusion? A good way to think about GANs is as a pair of neural networks, pitted against each other. The loss of one is the objective of the other. As the two networks are trained, each becomes better and better at its job. We can then take whichever side of the “tug of war” battle, separate it from its rival, and use it. In other cases, we might choose to “freeze” one of the networks, meaning that we do not train it, but only use it for scoring. In the case of malware, the book covers how to use MalGAN, which is a GAN for malware evasion. One network, the detector, is frozen. In this case, it is an implementation of MalConv. The other network, the adversarial network, is being trained to modify malware until the detection score of MalConv drops to zero. As it trains, it becomes better and better at this. In a practical situation, we would want to unfreeze both networks. Then we can take the trained detector, and use it as part of our anti-malware solution. We would then be confident knowing that it is very good at detecting evasive malware. The same ideas can be applied in a range of cybersecurity contexts, such as intrusion and deepfakes. On how Machine Learning for Cybersecurity Cookbook can help with easy implementation of ML for Cybersecurity problems What are some of the tools/ recipes mentioned in your book that can help cybersecurity professionals to easily implement machine learning and make it a part of their day-to-day activities? The Machine Learning for Cybersecurity Cookbook offers an astounding 80+ recipes. Themost applicable recipes will vary between individual professionals, and even for each individual different recipes will be applicable at different times in their careers. For a cybersecurity professional beginning to work with malware, the fundamentals chapter, chapter 2:ML-based Malware Detection, provides a solid and excellent start to creating a malware classifier. For more advanced malware analysts, Chapter 3:Advanced Malware Detection will offer more sophisticated and specialized techniques, such as dealing with obfuscation and script malware. Every cybersecurity professional would benefit from getting a firm grasp of chapter 4, “ML for Social Engineering”. In fact, anyone at all should have an understanding of how ML can be used to trick unsuspecting users, as part of their cybersecurity education. This chapter really shows that you have to be cautious because machines are becoming better at imitating humans. On the other hand, ML also provides the tools to know when such an attack is being performed. Chapter 5, “Penetration Testing Using ML” is a technical chapter, and is most appropriate to cybersecurity professionals that are concerned with pen testing. It covers 10 ways in which pen testing can be improved by using ML, including neural network-assisted fuzzing and DeepExploit, a framework that utilizes a reinforcement learning (RL) agent on top of metasploit to perform automatic pen testing. Chapter 6, “Automatic Intrusion Detection” has a wider appeal, as a lot of cybersecurity professionals have to know how to defend a network from intruders. They would benefit from seeing how to leverage ML to stop zero-day attacks on their network. In addition, the chapter covers many other use cases, such as spam filtering, Botnet detection and Insider Threat detection, which are more useful to some than to others. Chapter 7, “Securing and Attacking Data with ML” provides great content to cybersecurity professionals interested in utilizing ML for improving their password security and other forms of data security. Chapter 8, “Secure and Private AI”, is invaluable to data scientists in the field of cybersecurity. Recipes in this chapter include Federated Learning and differential privacy (which allow to train an ML model on clients’ data without compromising their privacy) and testing adversarial robustness (which allows to improve the robustness of ML models to adversarial attacks). Your book talks about using machine learning to generate custom malware to pentest security. Can you elaborate on how this works and why this matters? As a general rule, you want to find out your vulnerabilities before someone else does (who might be up to no-good). For that reason, pen testing has always been an important step in providing security. To pen test your Antivirus well, it is important to use the latest techniques in malware evasion, as the bad guys will certainly try them, and these are deep learning-based techniques for modifying malware. On Emmanuel’s personal achievements in the Cybersecurity domain Dr. Tsukerman, in 2017, your anti-ransomware product was listed in the ‘Top 10 ransomware products of 2018’ by PC Magazine. In your experience, why are ransomware attacks on the rise and what makes an effective anti-ransomware product? Also, in 2018,  you designed an ML-based, instant-verdict malware detection system for Palo Alto Networks' WildFire service of over 30,000 customers. Can you tell us more about this project? If you monitor cybersecurity news, you would see that ransomware continues to be a huge threat. The reason is that ransomware offers cybercriminals an extremely attractive weapon. First, it is very difficult to trace the culprit from the malware or from the crypto wallet address. Second, the payoffs can be massive, be it from hitting the right target (e.g., a HIPAA compliant healthcare organization) or a large number of targets (e.g., all traffic to an e-commerce web page). Thirdly, ransomware is offered as a service, which effectively democratizes it! On the flip side, a lot of the risk of ransomware can be mitigated through common sense tactics. First, backing up one’s data. Second, having an anti-ransomware solution that provides guarantees. A generic antivirus can provide no guarantee - it either catches the ransomware or it doesn’t. If it doesn’t, your data is toast. However, certain anti-ransomware solutions, such as the one I have developed, do offer guarantees (e.g., no more than 0.1% of your files lost). Finally, since millions of new ransomware samples are developed each year, the malware solution must include a machine learning component, to catch the zero-day samples, which is another component of the anti-ransomware solution I developed. The project at Palo Alto Networks is a similar implementation of ML for malware detection. The one difference is that unlike the anti-ransomware service, which is an endpoint security tool, it offers protection services from the cloud. Since Palo Alto Networks is a firewall-service provider, that makes a lot of sense, since ideally, the malicious sample will be stopped at the firewall, and never even reach the endpoint. To learn how to implement the techniques discussed in this interview, grab your copy of the Machine Learning for Cybersecurity Cookbook Don’t wait - the bad guys aren’t waiting. Author Bio Emmanuel Tsukerman graduated from Stanford University and obtained his Ph.D. from UC Berkeley. In 2017, Dr. Tsukerman's anti-ransomware product was listed in the Top 10 ransomware products of 2018 by PC Magazine. In 2018, he designed an ML-based, instant-verdict malware detection system for Palo Alto Networks' WildFire service of over 30,000 customers. In 2019, Dr. Tsukerman launched the first cybersecurity data science course. About the book Machine Learning for Cybersecurity Cookbook will guide you through constructing classifiers and features for malware, which you'll train and test on real samples. You will also learn to build self-learning, reliant systems to handle cybersecurity tasks such as identifying malicious URLs, spam email detection, intrusion detection, network protection, and tracking user and process behavior, and much more! DevSecOps and the shift left in security: how Semmle is supporting software developers [Podcast] Elastic marks its entry in security analytics market with Elastic SIEM and Endgame acquisition Businesses are confident in their cybersecurity efforts, but weaknesses prevail
Read more
  • 0
  • 0
  • 30378

article-image-beyond-kubernetes-key-skills-for-infrastructure-and-ops-engineers-in-2020
Richard Gall
20 Dec 2019
5 min read
Save for later

Beyond Kubernetes: Key skills for infrastructure and ops engineers in 2020

Richard Gall
20 Dec 2019
5 min read
For systems engineers and those working in operations, the move to cloud and the rise of containers in recent years has drastically changed working practices and even the nature of job roles. But that doesn’t mean you can just learn Kubernetes and then rest on your laurels. To a certain extent, the broad industry changes we’ve seen haven’t stabilised into some sort of consensus but rather created a field where change is only more likely - and where things are arguably even less stable. This isn’t to say that you have anything to fear as an engineer. But you should be open minded about the skills you learn in 2020. Here’s a list of 5 skills you should consider spending some time developing in the new year. Scripting and scripting languages Scripting is a well-established part of many engineers’ skill set. The reasons for this are obvious: they allow you to automate tasks and get things done quickly. If you don’t know scripting, then of course you should learn it. But even if you do it’s worth thinking about exploring some new programming languages. You might find that a fresh approach - like learning, for example, Go if you mainly use Python - will make you more productive, or will help you to tackle problems with greater ease than you have in the past. Learn Linux shell scripting with Learn Linux Shell Scripting: the Fundamentals of Bash 4.4. Find out how to script with Python in Mastering Python Scripting for System Administrators. Infrastructure automation tools and platforms With the rise of hybrid and multi-cloud, infrastructure automation platforms like Ansible and Puppet have been growing more and more important to many companies. While Kubernetes has perhaps dented their position in the wider DevOps tooling marketplace (if, indeed, that’s such a thing), they nevertheless remain relevant in a world where managing complexity appears to be a key engineering theme. With Puppet looking to continually evolve and Ansible retaining a strong position on the market, they remain two of the most important platforms to explore and learn. However, there are a wealth of other options too - Terraform in particular appears to be growing at an alarming pace even if it hasn’t reached critical mass, but Salt and Chef are also well worth learning too. Get started with Ansible, fast - learn with Ansible Quick Start Guide. Cloud architecture and design Gone are the days when cloud was just a rented server. Gone are the days when it offered a simple (or relatively simple, at least) solution to storage and compute problems. With trends like multi and hybrid cloud becoming the norm, serverless starting to gain traction at the cutting edge of software development, being able to piece together various different elements is absolutely crucial. Indeed, this isn’t a straightforward skill that you can just learn with some documentation and training materials. Of course those help, but it also requires sensitivity to business needs, an awareness of how developers work, as well as an eye for financial management. However, if you can develop the broad range of skills needed to architect cloud solutions, you will be a very valuable asset to a business. Become a certified cloud architect with Packt's new Professional Cloud Architect – Google Cloud Certification Guide. Security and resilience With the increase in architectural complexity, the ability to ensure security and resilience is now both vital but also incredibly challenging. Fortunately, there are many different tools and techniques available for doing this, each one relevant to different job roles - from service meshes to monitoring platforms, to chaos engineering, there are many ways that engineers can take on stability and security challenges head on. Whatever platforms you’re working with, make it your mission to learn what you need to improve the security and resilience of your systems. Learn how to automate cloud security with Cloud Security Automation. Pushing DevOps forward No one wants to hear about the growth of DevOps - we get that. It’s been growing for almost a decade now; it certainly doesn’t need to be treated to another wave of platitudes as the year ends. So, instead of telling you to simply embrace DevOps, a smarter thing to do would be to think about how you can do DevOps better. What do your development teams need in terms of support? And how could they help you? In theory the divide between dev and ops should now be well and truly broken - the question that remains is that how should things evolve once that silo has been broken down? Okay, so maybe this isn’t necessarily a single skill you can learn. But it’s something that starts with conversation - so make sure you and those around you are having better conversations in 2020. Search the latest DevOps eBooks and videos on the Packt store.
Read more
  • 0
  • 0
  • 29919

article-image-5-key-skills-for-web-and-app-developers-to-learn-in-2020
Richard Gall
20 Dec 2019
5 min read
Save for later

5 Key skills for web and app developers to learn in 2020

Richard Gall
20 Dec 2019
5 min read
Web and application development can change quickly. Much of this is driven by user behavior and user needs - and if you can’t keep up with it, it’s going to be impossible to keep your products and projects relevant. The only way to do that, of course, is to ensure your skills are up to date and constantly looking forward to what might be coming. You can’t predict the future, but you can prepare yourself. Here are 5 key skill areas that we think web and app developers should focus on in 2020. Artificial intelligence It’s impossible to overstate the importance of AI in application development at the moment. Yes, it’s massively hyped, but that’s largely because its so ubiquitous. Indeed, to a certain extent many users won’t even realise their interacting with AI or machine learning systems. You might even say that that’s when its used best. The ways in which AI can be used by web and app developers is extensive and constantly growing. Perhaps the most obvious is personal recommendations, but it’s chatbots and augmented reality that are really pushing the boundaries of what’s possible with AI in the development field. Artificial intelligence might sound daunting if you’re primarily a web developer. But it shouldn’t - you don’t need a computer science or math degree to use it effectively. There are now many platforms and tools available to use machine learning technology out of the box, from Azure’s Cognitive Services, Amazon’s Rekognition, and ML Kit, built by Google for mobile developers. Learn how to build smart, AI-backed applications with Azure Cognitive Services in Azure Cognitive Services for Developers [Video]. New programming languages Earlier this year I wrote about how polyglot programming (being able to use more than one language) “allows developers to choose the right language to solve tough engineering problems.” For web and app developers, who are responsible for building increasingly complex applications and websites, in as elegant and as clean a manner as possible, this is particularly true. The emergence of languages like TypeScript and Kotlin attest to the importance of keeping your programming proficiency up to date. Moreover, you could even say that they highlight that, however popular core languages like JavaScript and Java are, there are now some tasks that they’re just not capable of dealing with. So, this doesn’t mean you should just ditch your favored programming languages in 2020. But it does mean that learning a new language is a great way to build your skill set. Explore new programming languages with eBook and video bundles here. Accessibility Web accessibility is a topic that has been overlooked for too long. That needs to change in 2020. It’s not hard to see how it gets ignored. When the pressure to deliver software is high, thinking about the consequences of specific design decisions on different types of users, is almost certainly going to be pushed to the bottom of developer’s priorities. But if anything this means we need a two-pronged approach - on the one hand developers need to commit to learning web accessibility themselves, but they also need to be evangelists in communicating its importance to non-technical team members. The benefits of this will be significant: it could be a big step towards a more inclusive digital world, but from a personal perspective, it will also help developers to become more aware and well-rounded in their design decisions. And insofar as no one’s taking real leadership for this at the moment, it’s the perfect opportunity for developers to prove their leadership chops. Read next: It’s a win for Web accessibility as courts can now order companies to make their sites WCAG 2.0 compliant JAMStack and (sort of) static websites Traditional CMSes like WordPress can be a pain for developers if you want to build something that is more customized than what you get out of the box. This is one of the reasons why JAMStack (a term coined by Netlify) is so popular - combining JavaScript, APIs, and markup, it offers web developers a way to build secure, performant websites very quickly. To a certain extent, JAMstack is the next generation of static websites; but JAMStack sites aren’t exactly static, as they call data from the server-side through APIs. Developers then call on the help of templated markup - usually in the form of static site generators (like Gatsby.js) or build tools - to act as a pre-built front end. The benefits of JAMstack as an approach are well-documented. Perhaps the most important, though, is that it offers a really great developer experience. It allows you to build with the tools that you want to use, and integrate with services you might already be using, and minimizes the level of complexity that can come with some development approaches. Get started with Gatsby.js and find out how to use it in JAMstack with The Gatsby Masterclass video. State management We’ve talked about state management recently - in fact, lots of people have been talking about it. We won’t go into detail about what it involves, but the issue has grown as increasing app complexity has made it harder to gain a single source of truth on what’s actually happening inside our applications. If you haven’t already, it’s essential to learn some of the design patterns and approaches for managing application state that have emerged over the last couple of years. The two most popular - Flux and Redux - are very closely associated with React, but for Vue developers Vuex is well worth learning. Thinking about state management can feel brain-wrenching at times. However, getting to grips with it can really help you to feel more in control of your projects. Get up and running with Redux quickly with the Redux Quick Start Guide.
Read more
  • 0
  • 0
  • 29370
article-image-key-skills-for-data-professionals-to-learn-in-2020
Richard Gall
20 Dec 2019
6 min read
Save for later

Key skills for data professionals to learn in 2020

Richard Gall
20 Dec 2019
6 min read
It’s easy to fall into the trap of thinking about your next job, or even the job after that. It’s far more useful, however, to think more about the skills you want and need to learn now. This will focus your mind and ensure that you don’t waste time learning things that simply aren’t helpful. It also means you can make use of the things you’re learning almost immediately. This will make you more productive and effective - and who knows, maybe it will make the pathway to your future that little bit clearer. So, to help you focus, here are some of the things you should focus on learning as a data professional. Reinforcement learning Reinforcement learning is one of the most exciting and cutting-edge areas of machine learning. Although the area itself is relatively broad, the concept itself is fundamentally about getting systems to ‘learn’ through a process of reward. Because reinforcement learning focuses on making the best possible decision at a given moment, it naturally finds many applications where decision making is important. This includes things like robotics, digital ad-bidding, configuring software systems, and even something as prosaic as traffic light control. Of course, the list of potential applications for reinforcement learning could be endless. To a certain extent, the real challenge with it is finding new use cases that are relevant to you. But to do that, you need to learn and master it - so make 2020 the year you do just that. Get to grips with reinforcement learning with Reinforcement Learning Algorithms with Python. Learn neural networks Neural networks are closely related to reinforcement learning - they’re essentially another element within machine learning. However, neural networks are even more closely aligned with what we think of as typical artificial intelligence. Indeed, even the name itself hints at the fact that these systems are supposed to in some way mimic the human brain. Like reinforcement learning, there are a number of different applications for neural networks. These include image and language processing, as well as forecasting. The complexity of relationships that can be figured inside neural networks systems is useful for handling data with many different variables and intricacies that would otherwise be difficult to capture. If you want to find out how artificial intelligence really works under the hood, make sure you learn neural networks in 2020. Learn how to build real-world neural networks projects with Neural Network Projects with Python. Meta-learning Metalearning is another area of machine learning. It’s designed to help engineers and analysts to use the right machine learning algorithms for specific problems - it’s particularly important in automatic machine learning, where removing human agency from the analytical process can lead to the wrong systems being used on data. Meta learning does this by being applied to metadata about machine learning projects. This metadata will include information about the data, such as algorithm features, performance measures, and patterns identified previously. Once meta learning algorithms have ‘learned’ from this data, they should, in theory, be well optimized to run on other sets of data. It has been said that meta learning is important in the move towards generalized artificial intelligence, or AGI (intelligence that is more akin to human intelligence). This is because getting machines to learn about learning allow systems to move between different problems - something that is incredibly difficult with even the most sophisticated neural networks. Whether it will actually get us any closer to AGI is certainly open to debate, but if you want to be a part of the cutting edge of AI development, getting stuck into meta learning is a good place to begin in 2020. Find out how meta learning works in Hands-on Meta Learning with Python. Learn a new programming language Python is now the undisputed language of data. But that’s far from the end of the story - R still remains relevant in the field, and there are even reasons to use other languages for machine learning. It might not be immediately obvious - especially if you’re content to use R or Python for analytics and algorithmic projects - but because machine learning is shifting into many different fields, from mobile development to cybersecurity, learning how other programming languages can be used to build machine learning algorithms could be incredibly valuable. From the perspective of your skill set, it gives you a level of flexibility that will not only help you to solve a wider range of problems, but also stand out from the crowd when it comes to the job market. The most obvious non-obvious languages to learn for machine learning practitioners and other data professionals are Java and Julia. But even new and emerging languages are finding their way into machine learning - Go and Swift, for example, could be interesting routes to explore, particularly if you’re thinking about machine learning in production software and systems. Find out how to use Go for machine learning with Go Machine Learning Projects. Learn new frameworks For data professionals there are probably few things more important than learning new frameworks. While it’s useful to become a polyglot, it’s nevertheless true that learning new frameworks and ecosystem tools are going to have a more immediate impact on your work. PyTorch and TensorFlow should almost certainly be on your list for 2020. But we’ve mentioned them a lot recently, so it’s probably worth highlighting other frameworks worth your focus: Pandas, for data wrangling and manipulation, Apache Kafka, for stream-processing, scikit-learn for machine learning, and Matplotlib for data visualization. The list could be much, much longer: however, the best way to approach learning a new framework is to start with your immediate problems. What’s causing issues? What would you like to be able to do but can’t? What would you like to be able to do faster? Explore TensorFlow eBooks and videos on the Packt store. Learn how to develop and communicate a strategy It’s easy to just roll your eyes when someone talks about how important ‘soft skills’ are for data professionals. Except it’s true - being able to strategize, communicate, and influence, are what mark you out as a great data pro rather than a merely competent one. The phrase ‘soft skills’ is often what puts people off - ironically, despite the name they’re often even more difficult to master than technical skill. This is because, of course, soft skills involve working with humans in all their complexity. However, while learning these sorts of skills can be tough, it doesn’t mean it's impossible. To a certain extent it largely just requires a level of self-awareness and reflexivity, as well as a sensitivity to wider business and organizational problems. A good way of doing this is to step back and think of how problems are defined, and how they relate to other parts of the business. Find out how to deliver impactful data science projects with Managing Data Science. If you can master these skills, you’ll undoubtedly be in a great place to push your career forward as the year continues.
Read more
  • 0
  • 0
  • 32793

article-image-why-should-you-use-unreal-engine-4-to-build-augmented-and-virtual-reality-projects
Guest Contributor
20 Dec 2019
6 min read
Save for later

Why should you use Unreal Engine 4 to build Augmented and Virtual Reality projects

Guest Contributor
20 Dec 2019
6 min read
This is an exciting time to be a game developer. New technologies like Virtual Reality (VR) and Augmented Reality (AR) are here and growing in popularity, and a whole new generation of game consoles is just around the corner. Right now everyone wants to jump onto these bandwagons and create successful games using AR, VR and other technologies (for more detailed information see Chapter 15, Virtual Reality and Beyond, of my book, Learning C++ by Building Games with Unreal Engine 4 – Second Edition). But no one really wants to create everything from scratch (reinventing the wheel is just too much work). Fortunately, you don’t have to. Unreal Engine 4 (UE4) can help! Not only does Epic Games use their engine to develop their own games (and keep it constantly updated for that purpose), but many other game companies, both AAA and indie, also use the engine, and Epic is constantly adding new features for them too. They can also update the engine themselves, and they can make some of those changes available to the general public as well. UE4 also has a robust system for addons and plugins that many other developers contribute to. Some may be free, and others, more advanced ones are available for a price. These can be extremely specialized, and the developer may release regular updates to adjust to changes in Unreal and that adds new features that could make your life even easier. So how does UE4 help with new technologies? Here are some examples: Unreal Engine 4 for Virtual Reality Virtual Reality (VR) is one of the most exciting technologies around, and many people are trying to get into that particular door. VR headsets from companies like Oculus, HTC, and Sony are becoming cheaper, more common, and more powerful. If you were creating a game yourself from scratch you would need an extremely powerful graphics engine. Fortunately, UE4 already has one with VR functionality. If you already have a project you want to convert to VR, UE4 makes this easy for you. If you have an Oculus Rift or HTC Vive installed on your computer, viewing your game in VR is as easy as launching it in VR Preview mode and viewing it in your headset. While Controls might take more work, UE4 has a Motion Controller you can add to your controller to help you get started quickly. You can even edit your project in VR Mode, allowing you to see the editor view in your VR headset, which can help with positioning things in your game. If you’re starting a new project, UE4 now has VR specific templates for new projects. You also have plenty of online documentation and a large community of other users working with VR in Unreal Engine 4 who can help you out. Unreal Engine 4 for Augmented Reality Augmented Reality (AR) is another new technology that’s extremely popular right now. Pokemon Go is extremely popular, and many companies are trying to do something similar. There are also AR headsets and possibly other new ways to view AR information. Every platform has its own way of handling Augmented Reality right now. On mobile devices, iOS has ARKit to support AR programming and Android has ARCore. Fortunately, the Unreal website has a whole section on AR and how to support these in UE4 to develop AR games at https://docs.unrealengine.com/en-US/Platforms/AR/index.html. It also has information on using Magic Leap, Microsoft HoloLens, and Microsoft Hololens 2. So by using UE4, you get a big headstart on this type of development. Working with Other New Technologies If you want to use technology, chances are UE4 supports it (and if not, just wait and it will). Whether you’re trying to do procedural programming or just use the latest AI techniques (for more information see chapters 11 and 12 of my book, Learning C++ by Building Games with Unreal Engine 4 – Second Edition), chances are you can find something to help you get a head start in that technology that already works in UE4. And with so many people using the engine, it is likely to continue to be a great way to get support for new technologies. Support for New Platforms UE4 already supports numerous platforms such as PC, Mac, Mobile, web, Xbox One, PS4, Switch, and probably any other recent platform you can think of. With the next-gen consoles coming out in 2020, chances are they’re already working on support for them. For the consoles, you do generally need to be a registered developer with Microsoft, Sony, and/or Nintendo to have access to the tools to develop for those platforms (and you need expensive devkits). But as more indie games are showing up on these platforms you don’t necessarily have to be working at a AAA studio to do this anymore. What is amazing when you develop in UE4, is that publishing for another platform should basically just work. You may need to change the controls and the screen size. An AAA 3D title might be too slow to be playable if you try to just run it n a mobile device without any changes, but the basic game functionality will be there and you can make changes from that point. The Future It’s hard to tell what new technologies may come in the future, as new devices, game types, and methods of programming are developed. Regardless of what the future holds, there’s a strong chance that UE4 will support them. So learning UE4 now is a great investment of your time. If you’re interested in learning more, see my book, Learning C++ by Building Games with Unreal Engine 4 – Second Edition Author Bio Sharan Volin has been programming games for more than a decade. She has worked on AAA titles for Behavior Interactive, Blind Squirrel Games, Sony Online Entertainment/Daybreak Games, Electronic Arts (Danger Close Games), 7 Studios (Activision), and more, as well as numerous smaller games. She has primarily been a UI Programmer but is also interested in Audio, AI, and other areas. She also taught Game Programming for a year at the Art Institute of California and is the author of Learning C++ by Building Games with Unreal Engine 4 – Second Edition.
Read more
  • 0
  • 0
  • 53048
Modal Close icon
Modal Close icon