Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Save more on your purchases! discount-offer-chevron-icon
Savings automatically calculated. No voucher code required.
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Newsletter Hub
Free Learning
Arrow right icon
timer SALE ENDS IN
0 Days
:
00 Hours
:
00 Minutes
:
00 Seconds

How-To Tutorials

7010 Articles
article-image-yuri-shkuro-on-observability-challenges-in-microservices-and-cloud-native-applications
Packt Editorial Staff
05 Apr 2019
11 min read
Save for later

Yuri Shkuro on Observability challenges in microservices and cloud-native applications

Packt Editorial Staff
05 Apr 2019
11 min read
In the last decade, we saw a significant shift in how modern, internet-scale applications are being built. Cloud computing (infrastructure as a service) and containerization technologies (popularized by Docker) enabled a new breed of distributed system designs commonly referred to as microservices (and their next incarnation, FaaS). Successful companies like Twitter and Netflix have been able to leverage them to build highly scalable, efficient, and reliable systems, and to deliver more features faster to their customers. In this article we explain the concept of observability in microservices, its challenges and traditional monitoring tools in microservices. This article is an extract taken from the book Mastering Distributed Tracing, written by Yuri Shkuro. This book will equip you to operate and enhance your own tracing infrastructure. Through practical exercises and code examples, you will learn how end-to-end tracing can be used as a powerful application performance management and comprehension tool. While there is no official definition of microservices, a certain consensus has evolved over time in the industry. Martin Fowler, the author of many books on software design, argues that microservices architectures exhibit the following common characteristics: Componentization via (micro)services Smart endpoints and dumb pipes Organized around business capabilities Decentralized governance Decentralized data management Infrastructure automation Design for failure Evolutionary design Because of the large number of microservices involved in building modern applications, rapid provisioning, rapid deployment via decentralized continuous delivery, strict DevOps practices, and holistic service monitoring are necessary to effectively develop, maintain, and operate such applications. The infrastructure requirements imposed by the microservices architectures spawned a whole new area of development of infrastructure platforms and tools for managing these complex cloud-native applications. In 2015, the Cloud Native Computing Foundation (CNCF) was created as a vendor-neutral home for many emerging open source projects in this area, such as Kubernetes, Prometheus, Linkerd, and so on, with a mission to "make cloud-native computing ubiquitous." Read more on Honeycomb CEO Charity Majors discusses observability and dealing with “the coming armageddon of complexity” [Interview] What is observability? The term "observability" in control theory states that the system is observable if the internal states of the system and, accordingly, its behavior, can be determined by only looking at its inputs and outputs. At the 2018 Observability Practitioners Summit, Bryan Cantrill, the CTO of Joyent and one of the creators of the tool dtrace, argued that this definition is not practical to apply to software systems because they are so complex that we can never know their complete internal state, and therefore the control theory's binary measure of observability is always zero (I highly recommend watching his talk on YouTube: https://youtu.be/U4E0QxzswQc). Instead, a more useful definition of observability for a software system is its "capability to allow a human to ask and answer questions". The more questions we can ask and answer about the system, the more observable it is. Figure 1: The Twitter debate There are also many debates and Twitter zingers about the difference between monitoring and observability. Traditionally, the term monitoring was used to describe metrics collection and alerting. Sometimes it is used more generally to include other tools, such as "using distributed tracing to monitor distributed transactions." The definition by Oxford dictionaries of the verb "monitor" is "to observe and check the progress or quality of (something) over a period of time; keep under systematic review." However, it is better scoped to describing the process of observing certain a priori defined performance indicators of our software system, such as those measuring an impact on the end-user experience, like latency or error counts, and using their values to alert us when these signals indicate an abnormal behavior of the system. Metrics, logs, and traces can all be used as a means to extract those signals from the application. We can then reserve the term "observability" for situations when we have a human operator proactively asking questions that were not predefined. As Bryan Cantrill put it in his talk, this process is debugging, and we need to "use our brains when debugging." Monitoring does not require a human operator; it can and should be fully automated. "If you want to talk about (metrics, logs, and traces) as pillars of observability–great. The human is the foundation of observability!"  -- BryanCantrill In the end, the so-called "three pillars of observability" (metrics, logs, and traces) are just tools, or more precisely, different ways of extracting sensor data from the applications. Even with metrics, the modern time series solutions like Prometheus, InfluxDB, or Uber's M3 are capable of capturing the time series with many labels, such as which host emitted a particular value of a counter. Not all labels may be useful for monitoring, since a single misbehaving service instance in a cluster of thousands does not warrant an alert that wakes up an engineer. But when we are investigating an outage and trying to narrow down the scope of the problem, the labels can be very useful as observability signals. The observability challenge of microservices By adopting microservices architectures, organizations are expecting to reap many benefits, from better scalability of components to higher developer productivity. There are many books, articles, and blog posts written on this topic, so I will not go into that. Despite the benefits and eager adoption by companies large and small, microservices come with their own challenges and complexity. Companies like Twitter and Netflix were successful in adopting microservices because they found efficient ways of managing that complexity. Vijay Gill, Senior VP of Engineering at Databricks, goes as far as saying that the only good reason to adopt microservices is to be able to scale your engineering organization and to "ship the org chart". So, what are the challenges of this design? There are quite a few: In order to run these microservices in production, we need an advanced orchestration platform that can schedule resources, deploy containers, autoscale, and so on. Operating an architecture of this scale manually is simply not feasible, which is why projects like Kubernetes became so popular. In order to communicate, microservices need to know how to find each other on the network, how to route around problematic areas, how to perform load balancing, how to apply rate limiting, and so on. These functions are delegated to advanced RPC frameworks or external components like network proxies and service meshes. Splitting a monolith into many microservices may actually decrease reliability. Suppose we have 20 components in the application and all of them are required to produce a response to a single request. When we run them in a monolith, our failure modes are restricted to bugs and potentially a crush of the whole server running the monolith. But if we run the same components as microservices, on different hosts and separated by a network, we introduce many more potential failure points, from network hiccups, to resource constraints due to noisy neighbors. The latency may also increase. Assume each microservice has 1 ms average latency, but the 99th percentile is 1s. A transaction touching just one of these services has a 1% chance to take ≥ 1s. A transaction touching 100 of these services has 1 - (1 - 0.01)100 = 63% chance to take ≥ 1s. Finally, the observability of the system is dramatically reduced if we try to use traditional monitoring tools. When we see that some requests to our system are failing or slow, we want our observability tools to tell us the story about what happens to that request. Traditional monitoring tools Traditional monitoring tools were designed for monolith systems, observing the health and behavior of a single application instance. They may be able to tell us a story about that single instance, but they know almost nothing about the distributed transaction that passed through it. These tools "lack the context" of the request. Metrics It goes like this: "Once upon a time…something bad happened. The end." How do you like this story? This is what the chart in Figure 2 tells us. It's not completely useless; we do see a spike and we could define an alert to fire when this happens. But can we explain or troubleshoot the problem? Figure 2: A graph of two time series representing (hypothetically) the volume of traffic to a service Metrics, or stats, are numerical measures recorded by the application, such as counters, gauges, or timers. Metrics are very cheap to collect, since numeric values can be easily aggregated to reduce the overhead of transmitting that data to the monitoring system. They are also fairly accurate, which is why they are very useful for the actual monitoring (as the dictionary defines it) and alerting. Yet the same capacity for aggregation is what makes metrics ill-suited for explaining the pathological behavior of the application. By aggregating data, we are throwing away all the context we had about the individual transactions. Logs Logging is an even more basic observability tool than metrics. Every programmer learns their first programming language by writing a program that prints (that is, logs) "Hello, World!" Similar to metrics, logs struggle with microservices because each log stream only tells us about a single instance of a service. However, the evolving programming paradigms creates other problems for logs as a debugging tool. Ben Sigelman, who built Google's distributed tracing system Dapper, explained it in his KubeCon 2016 keynote talk as four types of concurrency (Figure 3): Figure 3: Evolution of concurrency Years ago, applications like early versions of Apache HTTP Server handled concurrency by forking child processes and having each process handle a single request at a time. Logs collected from that single process could do a good job of describing what happened inside the application. Then came multi-threaded applications and basic concurrency. A single request would typically be executed by a single thread sequentially, so as long as we included the thread name in the logs and filtered by that name, we could still get a reasonably accurate picture of the request execution. Then came asynchronous concurrency, with asynchronous and actor-based programming, executor pools, futures, promises, and event-loop-based frameworks. The execution of a single request may start on one thread, then continue on another, then finish on the third. In the case of event loop systems like Node.js, all requests are processed on a single thread but when the execution tries to make an I/O, it is put in a wait state and when the I/O is done, the execution resumes after waiting its turn in the queue. Both of these asynchronous concurrency models result in each thread switching between multiple different requests that are all in flight. Observing the behavior of such a system from the logs is very difficult, unless we annotate all logs with some kind of unique id representing the request rather than the thread, a technique that actually gets us close to how distributed tracing works. Finally, microservices introduced what we can call "distributed concurrency." Not only can the execution of a single request jump between threads, but it can also jump between processes, when one microservice makes a network call to another. Trying to troubleshoot request execution from such logs is like debugging without a stack trace: we get small pieces, but no big picture. In order to reconstruct the flight of the request from the many log streams, we need powerful logs aggregation technology and a distributed context propagation capability to tag all those logs in different processes with a unique request id that we can use to stitch those requests together. We might as well be using the real distributed tracing infrastructure at this point! Yet even after tagging the logs with a unique request id, we still cannot assemble them into an accurate sequence, because the timestamps from different servers are generally not comparable due to clock skews. In this article we looked at the concept of observability and some challenges one has to face in microservices. We further discussed traditional monitoring tools for microservices. Applying distributed tracing to microservices-based architectures will be easy with Mastering Distributed Tracing written by Yuri Shkuro. 6 Ways to blow up your Microservices! Have Microservices killed the monolithic architecture? Maybe not! How to build Dockers with microservices  
Read more
  • 0
  • 0
  • 31723

article-image-tech-regulation-heats-up-australias-abhorrent-violent-material-bill-to-warrens-corporate-executive-accountability-act
Fatema Patrawala
04 Apr 2019
6 min read
Save for later

Tech regulation to an extent of sentence jail: Australia’s ‘Sharing of Abhorrent Violent Material Bill’ to Warren’s ‘Corporate Executive Accountability Act’

Fatema Patrawala
04 Apr 2019
6 min read
Businesses in powerful economies like USA, UK, Australia are as arguably powerful as politics or more than that. Especially now that we inhabit a global economy where an intricate web of connections can show the appalling employment conditions of Chinese workers who assemble the Apple smartphones we depend on. Amazon holds a revenue bigger than Kenya’s GDP. According to Business Insider, 25 major American corporations have revenues greater than the GDP of countries around the world. Because corporations create millions of jobs and control vast amounts of money and resources, their sheer economic power dwarfs government's ability to regulate and oversee them. With the recent global scale scandals that the tech industry has found itself in, with some resulting in deaths of groups of people, governments are waking up to the urgency for the need to hold tech companies responsible. While some government laws are reactionary, others are taking a more cautious approach. One thing is for sure, 2019 will see a lot of tech regulation come to play. How effective they are and what intended and unintended consequences they bear, how masterfully big tech wields its lobbying prowess, we’ll have to wait and see. Holding Tech platforms enabling hate and violence, accountable Australian govt passes law that criminalizes companies and execs for hosting abhorrent violent content Today, Australian parliament has passed legislation to crack down on violent videos on social media. The bill, described the attorney general, Christian Porter, as “most likely a world first”, was drafted in the wake of the Christchurch terrorist attack by a White supremacist Australian, when video of the perpetrator’s violent attack spread on social media faster than it could be removed. The Sharing of Abhorrent Violent Material bill creates new offences for content service providers and hosting services that fail to notify the Australian federal police about or fail to expeditiously remove videos depicting “abhorrent violent conduct”. That conduct is defined as videos depicting terrorist acts, murders, attempted murders, torture, rape or kidnap. The bill creates a regime for the eSafety Commissioner to notify social media companies that they are deemed to be aware they are hosting abhorrent violent material, triggering an obligation to take it down. While the Digital Industry Group which consists of Google, Facebook, Twitter, Amazon and Verizon Media in Australia has warned that the bill is passed without meaningful consultation and threatens penalties against content created by users. Sunita Bose, the group’s managing director says, “ with the vast volumes of content uploaded to the internet every second, this is a highly complex problem”. She further debates that “this pass it now, change it later approach to legislation creates immediate uncertainty to the Australia’s tech industry”. The Chief Executive of Atlassian Scott Farquhar said that the legislation fails to define how “expeditiously” violent material should be removed, and did not specify on who should be punished in the social media company. https://twitter.com/scottfarkas/status/1113391831784480768 The Law Council of Australia president, Arthur Moses, said criminalising social media companies and executives was a “serious step” and should not be legislated as a “knee-jerk reaction to a tragic event” because of the potential for unintended consequences. Contrasting Australia’s knee-jerk legislation, the US House Judiciary committee has organized a hearing on white nationalism and hate speech and their spread online. They have invited social media platform execs and civil rights organizations to participate. Holding companies accountable for reckless corporate behavior Facebook has undergone scandals after scandals with impunity in recent years given the lack of legislation in this space. Facebook has repeatedly come under the public scanner for data privacy breaches to disinformation campaigns and beyond. Adding to its ever-growing list of data scandals yesterday CNN Business uncovered  hundreds of millions of Facebook records were stored on Amazon cloud servers in a way that it allowed to be downloaded by the public. Earlier this month on 8th March, Sen. Warren has proposed to build strong anti-trust laws and break big tech companies like Amazon, Google, Facebook and Apple. Yesterday, she introduced Corporate Executive Accountability Act and also reintroduced the “too big to fail” bill a new piece of legislation that would make it easier to criminally charge company executives when Americans’ personal data is breached, among other corporate negligent behaviors. “When a criminal on the street steals money from your wallet, they go to jail. When small-business owners cheat their customers, they go to jail,” Warren wrote in a Washington Post op-ed published on Wednesday morning. “But when corporate executives at big companies oversee huge frauds that hurt tens of thousands of people, they often get to walk away with multimillion-dollar payouts.” https://twitter.com/SenWarren/status/1113448794912382977 https://twitter.com/SenWarren/status/1113448583771185153 According to Elizabeth, just one banker went to jail after the 2008 financial crisis. The CEO of Wells Fargo and his successor walked away from the megabank with multimillion-dollar pay packages after it was discovered employees had created millions of fake accounts. The same goes for the Equifax CEO after its data breach. The new legislation Warren introduced would make it easier to hold corporate executives accountable for their companies’ wrongdoing. Typically, it’s been hard to prove a case against individual executives for turning a blind eye toward risky or questionable activity, because prosecutors have to prove intent — basically, that they meant to do it. This legislation would change that, Heather Slavkin Corzo, a senior fellow at the progressive nonprofit Americans for Financial Reform, said to the Vox reporter. “It’s easier to show a lack of due care than it is to show the mental state of the individual at the time the action was committed,” she said. A summary of the legislation released by Warren’s office explains that it would “expand criminal liability to negligent executives of corporations with over $1 billion annual revenue” who: Are found guilty, plead guilty, or enter into a deferred or non-prosecution agreement for any crime. Are found liable or enter a settlement with any state or Federal regulator for the violation of any civil law if that violation affects the health, safety, finances, or personal data of 1% of the American population or 1% of the population of any state. Are found liable or guilty of a second civil or criminal violation for a different activity while operating under a civil or criminal judgment of any court, a deferred prosecution or non prosecution agreement, or settlement with any state or Federal agency. Executives found guilty of these violations could get up to a year in jail. And a second violation could mean up to three years. The Corporate Executive Accountability Act is yet another push from Warren who has focused much of her presidential campaign on holding corporations and their leaders responsible for both their market dominance and perceived corruption. Elizabeth Warren wants to break up tech giants like Amazon, Google Facebook, and Apple and build strong antitrust laws Zuckerberg wants to set the agenda for tech regulation in yet another “digital gangster” move Facebook under criminal investigations for data sharing deals: NYT report
Read more
  • 0
  • 0
  • 10615

article-image-over-30-ai-experts-join-shareholders-in-calling-on-amazon-to-stop-selling-rekognition-its-facial-recognition-tech-for-government-surveillance
Natasha Mathur
04 Apr 2019
6 min read
Save for later

Over 30 AI experts join shareholders in calling on Amazon to stop selling Rekognition, its facial recognition tech, for government surveillance

Natasha Mathur
04 Apr 2019
6 min read
Update, 12th April 2018: Amazon shareholders will now be voting on at the 2019 Annual Meeting of Shareholders of Amazon, on whether the company board should prohibit sales of Facial recognition tech to the government. The meeting will be held at 9:00 a.m., Pacific Time, on Wednesday, May 22, 2019, at Fremont Studios, Seattle, Washington.  Over 30 researchers from top tech firms (Google, Microsoft, et al), academic institutions and civil rights groups signed an open letter, last week, calling on Amazon to stop selling Amazon Rekognition to law enforcement. The letter, published on Medium, has been signed by the likes of this year’s Turing award winner, Yoshua Bengio, and Anima Anandkumar, a Caltech professor, director of Machine Learning research at NVIDIA, and former principal scientist at AWS among others. https://twitter.com/rajiinio/status/1113480353308651520 Amazon Rekognition is a deep-learning based service that is capable of storing and searching tens of millions of faces at a time. It allows detection of objects, scenes, activities and inappropriate content. However, Amazon Rekognition has long been a bone of contention among public eye and rights groups. This is due to the inaccuracies in its face recognition capability and over the concerns that selling Rekognition to law enforcement can hamper public privacy. For instance, an anonymous Amazon employee spoke out against Amazon selling its facial recognition technology to the police, last year, calling it a “Flawed technology”. Also, a group of seven House Democrats sent a letter to Amazon CEO, last November, over Amazon Rekognition, raising concerns and questions about its accuracy and the possible effects. Moreover, a group of over 85 coalition groups sent a letter to Amazon, earlier this year, urging the company to not sell its facial surveillance technology to the government. Researchers argue against unregulated Amazon Rekognition use Researchers state in the letter that a study conducted by Inioluwa Deborah Raji and Joy Buolamwini shows that Rekognition possesses much higher error rates and is imprecise in classifying the gender of darker skinned women than lighter skinned men. However, Dr. Matthew Wood, general manager, AI, AWS and Michael Punke, vice president of global public policy, AWS, were irreverent about the research and disregarded it by labeling it as “misleading”. Dr. Wood also stated that “facial analysis and facial recognition are completely different in terms of the underlying technology and the data used to train them. Trying to use facial analysis to gauge the accuracy of facial recognition is ill-advised”.  Researchers in the letter have called on that statement saying that it is 'problematic on multiple fronts’. The letter also sheds light on the real world implications of the misuse of face recognition tools. It talks about Clare Garvie, Alvaro Bedoya and Jonathan Frankle of the Center on Privacy & Technology at Georgetown Law who studies law enforcement’s use of face recognition. According to them, using face recognition tech can put the wrong people to trial due to cases of mistaken identity. Also, it is quite common that the law enforcement operators are neither aware of the parameters of these tools, nor do they know how to interpret some of their results. Relying on decisions from automated tools can lead to “automation bias”. Another argument Dr. Wood makes to defend the technology is that “To date (over two years after releasing the service), we have had no reported law enforcement misuses of Amazon Rekognition.”However, the letter states that this is unfair as there are currently no laws in place to audit Rekognition’s use. Moreover, Amazon has not disclosed any information about its customers or any details about the error rates of Rekognition across different intersectional demographics. “How can we then ensure that this tool is not improperly being used as Dr. Wood states? What we can rely on are the audits by independent researchers, such as Raji and Buolamwini..that demonstrates the types of biases that exist in these products”, reads the letter. Researchers say that they find Dr. Wood and Mr. Punke’s response to the peer-reviewed research is ‘disappointing’ and hope Amazon will dive deeper into examining all of its products before deciding on making it available for use by the Police. More trouble for Amazon: SEC approves Shareholders’ proposal for need to release more information on Rekognition Just earlier this week, the U.S. Securities and Exchange Commission (SEC) announced a ruling that considers Amazon shareholders’ proposal to demand Amazon to provide more information about the company’s use and sale of biometric facial recognition technology as appropriate. The shareholders said that they are worried about the use of Rekognition and consider it a significant risk to human rights and shareholder value. Shareholders mentioned two new proposals regarding Rekognition and requested their inclusion in the company’s proxy materials: The first proposal called on Board of directors to prohibit the selling of Rekognition to the government unless it has been evaluated that the tech does not violate human and civil rights. The second proposal urges Board Commission to conduct an independent study of Rekognition. This would further help examine the risks of Rekognition on the immigrants, activists, people of color, and the general public of the United States. Also, the study would help analyze how such tech is marketed and sold to foreign governments that may be “repressive”, along with other financial risks associated with human rights issues. Amazon chastised the proposals and claimed that both the proposals should be discarded under the subsections of Rule 14a-8 as they related to the company’s “ordinary business and operations that are not economically significant”. But, SEC’s Division of Corporation Finance countered Amazon’s arguments. It told Amazon that it is unable to conclude that “proposals are not otherwise significantly related to the Company’s business” and approved their inclusion in the company’s proxy materials, reports Compliance Week. “The Board of Directors did not provide an opinion or evidence needed to support the claim that the issues raised by the Proposals are ‘an insignificant public policy issue for the Company”, states the division. “The controversy surrounding the technology threatens the relationship of trust between the Company and its consumers, employees, and the public at large”. SEC Ruling, however, only expresses informal views, and whether Amazon is obligated to accept the proposals can only be decided by the U.S. District Court should the shareholders further legally pursue these proposals.   For more information, check out the detailed coverage at Compliance Week report. AWS updates the face detection, analysis and recognition capabilities in Amazon Rekognition AWS makes Amazon Rekognition, its image recognition AI, available for Asia-Pacific developers Amazon Rekognition can now ‘recognize’ faces in a crowd at real-time
Read more
  • 0
  • 0
  • 27940

article-image-using-genetic-algorithms-for-optimizing-your-models-tutorial
Natasha Mathur
04 Apr 2019
14 min read
Save for later

Using Genetic Algorithms for optimizing your models [Tutorial]

Natasha Mathur
04 Apr 2019
14 min read
While, at present, deep learning (DL) is on top in terms of both application and employability, it has close competition with evolutionary algorithms. These algorithms are inspired by the natural process of evolution, the world's best optimizers. In this article, we will explore what is a genetic algorithm, advantages of genetic algorithms, and various uses of genetic algorithm in optimizing your models. This article is an excerpt taken from the book 'Hands-On Artificial Intelligence for IoT' written by  Amita Kapoor.  The book explores building smarter systems by combining artificial intelligence and the Internet of Things—two of the most talked about topics today. Let's now learn how can we implement the genetic algorithm. Genetic Algorithm was developed by John Holland in 1975. It was shown that it can be used to solve an optimization problem by his student Goldberg, who used genetic algorithms to control gas pipeline transmission. Since then, genetic algorithms have remained popular, and have inspired various other evolutionary programs. To apply genetic algorithms in solving optimization problems using the computer, as the first step we will need to encode the problem variables into genes. The genes can be a string of real numbers or a binary bit string (series of 0s and 1's). This represents a potential solution (individual) and many such solutions together form the population at time t. For instance, consider a problem where we need to find two variables, a and b, such that the two lie in the range (0, 255). For binary gene representation, these two variables can be represented by a 16-bit chromosome, with the higher 8 bits representing gene a and the lower 8 bits for b. The encoding will need to be later decoded to get the real values of the variables a and b. The second important requirement for genetic algorithms is defining a proper fitness function, which calculates the fitness score of any potential solution (in the preceding example, it should calculate the fitness value of the encoded chromosome). This is the function that we want to optimize by finding the optimum set of parameters of the system or the problem at hand. The fitness function is problem-dependent. For example, in the natural process of evolution, the fitness function represents the organism's ability to operate and to survive in its environment. Pros and cons of Genetic algorithm Genetic algorithms sound cool, right! Now, before we try and build code around them, let's point out certain advantages and disadvantages of genetic algorithms. Advantages Genetic algorithms offer some intriguing advantages and can produce results when the tradition gradient-based approaches fail: They can be used to optimize either continuous or discrete variables. Unlike gradient descent, we do not require derivative information, which also means that there is no need for the fitness function to be continuous and differentiable. It can simultaneously search from a wide sampling of the cost surface. We can deal with a large number of variables without a significant increase in computation time. The generation of the population and calculating their fitness values can be performed in parallel, and hence genetic algorithms are well suited for parallel computers. They can work even when the topological surface is extremely complex because crossover and mutation operators help them in jumping out of a local minimum. They can provide more than one optimum solution. We can use them with numerically generated data, experimental data, or even analytical functions. They specifically work well for large-scale optimization problems. Disadvantages Despite the previously mentioned advantages, we still do not find genetic algorithms to be a ubiquitous solution to all optimization problems. This is for the following reasons: If the optimization function is a well-behaved convex function, then gradient-based methods will give a faster convergence The large population of solutions that helps genetic algorithms cover the search space more extensively also results in slow convergence Designing a fitness function can be a daunting task Coding genetic algorithms using Distributed Evolutionary Algorithms in Python Now that we understand how genetic algorithms work, let's try solving some problems with them. They have been used to solve NP-hard problems such as the traveling salesman problem. To make the task of generating a population, performing the crossover, and performing mutation operations easy, we will make use of Distributed Evolutionary Algorithms in Python (DEAP). It supports multiprocessing and we can use it for other evolutionary algorithms as well. You can download DEAP directly from PyPi using this: pip install deap It is compatible with Python 3. To learn more about DEAP, you can refer to its GitHub repository and its user's guide. Guess the word In this program, we use genetic algorithms to guess a word. The genetic algorithm will know the number of letters in the word and will guess those letters until it finds the right answer. We decide to represent the genes as a single alphanumeric character; strings of these characters thus constitute a chromosome. And our fitness function is the sum of the characters matching in the individual and the right word: As the first step, we import the modules we will need. We use the string module and the random module to generate random characters from (a—z, A—Z, and 0—9). From the DEAP module, we use creator, base, and tools: import string import random from deap import base, creator, tools In DEAP, we start with creating a class that inherits from the deep.base module. We need to tell it whether we are going to have a minimization or maximization of the function; this is done using the weights parameter. A value of +1 means we are maximizing (for minimizing, we give the value -1.0). The following code line will create a class, FitnessMax, that will maximize the function: creator.create("FitnessMax", base.Fitness, weights=(1.0,)) We also define an Individual class, which will inherit the class list, and tell the DEAP creator module to assign FitnessMax as its fitness attribute: creator.create("Individual", list, fitness=creator.FitnessMax) Now, with the Individual class defined, we use the toolbox of DEAP defined in the base module. We will use it to create a population and define our gene pool. All the objects that we will need from now onward—an individual, the population, the functions, the operators, and the arguments—are stored in a container called toolbox. We can add or remove content to/from the toolbox container using the register() and unregister() methods: toolbox = base.Toolbox() # Gene Pool toolbox.register("attr_string", random.choice, \ string.ascii_letters + string.digits ) Now that we have defined how the gene pool will be created, we create an individual and then a population by repeatedly using the Individual class. We will pass the class to the toolbox responsible for creating a N parameter , telling it how many genes to produce: #Number of characters in word # The word to be guessed word = list('hello') N = len(word) # Initialize population toolbox.register("individual", tools.initRepeat, \ creator.Individual, toolbox.attr_string, N ) toolbox.register("population",tools.initRepeat, list,\ toolbox.individual) We define the fitness function. Note the comma in the return statement. This is because the fitness function in DEAP is returned as a tuple to allow multi-objective fitness functions: def evalWord(individual, word): return sum(individual[i] == word[i] for i in\ range(len(individual))), Add the fitness function to the container. Also, add the crossover operator, mutation operator, and parent selector operator. You can see that, for this, we are using the register function. In the first statement, we register the fitness function that we have defined, along with the additional arguments it will take. The next statement registers the crossover operation; it specifies that here we are using a two-point crossover (cxTwoPoint). Next, we register the mutation operator; we choose the mutShuffleIndexes option, which shuffles the attributes of the input individual with a probability indpb=0.05. And finally, we define how the parents are selected; here, we have defined the method of selection as tournament selection with a tournament size of 3: toolbox.register("evaluate", evalWord, word) toolbox.register("mate", tools.cxTwoPoint) toolbox.register("mutate", tools.mutShuffleIndexes, indpb=0.05) toolbox.register("select", tools.selTournament, tournsize=3) Now we have all the ingredients, so we will write down the code of the genetic algorithm, which will perform the steps we mentioned earlier in a repetitive manner: def main(): random.seed(64) # create an initial population of 300 individuals pop = toolbox.population(n=300) # CXPB is the crossover probability # MUTPB is the probability for mutating an individual CXPB, MUTPB = 0.5, 0.2 print("Start of evolution") # Evaluate the entire population fitnesses = list(map(toolbox.evaluate, pop)) for ind, fit in zip(pop, fitnesses): ind.fitness.values = fit print(" Evaluated %i individuals" % len(pop)) # Extracting all the fitnesses of individuals in a list fits = [ind.fitness.values[0] for ind in pop] # Variable keeping track of the number of generations g = 0 # Begin the evolution while max(fits) < 5 and g < 1000: # A new generation g += 1 print("-- Generation %i --" % g) # Select the next generation individuals offspring = toolbox.select(pop, len(pop)) # Clone the selected individuals offspring = list(map(toolbox.clone, offspring)) # Apply crossover and mutation on the offspring for child1, child2 in zip(offspring[::2], offspring[1::2]): # cross two individuals with probability CXPB if random.random() < CXPB: toolbox.mate(child1, child2) # fitness values of the children # must be recalculated later del child1.fitness.values del child2.fitness.values for mutant in offspring: # mutate an individual with probability MUTPB if random.random() < MUTPB: toolbox.mutate(mutant) del mutant.fitness.values # Evaluate the individuals with an invalid fitness invalid_ind = [ind for ind in offspring if not ind.fitness.valid] fitnesses = map(toolbox.evaluate, invalid_ind) for ind, fit in zip(invalid_ind, fitnesses): ind.fitness.values = fit print(" Evaluated %i individuals" % len(invalid_ind)) # The population is entirely replaced by the offspring pop[:] = offspring # Gather all the fitnesses in one list and print the stats fits = [ind.fitness.values[0] for ind in pop] length = len(pop) mean = sum(fits) / length sum2 = sum(x*x for x in fits) std = abs(sum2 / length - mean**2)**0.5 print(" Min %s" % min(fits)) print(" Max %s" % max(fits)) print(" Avg %s" % mean) print(" Std %s" % std) print("-- End of (successful) evolution --") best_ind = tools.selBest(pop, 1)[0] print("Best individual is %s, %s" % (''.join(best_ind),\ best_ind.fitness.values)) 9. Here, you can see the result of this genetic algorithm. In seven generations, we reached the right word:   Genetic algorithm for LSTM optimization In a genetic CNN, we use genetic algorithms to estimate the optimum CNN architecture; in genetic RNN, we will now use a genetic algorithm to find the optimum hyperparameters of the RNN, the window size, and the number of hidden units. We will find the parameters that reduce the root-mean-square error (RMSE) of the model. The hyperparameters window size and number of units are again encoded in a binary string with 6 bits for window size and 4 bits for the number of units. Thus, the complete encoded chromosome will be of 10 bits. The LSTM is implemented using Keras. The code we implement is taken from https://github.com/aqibsaeed/Genetic-Algorithm-RNN: The necessary modules are imported. This time, we are using Keras to implement the LSTM model: import numpy as np import pandas as pd from sklearn.metrics import mean_squared_error from sklearn.model_selection import train_test_split as split from keras.layers import LSTM, Input, Dense from keras.models import Model from deap import base, creator, tools, algorithms from scipy.stats import bernoulli from bitstring import BitArray np.random.seed(1120) The dataset we need for LSTM has to be time series data; we use the wind-power forecasting data from Kaggle (https://www.kaggle.com/c/GEF2012-wind-forecasting/data): data = pd.read_csv('train.csv') data = np.reshape(np.array(data['wp1']),(len(data['wp1']),1)) train_data = data[0:17257] test_data = data[17257:] Define a function to prepare the dataset depending upon the chosen window_size: def prepare_dataset(data, window_size): X, Y = np.empty((0,window_size)), np.empty((0)) for i in range(len(data)-window_size-1): X = np.vstack([X,data[i:(i + window_size),0]]) Y = np.append(Y,data[i + window_size,0]) X = np.reshape(X,(len(X),window_size,1)) Y = np.reshape(Y,(len(Y),1)) return X, Y The train_evaluate function creates the LSTM network for a given individual and returns its RMSE value (fitness function): def train_evaluate(ga_individual_solution): # Decode genetic algorithm solution to integer for window_size and num_units window_size_bits = BitArray(ga_individual_solution[0:6]) num_units_bits = BitArray(ga_individual_solution[6:]) window_size = window_size_bits.uint num_units = num_units_bits.uint print('\nWindow Size: ', window_size, ', Num of Units: ', num_units) # Return fitness score of 100 if window_size or num_unit is zero if window_size == 0 or num_units == 0: return 100, # Segment the train_data based on new window_size; split into train and validation (80/20) X,Y = prepare_dataset(train_data,window_size) X_train, X_val, y_train, y_val = split(X, Y, test_size = 0.20, random_state = 1120) # Train LSTM model and predict on validation set inputs = Input(shape=(window_size,1)) x = LSTM(num_units, input_shape=(window_size,1))(inputs) predictions = Dense(1, activation='linear')(x) model = Model(inputs=inputs, outputs=predictions) model.compile(optimizer='adam',loss='mean_squared_error') model.fit(X_train, y_train, epochs=5, batch_size=10,shuffle=True) y_pred = model.predict(X_val) # Calculate the RMSE score as fitness score for GA rmse = np.sqrt(mean_squared_error(y_val, y_pred)) print('Validation RMSE: ', rmse,'\n') return rmse, Next, we use DEAP tools to define Individual (again, since the chromosome is represented by a binary encoded string (10 bits), we use Bernoulli's distribution), create the population, use ordered crossover, use mutShuffleIndexes mutation, and use the roulette wheel selection for selecting the parents: population_size = 4 num_generations = 4 gene_length = 10 # As we are trying to minimize the RMSE score, that's why using -1.0. # In case, when you want to maximize accuracy for instance, use 1.0 creator.create('FitnessMax', base.Fitness, weights = (-1.0,)) creator.create('Individual', list , fitness = creator.FitnessMax) toolbox = base.Toolbox() toolbox.register('binary', bernoulli.rvs, 0.5) toolbox.register('individual', tools.initRepeat, creator.Individual, toolbox.binary, n = gene_length) toolbox.register('population', tools.initRepeat, list , toolbox.individual) toolbox.register('mate', tools.cxOrdered) toolbox.register('mutate', tools.mutShuffleIndexes, indpb = 0.6) toolbox.register('select', tools.selRoulette) toolbox.register('evaluate', train_evaluate) population = toolbox.population(n = population_size) r = algorithms.eaSimple(population, toolbox, cxpb = 0.4, mutpb = 0.1, ngen = num_generations, verbose = False) We get the best solution, as follows: best_individuals = tools.selBest(population,k = 1) best_window_size = None best_num_units = None for bi in best_individuals: window_size_bits = BitArray(bi[0:6]) num_units_bits = BitArray(bi[6:]) best_window_size = window_size_bits.uint best_num_units = num_units_bits.uint print('\nWindow Size: ', best_window_size, ', Num of Units: ', best_num_units) And finally, we implement the best LSTM solution: X_train,y_train = prepare_dataset(train_data,best_window_size) X_test, y_test = prepare_dataset(test_data,best_window_size) inputs = Input(shape=(best_window_size,1)) x = LSTM(best_num_units, input_shape=(best_window_size,1))(inputs) predictions = Dense(1, activation='linear')(x) model = Model(inputs = inputs, outputs = predictions) model.compile(optimizer='adam',loss='mean_squared_error') model.fit(X_train, y_train, epochs=5, batch_size=10,shuffle=True) y_pred = model.predict(X_test) rmse = np.sqrt(mean_squared_error(y_test, y_pred)) print('Test RMSE: ', rmse) Yay! Now, you have the best LSTM network for predicting wind power. In this article, we looked at an interesting nature-inspired algorithm family: genetic algorithms. We learned how to convert our optimization problems into a form suitable for genetic algorithms. Crossover and mutation, two very crucial operations in genetic algorithms, were explained. We applied what we learned from two very different optimization problems. We used it to guess a word and to find the optimum hyperparameters for an LSTM network.  If you want to explore more topics related to genetic algorithms, be sure to check out the book 'Hands-On Artificial Intelligence for IoT'. How AI is transforming the Smart Cities IoT? [Tutorial] How to stop hackers from messing with your home network (IoT) Defending your business from the next wave of cyberwar: IoT Threats
Read more
  • 0
  • 0
  • 60277

article-image-microsoft-announces-the-general-availability-of-live-share-and-brings-it-to-visual-studio-2019
Bhagyashree R
03 Apr 2019
2 min read
Save for later

Microsoft announces the general availability of Live Share and brings it to Visual Studio 2019

Bhagyashree R
03 Apr 2019
2 min read
Microsoft yesterday announced that Live Share is now generally available and now comes included in Visual Studio 2019. This release comes with a lot of updates based on the feedbacks the team got since the public preview of Live Share started including a read-only mode, support for C++ and Python, and more. What is Live Share? Microsoft first introduced Live Share at Connect 2017 and launched its public preview in May 2018. It is a tool that enables your to collaborate with your team on the same codebase without needing to synchronize code or to configure the same development tools, settings, or environment. You can edit and debug your code with others in real time, regardless what programming languages you are using or the type of app you are building. It allows you to do a bunch of different things like instantly sharing your project with a teammate, share debugging sessions, terminal instances, localhost web apps, voice calls, and more. With Live Share, you do not have to leave the comfort of your favorite tools. You can take the advantage collaboration while retaining your personal editor preferences. It also provides developers their own cursor so that to enable seamless transition between following one another. What’s new in Live Share? This release includes features like  read-only mode, support for more languages like C++ and Python, and enabled guests to start debugging sessions. Now, you can use Live Share while pair programming, conducting code reviews, giving lectures and presenting to students and colleagues, or even mob programming during hackathons. This release also comes with support for a few third-party extensions to improve the overall experience when working with Live Share. The two extensions are OzCode and CodeStream. OzCode offers a suite of visualizations like datatips to see how items are passed through a LINQ query. It provides heads-up display to show how a set of boolean expressions evaluates. With CodeStream, you can create discussions about your codebase, which will serve as an integrated chat feature within a LiveShare session. To read the updates in Live Share, check out the official announcement. Microsoft brings PostgreSQL extension and SQL Notebooks functionality to Azure Data Studio Microsoft open-sources Project Zipline, its data compression algorithm and hardware for the cloud Microsoft announces Game stack with Xbox Live integration to Android and iOS  
Read more
  • 0
  • 0
  • 25463

article-image-un-global-working-group-on-big-data-publishes-a-handbook-on-privacy-preserving-computation-techniques
Bhagyashree R
03 Apr 2019
4 min read
Save for later

UN Global Working Group on Big Data publishes a handbook on privacy-preserving computation techniques

Bhagyashree R
03 Apr 2019
4 min read
On Monday, the UN Global Working Group (GWG) on Big Data published UN Handbook on Privacy-Preserving Computation Techniques. This book talks about the emerging privacy-preserving computation techniques and also outlines the key challenges in making these techniques more mainstream. https://twitter.com/UNBigData/status/1112739047066255360 Motivation behind writing this handbook In recent years, we have come across several data breaches. Companies collect users’ personal data without their consent to show them targeted content. The aggregated personal data can be misused to identify individuals and localize their whereabouts. Individuals can be singled out with the help of just a small set of attributes. This large collections of data are very often an easy target for cybercriminals. Previously, when cyber threats were not that advanced, people used to focus mostly on protecting the privacy of data at rest. This led to development of technologies like symmetric key encryption. Later, when sharing data on unprotected networks became common, technologies like Transport Layer Security (TLS) came into the picture. Today, when attackers are capable of penetrating servers worldwide, it is important to be aware of the technologies that help in ensuring data privacy during computation. This handbook focuses on technologies that protect the privacy of data during and after computation, which are called privacy-preserving computation techniques. Privacy Enhancing Technologies (PET) for statistics This book lists five Privacy Enhancing Technologies for statistics that will help reduce the risk of data leakage. I say “reduce” because there is, in fact, no known technique that can give a complete solution to the privacy question. #1 Secure multi-party computation Secure multi-party computation is also known as secure computation, multi-party computation (MPC), or privacy-preserving computation. A subfield of cryptography, this technology deals with scenarios where multiple parties are jointly working on a function. It aims to prevent any participant from learning anything about the input provided by other parties. MPC is based on secret sharing, in which data is divided into shares that are random themselves, but when combined it gives the original data. Each data input is shared into two or more shares and distributed among the parties involved. These when combined produce the correct output of the computed function. #2 Homomorphic encryption Homomorphic encryption is an encryption technique using which you can perform computations on encrypted data without the need for a decryption key. The advantage of this encryption scheme is that it enables computation on encrypted data without revealing the input data or result to the computing party. The result can only be decrypted by a specific party that has access to the secret key, typically it is the owner of the input data. #3 Differential Privacy (DP) DP is a statistical technique that makes it possible to collect and share aggregate information about users, while also ensuring that the privacy of individual users is maintained. This technique was designed to address the pitfalls that previous attempts to define privacy suffered, especially in the context of multiple releases and when adversaries have access to side knowledge. #4 Zero-knowledge proofs Zero-knowledge proofs involve two parties: prover and verifier. The prover has to prove statements to the verifier based on secret information known only to the prover. ZKP allows you to prove that you know a secret or secrets to the other party without actually revealing it. This is why this technology is called “zero knowledge”, as in, “zero” information about the secret is revealed. But, the verifier is convinced that the prover knows the secret in question. #5 Trusted Execution Environments (TEEs) This last technique on the list is different from the above four as it uses both hardware and software to protect data and code. It provides users secure computation capability by combining special-purpose hardware and software built to use those hardware features. In this technique, a process is run on a processor without its memory or execution state being exposed to any other process on the processor. This free 50-pager handbook is targeted towards statisticians and data scientists, data curators and architects, IT specialists, and security and information assurance specialists. So, go ahead and have a read: UN Handbook for Privacy-Preserving Techniques! Google employees filed petition to remove anti-trans, anti-LGBTQ and anti-immigrant Kay Coles James from the AI council Ahead of Indian elections, Facebook removes hundreds of assets spreading fake news and hate speech, but are they too late? Researchers successfully trick Tesla autopilot into driving into opposing traffic via “small stickers as interference patches on the ground”
Read more
  • 0
  • 0
  • 5947
Unlock access to the largest independent learning library in Tech for FREE!
Get unlimited access to 7500+ expert-authored eBooks and video courses covering every tech area you can think of.
Renews at €18.99/month. Cancel anytime
article-image-chef-goes-open-source-ditching-the-loose-open-core-model
Richard Gall
02 Apr 2019
5 min read
Save for later

Chef goes open source, ditching the Loose Open Core model

Richard Gall
02 Apr 2019
5 min read
Chef, the infrastructure automation tool, has today revealed that it is going completely open source. In doing so, the project has ditched the loose open core model. The news is particularly intriguing as it comes at a time when the traditional open source model appears to be facing challenges around its future sustainability. However, it would appear that from Chef's perspective the switch to a full open source license is being driven by a crowded marketplace where automation tools are finding it hard to gain a foothold inside organizations trying to automate their infrastructure. A further challenge for this market is what Chef has identified as 'The Coded Enterprise' - essentially technologically progressive organizations driven by an engineering culture where infrastructure is primarily viewed as code. Read next: Key trends in software infrastructure in 2019: observability, chaos, and cloud complexity Why is Chef going open source? As you might expect,  there's actually more to Chef's decision than pure commercialism. To get a good understanding, it's worth picking apart Chef's open core model and how this was limiting the project. The limitations of Open Core The Loose Open Core model has open source software at its center but is wrapped in proprietary software. So, it's open at its core, but is largely proprietary in how it is deployed and used by businesses. While at first glance this might make it easier to monetize the project, it also severely limits the projects ability to evolve and develop according to the needs of people that matter - the people that use it. Indeed, one way of thinking about it is that the open core model positions your software as a product - something that is defined by product managers and lives and dies by its stickiness with customers. By going open source, your software becomes a project, something that is shared and owned by a community of people that believe in it. Speaking to TechCrunch, Chef Co-Founder Adam Jacob said "in the open core model, you’re saying that the value is in this proprietary sliver. The part you pay me for is this sliver of its value. And I think that’s incorrect... the value was always in the totality of the product." Read next: Chef Language and Style Removing the friction between product and project Jacob published an article on Medium expressing his delight at the news. It's an instructive look at how Chef has been thinking about itself and the challenges it faces. "Deciding what’s in, and what’s out, or where to focus, was the hardest part of the job at Chef," Jacob wrote. "I’m stoked nobody has to do it anymore. I’m stoked we can have the entire company participating in the open source community, rather than burning out a few dedicated heroes. I’m stoked we no longer have to justify the value of what we do in terms of what we hold back from collaborating with people on." So, what's the deal with the Chef Enterprise Automation Stack? As well as announcing that Chef will be open sourcing its code, the organization also revealed that it was bringing together Chef Automate, Chef Infra, Chef InSpec, Chef Habitat and Chef Workstation under one single solution: the Chef Enterprise Automation Stack. The point here is to simplify Chef's offering to its customers to make it easier for them to do everything they can to properly build and automate reliable infrastructure. Corey Scobie, SVP of Product and Engineering said that "the introduction of the Chef Enterprise Automation Stack builds on [the switch to open source]... aligning our business model with our customers’ stated needs through Chef software distribution, services, assurances and direct engagement. Moving forward, the best, fastest, most reliable way to get Chef products and content will be through our commercial distributions.” So, essentially the Chef Enterprise Automation Stack will be the primary Chef distribution that's available commercially, sitting alongside the open source project. What does all this mean for Chef customers and users? If you're a Chef user or have any questions or concerns, the team have put together a very helpful FAQ. You can read it here. The key points for Chef users Existing commercial and non-commercial users don't need to do anything - everything will continue as normal. However, anyone else using current releases should be aware that support will be removed from those releases in 12 months time. The team have clarified that "customers who choose to use our new software versions will be subject to the new license terms and will have an opportunity to create a commercial relationship with Chef, with all of the accompanying benefits that provides." A big step for Chef - could it help determine the evolution of open source? This is a significant step for Chef and it will be of particular interest to its users. But even for those who have no interest in Chef, it's nevertheless a story that indicates that there's a lot of life in open source despite the challenges it faces. It'll certainly interesting to see whether Chef makes it work and what impact it has on the configuration management marketplace.
Read more
  • 0
  • 0
  • 19356

article-image-optimizing-graphql-with-apollo-engine-tutorial
Bhagyashree R
02 Apr 2019
13 min read
Save for later

Optimizing GraphQL with Apollo Engine [Tutorial]

Bhagyashree R
02 Apr 2019
13 min read
Apollo Engine is a commercial product produced by MDG, the Meteor Development Group, the company behind Apollo. It provides many great features, which we'll explore in this article. We will also answer these questions using the Apollo Engine: How is our GraphQL API performing, are there any errors, and how can we improve the GraphQL schema? This article is taken from the book Hands-on Full-Stack Web Development with GraphQL and React by Sebastian Grebe. This book will guide you in implementing applications by using React, Apollo, Node.js, and SQL. By the end of the book, you will be proficient in using GraphQL and React for your full-stack development requirements. To follow along with the examples implemented in this article, you can download the code from the book’s GitHub repository. Setting up Apollo Engine First, you need to sign up for an Apollo Engine account. At the time of writing, they offer three different plans, which you can find by going to their plans page. When signing up, you get a two-week trial of the Team plan, which is one of the paid plans. Afterward, you'll be downgraded to the free plan. You should compare all three plans to understand how they differ—they're all worth checking out. To sign up, go to its login page. Currently, you can only sign up using a GitHub account. If you don't have one already, create a GitHub account. After logging in, you will see a dashboard that looks as follows: The next step is to add a service with the NEW SERVICE button in the top-right corner. The first thing you need to enter is a unique id for your service across all Apollo Engine services. This id will be auto-generated through the organization you select, but can be customized. Secondly, you will be asked to publish your GraphQL schema to Apollo Engine. Publishing your GraphQL schema means that you upload your schema to Apollo Engine so that it can be processed. It won't get publicized to external users. You can do this using the command provided by Apollo Engine. For me, this command looked as follows: npx apollo service:push --endpoint="http://localhost:8000/graphql" --key="YOUR_KEY" The preceding endpoint must match your GraphQL route. The key comes from Apollo Engine itself, so you don't generate it on your own. Before running the preceding command, you have to start the server, otherwise, the GraphQL schema isn't accessible. Once you've uploaded the schema, Apollo Engine will redirect you to the service you just set up. Notice that the GraphQL introspection feature needs to be enabled. Introspection means that you can ask your GraphQL API which operations it supports. Introspection is only enabled when you run your Apollo Server in a development environment, or if you explicitly enable introspection in production. I highly discourage this because it involves giving away information about queries and mutations that are accepted by your back end. However, if you want to enable it, you can do this by setting the introspection field when initializing Apollo Server. It can be added inside the index.js file of the graphql folder: const server = new ApolloServer({ schema: executableSchema, introspection: true, Ensure that you remove the introspection field when deploying your application. If you aren't able to run the GraphQL server, you also have the ability to specify a schema file. Once you publish the GraphQL schema, the setup process for your Apollo Engine service should be done. We'll explore the features that we can now use in the following sections of this article. Before doing this, however, we have to change one thing on the back end to get Apollo Engine working with our back end. We already used our API Key to upload our GraphQL schema to Apollo Engine. Everything, such as error tracking and performance analysis, relies on this key. We also have to insert it in our GraphQL server. If you entered a valid API key, all requests will be collected in Apollo Engine. Open index.js in the server's graphql folder and add the following object to the ApolloServer initialization: engine: { apiKey: ENGINE_KEY } The ENGINE_KEY variable should be extracted from the environment variables at the top of the file. We also need to extract JWT_SECRET with the following line: const { JWT_SECRET, ENGINE_KEY } = process.env; Verify that everything is working by running some GraphQL requests. You can view all past requests by clicking on the Clients tab in Apollo Engine. You should see that a number of requests happened, under the Activity in the last hour panel. If this isn't the case, there must be a problem with the Apollo Server configuration. Analyzing schemas with Apollo Engine The Community plan of Apollo Engine offers schema registry and explorer tools. You can find them by clicking on the Explorer tab in the left-hand panel. If your setup has gone well, the page should look as follows: Let's take a closer look at this screenshot: On the page, you see the last GraphQL schema that you have published. Each schema you publish has a unique version, as long as the schema includes changes. Beneath the version number, you can see your entire GraphQL schema. You can inspect all operations and types. All relations between types and operations are directly linked to each other. You can directly see the number of clients and various usage statistics next to each operation, type, and field. You can search through your GraphQL schema in the top bar and filter the usage statistics in the panel on the right. You can also switch to the Deprecation tab at the top. This page gives you a list of fields that are deprecated. We won't use this page because we are using the latest field definitions, but it's vital if you're running an application for a longer time. Having an overview of our schema is beneficial. In production, every new release of our application is likely to also bring changes to the GraphQL schema. With Apollo Engine, you can track those changes easily. This feature is called schema-change validation and is only included in the paid Team plan of Apollo Engine. It's worth the extra money because it allows you to track schema changes and also to compare how those fields are used. It allows us to draw conclusions about which clients and versions are being used at the moment. I have created an example for you in the following screenshot: Here, I published an initial version of our current GraphQL schema. Afterward, I added a demonstration type with one field, called example. On the right-hand side, you can see the schema difference between the initial and second releases of the GraphQL schema. Viewing your schema inside Apollo Engine, including the history of all previous schemas, is very useful. Performance metrics with Apollo Engine When your application is live and heavily used, you can't check the status of every feature yourself; it would lead to an impossible amount of work. Apollo Engine can tell you how your GraphQL API is performing by collecting statistics with each request that's received. You always have an overview of the general usage of your application, the number of requests it receives, the request latency, the time taken to process each operation, the type, and also each field that is returned. Apollo Server can provide these precise analytics since each field is represented in a resolver function. The time elapsed to resolve each field is then collected and stored inside Apollo Engine. At the top of the Metrics page, you have four tabs. The first tab will look as follows: If your GraphQL API is running for more than a day, you'll receive an overview that looks like the one here. The left-hand graph shows you the request rate over the last day. The graph in the middle shows the service time, which sums up the processing time of all requests. The right-hand graph gives you the number of errors, along with the queries that caused them. Under the overview, you'll find details about the current day, including the requests per minute, the request latency over time, and the request latency distribution: Requests Per Minute (rpm): It is useful when your API is used very often. It indicates which requests are sent more often than others. Latency over time: It is useful when the requests to your API take too long to process. You can use this information to look for a correlation between the number of requests and increasing latency. Request-latency distribution: It shows you the processing time and the number of requests. You can compare the number of slow requests with the number of fast requests in this chart. In the right-hand panel of Apollo Engine, under Metrics, you'll see all your GraphQL operations. If you select one of these, you can get even more detailed statistics. Now, switch to the Traces tab at the top. The first chart on this page looks as follows: The latency distribution chart shows all the different latencies for the currently-selected operation, including the number of sent requests with that latency. In the preceding example, I used the postsFeed query. Each request latency has its own execution timetable. You can see it by clicking on any column in the preceding chart. The table should look like the following screenshot: The execution timetable is a big foldable tree. It starts at the top with the root query, postsFeed, in this case. You can also see the overall time it took to process the operation. Each resolver function has got its own latency, which might include, for example, the time taken for each post and user to be queried from the database. All the times from within the tree are summed up and result in a total time of about 90 milliseconds. It's obvious that you should always check all operations and their latencies to identify performance breakdowns. Your users should always have responsive access to your API. This can easily be monitored with Apollo Engine. Error tracking with Apollo Engine We've already looked at how to inspect single operations using Apollo Engine. Under the Clients tab, you will find a separate view that covers all client types and their requests: In this tab, you can directly see the percentage of errors that happened during each operation. In the currentUser query, there were 37.14% errors out of the total currentUser requests. If you take a closer look at the left-hand side of the image, you will see that it says, Unidentified clients. Since version 2.2.3 of Apollo Server, client awareness is supported. It allows you to identify the client and track how consumers use your API. Apollo automatically extracts an extensions field inside each GraphQL operation, which can hold a name and version. Both fields—Name and Version—are then directly transferred to Apollo Engine. We can filter by these fields in Apollo Engine. We will have a look at how to implement this in our back end next. In this example, we'll use HTTP header fields to track the client type. There will be two header fields: apollo-client-name and apollo-client-version. We'll use these to set custom values to filter requests later in the Clients page. Open the index.js file from the graphql folder. Add the following function to the engine property of the ApolloServer initialization: engine: { apiKey: ENGINE_KEY, generateClientInfo: ({ request }) => { const headers = request.http.headers; const clientName = headers.get('apollo-client-name'); const clientVersion = headers.get('apollo-client-version'); if(clientName && clientVersion) { return { clientName, clientVersion }; } else { return { clientName: "Unknown Client", clientVersion: "Unversioned", }; } }, }, The generateClientInfo function is executed with every request. We extract the two fields from the header. If they exist, we return an object with the clientName and clientVersion properties that have the values from the headers. Otherwise, we return a static Unkown Client text. To get both of our clients – the front end and back end – set up, we have to add these fields. Perform the following steps: Open the index.js file of the client's apollo folder file. Add a new InfoLink to the file to set the two new header fields: const InfoLink = (operation, next) => { operation.setContext(context => ({ ...context, headers: { ...context.headers, 'apollo-client-name': 'Apollo Frontend Client', 'apollo-client-version': '1' }, })); return next(operation); }; Like AuthLink, this link will add the two new header fields next to the authorization header. It sets the version header to '1' and the name of the client to 'Apollo Frontend Client'. We will see both in Apollo Engine soon. Add InfoLink in front of AuthLink in the ApolloLink.from function. On the back end, we need to edit the apollo.js file in the ssr folder: const InfoLink = (operation, next) => { operation.setContext(context => ({ ...context, headers: { ...context.headers, 'apollo-client-name': 'Apollo Backend Client', 'apollo-client-version': '1' }, })); return next(operation); }; The link is almost the same as the one for the front end, except that we set another apollo-client-name header. Add it just before AuthLink in the ApolloLink.from function. The client name differs between the front end and back end code so you can compare both clients inside Apollo Engine. If you execute some requests from the back end and front end, you can see the result of these changes directly in Apollo Engine. Here, you can see an example of how that result should look: At the top of the screenshot, we see the number of requests the back end has made. In the middle, all the clients that we have no further information on are listed, while at the bottom, we can see all requests that have been made by the client-side code. Unknown clients might be external applications that are accessing your API. When releasing a new version of your application, you can increase the version number of the client. The version number represents another comparable field. We now know which clients have accessed our API from the information provided by Apollo Engine. Let's take a look at what Apollo Engine can tell us about errors. When you visit the Error tab, you will be presented with a screen that looks like the following screenshot: The first chart shows the number of errors over a timeline. Under the graph, you can see each error with a timestamp and the stack trace. You can follow the link to see the trace in detail, with the location of the error. If you paid for the Team plan, you can also set alerts when the number of errors increases or the latency time goes up. You can find these alerts under the Integrations tab. This article walked you through how to sign up to and set up Apollo Engine. Further, we will learn how to analyze schemas, check how our GraphQL API is performing, and track errors using Apollo Engine. If you found this post useful, do check out the book, Hands-on Full-Stack Web Development with GraphQL and React. This book teaches you how to build scalable full-stack applications while learning to solve complex problems with GraphQL. Applying Modern CSS to Create React App Projects [Tutorial] Keeping animations running at 60 FPS in a React Native app [Tutorial] React Native development tools: Expo, React Native CLI, CocoaPods [Tutorial]
Read more
  • 0
  • 0
  • 5457

article-image-zuckerberg-agenda-for-tech-regulation-yet-another-digital-gangster-move
Sugandha Lahoti
01 Apr 2019
7 min read
Save for later

Zuckerberg wants to set the agenda for tech regulation in yet another “digital gangster” move

Sugandha Lahoti
01 Apr 2019
7 min read
Facebook has probably made the biggest April Fool’s joke of this year. Over the weekend, Mark Zuckerberg, CEO of Facebook, penned a post detailing the need to have tech regulation in four major areas: “harmful content, election integrity, privacy, and data portability”. However, privacy advocates and tech experts were frustrated rather than pleased with this announcement, stating that seeing recent privacy scandals, Facebook CEO shouldn’t be the one making the rules. The term ‘digital gangster’ was first coined by the Guardian, when the Digital, Culture, Media and Sport Committee published its final report on Facebook’s Disinformation and ‘fake news practices. Per the publishing firm, “Facebook behaves like a ‘digital gangster’ destroying democracy. It considers itself to be ‘ahead of and beyond the law’. It ‘misled’ parliament. It gave statements that were ‘not true’”. Last week, Facebook rolled out a new Ad Library to provide more stringent transparency for preventing interference in worldwide elections. It also rolled out a policy to ban white nationalist content from its platforms. Zuckerberg’s four new regulation ideas “I believe we need a more active role for governments and regulators. By updating the rules for the internet, we can preserve what’s best about it — the freedom for people to express themselves and for entrepreneurs to build new things — while also protecting society from broader harms.”, writes Zuckerberg. Reducing harmful content For harmful content, Zuckerberg talks about having a certain set of rules that govern what types of content tech companies should consider harmful. According to him, governments should set "baselines" for online content that require filtering. He suggests that third-party organizations should also set standards governing the distribution of harmful content and measure companies against those standards. "Internet companies should be accountable for enforcing standards on harmful content," he writes. "Regulation could set baselines for what’s prohibited and require companies to build systems for keeping harmful content to a bare minimum." Ironically, over the weekend, Facebook was accused of enabling the spreading of anti-Semitic propaganda after its refusal to take down repeatedly flagged hate posts. Facebook stated that it will not remove the posts as they do not breach its hate speech rules and are not against UK law. Preserving election integrity The second tech regulation revolves around election integrity. Facebook has been taken steps in this direction by making significant changes to its advertising policies. Facebook’s new Ad library which was released last week, now provides advertising transparency on all active ads running on a Facebook page, including politics or issue ads. Ahead of the European Parliamentary election in May 2019, Facebook is also introducing ads transparency tools in the EU. He advises other tech companies to build a searchable ad archive as well. "Deciding whether an ad is political isn’t always straightforward. Our systems would be more effective if regulation created common standards for verifying political actors," Zuckerberg says. He also talks about improving online political advertising laws for political issues rather than primarily focussing on candidates and elections. “I believe”, he says “legislation should be updated to reflect the reality of the threats and set standards for the whole industry.” What is surprising is that just 24 hrs after Zuckerberg published his post committing to preserve election integrity, Facebook took down over 700 pages, groups, and accounts that were engaged in “coordinated inauthentic behavior” on Indian politics ahead of the country’s national elections. According to DFRLab, who analyzed these pages, Facebook was in fact quite late to take actions against these pages. Per DFRLab, "Last year, AltNews, an open-source fact-checking outlet, reported that a related website called theindiaeye.com was hosted on Silver Touch servers. Silver Touch managers denied having anything to do with the website or the Facebook page, but Facebook’s statement attributed the page to “individuals associated with” Silver Touch. The page was created in 2016. Even after several regional media outlets reported that the page was spreading false information related to Indian politics, the engagements on posts kept increasing, with a significant uptick from June 2018 onward." Adhering to privacy and data portability For privacy, Zuckerberg talks about the need to develop a “globally harmonized framework” along the lines of European Union's GDPR rules for US and other countries “I believe a common global framework — rather than regulation that varies significantly by country and state — will ensure that the internet does not get fractured, entrepreneurs can build products that serve everyone, and everyone gets the same protections.”, he writes. Which makes us wonder what is stopping him from implementing EU style GDPR on Facebook globally until a common framework is agreed upon by countries? Lastly, he adds, “regulation should guarantee the principle of data portability”, allowing people to freely port their data across different services. “True data portability should look more like the way people use our platform to sign into an app than the existing ways you can download an archive of your information. But this requires clear rules about who’s responsible for protecting information when it moves between services.” He also endorses the need for a standard data transfer format by supporting the open source Data Transfer Project. Why this call for regulation now? Zuckerberg's post comes at a strategic point of time when Facebook is battling a large number of investigations. Most recent of which is the housing discrimination charge by the U.S. Department of Housing and Urban Development (HUD) who alleged that Facebook is using its advertising tools to violate the Fair Housing Act. Also to be noticed is the fact, that Zuckerberg’s blog post comes weeks after Senator Elizabeth Warren, stated that if elected president in 2020, her administration will break up Facebook. Facebook was quick to remove and then restore several ads placed by Warren, that called for the breakup of Facebook and other tech giants. A possible explanation to Zuckerberg's post can be the fact that Facebook will be able to now say that it's actually pro-government regulation. This means it can lobby governments to make a decision that would be the most beneficial for the company. It may also set up its own work around political advertising and content moderation as the standard for other industries. By blaming decisions on third parties, it may also possibly reduce scrutiny from lawmakers. According to a report by Business Insider, just as Zuckerberg posted about his news today, a large number of Zuckerberg’s previous posts and announcements have been deleted from the FB Blog. Reaching for comment, a Facebook spokesperson told Business Insider that the posts were "mistakenly deleted" due to "technical errors." Now if this is a deliberate mistake or an unintentional one, we don’t know. Zuckerberg’s post sparked a huge discussion on Hacker news with most people drawing negative conclusions based on Zuckerberg’s writeup. Here are some of the views: “I think Zuckerberg's intent is to dilute the real issue (privacy) with these other three points. FB has a bad record when it comes to privacy and they are actively taking measures against it. For example, they lobby against privacy laws. They create shadow profiles and they make it difficult or impossible to delete your account.” “harmful content, election integrity, privacy, data portability Shut down Facebook as a company and three of those four problems are solved.” “By now it's pretty clear, to me at least, that Zuckerberg simply doesn't get it. He could have fixed the issues for over a decade. And even in 2019, after all the evidence of mismanagement and public distrust, he still refuses to relinquish any control of the company. This is a tone-deaf opinion piece.” Twitteratis also shared the same sentiment. https://twitter.com/futureidentity/status/1112455687169327105 https://twitter.com/BrendanCarrFCC/status/1112150281066819584 https://twitter.com/davidcicilline/status/1112085338342727680 https://twitter.com/DamianCollins/status/1112082926232092672 https://twitter.com/MaggieL/status/1112152675699834880 Ahead of EU 2019 elections, Facebook expands it’s Ad Library to provide advertising transparency in all active ads Facebook will ban white nationalism, and separatism content in addition to white supremacy content. Are the lawmakers and media being really critical towards Facebook?
Read more
  • 0
  • 0
  • 10913

article-image-installing-a-blockchain-network-using-hyperledger-fabric-and-composertutorial
Savia Lobo
01 Apr 2019
6 min read
Save for later

Installing a blockchain network using Hyperledger Fabric and Composer[Tutorial]

Savia Lobo
01 Apr 2019
6 min read
This article is an excerpt taken from the book Hands-On IoT Solutions with Blockchain written by Maximiliano Santos and Enio Moura. In this book, you'll learn how to work with problem statements and learn how to design your solution architecture so that you can create your own integrated Blockchain and IoT solution. In this article, you will learn how to install your own blockchain network using Hyperledger Fabric and Composer. We can install the blockchain network using Hyperledger Fabric by many means, including local servers, Kubernetes, IBM Cloud, and Docker. To begin with, we'll explore Docker and Kubernetes. Setting up Docker Docker can be installed using information provided on https://www.docker.com/get-started. Hyperledger Composer works with two versions of Docker: Docker Composer version 1.8 or higher Docker Engine version 17.03 or higher If you already have Docker installed but you're not sure about the version, you can find out what the version is by using the following command in the terminal or command prompt: docker –version Be careful: many Linux-based operating systems, such as Ubuntu, come with the most recent version of Python (Python 3.5.1). In this case, it's important to get Python version 2.7. You can get it here: https://www.python.org/download/releases/2.7/. Installing Hyperledger Composer We're now going to set up Hyperledger Composer and gain access to its development tools, which are mainly used to create business networks. We'll also set up Hyperledger Fabric, which can be used to run or deploy business networks locally. These business networks can be run on Hyperledger Fabric runtimes in some alternative places as well, for example, on a cloud platform. Make sure that you've not installed the tools and used them before. If you have, you'll them using this guide. Components To successfully install Hyperledger Composer, you'll need these components ready: CLI Tools Playground Hyperledger Fabric An IDE Once these are set up, you can begin with the steps given here. Step 1 – Setting up CLI Tools CLI Tools, composer-cli, is a library with the most important operations, such as administrative, operational, and developmental tasks. We'll also install the following tools during this step: Yeoman: Frontend tool for generating applications Library generator: For generating application assets REST server: Utility for running a REST server (local) Let's start our setup of CLI Tools:  Install CLI Tools: npm install -g composer-cli@0.20 Install the library generator: npm install -g generator-hyperledger-composer@0.20 Install the REST server: npm install -g composer-rest-server@0.20 This will allow for integration with a local REST server to expose your business networks as RESTful APIs. Install Yeoman: npm install -g yo Don't use the su or sudo commands with npm to ensure that the current user has all permissions necessary to run the environment by itself. Step 2 – Setting up Playground Playground can give you a UI in your local machine if using your browser to run Playground. This will allow you to display your business networks, browse apps to test edit, and test your business networks. Use the following command to install Playground: npm install -g composer-playground@0.20 Now we can run Hyperledger Fabric. Step 3 – Hyperledger Fabric This step will allow you to run a Hyperledger Fabric runtime locally and deploy your business networks: Choose a directory, such as ~/fabric-dev-servers. Now get the .tar.gz file, which contains the tools for installing Hyperledger Fabric: mkdir ~/fabric-dev-servers && cd ~/fabric-dev-servers curl -O https://raw.githubusercontent.com/hyperledger/composer-tools/master/packages/fabric-dev-servers/fabric-dev-servers.tar.gz tar -xvf fabric-dev-servers.tar.gz You've downloaded some scripts that will allow the installation of a local Hyperledger Fabric v1.2 runtime. To download the actual environment Docker images, run the following commands in your user home directory: cd ~/fabric-dev-servers export FABRIC_VERSION=hlfv12 ./downloadFabric.sh Well done! Now you have everything required for a typical developer environment. Step 4 – IDE Hyperledger Composer allows you to work with many IDEs. Two well-known ones are Atom and VS Code, which both have good extensions for working with Hyperledger Composer. Atom lets you use the composer-atom plugin (https://github.com/hyperledger/composer-atom-plugin) for syntax highlighting of the Hyperledger Composer Modeling Language. You can download this IDE at the following link: https://atom.io/. Also, you can download VS Code at the following link: https://code.visualstudio.com/download. Installing Hyperledger Fabric 1.3 using Docker There are many ways to download the Hyperledger Fabric platform; Docker is the most used method. You can use an official repository. If you're using Windows, you'll want to use the Docker Quickstart Terminal for the upcoming terminal commands. If you're using Docker for Windows, follow these instructions: Consult the Docker documentation for shared drives, which can be found at https://docs.docker.com/docker-for-windows/#shared-drives, and use a location under one of the shared drives. Create a directory where the sample files will be cloned from the Hyperledger GitHub repository, and run the following commands: $ git clone -b master https://github.com/hyperledger/fabric-samples.git To download and install Hyperledger Fabric on your local machine, you have to download the platform-specific binaries by running the following command: $ curl -sSl https://goo.gl/6wtTN5 | bash -s 1.1.0 The complete installation guide can be found on the Hyperledger site. Deploying Hyperledger Fabric 1.3 to a Kubernetes environment This step is recommended for those of you who have the experience and skills to work with Kubernetes, a cloud environment, and networks and would like an in-depth exploration of Hyperledger Fabric 1.3. Kubernetes is a container orchestration platform and is available on major cloud providers such as Amazon Web Services, Google Cloud Platform, IBM, and Azure. Marcelo Feitoza Parisi, one of IBM's brilliant cloud architects, has created and published a guide on GitHub on how to set up a Hyperledger Fabric production-level environment on Kubernetes. The guide is available at https://github.com/feitnomore/hyperledger-fabric-kubernetes. If you've enjoyed reading this post, head over to the book, Hands-On IoT Solutions with Blockchain to understand how IoT and blockchain technology can help to solve the modern food chain and their current challenges. IBM announces the launch of Blockchain World Wire, a global blockchain network for cross-border payments Google expands its Blockchain search tools, adds six new cryptocurrencies in BigQuery Public Datasets Blockchain governance and uses beyond finance – Carnegie Mellon university podcast
Read more
  • 0
  • 0
  • 24052
article-image-understanding-the-cost-of-a-cybersecurity-attack-the-losses-organizations-face
Savia Lobo
31 Mar 2019
15 min read
Save for later

Understanding the cost of a cybersecurity attack: The losses organizations face

Savia Lobo
31 Mar 2019
15 min read
The average cost of a cybersecurity attack has been increasing over time. The rewards to hackers in cyberheists have also been increasing, and this has been motivating them to come up with even better tools and techniques in order to allow them to steal more money and data. Several cybersecurity companies have listed their estimates for the average costs of cyber attacks in 2017/2018. This article is an excerpt taken from the book, Hands-On Cybersecurity for Finance written by Dr. Erdal Ozkaya and Milad Aslaner. In this book you will learn how to successfully defend your system against common cyber threats, making sure your financial services are a step ahead in terms of security. In this article, you will learn the different losses an organization faces post a cyber attack. According to IBM—a tech giant both in hardware and software products—the average cost of a cybersecurity breach has been increasing and is now at $3,860,000. This is a 6.4% increase in their estimate for 2017. The company also estimates that the cost of each stolen record that has sensitive information in 2018 is at $148, which is a rise of 4.8% compared to their estimate for 2017. The following is IBM's report on the cost of a cyber breach in 2018: This year's study reports the global average cost of a data breach is up 6.4% over the previous year to $3,860,000 million. The average cost for each lost or stolen record containing sensitive and confidential information also increased by 4.8% year over year to $148. The cost of different cyber attacks While it might be easy to say that the average cost of a hack is $3,000,000, not all types of attacks will be around that figure. Some attacks are more costly than others. Costs also differ with the frequency of an attack against an organization. Consequently, it's good to look at how costs vary among common cyber attacks. The following screenshot is Accenture's graphical representation of the costs of the most common attacks based on their frequency in 2016 and 2017. This data was collected from 254 companies around the world: To interpret this data, one should note that frequency was taken into consideration. Consequently, the most frequent attacks had higher averages. As can be seen from the graph, insider threats are the most frequent and costly threats to an organization. Attacks related to malicious insiders led to losses averaging $173,516 in 2017. The reason for this high cost is due to the amount of information that insider threats possess when carrying out an attack. Since they've worked with the victim company for some time, they know exactly what to target and are familiar with which security loopholes to exploit. This isn't a guessing game but an assured attack with a clear aim and a preplanned execution. According to the graph by Accenture, malicious insiders were followed by denial of service (DoS) attacks at an annual cost of $129,450, and then malicious code at an annual cost of $112,419. However, when frequency is not considered, there are several changes to the report, as can be seen from the following graphical representation: This graph is representative of the situation in the real world. As can be seen, malware attacks are collectively the costliest. Organizations hit by malware lose an average of $2,400,000 per attack. This is because of the establishment of an underground market that's supports the quick purchase of new malware and the huge number of unpatched systems. Malware has also become more sophisticated due to highly skilled black hats selling their malware on the dark web at affordable prices. Therefore, script kiddies have been getting highly effective malware that they can deploy in attacks. Web-based attacks come in second at $2,000,000, while DoS attacks are ranked third at $1,565,000. DoS attacks are ranked high due to the losses that they can cause a company to incur. Breakdown of the costs of a cyber attack The direct financial losses that have been discussed are not as a result of money stolen during an attack or records copied and advertised as for sale on the deep web. All cyber attacks come bundled with other losses to the company, some of which are felt even years after the attack has happened. This is why some attacks that do not involve the direct theft of money have been ranked among the most costly attacks. For instance, DoS does not involve the theft of money from an organization, yet each DDoS attack is said to average at about $1,500,000. This is due to the other costs that come with the attacks. The following is a breakdown of the costs that come with a cyber attack. Production loss During a cyber attack, productive processes in some organizations will come to a halt. For instance, an e-commerce shop will be unable to keep its business processes running once it's attacked by a DDoS attack or a web-based attack. Organizations have also had their entire networks taken down during attacks, preventing any form of electronic communication from taking place. In various industries, cyber attacks can take a toll on production systems. Weaponized cyber attacks can even destroy industrial machines by messing with hardware controls. For instance, the Stuxnet cyberattack against Iran's nuclear facility led to the partial destruction of the facility. This shows the affect that an attack can have even behind highly secured facilities. With the looming cyber warfare and worsening political tensions between countries, it can only be feared that there will be a wave of cyber attacks targeted at key players in the industrial sector. There has been a radical shift in hacking tendencies in that hackers are no longer just looking to embezzle funds or extort money from companies. Instead, they are causing maximum damage by attacking automated processes and systems that control production machines. Cyber attacks are heading into a dangerous phase where they are able to be weaponized by competitors or enemy states, enabling them to cause physical damage and even the loss of life. There are fears that some states already have the capabilities to take over smart grids and traffic lights in US cities. ISIS, a terrorist group, was once also reported to be trying to hack into the US energy grid. In any case, production losses are moving to new heights and are becoming more costly. A ransomware attack in 2016 called WannaCry was able to encrypt many computers used in industrial processes. Some hospitals were affected and critical computers, such as those used to maintain life support systems or schedule operations in the healthcare facilities, were no longer usable. This led to the ultimate loss: human life. Other far-reaching impacts are environmental impacts, regulatory risks, and criminal liability on the side of the victim. Economic losses Cybercrime has become an economic disaster in many countries. It is estimated that at least $600,000,000,000 is drained from the global economy through cybercrime annually. This is quite a huge figure and its impact is already being felt. $600,000,000,000 is an enormous figure and the loss of this has affected many factors, including jobs. Cybercrime is hurting the economy and, in turn, hurting the job market (https://www.zdnet.com/article/cybercrime-drains-600-billion-a-year-from-the-global-economy-says-report/): Global businesses are losing the equivalent of nearly 1% of global Gross Domestic Product (GDP) a year to cybercrime, and it's impacting job creation, innovation, and economic growth. So says a report from cybersecurity firm McAfee and the Center for Strategic and International Studies (CSIS), which estimates that cybercrime costs the global economy $600,000,000,000 a year—up from a 2014 study which put the figure at $445,000,000,000. Companies are being targeted with industrial espionage and their business secrets are being stolen by overseas competitors. In the long run, companies have been facing losses due to flooding of markets with similar but cheap and substandard products. This has forced companies that were once growing fast, opening multiple branches, and hiring thousands, to start downsizing and retrenching their employees. In the US, it's estimated that cybercrime has already caused the loss of over 200,000 jobs. The loss of jobs and the drainage of money from a country's economy has made cyber crime a major concern globally. However, it might be too late for the loss to be averted. It's said that many industries have already had their business secrets stolen. In the US, it's estimated that a large number of organizations are those that are not aware of having been breached and their business secrets stolen. Therefore, the economic loss might continue for a while. In 2015, then US president Barack Obama agreed to a digital truce to put an end to the hacking of companies for trade secrets because US companies were losing too much data. The following is a snippet from the BBC (https://www.bbc.co.uk/news/world-asia-china-34360934) about the agreement between Xi and Obama: US President Barack Obama and Chinese President Xi Jinping have said they will take new steps to address cybercrime. Speaking at a joint news conference at the White House, Mr. Obama said they had agreed that neither country would engage in cyber economic espionage. Political tensions with China due to Donald Trump's presidency are threatening this truce, and an increase in hacking could occur against US companies if these tensions run too high. Unlike Obama, Trump is taking on China head-on and has been hinting at retaliatory moves, such as cutting off China's tech companies, such as Huawei, from the US market. The US arrests of Huawei employees are likely to cause retaliatory attacks from the Chinese; China may hack more US companies and, ultimately, the two countries might enter into a cyber war. Damaged brand and reputation An organization will spend a lot of money on building its brand in order to keep a certain market share and also to keep investors satisfied. Without trusted brand names, some companies could fall into oblivion. Cyber attacks tend to attract negative press and this leads to damaging a company's brand and reputation. Investors are put in a frenzy of selling their shares to prevent further loss in value. Shareholders that are left holding onto their shares are unsure whether they will ever recover the money trapped in their shares. Consequently, customers stop trusting the victim company's goods and services. Competitors then take advantage of the situation and intensify marketing in order to win over the customers and investors of the victim company. This could happen within a day or a week due to an unpreventable cyber attack. Investors will always want to keep their money with companies that they trust, and customers will always want to buy from companies that they trust. When a cyber attack breaks this trust, both investors and customers run away. Damage to a brand is very costly. A good example is Yahoo, where, after three breaches, Verizon purchased the company for $4,000,000,000 less than the amount offered in the previous year, before the hacks were public knowledge. Therefore, in a single company, almost $4,000,000,000 was lost due to the brand-damaging effects of a cyber attack. The class-action law suits against Yahoo also contributed to its lower valuation. Loss of data Despite the benefits, organizations are said to have been sluggishly adopting cloud-based services due to security fears. Those that have bought into the idea of the cloud have mostly done this halfway, not risking their mission-critical data to cloud vendors. Many organizations spend a lot of resources on protecting their systems and networks from the potential loss of data. The reason that they go through all of this trouble is so that they don't lose their valuable data, such as business secrets. If a hacker were to discover the secret code used to securely unlock iPhones, they could make a lot of money selling that code to underground markets. This is because such information is of high value to a point where Apple was unwilling to give authorities a code to compromise the lock protection and aid with the investigations of terrorists. It wasn't because Apple isn't supportive of the war against terrorism; it was instead a decision made to protect all Apple users. Here is a snippet from an article (https://www.theguardian.com/technology/2016/feb/22/tim-cook-apple-refusal-unlock-iphone-fbi-civil-liberties) on Apple's refusal to unlock an iPhone for the FBI: "Apple boss Tim Cook told his employees on Monday that the company's refusal to cooperate with a US government to unlock an iPhone used by Syed Farook, one of the two shooters in the San Bernardino attack, was a defense of civil liberties." No company will trust a third party with such sensitive information. With Apple, if a hacker were to steal documentation relating to the safety measures in Apple devices and their shortcomings, the company would face a fall in share prices and a loss of customers. The loss of data is even more sensitive in institutions that offer far more sensitive services. For instance, in June 2018, it was reported that a US Navy contractor lost a large amount of data to hackers. Among the sensitive data stolen were sensitive details about undersea warfare, plans of supersonic anti-ship missiles, and other armament and defense details of US ships and submarines. Fines, penalties, and litigations The loss of data in any cyber attack is recovered by all organizations, particularly if the data lost is sensitive in nature. The loss of health, personal, and financial data will cause a company agony when it considers the consequences that will follow. The loss of these types of data comes with many more losses, in the form of fines, penalties, and litigations. If a company is hacked, instead of receiving consolation, it's dragged into court cases and slapped with heavy fines and penalties. Several regulations have been put in place to ensure the protection of sensitive, personally-identifiable information (PII) by the organizations that collect them. This is due to the impact of the theft of such information. The demand for PII is on the rise on the dark web. This is because PII is valuable in different aspects. If, for instance, hackers were to discover that some of the data stolen from a hospital included the health information of a politician, they could use this data to extort huge amounts of money from the politician. In another scenario, hackers can use PII to social engineer the owners. Armed with personal details, such as name, date of birth, real physical address and current contact details, it's very easy for a skilled social engineer to scam a target. This is part of the reason why governments have ensured that there are very tough laws to protect PII. Losses due to recovery techniques After an attack, an organization will have to do everything it can to salvage itself. The aftermath of a serious attack is not pretty, and lots of funds have to be used to clean up the mess created by the hackers. Some companies prefer to do a complete audit of their information systems to find out the exact causes or influential factors in the attack. Post-breach activities, such as IT audits, can unearth important information that can be used to prevent the same type of attack from being executed. Some companies prefer to pay for digital forensics experts to identify the cause of an attack as well as track the hackers or the data and money stolen. Digital forensics is sometimes even able to recover some of the lost assets or funds. For instance, Ubiquiti Networks was hacked in 2015 and $46,000,000 was stolen through social engineering. Using digital forensics, $8,000,000 was recovered in one of the overseas accounts that the hackers requested the money be sent to. Sometimes all the stolen money can be recovered, but in most instances, that's not the case. The following is an article on the recovery of $8,100,000 by Ubiquiti Networks after an attack that stole $46,000,000: "The incident involved employee impersonation and fraudulent requests from an outside entity targeting the Company's finance department. This fraud resulted in transfers of funds aggregating $46,700,000 held by a Company subsidiary incorporated in Hong Kong to other overseas accounts held by third parties. "Ubiquiti says it has so far managed to recover $8,100,000 of the lost funds, and it expects to regain control of another $6,800,000. The rest? Uncertain." In short, the costs associated with a cyber attack are high and charges can even continue for several years after the actual attack happens. The current estimate of each attack being around $3,000,000 per victim organization is a mere statistic. Individual companies suffer huge losses. However, the costs associated with cybersecurity are not solely tied to the negative aftermath of an attack. Cybersecurity products are added, but necessary, expenditure for organizations. Analysts say that 75% of cyber attacks happen to people or organizations that don't have any cybersecurity products. If you've enjoyed reading this article, head over to the book Hands-On Cybersecurity for Finance to know more about the different types of threat actor groups and their motivations. Defensive Strategies Industrial Organizations Can Use Against Cyber Attacks Hydro cyber attack shuts down several metal extrusion plants 5 nation joint Activity Alert Report finds most threat actors use publicly available tools for cyber attacks
Read more
  • 0
  • 0
  • 37954

article-image-knowing-the-threat-actors-behind-a-cyber-attack
Savia Lobo
30 Mar 2019
7 min read
Save for later

Knowing the threat actors behind a cyber attack

Savia Lobo
30 Mar 2019
7 min read
This article is an excerpt taken from the book, Hands-On Cybersecurity for Finance written by Dr. Erdal Ozkaya and Milad Aslaner. In this book you will learn how to successfully defend your system against common cyber threats, making sure your financial services are a step ahead in terms of security. In this article, you will understand the different types of threat actor groups and their motivations. The attackers behind cyber attacks can be classified into the following categories: Cybercriminals Cyber terrorists Hacktivists "What really concerns me is the sophistication of the capability, which is becoming good enough to really threaten parts of our critical infrastructure, certainly in the financial, banking sector."– Director of Europol Robert Wainwright Hacktivism Hacktivism, as defined by the Industrial Control Systems Cyber Emergency Response Team (ICS-CERT), refers to threat actors that depend on propaganda rather than damage to critical infrastructures. Their goal is to support their own political agenda, which varies between anti-corruption, religion, environmental, or anti-establishment concerns. Their sub-goals are propaganda and causing damage to achieve notoriety for their cause. One of the most prominent hacktivist threat actor groups is Anonymous. Anonymous is known primarily for their distributed denial of service (DDoS) attacks on governments and the Church of Scientology. The following screenshot shows "the man without a head," which is commonly used by Anonymous as their emblem: Hacktivists target companies and governments based on the organization's mission statement or ethics. Given that the financial services industry is responsible for economic wealth, they tend to be a popular target for hacktivists. The ideologies held by hacktivists can vary, but at their core, they focus on bringing attention to social issues such as warfare or what they consider to be illegal activities. To spread their beliefs, they choose targets that allow them to spread their message as quickly as possible. The primary reason why hacktivists are choosing organizations in the financial services industry sector is that these organizations typically have a large user base, allowing them to raise the profile of their beliefs very quickly once they have successfully breached the organization's security controls. Case study – Dakota Access Pipeline The Dakota Access Pipeline (DAPL) was a 2016 construction of a 1.172-mile-long pipeline that spanned three states in the US. Native American tribes were protesting against the DAPL because of the fear that it would damage sacred grounds and drinking water. Shortly after the protests began, the hacktivist group Anonymous publicly announced their support under the name OpNoDAPL. During the construction, Anonymous launched numerous DDoS attacks against the organizations involved in the DAPL. Anonymous leaked the personal information of employees that were responsible for the DAPL and threatened that this would continue if they did not quit. The following screenshot shows how this attack spread on Twitter: Case study – Panama Papers In 2015, an offshore law firm called Mossack Fonseca had 11.5 million of their documents leaked. These documents contained confidential financial information for more than 214,488 offshore entities under what was later known as the Panama Papers. In the leaked documents, several national leaders, politicians, and industry leaders were identified, including a trail to Vladimir Putin. The following diagram shows how much was exposed as part of this attack:   While there is not much information available on how the cyber attack occurred, various security researchers have analyzed the operation. According to the WikiLeaks post, which claims to show a client communication from Mossack Fonseca, they confirm that there was a breach of their "email server". Considering the size of the data leak, it is believed that a direct attack occurred on the email servers. Cyber terrorists Extremist and terrorist organizations such as Al Qaeda and Islamic State of Iraq and Syria (ISIS) are using the internet to distribute their propaganda, recruiting new terrorists and communicating via this medium. An example of this is the 2008 attack in Mumbai, in which one of the gunmen confirmed that they used Google Earth to familiarize themselves with the locations of buildings. Cyber terrorism is an extension of traditional terrorism in cyber space. Case study – Operation Ababil In 2012, the Islamic group Izz ad-Din al-Qassam Cyber Fighters—which is a military wing of Hamas—attacked a series of American financial institutions. On September 18th 2012, this threat actor group confirmed that they were behind the cyber attack and justified it due to the relationship of the United States government with Israel. They also claimed that this was a response to the Innocence of Muslims video released by the American pastor Terry Jones. As part of a DDoS attack, they targeted the New York Stock Exchange as well as banks such as J.P. Morgan Chase. Cyber criminals Cyber criminals are either individuals or groups of hackers who use technology to commit crimes in the digital world. The primary driver of cyber criminals is financial gain and/or service disruption. Cyber criminals use computers in three broad ways: Select computers as their target: These criminals attack other people's computers to perform malicious activities, such as spreading viruses, data theft, identity theft, and more. Use computers as their weapon: They use computers to carry out "conventional crime", such as spam, fraud, illegal gambling, and more. Use computers as an accessory: They use computers to save stolen or illegal data. The following provides the larger picture so we can understand how Cyber Criminals has penetrated into the finance sector and wreaked havoc: Becky Pinkard, vice president of service delivery and intelligence at Digital Shadows Ltd, states that "Attackers can harm the bank by adding or subtracting a zero with every balance, or even by deleting entire accounts". Case study – FIN7 On August 1st 2018, the United States District Attorney's Office for the Western District of Washington announced the arrest of several members of the cyber criminal organization FIN7, which had been tracked since 2015. To this date, security researchers believe that FIN7 is one of the largest threat actor groups in the financial services industry. Combi Security is a FIN7 shelf company. The screenshot presented here shows a phishing email sent by FIN7 to victims claiming it was sent by the US Food and Drug Administration (FDA) Case study – Carbanak APT Attack Carbanak is an advanced persistent threat (APT) attack that is believed to have been executed by the threat actor group Cobalt Strike Group in 2014. In this operation, the threat actor group was able to generate a total financial loss for victims of more than 1 billion US dollars. The following depicts how the Carbanak cyber-gang stole $1bn by targeting a bank: Case study – OurMine operation In 2016, the threat actor group OurMine, who are suspected to operate in Saudi Arabia, conducted a DDoS attack against HSBC's websites, hosted in the USA and UK. The following screenshot shows the communication by the threat actor: The result of the DDoS attack was that HSBC websites for the US and the UK were unavailable. The following screenshot shows the HSBC USA website after the DDoS attack: Summary The financial services industry is one of the most popular victim industries for cybercrime. Thus, in this article, you learned about different threat actor groups and their motivations. It is important to understand these in order to build and execute a successful cybersecurity strategy.  Head over to the book, Hands-On Cybersecurity for Finance, to know more about the costs associated with cyber attacks and cybersecurity. RSA Conference 2019 Highlights: Top 5 cybersecurity products announced Security experts, Wolf Halton and Bo Weaver, discuss pentesting and cybersecurity [Interview] Hackers are our society’s immune system – Keren Elazari on the future of Cybersecurity
Read more
  • 0
  • 0
  • 16138

article-image-why-did-mcdonalds-acqui-hire-300-million-machine-learning-startup-dynamic-yield
Fatema Patrawala
29 Mar 2019
7 min read
Save for later

Why did McDonalds acqui-hire $300 million machine learning startup, Dynamic Yield?

Fatema Patrawala
29 Mar 2019
7 min read
Mention McDonald’s to someone today, and they're more likely to think about Big Mac than Big Data. But that could soon change. As the fast-food giant embraced machine learning, with plans to become a tech-innovator in a fittingly super-sized way. McDonald's stunned a lot of people when it announced its biggest acquisition in 20 years, one that reportedly cost it over $300 million. It plans to acquire Dynamic Yield, a New York based startup that provides retailers with algorithmically driven "decision logic" technology. When you add an item to an online shopping cart, “decision logic” is the tech that nudges you about what other customers bought as well. Dynamic Yield’s client list includes blue-chip retail clients like Ikea, Sephora, and Urban Outfitters. McDonald’s vetted around 30 firms offering similar personalization engine services, and landed on Dynamic Yield. It has been recently valued in the hundreds of millions of dollars; people familiar with the details of the McDonald’s offer put it at over $300 million. This makes the company's largest purchase as per a tweet by the McDonald’s CEO Steve Easterbrook. https://twitter.com/SteveEasterbrk/status/1110313531398860800 The burger giant can certainly afford it; in 2018 alone it tallied nearly $6 billion of net income, and ended the year with a free cash flow of $4.2 billion. McDonalds, a food-tech innovator from the start Over the last several years, McDonalds has invested heavily in technology by bringing stores up to date with self-serve kiosks. The company also launched an app and partnered with Uber Eats in that time, in addition to a number of infrastructure improvements. It even relocated its headquarters less than a year ago from the suburbs to Chicago’s vibrant West Town neighborhood, in a bid to attract young talent. Collectively, McDonald’s serves around 68 million customers every single day. And the majority of those people are at their drive-thru window who never get out of their car, instead place and pick up their orders from the window. And that’s where McDonalds is planning to deploy Dynamic Yield tech first. “What we hadn’t done is begun to connect the technology together, and get the various pieces talking to each other,” says Easterbrook. “How do you transition from mass marketing to mass personalization? To do that, you’ve really got to unlock the data within that ecosystem in a way that’s useful to a customer.” Here’s what that looks like in practice: When you drive up to place your order at a McDonald’s today, a digital display greets you with a handful of banner items or promotions. As you inch up toward the ordering area, you eventually get to the full menu. Both of these, as currently implemented, are largely static, aside from the obvious changes like rotating in new offers, or switching over from breakfast to lunch. But in a pilot program at a McDonald’s restaurant in Miami, powered by Dynamic Yield, those displays have taken on new dexterity. In the new McDonald’s machine-learning paradigm, that particular display screen will show customers what other items have been popular at that location, and prompt them with potential upsells. Thanks for your Happy Meal order; maybe you’d like a Sprite to go with it. “We’ve never had an issue in this business with a lack of data,” says Easterbrook. “It’s drawing the insight and the intelligence out of it.” Revenue aspects likely to double with the acquisition McDonald’s hasn’t shared any specific insights gleaned so far, or numbers around the personalization engine’s effect on sales. But it’s not hard to imagine some of the possible scenarios. If someone orders two Happy Meals at 5 o’clock, for instance, that’s probably a parent ordering for their kids; highlight a coffee or snack for them, and they might decide to treat themselves to a pick-me-up. And as with any machine-learning system, the real benefits will likely come from the unexpected. While customer satisfaction may be the goal, the avenues McDonald’s takes to get there will increase revenues along the way. Customer personalization is another goal to achieve As you may think, McDonald’s didn’t spend over $300 million on a machine-learning company to only juice up its drive-thru sales. An important part is to figure how to leverage the “personalization” part of a personalization engine. Fine-tuned insights at the store level are one thing, but Easterbrook envisions something even more granular. “If customers are willing to identify themselves—there’s all sorts of ways you can do that—we can be even more useful to them, because now we call up their favorites,” according to Easterbrook, who stresses that privacy is paramount. As for what form that might ultimately take, Easterbrook raises a handful of possibilities. McDonald’s already uses geofencing around its stores to know when a mobile app customer is approaching and prepare their order accordingly. On the downside of this tech integration When you know you have to change so much in your company, it's easy to forget some of the consequences. You race to implement all new things in tech and don't adequately think about what your employees might think of it all. This seems to be happening to McDonald's. As the fast-food chain tries to catch up to food trends that have been established for some time, their employees seem to be not happy about the fact. As Bloomberg reports, the more McDonald's introduces, fresh beef, touchscreen ordering and delivery, the more its employees are thinking: "This is all too much work." One of the employees at the McDonalds franchisee revealed at the beginning of this year. "Employee turnover is at an all-time high for us," he said, adding "Our restaurants are way too stressful, and people do not want to work in them." Workers are walking away rather than dealing with new technologies and menu options. The result: customers will wait longer. Already, drive-through times at McDonald’s slowed to 239 seconds last year -- more than 30 seconds slower than in 2016, according to QSR magazine. Turnover at U.S. fast-food restaurants jumped to 150% meaning a store employing 20 workers would go through 30 in one year. Having said that it does not come to us as a surprise that McDonalds on Tuesday announced to the National Restaurant Association that it will no longer participate in lobby efforts against minimum-wage hikes at the federal, state or local level. It does makes sense when they are already paying low wages and an all time high attrition rate hail as a bigger problem. Of course, technology is supposed to solve all the world's problems, while simultaneously eliminating the need for many people. Looks like McDonalds has put all its eggs in the machine learning and automation basket. Would it not be a rich irony, if people saw technology being introduced and walked out, deciding it was all too much trouble for just a burger? 25 Startups using machine learning differently in 2018: From farming to brewing beer to elder care An AI startup now wants to monitor your kids’ activities to help them grow ‘securly’ Microsoft acquires AI startup Lobe, a no code visual interface tool to build deep learning models easily
Read more
  • 0
  • 0
  • 12056
article-image-brett-lantz-on-implementing-a-decision-tree-using-c5-0-algorithm-in-r
Packt Editorial Staff
29 Mar 2019
9 min read
Save for later

Brett Lantz on implementing a decision tree using C5.0 algorithm in R

Packt Editorial Staff
29 Mar 2019
9 min read
Decision tree learners are powerful classifiers that utilize a tree structure to model the relationships among the features and the potential outcomes. This structure earned its name due to the fact that it mirrors the way a literal tree begins at a wide trunk and splits into narrower and narrower branches as it is followed upward. In much the same way, a decision tree classifier uses a structure of branching decisions that channel examples into a final predicted class value. In this article, we demonstrate the implementation of decision tree using C5.0 algorithm in R. This article is taken from the book, Machine Learning with R, Fourth Edition written by Brett Lantz. This 10th Anniversary Edition of the classic R data science book is updated to R 4.0.0 with newer and better libraries. This book features several new chapters that reflect the progress of machine learning in the last few years and help you build your data science skills and tackle more challenging problems There are numerous implementations of decision trees, but the most well-known is the C5.0 algorithm. This algorithm was developed by computer scientist J. Ross Quinlan as an improved version of his prior algorithm, C4.5 (C4.5 itself is an improvement over his Iterative Dichotomiser 3 (ID3) algorithm). Although Quinlan markets C5.0 to commercial clients (see http://www.rulequest.com/ for details), the source code for a single-threaded version of the algorithm was made public, and has therefore been incorporated into programs such as R. The C5.0 decision tree algorithm The C5.0 algorithm has become the industry standard for producing decision trees because it does well for most types of problems directly out of the box. Compared to other advanced machine learning models, the decision trees built by C5.0 generally perform nearly as well but are much easier to understand and deploy. Additionally, as shown in the following table, the algorithm's weaknesses are relatively minor and can be largely avoided. Strengths An all-purpose classifier that does well on many types of problems. Highly automatic learning process, which can handle numeric or nominal features, as well as missing data. Excludes unimportant features. Can be used on both small and large datasets. Results in a model that can be interpreted without a mathematical background (for relatively small trees). More efficient than other complex models. Weaknesses Decision tree models are often biased toward splits on features having a large number of levels. It is easy to overfit or underfit the model. Can have trouble modeling some relationships due to reliance on axis-parallel splits. Small changes in training data can result in large changes to decision logic. Large trees can be difficult to interpret and the decisions they make may seem counterintuitive. To keep things simple, our earlier decision tree example ignored the mathematics involved with how a machine would employ a divide and conquer strategy. Let's explore this in more detail to examine how this heuristic works in practice. Choosing the best split The first challenge that a decision tree will face is to identify which feature to split upon. In the previous example, we looked for a way to split the data such that the resulting partitions contained examples primarily of a single class. The degree to which a subset of examples contains only a single class is known as purity, and any subset composed of only a single class is called pure. There are various measurements of purity that can be used to identify the best decision tree splitting candidate. C5.0 uses entropy, a concept borrowed from information theory that quantifies the randomness, or disorder, within a set of class values. Sets with high entropy are very diverse and provide little information about other items that may also belong in the set, as there is no apparent commonality. The decision tree hopes to find splits that reduce entropy, ultimately increasing homogeneity within the groups. Typically, entropy is measured in bits. If there are only two possible classes, entropy values can range from 0 to 1. For n classes, entropy ranges from 0 to log2(n). In each case, the minimum value indicates that the sample is completely homogenous, while the maximum value indicates that the data are as diverse as possible, and no group has even a small plurality. In mathematical notion, entropy is specified as: In this formula, for a given segment of data (S), the term c refers to the number of class levels, and pi  refers to the proportion of values falling into class level i. For example, suppose we have a partition of data with two classes: red (60 percent) and white (40 percent). We can calculate the entropy as: > -0.60 * log2(0.60) - 0.40 * log2(0.40) [1] 0.9709506 We can visualize the entropy for all possible two-class arrangements. If we know the proportion of examples in one class is x, then the proportion in the other class is (1 – x). Using the curve() function, we can then plot the entropy for all possible values of x: > curve(-x * log2(x) - (1 - x) * log2(1 - x),     col = "red", xlab = "x", ylab = "Entropy", lwd = 4) This results in the following figure: The total entropy as the proportion of one class varies in a two-class outcome As illustrated by the peak in entropy at x = 0.50, a 50-50 split results in the maximum entropy. As one class increasingly dominates the other, the entropy reduces to zero. To use entropy to determine the optimal feature to split upon, the algorithm calculates the change in homogeneity that would result from a split on each possible feature, a measure known as information gain. The information gain for a feature F is calculated as the difference between the entropy in the segment before the split (S1) and the partitions resulting from the split (S2): One complication is that after a split, the data is divided into more than one partition. Therefore, the function to calculate Entropy(S2) needs to consider the total entropy across all of the partitions. It does this by weighting each partition's entropy according to the proportion of all records falling into that partition. This can be stated in a formula as: In simple terms, the total entropy resulting from a split is the sum of entropy of each of the n partitions weighted by the proportion of examples falling in the partition (wi). The higher the information gain, the better a feature is at creating homogeneous groups after a split on that feature. If the information gain is zero, there is no reduction in entropy for splitting on this feature. On the other hand, the maximum information gain is equal to the entropy prior to the split. This would imply the entropy after the split is zero, which means that the split results in completely homogeneous groups. The previous formulas assume nominal features, but decision trees use information gain for splitting on numeric features as well. To do so, a common practice is to test various splits that divide the values into groups greater than or less than a threshold. This reduces the numeric feature into a two-level categorical feature that allows information gain to be calculated as usual. The numeric cut point yielding the largest information gain is chosen for the split. Note: Though it is used by C5.0, information gain is not the only splitting criterion that can be used to build decision trees. Other commonly used criteria are Gini index, chi-squared statistic, and gain ratio. For a review of these (and many more) criteria, refer to An Empirical Comparison of Selection Measures for Decision-Tree Induction, Mingers, J, Machine Learning, 1989, Vol. 3, pp. 319-342. Pruning the decision tree As mentioned earlier, a decision tree can continue to grow indefinitely, choosing splitting features and dividing into smaller and smaller partitions until each example is perfectly classified or the algorithm runs out of features to split on. However, if the tree grows overly large, many of the decisions it makes will be overly specific and the model will be overfitted to the training data. The process of pruning a decision tree involves reducing its size such that it generalizes better to unseen data. One solution to this problem is to stop the tree from growing once it reaches a certain number of decisions or when the decision nodes contain only a small number of examples. This is called early stopping or prepruning the decision tree. As the tree avoids doing needless work, this is an appealing strategy. However, one downside to this approach is that there is no way to know whether the tree will miss subtle but important patterns that it would have learned had it grown to a larger size. An alternative, called post-pruning, involves growing a tree that is intentionally too large and pruning leaf nodes to reduce the size of the tree to a more appropriate level. This is often a more effective approach than prepruning because it is quite difficult to determine the optimal depth of a decision tree without growing it first. Pruning the tree later on allows the algorithm to be certain that all of the important data structures were discovered. Note: The implementation details of pruning operations are very technical and beyond the scope of this book. For a comparison of some of the available methods, see A Comparative Analysis of Methods for Pruning Decision Trees, Esposito, F, Malerba, D, Semeraro, G, IEEE Transactions on Pattern Analysis and Machine Intelligence, 1997, Vol. 19, pp. 476-491. One of the benefits of the C5.0 algorithm is that it is opinionated about pruning—it takes care of many of the decisions automatically using fairly reasonable defaults. Its overall strategy is to post-prune the tree. It first grows a large tree that overfits the training data. Later, the nodes and branches that have little effect on the classification errors are removed. In some cases, entire branches are moved further up the tree or replaced by simpler decisions. These processes of grafting branches are known as subtree raising and subtree replacement, respectively. Getting the right balance of overfitting and underfitting is a bit of an art, but if model accuracy is vital, it may be worth investing some time with various pruning options to see if it improves the test dataset performance. To summarize , decision trees are widely used due to their high accuracy and ability to formulate a statistical model in plain language.  Here, we looked at a highly popular and easily configurable decision tree algorithm C5.0. The major strength of the C5.0 algorithm over other decision tree implementations is that it is very easy to adjust the training options. Harness the power of R to build flexible, effective, and transparent machine learning models with Brett Lantz’s latest book Machine Learning with R, Fourth Edition. Dr.Brandon explains Decision Trees to Jon Building a classification system with Decision Trees in Apache Spark 2.0 Implementing Decision Trees
Read more
  • 0
  • 0
  • 48067

article-image-amazon-joins-nsf-funding-fairness-ai-public-outcry-big-tech-ethicswashing
Sugandha Lahoti
27 Mar 2019
5 min read
Save for later

Amazon joins NSF in funding research exploring fairness in AI amidst public outcry over big tech #ethicswashing

Sugandha Lahoti
27 Mar 2019
5 min read
Behind the heels of Stanford’s HCAI Institute ( which, mind you, received public backlash for non-representative faculty makeup). Amazon is collaborating with the National Science Foundation (NSF) to develop systems based on fairness in AI. The company will be investing $10M each in artificial intelligence research grants over a three-year period. The official announcement was made by Prem Natarajan, VP of natural understanding in the Alexa AI group, who wrote in a blog post “With the increasing use of AI in everyday life, fairness in artificial intelligence is a topic of increasing importance across academia, government, and industry. Here at Amazon, the fairness of the machine learning systems we build to support our businesses is critical to establishing and maintaining our customers’ trust.” Per the blog post, Amazon will be collaborating with NSF to build trustworthy AI systems to address modern challenges. They will explore topics of transparency, explainability, accountability, potential adverse biases and effects, mitigation strategies, validation of fairness, and considerations of inclusivity. Proposals will be accepted from March 26 until May 10, to result in new open source tools, publicly available data sets, and publications. The two organizations plan to continue the program with calls for additional proposals in 2020 and 2021. There will be 6 to 9 awards of type Standard Grant or Continuing Grant. The award size will be $750,000 - up to a maximum of $1,250,000 for periods of up to 3 years. The anticipated funding amount is $7,600,000. “We are excited to announce this new collaboration with Amazon to fund research focused on fairness in AI,” said Jim Kurose, NSF's head for Computer and Information Science and Engineering. “This program will support research related to the development and implementation of trustworthy AI systems that incorporate transparency, fairness, and accountability into the design from the beginning.” The insidious nexus of private funding in public research: What does Amazon gain from collab with NSF? Amazon’s foray into fairness system looks more of a publicity stunt than eliminating AI bias. For starters, Amazon said that they will not be making the award determinations for this project. NSF would solely be awarding in accordance with its merit review process. However, Amazon said that Amazon researchers may be involved with the projects as an advisor only at the request of an awardee, or of NSF with the awardee's consent. As advisors, Amazon may host student interns who wish to gain further industry experience, which seems a bit dicey. Amazon will also not participate in the review process or receive proposal information. NSF will only be sharing with Amazon summary-level information that is necessary to evaluate the program, specifically the number of proposal submissions, number of submitting organizations, and numbers rated across various review categories. There was also the question of who exactly is funding since VII.B section of the proposal states: "Individual awards selected for joint funding by NSF and Amazon will be   funded through separate NSF and Amazon funding instruments." https://twitter.com/nniiicc/status/1110335108634951680 https://twitter.com/nniiicc/status/1110335004989521920 Nic Weber, the author of the above tweets and Assistant Professor at UW iSchool, also raises another important question: “Why does Amazon get to put its logo on a national solicitation (for a paltry $7.6 million dollars in basic research) when it profits in the multi-billions off of AI that is demonstrably unfair and harmful.” Twitter was abundant with tweets from those in working tech questioning Amazon’s collaboration. https://twitter.com/mer__edith/status/1110560653872373760 https://twitter.com/patrickshafto/status/1110748217887649793 https://twitter.com/smunson/status/1110657292549029888 https://twitter.com/haldaume3/status/1110697325251448833 Amazon has already been under the fire due to its controversial decisions in the recent past. In June last year, when the US Immigration and Customs Enforcement agency (ICE) began separating migrant children from their parents, Amazon came under fire as one of the tech companies that aided ICE with the software required to do so. Amazon has also faced constant criticisms since the news came that Amazon had sold its facial recognition product Rekognition to a number of law enforcement agencies in the U.S. in the first half of 2018. Amazon is also under backlash after a study by the Massachusetts Institute of Technology in January, found Amazon Rekognition incapable of reliably determining the sex of female and darker-skinned faces in certain scenarios. Amazon is yet to fix this AI-bias anomaly, and yet it has now started a new collaboration with NSF that ironically focusses on building bias-free AI systems. Amazon’s Ring (a smart doorbell company) also came under public scrutiny in January, after it gave access to its employees to watch live footage from cameras of the customers. In other news, yesterday, Google also formed an external AI advisory council to help advance the responsible development of AI. More details here. Amazon won’t be opening its HQ2 in New York due to public protests Amazon admits that facial recognition technology needs to be regulated Amazon’s Ring gave access to its employees to watch live footage of the customers, The Intercept reports
Read more
  • 0
  • 0
  • 24994
Modal Close icon
Modal Close icon