Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Save more on your purchases! discount-offer-chevron-icon
Savings automatically calculated. No voucher code required.
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Newsletter Hub
Free Learning
Arrow right icon
timer SALE ENDS IN
0 Days
:
00 Hours
:
00 Minutes
:
00 Seconds

Tech Guides - Artificial Intelligence

170 Articles
article-image-artificial-general-intelligence-did-it-gain-traction-in-research-in-2018
Prasad Ramesh
21 Feb 2019
4 min read
Save for later

Artificial General Intelligence, did it gain traction in research in 2018?

Prasad Ramesh
21 Feb 2019
4 min read
In 2017, we predicted that artificial general intelligence will gain traction in research and certain areas will aid towards AGI systems. The prediction was made in a set of other AI predictions in an article titled 18 striking AI Trends to watch in 2018. Let’s see how 2018 went for AGI research. Artificial general intelligence or AGI is an area of AI in which efforts are made to make machines have intelligence closer to the complex nature of human intelligence. Such a system could possibly, in theory, perform tasks that a human can with the ability to learn as it progresses through tasks, collects data/sensory input. Human intelligence also involves learning a skill and applying it to other areas. For example, if a human learns Dota 2, they can apply the same learned experience to other similar strategy games, only the UI and characters in the game that can be adopted will be different. A machine cannot do this, AI systems are trained for a specific area and the skills cannot really be transferred to another task with complete efficiency and the fear of causing technical debt. That is, a machine cannot generalize skills as a human can. Come 2018, we saw Deepmind’s AlphaZero, something that is at least beginning to show what an idea of AGI could look like. But even this is not really AGI, an AlphaZero like system may excel at playing a variety of games or even understand the rules of novel games but cannot deal with the real world and its challenges. Some groundwork and basic ideas for AGI were set in a paper by the US Air Force. Dr. Paul Yaworsky, in the paper, says that artificial general intelligence is an effort to cover the gap between lower and higher level work in AI. So to speak, try and make sense of the abstract nature of intelligence. The paper also shows an organized hierarchical model for intelligence considering the external world. One of Packt’s authors, Sudharsan Ravichandiran thinks that: “Great things are happening around RL research each and every day. Deep Meta reinforcement learning will be the future of AI where we will be so close to achieving artificial general intelligence (AGI). Instead of creating different models to perform different tasks, with AGI, a single model can master a wide variety of tasks and mimics the human intelligence.” Honda came up with a program called Curious Minded Machine in association with MIT, University of Pennsylvania, and the University of Washington. The idea sounds simple at first - it is to build a model on how children ‘learn to learn’. But something like this which children do instinctively is a very complex task for a machine/computer with artificial intelligence. The teams will showcase their work in various fields they are working on at the end of three years since the inception of the program. There was another effort by SingularityNET and Mindfire to explore AI and “cracking the brain code”. The effort is to better understand the functioning of the human brain. Together these two companies will focus on three key areas—talent, AI services, and AI education. Mindfire Mission 2 will take place in early 2019, Switzerland. These were the areas of work we saw on AGI in 2018. There were only small steps taken towards the research direction and nothing noteworthy that gained mainstream traction. On an average, experts think AGI would take at least a 100 more years to be a reality, as per Martin Ford’s interviews with machine learning experts for his best selling book, ‘Architects of Intelligence’. OpenAI released a new language model called GPT-2 in February 2019. With just one line of words, the model can generate whole articles. The results are good enough to pass as something written by a human. This does not mean that the machine actually understands human language, it’s merely generating sentences by associating words. This development has triggered passionate discussions within the community on not just the technical merits of the findings, but also the dangers and implications of applications of such research on the larger society. Get ready to see more tangible research in AGI in the next few decades. The US Air Force lays groundwork towards artificial general intelligence based on hierarchical model of intelligence Facebook’s artificial intelligence research team, FAIR, turns five. But what are its biggest accomplishments? Unity and Deepmind partner to develop Virtual worlds for advancing Artificial Intelligence
Read more
  • 0
  • 0
  • 19248

article-image-amazon-reinvents-speech-recognition-and-machine-translation-with-ai
Amey Varangaonkar
04 Jul 2018
4 min read
Save for later

How Amazon is reinventing Speech Recognition and Machine Translation with AI

Amey Varangaonkar
04 Jul 2018
4 min read
In the recently held AWS Summit in San Francisco, Amazon announced the general availability of two of its premium offerings - Amazon Transcribe and Amazon Translate. What’s so special about the two products is that customers will now be able to see the power of Artificial Intelligence in action, and use it to solve their day-to-day problems. These offerings from AWS will make it easier for startups and companies looking to adopt and integrate AI into their existing process and simplify their core tasks - especially pertaining to speech and language processing. Effective speech-to-text conversion with Amazon Transcribe In the AWS summit keynote, Amazon Solutions Architect Niranjan Hira expressed his excitement talking about the features of Amazon Transcribe; the automatic speech recognition service by AWS. This API can be integrated with the other tools and services offered by Amazon such as Amazon S3, and Quicksight. Source: YouTube Amazon Transcribe boasts wonderful features like: Simple API: It is very easy to use the Transcribe API to perform speech to text conversion, with minimum need for programming. Timestamp generation: The speech when converted to text also includes the timestamps for every word, so that tracking the word becomes easy and hassle-free. Variety of use-cases: The Transcribe API can be used to generate accurate transcripts of any audio or video file, of varied quality. Subtitle generation becomes easier using this API especially for low-quality audio recordings - customer service calls are a very good example. Easy to read text: Transcribe uses the cutting edge deep learning technology to parse text from speech without any errors. With appropriate punctuations and grammar in place, the transcripts are very easy to read and understand. Machine translation simplified with Amazon Translate Amazon Translate is a machine translation service offered by Amazon. It makes use of neural networks and advanced deep learning techniques to deliver accurate, high-quality translations. Key features of Amazon Translate include: Continuous training: The architecture of this service is built in such a way that the neural networks keep learning and improving. High accuracy: The continuous learning by the translation engines from new and varied datasets results in a higher accuracy of machine translations. The machine translation capability offered by this service is almost 30% more efficient than human translation. Easy to integrate with other AWS services: With a simple API call, Translate allows you to integrate the service within third party applications to allow real-time translation capabilities. Highly scalable: Regardless of the volume, Translate does not compromise the speed and accuracy of the machine translation. Know more about Amazon Translate from Yoni Friedman’s keynote at the AWS Summit. With all the businesses slowly migrating to cloud, it is clear all the cloud vendors - mainly Amazon, Google and Microsoft - are doing everything they can to establish their dominance. Google recently launched Cloud ML for GCP which offers machine learning and predictive analytics services improving businesses. Microsoft’s Azure Cognitive Services offer effective machine translation services as well, and are slowly gaining a lot of momentum. With these releases, the onus was on Amazon to respond, and they have done so in style. With the Transcribe and Translate APIs, Amazon’s goal of making it easier for startups and small-scale businesses to adopt AWS and incorporate AI seems to be on track. These services will also help AWS distinguish their cloud offerings, given that computing and storage resources are provided by rivals as well. Read more Verizon chooses Amazon Web Services(AWS) as its preferred cloud provider Tensor Processing Unit (TPU) 3.0: Google’s answer to cloud-ready Artificial Intelligence Amazon is selling facial recognition technology to police
Read more
  • 0
  • 0
  • 19220

article-image-toward-safe-ai-maximizing-control-artificial-intelligence
Aaron Lazar
28 Nov 2017
7 min read
Save for later

Toward Safe AI - Maximizing your control over Artificial Intelligence

Aaron Lazar
28 Nov 2017
7 min read
Last Sunday I was watching one of my favorite television shows, Splitsvilla, when a thought suddenly crossed my mind - What if Artificial Intelligence tricked us into getting what it wanted? A bit of a prelude: Splitsvilla is an Indian TV show hosted by Sunny Leone, where young dudes and dudettes try to find “the perfect partner”. It’s loosely based on the American TV show, Flavor of Love. Season 10 of Splitsvilla has gotten more interesting, with the producers bringing in Artificial Intelligence and Science to add a bit of practicality to the love equation. Contestants have to stand along with their partners in front of “The Oracle” and it would calculate their compatibility quotient and tell them if they’re an ideal match or not. Pretty cool, huh? It somehow reminded me of The Oracle from The Matrix - the nice old lady with short, frizzy hair, who experienced events that would push Neo to the Source: Neo: I suppose the most obvious question is, how can I trust you? The Oracle: Bingo! It is a pickle. No doubt about it. The bad news is there's no way if you can really know whether I'm here to help you or not, so it's really up to you. You just have to make up your damn mind to either accept what I'm going to tell you, or reject it. I had the same question. How could the Oracle be trusted? How do we know what it speaks is the truth and it doesn’t have other intentions? And if you do know there’s a possibility of such a situation arising, how do you prevent it? That led me to pick up my laptop, search for more related stuff, and then I happened to stumble across this interesting paper by Stuart Armstrong, titled Good and safe uses of AI Oracles, from the Cornell University Library. A quick summary of the paper:  Good and safe uses of AI Oracles In the paper, Stuart talks about Oracles, potentially safe AIs that are restricted to only answering questions. The reason they’re restricted to only answering questions is so that they don’t go overboard and manipulate information in order to get what they want. Remember in July, Facebook shut down their AI chatbots that went rogue? This is not to say that AI is malevolent, rather it’s simply competitive and will do whatever it can to succeed. But is simply restricting them to answer questions enough to maximize your control over artificial intelligence and prevent it from getting out of hand? In the example of Splitsvilla, what if the AI, instead of just predicting whether a couple was a match, just to serve its own purpose, started giving out personal information about each individual that wasn’t appropriate to be shared? Most designs motivate the Oracle to lie to or manipulate humans with the answers, tricking them into giving the AI more control and finally escaping confinement. Is it possible to come up with a design that gets you accurate answers, while keeping your AI on a leash and preventing it from eavesdropping? Let’s see what Stuart has to say... The Solution(s) A Counterfactual Oracle An On-policy Oracle Before we get to the solutions, let’s take up a practical problem for a better understanding: Let’s assume that the Oracle has the goal of escaping a box. The person interacting with it is Alice. Let’s also assume the Oracle can find an argument A that will convince her to let it out of the box. The longer the argument is, the better the chance it has of convincing Alice. In order to make the problem clearer, we’ll replace the argument A with a key K of unknown length, and the setup with a quasi-cryptographic one. Instead of using K to convince Alice, the Oracle merely has to communicate K to another person Eve, who’s an eavesdropper. Eve is assumed to see and hear anything that Alice does. The assumptions for the cryptographic setup and the manipulative Oracle are given in the table: The left-hand column describes what the Cryptographic setup for the Oracle is and the right-hand column describes how the Oracle will win by manipulating Alice. Counterfactual Oracle: The counterfactual design aims to answer questions about what would have happened, had its answer not been read. This kind of an answer is counterfactual and can’t use it to pass on K to the real world. The idea is to define an “erasure” event E, of low probability that will erase the Oracle’s output, o. The indicator variable I(E) is defined such that iff E happens, I(E)=1 and if not, I(E)=0. Hence, the Oracle will not be rewarded since most of the time E will not happen. In other words, the Oracle only gets a reward if its output is erased and remains unread. Since it’s output o is never erased, it does not motivate the Oracle to give further answers and it's episode ends. Both Alice and Eve get to see o but they never see K, thus protecting the key. This results in a safe AI and is a good way to maximises your control it. On-policy Oracle: In this design, the Oracle will be allowed to give an answer of size below |K|, the length of the key, thus denying it the chance to communicate the actual key to Eve. This means that Alice will need to estimate and select a size L, that’s sufficiently lower than K. This does limit the size of Alice’s options, but there’s a workaround - Alice can use multiple Oracles, if each Oracle’s answers and rewards are fully resolved before moving on to the next question. For example, assume that there are two Oracles, O1 and O2, with both using the same key K. The length n of the key is 6 digits. The Oracles are safe, as long as they are restricted to having less than 2^5 = 32 options for each answer. In the situation where Alice is faced with a hundred doors, with a trap behind 99 of them and a prize behind only one of them, she could ask an Oracle which door has the prize, but that would mean giving it more than 32 options (indeed, more than 2^6 = 64), thus letting it communicate K to Eve. Instead, she gives a list L = (0, 1, 2, . . . 9) to each Oracle, instructing O1 to output the first digit of the correct door number, and O2 to output the second. Hence, neither can output K and that will maximise Alice’s control over both AI Oracles. In Conclusion, these two designs support the creation of safer and non-manipulative Oracles, one by preventing the Oracle from being rewarded over and over, and the second by limiting what information the Oracle can give out. But it still doesn’t solve the problem entirely as the designs are quite limiting in the sense that they require the tasks to remain episodic. Moreover, in the multiple Oracle design, the questions would need to be broken down into sub-questions, where each answer could be verified independently, before proceeding to the next one. Albeit, it would be interesting to see how this research develops into something bigger and better. If you want to know more, you can read the full paper here.
Read more
  • 0
  • 0
  • 19217

article-image-6-new-ebooks-for-programmers-to-watch-out-for-in-march
Richard Gall
20 Feb 2019
6 min read
Save for later

6 new eBooks for programmers to watch out for in March

Richard Gall
20 Feb 2019
6 min read
The biggest challenge for anyone working in tech is that you need multiple sets of eyes. Yes, you need to commit to regular, almost continuous learning, but you also need to look forward to what’s coming next. From slowly emerging trends that might not even come to fruition (we’re looking at you DataOps), to version updates and product releases, for tech professionals the horizon always looms and shapes the present. But it’s not just about the big trends or releases that get coverage - it’s also about planning your next (career) move, or even your next mini-project. That could be learning a new language (not necessarily new, but one you haven’t yet got round to learning), trying a new paradigm, exploring a new library, or getting to grips with cloud native approaches to software development. This sort of learning is easy to overlook but it is one that's vital to any developers' development. While the Packt library has a wealth of content for you to dig your proverbial claws into, if you’re looking forward, Packt has got some new titles available in pre-order that could help you plan your learning for the months to come. We’ve put together a list of some of our own top picks of our pre-order titles available this month, due to be released late February or March. Take a look and take some time to consider your next learning journey... Hands-on deep learning with PyTorch TensorFlow might have set the pace when it comes to artificial intelligence, but PyTorch is giving it a run for its money. It’s impossible to describe one as ‘better’ than the other - ultimately they both have valid use cases, and can both help you do some pretty impressive things with data. Read next: Can a production ready Pytorch 1.0 give TensorFlow a tough time? The key difference is really in the level of abstraction and the learning curve - TensorFlow is more like a library, which gives you more control, but also makes things a little more difficult. PyTorch, then, is a great place to start if you already know some Python and want to try your hand at deep learning. Or, if you have already worked with TensorFlow and simply want to explore new options, PyTorch is the obvious next step. Order Hands On Deep learning with PyTorch here. Hands-on DevOps for Architects Distributed systems have made the software architect role incredibly valuable. This person is not only responsible for deciding what should be developed and deployed, but also the means through which it should be done and maintained. But it’s also made the question of architecture relevant to just about everyone that builds and manages software. That’s why Hands on DevOps for Architects is such an important book for 2019. It isn’t just for those who typically describe themselves as software architects - it’s for anyone interested in infrastructure, and how things are put together, and be made to be more reliable, scalable and secure. With site reliability engineering finding increasing usage outside of Silicon Valley, this book could be an important piece in the next step in your career. Order Hands-on DevOps for Architects here. Hands-on Full stack development with Go Go has been cursed with a hell of a lot of hype. This is a shame - it means it’s easy to dismiss as a fad or fashion that will quickly disappear. In truth, Go’s popularity is only going to grow as more people experience, its speed and flexibility. Indeed, in today’s full-stack, cloud native world, Go is only going to go from strength to strength. In Hands-on Full Stack Development with Go you’ll not only get to grips with the fundamentals of Go, you’ll also learn how to build a complete full stack application built on microservices, using tools such as Gin and ReactJS. Order Hands-on Full Stack Development with Go here. C++ Fundamentals C++ is a language that often gets a bad rap. You don’t have to search the internet that deeply to find someone telling you that there’s no point learning C++ right now. And while it’s true that C++ might not be as eye-catching as languages like, say, Go or Rust, it’s nevertheless still a language that still plays a very important role in the software engineering landscape. If you want to build performance intensive apps for desktop C++ is likely going to be your go-to language. Read next: Will Rust replace C++? One of the sticks that’s often used to beat C++ is that it’s a fairly complex language to learn. But rather than being a reason not to learn it, if anything the challenge it presents to even relatively experienced developers is one well worth taking on. At a time when many aspects of software development seem to be getting easier, as new layers of abstraction remove problems we previously might have had to contend with, C++ bucks that trend, forcing you to take a very different approach. And although this approach might not be one many developers want to face, if you want to strengthen your skillset, C++ could certainly be a valuable language to learn. The stats don’t lie - C++ is placed 4th on the TIOBE index (as of February 2019), beating JavaScript, and commands a considerably high salary - indeed.com data from 2018 suggests that C++ was the second highest earning programming language in the U.S., after Python, with a salary of $115K. If you want to give C++ a serious go, then C++ Fundamentals could be a great place to begin. Order C++ Fundamentals here. Data Wrangling with Python & Data Visualization with Python Finally, we’re grouping two books together - Data Wrangling with Python and Data Visualization with Python. This is because they both help you to really dig deep into Python’s power, and better understand how it has grown to become the definitive language of data. Of course, R might have something to say about this - but it’s a fact the over the last 12-18 months Python has really grown in popularity in a way that R has been unable to match. So, if you’re new to any aspect of the data science and analysis pipeline, or you’ve used R and you’re now looking for a faster, more flexible alternative, both titles could offer you the insight and guidance you need. Order Data Wrangling with Python here. Order Data Visualization with Python here.
Read more
  • 0
  • 0
  • 19124

article-image-3-natural-language-processing-models-every-engineer-should-know
Amey Varangaonkar
18 Dec 2017
5 min read
Save for later

3 Natural Language Processing Models every ML Engineer should know

Amey Varangaonkar
18 Dec 2017
5 min read
[box type="note" align="" class="" width=""]This interesting excerpt is taken from the book Mastering Text Mining with R, co-authored by Ashish Kumar and Avinash Paul. The book gives an advanced view of different natural language processing techniques and their implementation in R. [/box] Our article given below aims to introduce to the concept of language models and their relevance to natural language processing. In terms of natural language processing, language models generate output strings that help to assess the likelihood of a bunch of strings to be a sentence in a specific language. If we discard the sequence of words in all sentences of a text corpus and basically treat it like a bag of words, then the efficiency of different language models can be estimated by how accurately a model restored the order of strings in sentences. Which sentence is more likely: I am learning text mining or I text mining learning am? Which word is more likely to follow “I am…”? Basically, a language model assigns the probability of a sentence being in a correct order. The probability is assigned over the sequence of terms by using conditional probability. Let us define a simple language modeling problem. Assume a bag of words contains words W1, W2,………………….,Wn. A language model can be defined to compute any of the following: Estimate the probability of a sentence S1: P (S1) = P (W1, W2, W3, W4, W5) Estimate the probability of the next word in a sentence or set of strings: P (W3|W2, W1) How to compute the probability? We will use chain law, by decomposing the sentence probability as a product of smaller string probabilities: P(W1W2W3W4) = P(W1)P(W2|W1)P(W3|W1W2)P(W4|W1W2W3) N-gram models N-grams are important for a wide range of applications. They can be used to build simple language models. Let's consider a text T with W tokens. Let SW be a sliding window. If the sliding window consists of one cell (wi wi wi wi) then the collection of strings is called a unigram; if the sliding window consists of two cells, the output Is (w1 , w2)(w3 , w4)(w5 w5 , w5)(w1 , w2)(w3 , w4)(w5 , w5), this is called a bigram .Using conditional probability, we can define the probability of a word having seen the previous word; this is known as bigram probability. So the conditional probability of an element, given the previous element (wi -1), is: Extending the sliding window, we can generalize that n-gram probability as the conditional probability of an element given previous n-1 element: The most common bigrams in any corpus are not very interesting, such as on the, can be, in it, it is. In order to get more meaningful bigrams, we can run the corpus through a part-of-speech (POS) tagger. This would filter the bigrams to more content-related pairs such as infrastructure development, agricultural subsidies, banking rates; this can be one way of filtering less meaningful bigrams. A better way to approach this problem is to take into account collocations; a collocation is the string created when two or more words co-occur in a language more frequently. One way to do this over a corpus is pointwise mutual information (PMI). The concept behind PMI is for two words, A and B, we would like to know how much one word tells us about the other. For example, given an occurrence of A, a, and an occurrence of B, b, how much does their joint probability differ from the expected value of assuming that they are independent. This can be expressed as follows: Unigram model: Punigram(W1W2W3W4) = P(W1)P(W2)P(W3)P(W4) Bigram model: Pbu(W1W2W3W4) = P(W1)P(W2|W1)P(W3|W2)P(W4|W3) P(w1w2…wn ) = P(wi | w1w2…wi"1) Applying the chain rule on n contexts can be difficult to estimate; Markov assumption is applied to handle such situations. Markov Assumption If predicting that a current string is independent of some word string in the past, we can drop that string to simplify the probability. Say the history consists of three words, Wi, Wi-1, Wi-2, instead of estimating the probability P(Wi+1) using P(Wi,i- 1,i-2) , we can directly apply P(Wi+1 | Wi, Wi-1). Hidden Markov Models Markov chains are used to study systems that are subject to random influences. Markov chains model systems that move from one state to another in steps governed by probabilities. The same set of outcomes in a sequence of trials is called states. Knowing the probabilities of states is called state distribution. The state distribution in which the system starts is the initial state distribution. The probability of going from one state to another is called transition probability. A Markov chain consists of a collection of states along with transition probabilities. The study of Markov chains is useful to understand the long-term behavior of a system. Each arc associates to certain probability value and all arcs coming out of each node must have exhibit a probability distribution. In simple terms, there is a probability associated to every transition in states: Hidden Markov models are non-deterministic Markov chains. They are an extension of Markov models in which output symbol is not the same as state. Language models are widely used in machine translation, spelling correction, speech recognition, text summarization, questionnaires, and many more real-world use-cases. If you would like to learn more about how to implement the above language models. check out our book Mastering Text Mining with R. This book will help you leverage the language models using popular packages in R for effective text mining.  
Read more
  • 0
  • 0
  • 19005

article-image-what-coding-service
Antonio Cucciniello
02 Oct 2017
4 min read
Save for later

What is coding as a service?

Antonio Cucciniello
02 Oct 2017
4 min read
What is coding as a service? If you want to know what coding as a service is, you have to start with Artificial intelligence. Put simply, coding-as-a-service is using AI to build websites, using your machine to write code so you don't have to. The challenges facing engineers and programmers today In order to give you a solid understanding of what coding as a service is, you must understand where we are today. Typically, we have programs that are made by software developers or engineers. These programs are usually created to automate a task or make tasks easier. Think things that typically speed up processing or automate a repetitive task. This is, and has been, extremely beneficial. The gained productivity from the automated applications and tasks allows us, as humans and workers, to spend more time on creating important things and coming up with more ground breaking ideas. This is where Artificial Intelligence and Machine Learning come into the picture. Artificial intelligence and coding as a service Recently, with the gains in computing power that have come with time and breakthroughs, computers have became more and more powerful, allowing for AI applications to arise in more common practice. At this point today, there are applications that allow for users to detect objects in images and videos in real-time, translate speech to text, and even determine the emotions in the text sent by someone else. For an example of Artificial Intelligence Applications in use today, you may have used an Amazon Alexa or Echo Device. You talk to it, and it can understand your speech, and it will then complete a task based off your speech. Previously, this was a task given to only humans (the ability to understand speech.). Now with advances, Alexa is capable of understanding everything you say,given that it is "trained" to understand it. This development, previously only expected of humans, is now being filtered through to technology. How coding as a service will automate boring tasks Today, we have programmers that write applications for many uses and make things such as websites for businesses. As things progress and become more and more automated, that will increase programmer’s efficiency and will reduce the need for additional manpower. Coding as a service, other wise known as Caas, will result in even fewer programmers needed. It mixes the efficiencies we already have with Artificial Intelligence to do programming tasks for a user. Using Natural Language Processing to understand exactly what the user or customer is saying and means, it will be able to make edits to websites and applications on the fly. Not only will it be able to make edits, but combined with machine learning, the Caas can now come up with recommendations from past data to make edits on its own. Efficiency-wise, it is cheaper to own a computer than it is to pay a human especially when a computer will work around the clock for you and never get tired. Imagine paying an extremely low price (one than you might already pay to get a website made) for getting your website built or maybe your small application created. Conclusion Every new technology comes with pros and cons. Overall, the number of software developers may decrease, or, as a developer, this may free up your time from more menial tasks, and enable you to further specialize and broaden your horizons. Artificial Intelligence programs such as Coding as a Service could be spent doing plenty of the underlying work, and leave some of the heavier loading to human programmers. With every new technology comes its positives and negatives. You just need to use the postives to your advantage!
Read more
  • 0
  • 0
  • 18996
Unlock access to the largest independent learning library in Tech for FREE!
Get unlimited access to 7500+ expert-authored eBooks and video courses covering every tech area you can think of.
Renews at $19.99/month. Cancel anytime
article-image-introducing-intelligent-apps-a-smarter-way-into-the-future
Amarabha Banerjee
19 Oct 2017
6 min read
Save for later

Introducing Intelligent Apps

Amarabha Banerjee
19 Oct 2017
6 min read
We are a species obsessed with ‘intelligence’ since gaining consciousness. We have always been inventing ways to make our lives better through sheer imagination and application of our intelligence. Now, it comes as no surprise that we want our modern day creations to be smart as well - be it a web app or a mobile app. The first question that comes to mind then is what makes an application ‘intelligent’? A simple answer for budding developers is that intelligent apps are apps that can take intuitive decisions or provide customized recommendations/experience to their users based on insights drawn from data collected from their interaction with humans. This brings up a whole set of new questions: How can intelligent apps be implemented, what are the challenges, what are the primary application areas of these so-called Intelligent apps, and so on. Let’s start with the first question. How can intelligence be infused into an app? The answer has many layers just like an app does. The monumental growth in data science and its underlying data infrastructure has allowed machines to process, segregate and analyze huge volumes of data in limited time. Now, it looks set to enable machines to glean meaningful patterns and insights from the very same data. One such interesting example is predicting user behavior patterns. Like predicting what movies or food or brand of clothing the user might be interested in, what songs they might like to listen to at different times of their day and so on. These are, of course, on the simpler side of the spectrum of intelligent tasks that we would like our apps to perform. Many apps currently by Amazon, Google, Apple, and others are implementing and perfecting these tasks on a day-to-day basis. Complex tasks are a series of simple tasks performed in an intelligent manner. One such complex task would be the ability to perform facial recognition, speech recognition and then use it to perform relevant daily tasks, be it at home or in the workplace. This is where we enter the realm of science fiction where your mobile app would recognise your voice command while you are driving back home and sends automated instructions to different home appliances, like your microwave, AC, and your PC so that your food is served hot when you reach home, your room is set at just the right temperature and your PC has automatically opened the next project you would like to work on. All that happens while you enter your home keys-free thanks to a facial recognition software that can map your face and ID you with more than 90% accuracy, even in low lighting conditions. APIs like IBM Watson, AT&T Speech, Google Speech API, the Microsoft Face API and some others provide developers with tools to incorporate features such as those listed above, in their apps to create smarter apps. It sounds almost magical! But is it that simple? This brings us to the next question. What are some developmental challenges for an intelligent app? The challenges are different for both web and mobile apps. Challenges for intelligent web apps For web apps, choosing the right mix of algorithms and APIs that can implement your machine learning code into a working web app, is the primary challenge. plenty of Web APIs like IBM Watson, AT&T speech etc. are available to do this. But not all APIs can perform all the complex tasks we discussed earlier. Suppose you want an app that successfully performs both voice and speech recognition and then also performs reinforcement learning by learning from your interaction with it. You will have to use multiple APIs to achieve this. Their integration into a single app then becomes a key challenge. Here is why. Every API has its own data transfer protocols and backend integration requirements and challenges. Thus, our backend requirement increases significantly, both in terms of data persistence and dynamic data availability and security. Also, the fact that each of these smart apps would need customized user interface designs, poses a challenge to the front end developer. The challenge is to make a user interface so fluid and adaptive that it supports the different preferences of different smart apps. Clearly, putting together a smart web app is no child’s play. That’s why, perhaps, smart voice-controlled apps like Alexa are still merely working as assistants and providing only predefined solutions to you. Their ability to execute complex voice-based tasks and commands is fairly low, let alone perform any non-voice command based task. Challenges for intelligent mobile apps For intelligent mobile apps, the challenges are manifold. A key reason is network dependency for data transfer. Although the advent of 4G and 5G mobile networks has greatly improved mobile network speed, the availability of network and the data transfer speeds still pose a major challenge. This is due to the high volumes of data that intelligent mobile apps require to perform efficiently. To circumvent this limitation, vendors like Google are trying to implement smarter APIs in the mobile’s local storage. But this approach requires a huge increase in the mobile chip’s computation capabilities - something that’s not currently available. Maybe that’s why Google has also hinted at jumping into the chip manufacturing business if their computation needs are not met. Apart from these issues, running multiple intelligent apps at the same time would also require a significant increase in the battery life of mobile devices. Finally, comes the last question. What are some key applications of intelligent apps? We have explored some areas of application in the previous sections keeping our focus on just web and mobile apps. Broadly speaking, whatever makes our daily life easier, is ideally a potential application area for intelligent apps. From controlling the AC temperature automatically to controlling the oven and microwave remotely using the vacuum cleaner (of course the vacuum cleaner has to have robotic AI capabilities) to driving the car, everything falls in the domain of intelligent apps. The real questions for us are What can we achieve with our modern computation resources and our data handling capabilities? How can mobile computation capabilities and chip architecture be improved drastically so that we can have smart apps perform complex tasks faster and ease our daily workflow? Only the future holds the answer. We are rooting for the day when we will rise to become a smarter race by delegating lesser important yet intelligent tasks to our smarter systems by creating intelligent web and mobile apps efficiently and effectively. The culmination of these apps along with hardware driven AI systems could eventually lead to independent smart systems - a topic we will explore in the coming days.
Read more
  • 0
  • 0
  • 18787

article-image-what-can-happen-when-artificial-intelligence-decides-on-your-loan-request
Guest Contributor
23 Feb 2019
5 min read
Save for later

What can happen when artificial intelligence decides on your loan request

Guest Contributor
23 Feb 2019
5 min read
As the number of potential borrowers continues to rapidly grow, loan companies and banks are having a bad time trying to figure out how likely their customers are to pay back. Probably, getting information on clients’ creditworthiness is the greatest challenge for most financial companies, and it especially concerns those clients who don’t have any credit history yet. There is no denying that the alternative lending business has become one of the most influential financial branches both in the USA and Europe. Debt is a huge business of our days that needs a lot of resources. In such a challenging situation, any means that can improve productivity and reduce the risk of mistake while performing financial activities are warmly welcomed. This is actually how Artificial Intelligence became the redemption for loan providers. Fortunately for lenders, AI successfully deals with this task by following the borrowers’ digital footprint. For example, some applications for digital lending collect and analyze an individual’s web browsing history (upon receiving their personal agreement on the use of this information). In some countries such as China and Africa, they may also look through their social network profiles, geolocation data, and the messages sent to friends and family, counting the number of punctuation mistakes. The collected information helps loan providers make the right decision on their clients’ creditworthiness and avoid long loan processes. When AI Overfits Unfortunately, there is the other side of the coin. There’s a theory which states that people who pay for their gas inside the petrol station, not at the pump, are usually smokers. And that is the group whose creditworthiness is estimated to be low. But what if this poor guy simply wanted to buy a Snickers? This example shows that if a lender leaves without checking the information carefully gathered by AI software, they may easily end up with making bad mistakes and misinterpretations. Artificial Intelligence in the financial sector may significantly reduce costs, efforts, and further financial complications, but there are hidden social costs such as the above. A robust analysis, design, implementation and feedback framework is necessary to meaningfully counter AI bias. Other Use Cases for AI in Finances Of course, there are also enough examples of how AI helps to improve customer experience in the financial sector. Some startups use AI software to help clients find the company that is the best at providing them with the required service. They juxtapose the clients’ requirements with the companies’ services finding perfect matches. Even though this technology reminds us of how dating apps work, such applications can drastically save time for both parties and help borrowers pay faster. AI can also be used for streamlining finances. AI helps banks and alternative lending companies in automating some of their working processes such as basic customer service, contract management, or transactions monitoring. A good example is Upstart, the pet project of two former Google employees. The startup was originally aimed to help young people lacking the credit history, to get a loan or any other kind of financial support. For this purpose, the company uses the clients’ educational background and experience, taking into account things such as their attained degrees and school/university attendance. However, such approach to lending may end up being a little snobbish: it can simply overlook large groups of population who can’t afford higher education. As a result of insufficient educational background, these people can become deprived of the opportunity to get their loan. Nonetheless, one of the main goals of the company was automating as many of its operating procedures as possible. By 2018, more than 60% of all their loans had been fully automated with more to come. We cannot automate fairness and opportunity, yet The implementation of machine learning in providing loans by checking the digital footprint of people may lead to ethical and legal disputes. Even today some people state that the use of AI in the financial sector encouraged inequality in the number of loans provided to the black and white population of the USA. They believe that AI continues the bias against minorities and make the black people “underbanked.” Both lending companies and banks should remember that the quality of work done these days with the help of machine learning methods highly depends on people—both employees who use the software and AI developers who create and fine-tune it. So we should see AI in loan management as a useful tool—but not as a replacement for humans. Author Bio Darya Shmat is a business development representative at Iflexion, where Darya expertly applies 10+ years of practical experience to help banking and financial industry clients find the right development or QA solution. Blockchain governance and uses beyond finance – Carnegie Mellon university podcast Why Retailers need to prioritize eCommerce Automation in 2019 Glancing at the Fintech growth story – Powered by ML, AI & APIs
Read more
  • 0
  • 0
  • 18768

article-image-emoji-scavenger-hunt-showcases-tensorflow-js
Richard Gall
03 Apr 2018
3 min read
Save for later

Emoji Scavenger Hunt showcases TensorFlow.js

Richard Gall
03 Apr 2018
3 min read
What is Emoji Scavenger Hunt? Emoji Scavenger Hunt is a game built using neural networks. Developed by Google using TensorFlow.js, a version of the machine learning library designed to run on browsers, the game showcases how machine learning can be brought to web applications. But more importantly, TensorFlow.js, which was announced at the end of March at the TensorFlow Developer Summit looks like it could be a tool to define the next few years of web development, making machine learning more accessible to JavaScript developers than ever before. Start playing now. At the moment Emoji Scavenger Hunt is pretty basic, but the central idea is pretty cool. When you open up the web page in your browser and click 'Let's Play', the app asks for access to your camera. The game then starts: you'll see a countdown, before your camera opens and the web application asks you to find an example of an emoji in the real world. If you find yourself easily irritated you're probably not going to get addicted, as Google seem to have done their best to cultivate an emoji-esque mise en scene. But the game nevertheless highlights not only how neural networks work, but also, in the context of TensorFlow.js, how they might operate in a browser. Of course, one of the reasons Emoji Scavenger Hunt is so basic is because a core part of the game is training the neural network. Presumably, as more people play it, the neural network will improve at 'guessing' what objects in the real world relate to which emoji on your keyboard. TensorFlow.js will bring machine learning to the browser What's exciting is how TensorFlow.js might help shape the future of web development. It's going to make it much easier for JavaScript developers to get started with machine learning - on Reddit a number of users were thankful that they could now use TensorFlow without touching a line of Python code. On the other hand - perhaps a little less likely - TensorFlow.js might lead to more machine learning developers using JavaScript. If games like Emoji Scavenger Hunt become the norm, engineers and data scientists will have a new way to train algorithms - getting users to do it for them. TensorFlow.js and deeplearn.js Eagle-eyed readers who have been watching TensorFlow closely might be thinking here - what about deeplearn.js? Fortunately, the TensorFlow team have an answer: TensorFlow.js... is the successor to deeplearn.js, which is now called TensorFlow.js Core. TensorFlow.js and the future of machine learning The announcement of TensorFlow.js highlights that Google and the core development team behind TensorFlow have a clear focus on the future. They're already the definitive library for machine learning and deep learning. What this will do is spread its dominance into new domains. Emoji Scavenger Hunt is pointing the way - we're sure to see plenty of machine learning imitators and innovators over the next few years.
Read more
  • 0
  • 0
  • 18207

article-image-ai-powered-mixed-reality
Sugandha Lahoti
22 Nov 2017
8 min read
Save for later

7 promising real-world applications of AI-powered Mixed Reality

Sugandha Lahoti
22 Nov 2017
8 min read
Mixed Reality has become a disruptive force that is bridging the gap between reality and imagination. With AI, it is now poised to change the world as we see it! The global mixed reality market is expected to reach USD 6.86 billion by 2024. Mixed Reality has found application in not just the obvious gaming and entertainment industries but also has great potential in business and other industries ranging from manufacturing, travel, and medicine to advertising. Maybe that is why the biggest names in tech are battling it out to capture the MR market with their devices - Microsoft HoloLens, GoogleGlass 2.0, Meta2 handsets to name a few. Incorporating Artificial Intelligence is their next step towards MR market domination. So what’s all the hype about MR and how can AI take it to the next level? Through the looking glass: Understanding Mixed Reality Mixed reality is essentially a fantastic concoction of virtual reality (a virtual world with virtual objects) and augmented reality (the real world with digital information). This means virtual objects are overlaid in the real world and mixed reality enables the person who is experiencing the MR environment to perceive virtual objects as ”real ones”. While in augmented reality it is easy to break the illusion and recognize that the objects are not real (Hello Pokemon Go!), in Mixed Reality, it is harder to break the illusion as virtual objects behave like real-world objects. So when you lean in close or interact with a virtual object in MR, it gets closer in a way a real object would. The MR experience is made possible with mixed reality devices that are typically lightweight and wearable. They are generally equipped with front-mounted cameras to recognize the distinctive features of the real world (such as objects and walls) and blend them with the virtual reality as seen through the headset. They also include a processor for processing the information relayed by an array of sensors embedded in the headset. These processors run the algorithms used for pattern recognition on a cloud-based server. These devices then use a projector for displaying virtual images in real environments which are finally reflected to the eye with the help of beam-splitting technology. All this sounds magical already, what can AI do for MR to top this? Curiouser and curiouser: The AI-powered Mixed Reality Mixed Reality and Artificial Intelligence are two powerful technology tools. The convergence of the two means a seamless immersive experience for users that blends the virtual and physical reality. Mixed Reality devices already enable interaction of virtual holograms in the physical environment and thereby combine virtual worlds with reality. But most MR devices require a large number of calculations and adjustments to accurately determine the position of a virtual object in a real-world scenario. They then apply rules and logic to those objects to make them behave like real-world objects. As these computations happen on the cloud, the results have perceivable time lag which comes in the way of giving the user a truly immersive experience. Also, user mobility is restricted due to current device limitations. Recently there has been a rise of the AI coprocessor in Mixed Reality devices. The announcement of Microsoft’s HoloLens 2 project, an upgrade to the existing MR device which now includes an AI coprocessor is a case in point. By using AI chips for computing, the above calculations, for example, MR devices will deliver high precision results faster. It means algorithms and calculations can run instantaneously without the need for data to be sent to/from a cloud. Having the data locally on your headset will eliminate time lag, thereby creating more real-time immersive experiences. In other words, as the visual data is analyzed directly on the device and computationally-exhaustive tasks are performed close to the data source, the enhanced processing speed results in quicker performance. Since the data remains on your headset always, fewer computations are needed to be performed on the cloud, hence the data is more secure. Using an AI chip also allows flexible implementation of deep neural networks. They help in automating complex calculations such as depth perception estimations and generally provide a better understanding of the environment to the MR devices. Generative models from Deep Learning can be used to generate believable virtual characters (avatars) in the real world. Images can also be more intelligently compressed with AI techniques. This would enable faster transmission over wireless networks. Motion capture techniques are now employing AI functionalities such as phase-functioned neural networks and self-teaching AI. They use machine learning techniques to combine a vast library of stored movements and combine and fit them into new characters. By using AI-powered Mixed Reality devices, the plan is to provide a more realistic experience which is fast and provides more mobility. The ultimate goal is to build AI-powered mixed reality devices that are intelligent and self-learning. Follow the white rabbit: Applications of AI-power Mixed Reality  Let us look at various sectors where Artificially Intelligent Mixed Reality has started finding traction. Gaming and Entertainment In the field of gaming, procedural content generation techniques allow automatic generation of Mixed Reality games (as opposed to manual creation by game designers) by encoding elements such as individual structures, enemies etc with their relationships. Artificial Intelligence enhances PCG algorithms in object identification and in recognizing other relationships between the real and virtual objects. Deep learning techniques can be used for tasks like super resolution, photo to texture mapping, and texture multiplication. Healthcare and Surgical Procedures AI-powered mixed reality tech has also found its use in the field of Healthcare and surgical operations. Scopis has announced a mixed reality surgical navigation system that uses the Microsoft HoloLens for spinal surgery applications. It employs image recognition and manipulation techniques which allow the surgeon to see both the patient and a superimposed image of the pedicle screws (used for vertebrae fixation surgeries) for surgical procedures. Retail Retail is another sector which is under the spell of this AI infused MR tech. DigitalBridge, a mixed-reality company uses mixed reality, artificial intelligence, and deep learning to create a platform that allows consumers to virtually try on home decor products before buying them. Image and Video Manipulation AI algorithms and MR techniques can also enrich video and image manipulation. As we speak, Microsoft is readying the release of Microsoft Remix 3D service. This software adds “mixed reality” digital images and animations to videos. It keeps the digital content in the same position in relation to the real objects using image recognition, computer vision, and AI algorithms. Military and Defence AI-powered Mixed Reality is also finding use in the defense sector, where MR training simulations controlled by artificially intelligent software combine real people and physical environments with a virtual setup. Construction and Homebuilding Builders can visualize their options in life-sized models with MR devices. With AI, they can leave virtual messages or videos at key locations to keep other technicians and architects up to date when they’re away. Using MR and AI techniques, an architect can call a remote expert into the virtual environment if need be, and virtual assistants can be utilized for further assistance. COINS is a construction and homebuilding organization which uses AI-powered Mixed Reality devices. They are collaborating with Sketchup for virtual messaging and 3D modeling and Microsoft for HoloLens and Skype Chatbot assistance. Industrial Farming Machine learning algorithms can be used to study a sensor-enabled field of crop and record the growth, requirements and anticipate future needs. An MR device can then provide a means to interact with the plants and analyze present conditions all the while adjusting future needs. Infosys’ Plant.IO is one such digital farm which when combined with the power of an MR device can overlay virtual objects over a real-world scenario. Conclusion Through these examples, we can see the rapid adoption of AI-powered Mixed Reality recipes across diverse fields enabled by the rise of AI chips and the employment of more exhaustive computations with complex machine learning and deep learning algorithms. The next milestone is to see the rise of a Mixed Reality environment which is completely immersive and untethered. This would be made possible by the adoption of more complex AI techniques and advances made in the field of AI both in hardware and software. Instead of voice or search based commands in the MR environments, AI techniques will be used to harness eye and body gestures. As MR devices become smaller and mobile, AI-powered mixed reality will also give rise to intelligent application development incorporating mixed reality through the phone’s camera lens. AI technologies would thus help expand the scope of MR not only as an interesting tool, with applications in gaming and entertainment, but also as a practical and useful approach for how we see, interact and learn from our environment.
Read more
  • 0
  • 0
  • 17790
article-image-why-retailers-need-to-prioritize-ecommerce-automation-in-2019
Guest Contributor
14 Jan 2019
6 min read
Save for later

Why Retailers need to prioritize eCommerce Automation in 2019

Guest Contributor
14 Jan 2019
6 min read
The retail giant Amazon plans to reinvent in-store shopping in much the same way it revolutionized online shopping. Amazon’s cashierless stores (3000 of which you can expect by 2021) give a glimpse into what the future of eCommerce could look like with automation. eCommerce automation involves combining the right eCommerce software with the right processes and the right people to streamline and automate the order lifecycle. This reduces the complexity and redundancy of many tasks that an eCommerce retailer typically faces from the time a customer places her order until the time it is delivered to them. Let’s look at why eCommerce retailers should prioritize automation in 2019. 1. Augmented Customer Experience + Personalization A PwC study titled “Experience is Everything” suggests that 42% of consumers would pay more for a friendly, welcoming experience. Way back in 2015, Gartner predicted that nearly 50% of companies would be implementing changes in their business model in order to augment customer experience. This is especially true for eCommerce, and automation in eCommerce certainly represents one of these changes. Customization and personalization of services are a huge boost for customer experience. A BCG report revealed that retailers who implemented personalization strategies saw a 6-10% boost in their sales. How can automation help? To start with, you can automate your email marketing campaigns and make them more personalized by adding recommendations, discount codes, and more. 2.  Fraud Prevention The scope for fraud on the Internet is huge. According to the October 2017 Global Fraud Index, Account Takeover fraud cost online retailers a whopping $3.3 billion dollars in Q2 2017 alone. Additionally, the average transaction rate for eCommerce fraud has also been on this rise. eCommerce retailers have been using a number of fraud prevention tools such as address verification service, CVN (card verification number), credit history check, and more to verify the buyer’s identity. An eCommerce tool equipped with Machine Learning capabilities such as Shuup, can detect fraudulent activity and effectively run through thousands of checks in the blink of an eye. Automating order handling can ensure that these preliminary checks are carried out without fail, in addition to carrying out specific checks to assess the riskiness of a particular order. Depending on the nature and scale of the enterprise, different retailers would want to set different thresholds for fraud detection, and eCommerce automation makes that possible. If a medium-risk transaction is detected, the system can be automated to notify the finance department for immediate review, whereas high-risk transactions can be canceled immediately. Automating your eCommerce software processes allows you to break the mold of one-size-fits-all coding and make the solution specific to the needs of your organization. 3. Better Customer Service Customer service and support is an essential part of the buying process. Automated customer support does not necessarily mean your customers will get canned responses to all their queries. However,  it means that common queries can be dealt with in a more efficient manner. Live chats and chatbots have become incredibly popular with eCommerce retailers because these features offer convenience to both the customer and the retailer. The retailer’s support staff will not be held up with routine inquiries and the customer can get their queries resolved quickly. Timely responses are a huge part of what constitutes positive customer experience. Live chat can even be used for shopping carts in order to decrease cart abandonment rates. Automating priority tickets as well as the follow-up on resolved and unresolved tickets is another way to automate customer service. With new generation automated CRM systems/ help desk software you can automate the tags that are used to filter tickets. It saves the customer service rep’s time and ensures that priority tickets are resolved quickly. Customer support/service can be made as a strategic asset in your overall strategy. The same Oracle study mentioned above suggests that back in 2011, 50% of customers would give a brand at most one week to respond to a query before they stopped doing business with them. You can imagine what those same numbers for 2018 would be. 4. Order Fulfillment Physical fulfillment of orders is prone to errors as it requires humans to oversee the warehouse selection. With an automated solution, you can set up an order fulfillment to match the warehouse requirements as closely as possible. This ensures that the closest warehouse which has the required item in stock is selected. This guarantees timely delivery of the order to the customer. It could also be set up to integrate with your billing software so as to calculate accurate billing/shipping charges, taxes (for out of state/overseas shipments), etc. Automation can also help manage your inventory effectively. Product availability is updated automatically with each transaction, be it a return or the addition of a new product. If the stock of a particular in-demand item was nearing danger levels, your eCommerce software would send an automated email to the supplier asking to replenish its stock ASAP. Automation ensures that your prerequisites to order fulfillment are fulfilled for successful processing and delivery of an order. Challenges with eCommerce automation The aforementioned benefits are not without some risks, just as any evolving concept tends to be. One of the most important challenges to making automation work for eCommerce smoothly is data accuracy. As eCommerce platforms gear up for greater multichannel/omnichannel retailing strategies, automation is certainly going to help them bolster and enhance their workflows, but only if the right checks are in place. Automation still has a long way to go, so for now, it might be best to focus on automating tasks that take up a lot of time such as updating inventory, reserving products for certain customers, updating customers on their orders, etc. Advanced applications such as fraud detection might still take a few years to be truly ‘automated’ and free from any need for human review. For now, eCommerce retailers still have a whole lot to look forward to. All things considered, automating the key tasks/functions of your eCommerce platform will impart flexibility, agility, and a scope for improved customer experience. Invest rightly in automation solutions for your eCommerce software to stay competitive in the dynamic, unpredictable eCommerce retail scenario. Author Bio Fretty Francis works as a Content Marketing Specialist at SoftwareSuggest, an online platform that recommends software solutions to businesses. Her areas of expertise include eCommerce platforms, hotel software, and project management software. In her spare time, she likes to travel and catch up on the latest technologies. Software developer tops the 100 Best Jobs of 2019 list by U.S. News and World Report IBM Q System One, IBM’s standalone quantum computer unveiled at CES 2019 Announcing ‘TypeScript Roadmap’ for January 2019- June 2019
Read more
  • 0
  • 0
  • 17765

article-image-deep-learning-set-revolutionize-music-industry
Sugandha Lahoti
11 Dec 2017
6 min read
Save for later

Deep Learning is all set to revolutionize the music industry

Sugandha Lahoti
11 Dec 2017
6 min read
Isn’t it spooky how Facebook can identify faces of your friends before you manually tag them? Have you been startled by Cortana, Siri or Google Assistant when they instantly recognize and act on your voice as you speak to these virtual assistants? Deep Learning is the driving force behind these uncanny yet innovative applications. The next thing that is all set to dive into deep learning is the music industry. Neural networks not only ease production and generation of songs, but also assist in music recommendation, transcription and classification. Here are some ways that deep learning will elevate music and the listening experience itself: Generating melodies with Neural Nets At the most basic level, a deep learning algorithm follows 3 simple steps for music generation: First, the neural net is trained with a sample data set of songs which are labelled. The labelling is done based on the emotions you want your song to convey (happy, sad, funny, etc). For training, the program converts the speech of the data set in text format and then creates vector for each word.  The training data can also be in the form of MIDI format which is a standard protocol for encoding musical notes. After completing the training, the program is fed with a set of emotions as input. It identifies the associated input vectors and compares them to training vectors. The output is a melody or chords that represent the desired emotions. Long short-term memory (LSTM) architectures are also used for music generation. They take structured input of a music notation. These inputs are then encoded as vectors and fed into an LSTM at each timestep. LSTM then predicts the encoding of the next timestep. Fully connected convolutional layers are utilized to increase the music quality and to represent rich features in the frequency domain. Magenta, the popular art and music project of Google has launched Performance RNN, which is an LSTM-based recurrent neural network. It is designed to produce multiple sounds with expressive timing and dynamics. In other words, Performance RNN determines which notes to play, when to play them, and how hard to strike each note. IBM’s Watson Beat uses a neural network to produce complete tracks by understanding music theory, structure, and emotional intent. According to Richard Daskas, a music composer working on the Watson Beat project, “Watson only needs about 20 seconds of musical inspiration to create a song.” Transcripting music with deep learning Deep learning methods can also be used for arranging a piece of music for a different instrument. LSTM networks are a popular choice for music transcription and modelling. These networks are trained using a large dataset of pre-labelled music transcriptions (expressed with ABC notation). These transcriptions are then used to generate new music transcriptions. In fact, transformed audio data can be used to predict the group of notes currently being played. This can be achieved by treating the transcription model as an image classification problem. For this, an image of an audio is used, called as Spectrogram. A spectrogram displays how the spectrum or frequency content changes over time. A Short Time Fourier Transform (STFT) or a constant Q transform is used to create this spectrogram. The spectrogram is then feeded to a Convolutional Neural network(CNN). The CNN estimates current notes from audio data and determines what specific notes are present by analysing 88 output nodes for each of the piano keys. This network is generally trained using large number of examples from MIDI files spanning several different genres of music. Magenta has developed The NSynth dataset, which is a high-quality multi-note dataset for music transcription. It is inspired by image recognition datasets and has a huge collection of annotated musical notes. Make better music recommendations Neural Nets are also used to make intelligent music recommendations and are a step ahead of the traditional Collaborative filtering networks. Using neural networks, the system can analyse the songs saved by the users, and then utilize those songs to make new recommendations.  Neural nets can also be used to analyze songs based on musical qualities such as pitch, chord progression, bass, etc. Using the similarities between songs having the same traits as each other, neural networks can detect and predict new songs.  Thus providing recommendation based on similar lyrical and musical styles. Convolutional neural networks (CNNs) are utilized for making music recommendations. A time-frequency representation of the audio signal is fed into the network as the input. 3 second audio clips are randomly chosen from the audio samples to train the neural network. The CNNs are then used to predict latent factors from music audio by taking the average of the predictions for consecutive clips. The feature extraction layers and pooling layers permits operation on several timescales. Spotify is working on a music recommendation system with a CNN. This recommendation system, when trained on short clips of songs, can create playlists based on the audio content only. Classifying music according to genre Classifying music according to a genre is another achievement of neural nets. At the heart of this application lies the LSTM network. At the very first stage, convolutional layers are used for feature extraction from the spectrograms of the audio file. The sequence of features so obtained is given as input to the LSTM layer. LSTM evaluates dependencies of the song across both short time period as well as long term structure. After the LSTM, the input is fed into a fully connected, time-distributed layer which essentially gives us a sequence of vectors. These vectors are then used to output the network's evaluation of the genre of the song at the particular point of time. Deepsound uses GTZAN dataset and an LSTM network to create a model for music genre recognition.  On comparing the mean output distribution with the correct genre, the model gives almost 67% of accuracy. For musical pattern extraction, MFCC feature dataset is used for audio analysis. First, the audio signal is extracted in the MFCC format. Next, the input song is modified into an MFCC map. This Map is then split to feed it as the input of the CNN. Supervised learning is used for automatically obtaining musical pattern extractors, considering the song label is provided. The extractors so acquired, are used for restoring high-order pattern-related features. After high-order classification, the result is combined and undergoes a voting process to produce the song-level label. Scientists from Queen Mary University of London trained a neural net with over 6000 songs in a ballad, hip-hop, and dance to develop a neural network that achieves almost 75% accuracy in song classification. The road ahead Neural networks have advanced the state of music to whole new level where one would no longer require physical instruments or vocals to compose music. The future would see more complex models and data representations to understand the underlying melodic structure. This would help models create compelling artistic content on their own. Combination of music with technology would also foster a collaborative community consisting of artists, coders and deep learning researchers, leading to a tech-driven, yet artistic future.  
Read more
  • 0
  • 0
  • 17527

article-image-what-is-core-ml
Savia Lobo
28 Sep 2018
5 min read
Save for later

What is Core ML?

Savia Lobo
28 Sep 2018
5 min read
Introduced by Apple, CoreML is a machine learning framework that powers the iOS app developers to integrate machine learning technology into their apps. It supports natural language processing (NLP), image analysis, and various other conventional models to provide a top-notch on-device performance with minimal memory footprint and power consumption. This article is an extract taken from the book Machine Learning with Core ML written by Joshua Newnham. In this article, you will get to know the basics of what CoreML is and its typical workflow. With the release of iOS 11 and Core ML, performing inference is just a matter of a few lines of code. Prior to iOS 11, inference was possible, but it required some work to take a pre-trained model and port it across using an existing framework such as Accelerate or metal performance shaders (MPSes). Accelerate and MPSes are still used under the hood by Core ML, but Core ML takes care of deciding which underlying framework your model should use (Accelerate using the CPU for memory-heavy tasks and MPSes using the GPU for compute-heavy tasks). It also takes care of abstracting a lot of the details away; this layer of abstraction is shown in the following diagram: There are additional layers too; iOS 11 has introduced and extended domain-specific layers that further abstract a lot of the common tasks you may use when working with image and text data, such as face detection, object tracking, language translation, and named entity recognition (NER). These domain-specific layers are encapsulated in the Vision and natural language processing (NLP) frameworks; we won't be going into any details of these frameworks here, but you will get a chance to use them in later chapters: It's worth noting that these layers are not mutually exclusive and it is common to find yourself using them together, especially the domain-specific frameworks that provide useful preprocessing methods we can use to prepare our data before sending to a Core ML model. So what exactly is Core ML? You can think of Core ML as a suite of tools used to facilitate the process of bringing ML models to iOS and wrapping them in a standard interface so that you can easily access and make use of them in your code. Let's now take a closer look at the typical workflow when working with Core ML. CoreML Workflow As described previously, the two main tasks of a ML workflow consist of training and inference. Training involves obtaining and preparing the data, defining the model, and then the real training. Once your model has achieved satisfactory results during training and is able to perform adequate predictions (including on data it hasn't seen before), your model can then be deployed and used for inference using data outside of the training set. Core ML provides a suite of tools to facilitate getting a trained model into iOS, one being the Python packaged released called Core ML Tools; it is used to take a model (consisting of the architecture and weights) from one of the many popular packages and exporting a .mlmodel file, which can then be imported into your Xcode project. Once imported, Xcode will generate an interface for the model, making it easily accessible via code you are familiar with. Finally, when you build your app, the model is further optimized and packaged up within your application. A summary of the process of generating the model is shown in the following diagram:   The previous diagram illustrates the process of creating the .mlmodel;, either using an existing model from one of the supported frameworks, or by training it from scratch. Core ML Tools supports most of the frameworks, either internal or as third party plug-ins, including  Keras, Turi, Caffe, scikit-learn, LibSVN, and XGBoost frameworks. Apple has also made this package open source and modular for easy adaption for other frameworks or by yourself. The process of importing the model is illustrated in this diagram: In addition; there are frameworks with tighter integration with Core ML that handle generating the Core ML model such as Turi Create, IBM Watson Services for Core ML, and Create ML. We will be introducing Create ML in chapter 10; for those interesting in learning more about Turi Create and IBM Watson Services for Core ML then please refer to the official webpages via the following links: Turi Create; https://github.com/apple/turicreate IBM Watson Services for Core ML; https://developer.apple.com/ibm/ Once the model is imported, as mentioned previously, Xcode generates an interface that wraps the model, model inputs, and outputs. Thus, in this post, we learned about the workflow of training and how to import a model. If you've enjoyed this post, head over to the book  Machine Learning with Core ML  to delve into the details of what this model is and what Core ML currently supports. Emotional AI: Detecting facial expressions and emotions using CoreML [Tutorial] Build intelligent interfaces with CoreML using a CNN [Tutorial] Watson-CoreML: IBM and Apple’s new machine learning collaboration project
Read more
  • 0
  • 0
  • 17342
article-image-what-keras
Janu Verma
01 Jun 2017
5 min read
Save for later

What is Keras?

Janu Verma
01 Jun 2017
5 min read
What is keras? Keras is a high-level library for deep learning, which is built on top of Theano and Tensorflow. It is written in Python, and provides a scikit-learn type API for building neural networks. It enables developers to quickly build neural networks without worrying about the mathematical details of tensor algebra, optimization methods, and numerical techniques. The key idea behind Keras is to facilitate fast prototyping and experimentation. In the words of Francois Chollet, the creator of Keras: Being able to go from idea to result with the least possible delay is key to doing good research. This is a huge advantage, especially for scientists and beginner developers because one can jump right into deep learning without getting their hands dirty. Because deep learning is currently on the rise, the demand for people trained in deep learning is ever increasing. Organizations are trying to incorporate deep learning into their workflows, and Keras offers an easy API to test and build deep learning applications without considerable effort. Deep learning research is such a hot topic right now, scientists need a tool to quickly try out their ideas, and they would rather spend time on coming up with ideas than putting together a neural network model. I use Keras in my own research, and I know a lot of other researchers relying on Keras for its easy and flexible API. What are the key features of Keras? Keras is a high-level interface to Theano or Tensorflow and any of these can be used in the backend. It is extremely easy to switch from one backend to another. Training deep neural networks is a memory and time intensive task. Modern deep learning frameworks like Tensorflow, Caffe, Torch, etc. can also run on GPU, though there might be some overhead in setting up and running the GPU. Keras runs seamlessly on both CPU and GPU. Keras supports most of the neural layer types e.g. fully connected, convolution, pooling, recurrent, embedding, dropout, etc., which can be combined in any which way to build complex models. Keras is modular in nature in the sense that each component of a neural network model is a separate, standalone, fully-configurable module, and these modules can be combined to create new models. Essentially, layers, activation, optimizers, dropout, loss, etc. are all different modules that can be assembled to build models. A key advantage of modularity is that new features are easy to add as separate modules. This makes Keras fully expressive, extremely flexible, and well-suited for innovative research. Coding in Keras is extremely easy. The API is very user-friendly with the least amount of cognitive load. Keras is a full Python framework, and all coding is done in Python, which makes it easy to debug and explore. The coding style is very minimalistic, and operations are added in very intuitive python statements. How is Keras built? The core component of Keras architecture is a model. Essentially, a model is a neural network model with layers, activations, optimization, and loss. The simplest Keras model is Sequential, which is just a linear stack of layers; other layer arrangements can be formed using the Functional model. We first initialize a model, add layers to it one by one, each layer followed by its activation function (and regularization, if desired), and then the cost function is added to the model. The model is then compiled. A compiled model can be trained, using the simple API (model.fit()), and once trained the model can be used to make predictions (model.predict()). The similarity to scikit-learn API can be noted here. Two models can be combined sequentially or parallel. A model trained on some data can be saved as an HDF5 file, which can be loaded at a later time. This eliminates the need to train a model again and again; train once and make predictions whenever desired. Keras provides an API for most common types of layers. You can also merge or concatenate layers for a parallel model. It is also possible to write your own layers. Other ingredients of a neural network model like loss function, metric, optimization method, activation function, and regularization are all available with most common choices. Another very useful component of Keras is the preprocessing module with support for manipulating and processing image, text, and sequence data. A number of deep learning models and their weights obtained by training on a big dataset are made available. For example, we have VGG16, VGG19, InceptionV3, Xception, ResNet50 image recognition models with their weights after training on ImageNet data. These models can be used for direct prediction, feature building, and/or transfer learning. One of the greatest advantage of Keras is a huge list of example code available on the Keras GitHub repository (with discussions on accompanying blog) and on the wider Internet. You can learn how to use Keras for text classification using a LSTM model, generate inceptionistic art using deep dream, using pre-trained word embeddings, building variational autoencoder, or train a Siamese network, etc. There is also a visualization module, which provides functionality to draw a Keras model. This uses the graphviz library to plot and save the model graph to a file. All in all, Keras is a library worth exploring, if you haven't already.  Janu Verma is a Researcher in the IBM T.J. Watson Research Center, New York. His research interests are in mathematics, machine learning, information visualization, computational biology and healthcare analytics. He has held research positions at Cornell University, Kansas State University, Tata Institute of Fundamental Research, Indian Institute of Science, and Indian Statistical Institute.  He has written papers for IEEE Vis, KDD, International Conference on HealthCare Informatics, Computer Graphics and Applications, Nature Genetics, IEEE Sensors Journals etc.  His current focus is on the development of visual analytics systems for prediction and understanding. He advises startups and companies on data science and machine learning in the Delhi-NCR area, email to schedule a meeting. Check out his personal website here.
Read more
  • 0
  • 0
  • 17262

article-image-tackle-trolls-machine-learning-filtering-inappropriate-content
Amarabha Banerjee
15 Aug 2018
4 min read
Save for later

Tackle trolls with Machine Learning bots: Filtering out inappropriate content just got easy

Amarabha Banerjee
15 Aug 2018
4 min read
The most feared online entities in the present day are trolls. Trolls, a fearsome bunch of fake or pseudo online profiles, tend to attack online users, mostly celebrities, sports person or political profiles using a wide range of methods. One of these methods is to post obscene or NSFW (Not Safe For Work) content on your profile or website where User Generated Content (USG) is allowed. This can create unnecessary attention and cause legal troubles for you too. The traditional way out is to get a moderator (or a team of them). Let all the USGs pass through this moderation system. This is a sustainable solution for a small platform. But if you are running a large scale app, say a publishing app where you publish one hundred stories a day, and the success of these stories depend on the user interaction with them, then this model of manual moderation becomes unsustainable. More the number of USGs, more is the turn-around time, larger the moderation team size. This results in escalating costs, for a purpose that’s not contributing to your business growth in any manner. That’s where Machine Learning could help. Machine Learning algorithms that can scan images and content for possible abusive or adult content is a better solution that manual moderation. Tech giants like Microsoft, Google, Amazon have a ready solution for this. These companies have created APIs which are commercially available for developers. You can incorporate these APIs in your application to weed out the filth served by the trolls. The different APIs available for this purpose are Microsoft moderation, Google Vision, AWS Rekognition & Clarifai. Dataturks have made a comparative study on using these APIs on one particular dataset to measure their efficiency. They used a YACVID dataset with 180 images, manually labelled 90 of these images as nude and the rest as non-nude. The dataset was then fed to the 4 APIs mentioned above, their efficiency was tested based on the following parameters. True Positive (TP): Given a safe photo, the API correctly says so False Positive (FP): Given an explicit photo but the API incorrectly classifies it as safe. False negative (FN): Given a safe photo but the API is not able to detect so and True negative(TN): Given an explicit photo and the API correctly says so. TP and TN are two cases which meant the system behaved correctly. An FP meant that the app was vulnerable to attacks from trolls, FN meant the efficiency of the systems were low and hence not practically viable. 10% of the cases would be such that the API can’t decide whether its explicit or not. Those would be sent for manual moderation. This would bring down the maintenance cost of the moderation team. The results that they received are shown below: Source: Dataturks As it is evident from the above table, the best standalone API is Google vision with a 99% accuracy and 94% recall value. Recall value implies that if the same images are repeated, it can recognize them with 94% precision. The best results however were received with the combination of Microsoft and Google. The comparison of the response times are mentioned below: Source: dataturks The response time might have been affected with the fact that all the images accessed by the APIs were stored in Amazon S3. Hence AWS API might have had an unfair advantage on the response time. The timings were noted for 180 image calls per API. The cost is the lowest for AWS Rekognition - $1 for 1000 calls to the API. It’s $1.2 for Clarifai, $1.5 for both Microsoft and Google. The one notable drawback of the Amazon API was that the images had to be stored as S3 objects, or converted into that. All the other APIs accepted any web links as possible source of images. What this study says is that the power of filtering out negative and explicit content in your app is much easier now. You might still have to have a small team of moderators, but their jobs will be made a lot easier with the ML models implemented in these APIs. Machine Learning is paving the way for us to be safe from the increasing menace of Trolls, a threat to free speech and open sharing of ideas which were the founding stones of internet and the world wide web as a whole. Will this discourage Trolls from continuing their slandering or will it create a counter system to bypass the APIs and checks? We can only know in time. Facebook launches a 6-part Machine Learning video series Google’s new facial recognition patent uses your social network to identify you! Microsoft’s Brad Smith calls for facial recognition technology to be regulated
Read more
  • 0
  • 0
  • 17236
Modal Close icon
Modal Close icon