Reader small image

You're reading from  Machine Learning with Apache Spark Quick Start Guide

Product typeBook
Published inDec 2018
Reading LevelIntermediate
PublisherPackt
ISBN-139781789346565
Edition1st Edition
Languages
Right arrow
Author (1)
Jillur Quddus
Jillur Quddus
author image
Jillur Quddus

Jillur Quddus is a lead technical architect, polyglot software engineer and data scientist with over 10 years of hands-on experience in architecting and engineering distributed, scalable, high-performance, and secure solutions used to combat serious organized crime, cybercrime, and fraud. Jillur has extensive experience of working within central government, intelligence, law enforcement, and banking, and has worked across the world including in Japan, Singapore, Malaysia, Hong Kong, and New Zealand. Jillur is both the founder of Keisan, a UK-based company specializing in open source distributed technologies and machine learning, and the lead technical architect at Methods, the leading digital transformation partner for the UK public sector.
Read more about Jillur Quddus

Right arrow

Summary

In this chapter, we have studied, implemented, and evaluated common algorithms that are used in natural language processing. We have preprocessed a corpus of documents using feature transformers and generated feature vectors from the resulting processed corpus using feature extractors. We have also applied these common NLP algorithms to machine learning. We trained and tested a sentiment analysis model that we used to predict the underlying sentiment of tweets so that organizations may improve their product and service offerings. In Chapter 8, Real-Time Machine Learning Using Apache Spark, we will extend our sentiment analysis model to operate in real time using Spark Streaming and Apache Kafka.

In the next chapter, we will take a hands-on exploration through the exciting and cutting-edge world of deep learning!

lock icon
The rest of the page is locked
Previous PageNext Chapter
You have been reading a chapter from
Machine Learning with Apache Spark Quick Start Guide
Published in: Dec 2018Publisher: PacktISBN-13: 9781789346565

Author (1)

author image
Jillur Quddus

Jillur Quddus is a lead technical architect, polyglot software engineer and data scientist with over 10 years of hands-on experience in architecting and engineering distributed, scalable, high-performance, and secure solutions used to combat serious organized crime, cybercrime, and fraud. Jillur has extensive experience of working within central government, intelligence, law enforcement, and banking, and has worked across the world including in Japan, Singapore, Malaysia, Hong Kong, and New Zealand. Jillur is both the founder of Keisan, a UK-based company specializing in open source distributed technologies and machine learning, and the lead technical architect at Methods, the leading digital transformation partner for the UK public sector.
Read more about Jillur Quddus