• Optimize resources and costs by utilizing Spark's speed
• Troubleshoot the Spark execution DAG by exploring Spark logical and physical query plans to perform the same logic on fewer executors and machines
• Solve the problem of slow-running jobs by speeding up feedback loops by creating efficient transformations and joins using Spark APIs
Description
Apache Spark has been around quite some time, but do you really know how to solve the development issues and problems you face with it? This course will give you new possibilities and you'll cover many aspects of Apache Spark; some you may know and some you probably never knew existed. If you take a lot of time learning and performing tasks on Spark, you are unable to leverage Apache Spark's full capabilities and features, and face a roadblock in your development journey. You'll face issues and will be unable to optimize your development process due to common problems and bugs; you'll be looking for techniques which can save you from falling into any pitfalls and common errors during development. With this course you'll learn to implement some practical and proven techniques to improve particular aspects of Apache Spark with proper research
You need to understand the common problems and issues Spark developers face, collate them, and build simple solutions for these problems. One way to understand common issues is to look out for Stack Overflow queries. This course is a high-quality troubleshooting course, highlighting issues faced by developers in different stages of their application development and providing them with simple and practical solutions to these issues. It supplies solutions to some problems and challenges faced by developers; however, this course also focuses on discovering new possibilities with Apache Spark. By the end of this course, you will have solved your Spark problems without any hassle.
All the code and supporting files for this course are available on Github at https://github.com/PacktPublishing/Troubleshooting-Apache-Spark
What you will learn
• Solve long-running computation problems by leveraging lazy evaluation in Spark
• Avoid memory leaks by understanding the internal memory management of Apache Spark
• Rework problems due to not-scaling out pipelines by using partitions
• Debug and create user-defined functions that enrich the Spark API
• Choose a proper join strategy depending on the characteristics of your input data
• Troubleshoot APIs for joins - DataFrames or DataSets
• Write code that minimizes object creation using the proper API
• Troubleshoot real-time pipelines written in Spark Streaming
What do you get with a video?
Download this video in MP4 format
Access this title in our online reader with advanced features
DRM FREE - Read whenever, wherever and however you want
Tomasz Lelek is a Software Engineer who programs mostly in Java and Scala. He is a fan of microservice architectures and functional programming. He dedicates considerable time and effort to being better every day. Recently, he's been delving into big data technologies such as Apache Spark and Hadoop. He is passionate about nearly everything associated with software development.
Tomasz thinks that we should always try to consider different solutions and approaches before solving a problem. Recently, he was a speaker at several conferences in Poland - Confitura and JDD (Java Developer's Day) and also at Krakow Scala User Group. You can find the JDD video here: https://www.youtube.com/watch?v=BnORjQbnZNQ&t - ML Spark talk.
He also conducted a live coding session at Geecon Conference. He is currently working on this website using ML: http://www.allegro.pl
How can I download a video package for offline viewing?
Login to your account at Packtpub.com.
Click on "My Account" and then click on the "My Videos" tab to access your videos.
Click on the "Download Now" link to start your video download.
How can I extract my video file?
All modern operating systems ship with ZIP file extraction built in. If you'd prefer to use a dedicated compression application, we've tested WinRAR / 7-Zip for Windows, Zipeg / iZip / UnRarX for Mac and 7-Zip / PeaZip for Linux. These applications support all extension files.
How can I get help and support around my video package?
If your video course doesn't give you what you were expecting, either because of functionality problems or because the content isn't up to scratch, please mail customercare@packt.com with details of the problem. In addition, so that we can best provide the support you need, please include the following information for our support team.
Video
Format watched (HTML, MP4, streaming)
Chapter or section that issue relates to (if relevant)
System being played on
Browser used (if relevant)
Details of support
Why can’t I download my video package?
In the even that you are having issues downloading your video package then please follow these instructions:
Disable all your browser plugins and extensions: Some security and download manager extensions can cause issues during the download.
Download the video course using a different browser: We've tested downloads operate correctly in current versions of Chrome, Firefox, Internet Explorer, and Safari.