As we mentioned in Chapter 8, Spark SQL, the big data compute stack doesn't work in isolation. Integration points across multiple stacks and technologies are essential. In this chapter, we will look at how Spark works with some of the big data technologies that are part of the Hadoop ecosystem. We will cover the following topics in this chapter:
Parquet: This is an efficient storage format
HBase: This is the database in the Hadoop ecosystem