Apache Spark 2 – Data Science Education with Apache Spark 2
Spark is one of the most widely-used large-scale data processing engines and runs extremely fast. It is a framework that has tools that are equally useful for application developers as well as data scientists.
This Learning Path begins with an introduction to Apache Spark. We first cover the basics of Spark, introduce SparkR, then look at the charting and plotting features of Python in conjunction with Spark data processing, and finally Spark’s data processing libraries. We then develop a real-world Spark application. Next, we enable you to become comfortable and confident in working with Spark for data science by exploring Spark’s data science libraries on a dataset of tweets.
Table of Contents:
Chapter 1: Apache Spark 2 for Beginners
Chapter 2: Data Science with Spark
Manufacturer: Pakkt Publishing
Language of instruction: English
Teacher: Taabish Khan – Curator
Level of training: Elementary, Secondary
Teaching time: 8 hours + 30 minutes
File size: 1690 MB