Intro to Spark for Data Science San Francisco

Get More Info Apply Now

Course Details

Are you ready to take your data engineering skills to the next level with Spark? In this class, you’ll learn to batch process data, build data pipelines and process data in near real time with Spark.

See the Full Curriculum

Why Spark?

Originally created at University of Berkeley, Spark is a powerful, open source processing engine for data distributed across large clusters. Spark is optimized for speed and ease of use; it uses caching and memory to run distributed algorithms 100x faster than MapReduce. Spark can be used for batch process and for processing data in near real-time. This workshop series is for anyone else who wants to master Spark to analyze data at scale.

What You’ll Learn

In this four week hands-on Spark training, you’ll learn:

  • Use Spark to solve real-world problems and use-cases
  • Process terabytes of data using Spark
  • Ins and Outs of Coding Pythonically
  • Build real-time big data applications using Spark Streaming
  • Optimize Spark applications
*Course completion will empower you to use Spark on projects but does not guarantee a job in big data engineering.

Have More Questions?

Read our full FAQ or get in touch with someone on the Galvanize team.

Request Info Read our FAQ