_images/apachespark.svg

Spark By Examples | Learn Spark Tutorial with Examples.

Spark By Examples

What is Apache Spark?

Apache Spark is an Open source analytical processing engine for large scale powerful distributed data processing and machine learning applications. Spark is Originally developed at the University of California, Berkeley’s, and later donated to Apache Software Foundation. In February 2014, Spark became a Top-Level Apache Project and has been contributed by thousands of engineers and made Spark as one of the most active open-source projects in Apache.

Apache Spark Features

  • In-memory computation

  • Distributed processing using parallelize

  • Can be used with many cluster managers (Spark, Yarn, Mesos e.t.c)

  • Fault-tolerant

  • Immutable

  • Lazy evaluation

  • Cache & persistence

  • Inbuild-optimization when using DataFrames

  • Supports ANSI SQL