Following are some important components of Spark

  1. Cluster Manager
    1. Is used to run the Spark Application in Cluster Mode
  2. Application
    • User program built on Spark. Consists of,
    • Driver Program
      • The Program that has SparkContext. Acts as a coordinator for the Application
    • Executors
      • Runs computation & Stores Application Data
      • Are launched at the beginning of an Application & runs for the entire life time of an Application
      • Each Application gets it own Executors
      • An Application can have multiple Executors
      • An Executor is not shared by Multiple Applications
      • Provides in-memory storage for RDDs
      • For an Application, No >1 Executors run in the same Node
  3. Task
    1. Represents a unit of work in Spark
    2. Gets executed in Executor
  4. Job 
    1. Parallel Computation consisting of multiple Tasks that gets spawned in response to Spark