In Apache Spark, scheduling is the process of managing how jobs, stages, and tasks are executed across the cluster. Spark supports various levels of scheduling, from high-level job scheduling to low-level task execution
Job Scheduling
At the highest level, Spark schedules jobs. A job is triggered when an action (e.g., collect, save, or count) is called on an RDD, DataFrame, or Dataset. Spark provides two primary job scheduling modes:
- FIFO Scheduler (First In, First Out):
This is the default scheduling mechanism in Spark. Jobs are queued and executed sequentially in the order they are submitted. This method is straightforward and works well for workloads where jobs do not need to share cluster resources. - Fair Scheduler:
The Fair Scheduler allows multiple jobs to share cluster resources equitably. Jobs are assigned to pools, and resources are distributed fairly among them. Pools can also have different weights to prioritize certain jobs over others. The Fair Scheduler is configured using afair-scheduler.xmlfile, which specifies pool settings and priorities.
Task Scheduling
Once a job is broken into stages (via the DAGScheduler), Spark schedules individual tasks within each stage. The TaskScheduler is responsible for distributing these tasks to the cluster’s executors. It ensures data locality preferences (i.e., running tasks close to the data they operate on) and handles retries for failed tasks.
Scheduling and Executors
Job and Task scheduling are strongly related concept to Spark Executors
Speculative Execution
To address the issue of slow-running tasks (stragglers), Spark supports speculative execution. When enabled (spark.speculation=true), Spark launches duplicate instances of slow tasks, and the result from the first completed instance is used. This is especially helpful for reducing the impact of stragglers on job completion times.
Advanced Scheduling Features
- Barrier Execution Mode:
This mode is designed for tasks that require global coordination. All tasks in a barrier stage must complete before the next stage begins. It is useful for workloads like distributed deep learning. - Resource Profiles (introduced in Spark 3.0):
Resource profiles allow different tasks within the same application to request varying resource configurations. This is helpful for handling heterogeneous workloads.
Streaming workloads
While the underlying scheduling mechanisms remain the same, scheduling is adapted for processing micro-batches of data in real-time.