What the PySpark progress bar means
If you've used Spark you will have seen a progress bar when you submit jobs:
[Stage 2: ============> (2+2) / 960]
What do the numbers on the RHS tell us?
(2+2) / 960
: (numCompletedTasks + numActiveTasks) / totalNumOfTasksInThisStage