Hadoop Interview questions

Total available count: 27
Subject - Apache
Subsubject - Hadoop

What is DAGSchedular and how it performs?

DAGScheduler is the scheduling layer of Apache Spark that implements stage-oriented scheduling, i.e. post an RDD action has been called it becomes a job that is then transformed into a set of steps that are submitted as TaskSets for execution. 

DAGScheduler uses an event queue architecture in which a thread can post DAGSchedulerEvent events, e.g. a new/fresh job or stage being submitted, that DAGScheduler reads and executes sequentially.

Next 5 interview question(s)

Please define executors in detail?
Please explain, how workers work, when a new Job submitted to them?
What are the workers?
What is the purpose of Driver in Spark Architecture?
Define Spark architecture?