In industry, there is a requirement to run multiple instances of Hadoop(production, test) on same data set. Mesos framework fulfills this kind of requirements. Mesos makes easy to share clustered resources like, CPU, RAM or hard disk space across distributed frameworks.
As shown in below figure, Mesos acts a common resource sharing layer to all frameworks.
Short overview of Mesos design
Mesos master allocates resources on slave nodes to cluster frameworks. Mesos allocates resources at the level of tasks using allocation module. Framework can reject/accept the resource offers. Slave nodes also provide performance isolation between frameworks using platform isolation modules (ex: Linux container). To use Mesos framework, cluster frameworks need to implement Executor, Scheduler interfaces.
- This framework enables to use cluster resources efficiently.
- The design of Mesos framework is flexible enough to add new frameworks
1) Performance of the job can't be guaranteed as resource allocation may delay the whole execution.
2) Mesos provides resource isolation using external OS libraries(ex: linux container). This may have some additional overhead.
3) It would have better if Mesos handles resource allocation efficiently instead of letting each framework to work on.
4) Mesos can be much more capable in providing common concerns like security etc across frameworks but the paper has not discussed any details.
5) Synchronization issues may arise when data is shared across different frameworks. Paper hasn’t addressed this issue.
Current state of Mesos
Mesos project is being incubated in Apache. Mesos has got recognition from big companies like Twitter, Facebook and yahoo.
-Twitter uses apache mesos to run analytics job.
-Facebook uses mesos to run production and test hadoop instances on same data
NextGen MapReduce handles process execution and resource scheduling (this system is called YARN) in generic layer and MapReduce is being implemented on top of this.
In one sentence, difference between YARN and mesos is, mesos is a Meta framework scheduler whereas YARN is an application scheduler.