when running our mapreduce program on hadoop, observed reducers had large shuffle time. concluded happened because required map containers exceeded available containers on cluster. caused reducers 'block' in shuffle phase until mappers done. reducers not depict behaviour started after mappers completed. effect can seen in first image below.
we disabled slowstart setting mapreduce.job.reduce.slowstart.completedmaps 1. resulted in lower shuffle time, confirming our hypothesis. result can seen in second image below.
one expect see lower cpu time second job. strangely, not case. cpu time same. 1 reason things not counted shuffle, merge , reduce timers.
what cause mismatch between total measured map , reduce (elapsed) times , total cpu milliseconds?
additional question: cause mappers in second job behave different in first job?
thanks insights.

Comments
Post a Comment