hadoop - YARN: How to run MapReduce jobs with lot of mappers comparing to cluster size -


I have 1-node headop test setup with MapReduce job which starts 96 mapers and 6 reducers. Before migrating from the yarn, this job was stable but with normal yarn, it started to hang 100% with the majority of mappers in the 'pending' state.

The job is actually 6 Sub-Job (16 Mapper +1 rediscover each). This configuration shows the production process sequence. All of them are under a single job control, are there any such configurations that need to be examined, or are relatively big jobs compared to the nodes and cluster sizes of such small cases? The worst case for developers I can reduce the job with a group of sub-jobs, but I do not want to do this because there is no reason to do this on production and I must have the exam and production sequence same.

When I migrated to the YARN scheduler then it was changed to farsiibular and it is currently the only option as I run Claudera and Chloeida, strongly recommends that unbiased scheduler Therefore switching to the FIFO scheduler is not an option.

Any other option in addition to 'redefining the job'?

Disables current 'Qi per user' logic (switch to single queue) and assign the allocation file Having solved my problems with the limited amount of applications running running.

  • as yarn.scheduler.fair.user-as- The default-queue was set to false.
  • The dynamic resource allocation queue in Clodaria Manager was changed to 'Default' so that queue does not allow applications running more than 2. 1-node design is good enough for testing equipment. This allocation file will be improved in the open source.

So far it works as needed. Everything else except the default policy is left.


Comments

Popular posts from this blog

import - Python ImportError: No module named wmi -

Editing Python Class in Shell and SQLAlchemy -

c# - MySQL Parameterized Select Query joining tables issue -