Memory Consumption of Hadoop NameNode

Memory Consumtion of NameNode

Each file or directory or block occupies about 150 bytes in the namenode memory. So a cluster with a namenode with 32G RAM can support a maximum of (assuming namenode is the bottleneck) about 38 million files. (Each file will also take up a block, so each file takes 300 bytes in effect. I am also assuming 3x replication. So each file takes up 900 bytes)

In practice however, the number will be much lesser because all of the 32G will not be available to the namenode for keeping the mapping. You can increase it by allocating more heap space to the namenode in that machine.

Replication also effects this to a lesser degree. Each additional replica adds about 16 bytes to the memory requirement.

(Each file metadata = 150bytes) + (block metadata for the file=150bytes)=300bytes so 1million files each with 1 block will consume=300*1000000=300000000bytes =300MB for replication factor of 1. with replication factor of 3 it requires 900MB

So as thumb rule for every 1GB you can store 1million files.
There are several technical limits to the NameNode (NN), and facing any of them will limit your scalability.

  1. Memory – NameNode consume about 150 bytes per each block.
  2. IO – NN is doing 1 IO for each change to filesystem (like create, delete block etc). So your local IO should allow enough. It is harder to estimate how much you need. Taking into account fact that we are limited in number of blocks by memory you will not claim this limit unless your cluster is very big.
  3. CPU – Namenode has considerable load keeping track of health of all blocks on all datanodes. Each datanode once a period of time report state of all its block. Again, unless cluster is not too big it should not be a problem.

 

Big Data Hadoop Training in Bangalore provided by NPN Training is a program designed to help professionals gain proficiency to work with the latest and core components of Hadoop.

 

Naveen P.N

12+ years of experience in IT with vast experience in executing complex projects using Java, Micro Services , Big Data and Cloud Platforms. I found NPN Training Pvt Ltd a India based startup to provide high quality training for IT professionals. I have trained more than 3000+ IT professionals and helped them to succeed in their career in different technologies. I am very passionate about Technology and Training. I have spent 12 years at Siemens, Yahoo, Amazon and Cisco, developing and managing technology.