Monday, August 16, 2010

HDFS and Map Reduce Architecture


Master server is controling all the activities in the cluster and slave/Data node works for master node.
Here in the hadoop environment , Master Node server is called Name node and Slave node is called Data node.
1. NameNode server
2. Data server
Namenode: Namenode is managing the file system metadata and also provides control service to the hadoop cluster. There will be only one namenode process running in a hadoop file system in the cluster environment.
Backupnode: Namenode is a single point of failure in a hadoop file system environment. So to overcome this failure, Backup node is used to copy the meta data file system from the namenode at frequent interval.
Datanode: Datanode is used for storing the data and retrieval of the data. There will be multiple processes are running in a cluster. Typically one datanode process per storage node.
Job Tracker: Job tracker accepts jobs and submissions of the jobs in a cluster environment. It also distributing/controlling the jobs in a cluster enviroment.
Distributed jobs are handover to TaskTracker process in a datanode.
TaskTracker: It manages the execution of the individual map and reduce task in the datanode



No comments:

Post a Comment