Safari Books Online is a digital library providing on-demand subscription access to thousands of learning resources.
When setting up a Hadoop cluster, you’ll need to designate one specific node as the master node. As shown in figure 2.3, this server will typically host the NameNode and JobTracker daemons. It’ll also serve as the base station contacting and activating the DataNode and TaskTracker daemons on all of the slave nodes. As such, we need to define a means for the master node to remotely access every node in your cluster.
Hadoop uses passphraseless SSH for this purpose. SSH utilizes standard public key cryptography to create a pair of keys for user verification—one public, one private. The public key is stored locally on every node in the cluster, and the master node sends the private key when attempting to access a remote machine. With both pieces of information, the target machine can validate the login attempt.