Free Trial

Safari Books Online is a digital library providing on-demand subscription access to thousands of learning resources.


  • Create BookmarkCreate Bookmark
  • Create Note or TagCreate Note or Tag
  • PrintPrint
Share this Page URL
Help

2. HDFS > Reading and Writing Data

Reading and Writing Data

Clients can read and write to HDFS using various tools and APIs (see Access and Integration), but all of them follow the same process. The client always, at some level, uses a Hadoop library that is aware of HDFS and its semantics. This library encapsulates most of the gory details related to communicating with the namenode and datanodes when necessary, as well as dealing with the numerous failure cases that can occur when working with a distributed filesystem.

The Read Path

First, let’s walk through the logic of performing an HDFS read operation. For this, we’ll assume there’s a file /user/esammer/foo.txt already in HDFS. In addition to using Hadoop’s client library—usually a Java JAR file—each client must also have a copy of the cluster configuration data that specifies the location of the namenode (see Chapter 5). As shown in Figure 2-2, the client begins by contacting the namenode, indicating which file it would like to read. The client identity is first validated—either by trusting the client and allowing it to specify a username or by using a strong authentication mechanism such as Kerberos (see Chapter 6)—and then checked against the owner and permissions of the file. If the file exists and the user has access to it, the namenode responds to the client with the first block ID and the list of datanodes on which a copy of the block can be found, sorted by their distance to the client. Distance to the client is measured according to Hadoop’s rack topology—configuration data that indicates which hosts are located in which racks. (More on rack topology configuration is available in Rack Topology.)


  

You are currently reading a PREVIEW of this book.

                                                                                        

Get instant access to over
$1 million worth of books and videos.

  

Start a Free Trial