Free Trial

Safari Books Online is a digital library providing on-demand subscription access to thousands of learning resources.


  • Create BookmarkCreate Bookmark
  • Create Note or TagCreate Note or Tag
  • PrintPrint

Tying It Together

With an ever-growing ecosystem of projects forming around Hadoop, each with its own requirements, it’s hard to understand how these pieces come together to form a single data processing platform. Beyond simply making these tools function, administrators are responsible for ensuring a singular identification, authentication, and authorization scheme is applied consistently and in accordance with data handling policies. This is a significant challenge, to use a well-worn, albeit applicable, cliche. However, there are a few things one can do to reduce the pain of building a secure, shared Hadoop platform.

To secure or not to secure

Do your homework and make an informed decision about whether or not you need Hadoop security. It is not as simple as setting hadoop.security.authentication to kerberos and getting on with life. The repercussions of Kerberizing a cluster are significant in that every client of the cluster must be able to properly handle authentication. It’s easy to say and difficult to make a reality. Can you trust clients to identify themselves truthfully? If not, you have your work cut out for you. Create a map of all access points to the cluster, understand which must support multiple users through a single process, and create a strategy to handle authentication. There’s no universal answer, although there are some established patterns you can use.


  

You are currently reading a PREVIEW of this book.

                                                                                        

Get instant access to over
$1 million worth of books and videos.

  

Start a Free Trial