Safari Books Online is a digital library providing on-demand subscription access to thousands of learning resources.
Pig runs as a client-side application. Even if you want to run Pig on a Hadoop cluster, there is nothing extra to install on the cluster: Pig launches jobs and interacts with HDFS (or other Hadoop filesystems) from your workstation.
Installation is straightforward. Java 6 is a prerequisite (and on Windows, you will need Cygwin). Download a stable release from http://pig.apache.org/releases.html, and unpack the tarball in a suitable place on your workstation:
%tar xzf pig-x.y.z.tar.gz
It’s convenient to add Pig’s binary directory to your command-line path. For example:
%export PIG_INSTALL=/home/tom/pig-x.y.z%export PATH=$PATH:$PIG_INSTALL/bin
You also need to set the JAVA_HOME environment
variable to point to a suitable Java installation.