Safari Books Online is a digital library providing on-demand subscription access to thousands of learning resources.
It is time to turn our attention to how you can
extend Pig. So far we have looked at the operators and functions Pig
provides. But Pig also makes it easy for you to add your own processing
logic via User Defined Functions (UDFs). These are written in Java and,
starting with version 0.8, in Python.[24] This chapter will walk through how you can build
evaluation functions, UDFs that operate on single
elements of data or collections of data. It will also cover how to write
filter functions, UDFs that can be used
as part of filter statements.
UDFs are powerful tools, and thus the interfaces are somewhat complex. In designing Pig, our goal was to make easy things easy and hard things possible. So, the simplest UDFs can be implemented in a single method, but you will have to implement a few more methods to take advantage of more advanced features. We will cover both cases in this chapter.