Free Trial

Safari Books Online is a digital library providing on-demand subscription access to thousands of learning resources.


  • Create BookmarkCreate Bookmark
  • Create Note or TagCreate Note or Tag
  • DownloadDownload
  • PrintPrint
Share this Page URL
Help

Part 2: Regular Expressions

Part 2: Regular Expressions

Chapter 13

Chapter 14

Chapter 15

Chapter 16

Chapter 17

Perl lets you search huge collections of data with its built-in support for pattern-matching. To match a pattern, you create a regular expression, and test whether a string contains the pattern with the =~ and !~ operators, like so:

	print "Match!" if $greeting =~ /Hello/;

If $greeting contains the five-character sequence Hello, this code snippet prints Match!.

Regular expressions constitute a little language inside Perl, with their own peculiar syntax inherited from history. From a computer scientist’s point of view, Perl’s regular expressions aren’t even regular; features like \1 and \2, which allow you to refer to portions of already-matched text inside a pattern, aren’t permitted in “traditional” regular expressions.

This section won’t give you a tour of all of Perl’s regular expression metacharacters and assertions. For that, consult the perlre online documentation, or another Perl book. Instead, the five chapters here help you understand how Perl’s regular expressions work under the hood.

Four of the five articles are written by the world’s foremost regex guru: Jeffrey Friedl, author of O’Reilly’s Mastering Regular Expressions. His often-cited article Understanding Regular Expressions, Part I explains backtracking, one of the key concepts in regular expressions that cleanly separates the beginners from experts. In Understanding Regular Expressions, Part II, Jeffrey takes a simple problem—matching two substrings without regard to ordering—and shows you ten different solutions, explaining which is best and why, so that you can apply the reasoning to other problems. Understanding Regular Expressions, Part III dives more deeply into backtracking and the behavior of greedy quantifiers. (“Greediness” refers to the behavior of * and +, which match as much text as they can; for instance, a+ means “match one or more as, as many as possible.) In Nibbling Strings, he demonstrates a powerful technique for speeding up huge regular expression matches that he’s put into practice at Yahoo!. Every time you see a Yahoo! article mentioning a publicly traded company, you see a link to the company’s stock information; that link is provided by the code Jeffrey shows you in the article.

The fifth article, How Regexes Work, is a more theoretical piece by Mark Jason Dominus showing you how to build your own regular expression engine. There’s little reason to do that when you have Perl, of course, but his explanation of the processes involve will help you to understand why Perl’s regular expression engine behaves the way it does. The theoretically-minded may wish to read Mark Jason’s chapter immediately after Understanding Regular Expressions, Part I. The practically-minded should read them in the order presented.



  

You are currently reading a PREVIEW of this book.

                                                                                        

Get instant access to over
$1 million worth of books and videos.

  

Start a Free Trial