Free Trial

Safari Books Online is a digital library providing on-demand subscription access to thousands of learning resources.

  • Create BookmarkCreate Bookmark
  • Create Note or TagCreate Note or Tag
  • PrintPrint

7.3. Individual Tokens

Now that you know the composition of the various types of tokens, let's see how to use HTML::TokeParser to write useful programs. Many problems are quite simple and require only one token at a time. Programs to solve these problems consist of a loop over all the tokens, with an if statement in the body of the loop identifying the interesting parts of the HTML:

use HTML::TokeParser;
my $stream = HTML::TokeParser->new($filename)
  || die "Couldn't read HTML file $filename: $!";
# For a string: HTML::TokeParser->new( \$string_of_html );

while (my $token = $stream->get_token) {
   if ($token->[0] eq 'T') { # text
     # process the text in $text->[1]

   } elsif ($token->[0] eq 'S') { # start-tag
     my($tagname, $attr) = @$token[1,2];
     # consider this start-tag...

   } elsif ($token->[0] eq 'E') {
     my $tagname = $token->[1];
     # consider this end-tag
   }

   # ignoring comments, declarations, and PIs
}


  

You are currently reading a PREVIEW of this book.

                                                                                                                    

Get instant access to over $1 million worth of books and videos.

  

Start a Free Trial


  
  • Safari Books Online
  • Create BookmarkCreate Bookmark
  • Create Note or TagCreate Note or Tag
  • PrintPrint