Safari Books Online is a digital library providing on-demand subscription access to thousands of learning resources.
The following is an example of how HTML::Parser can be subclassed, and its methods overridden, to produce meaningful output. This example simply prints out the original HTML file, unmodified:
1 #!/usr/bin/perl -w
2
3 use strict;
4
5 # Define the subclass
6 package IdentityParse;
7 use base "HTML::Parser";
8
9 sub text {
10 my ($self, $text) = @_;
11 # Just print out the original text
12 print $text;
13 }
14
15 sub comment {
16 my ($self, $comment) = @_;
17 # Print out original text with comment marker
18 print "<!--", $comment, "-->";
19 }
20
21 sub start {
22 my ($self, $tag, $attr, $attrseq, $origtext) = @_;
23 # Print out original text
24 print $origtext;
25 }
26
27 sub end {
28 my ($self, $tag, $origtext) = @_;
29 # Print out original text
30 print $origtext;
31 }
32
33 my $p = new IdentityParse;
34 $p->parse_file("index.html");