Writing Apache Modules with Perl and C
By:   Lincoln Stein and Doug MacEachern
Published:   O'Reilly & Associates, Inc.  - March 1999

Copyright © 1999 by O'Reilly & Associates, Inc.


 


   Show Contents   Previous Page   Next Page

Chapter 4 - Content Handlers / Apache::Registry
A Useful Apache::Registry Application

All the Apache::Registry examples that we've seen so far have been short and, frankly, silly. Now let's look at an example of a real-world script that actually does something useful. The guestbook script (Example 4-15), as its name implies, manages a typical site guestbook, where visitors can enter their names, email addresses, and comments. It works well as both a standalone CGI script and a mod_perl Apache::Registry script, automatically detecting when it is running under the Apache Perl API in order to take advantage of mod_perl's features. In addition to showing you how to generate a series of fill-out forms to handle a moderately complex user interaction, this script demonstrates how to read and update a file without the risk of several instances of the script trying to do so simultaneously.

Unlike some other guestbook programs, this one doesn't append users' names to a growing HTML document. Instead, it maintains a flat file in which each user's entry is represented as a single line in the file. Tabs separate the five fields, which are the date of the entry, the user's name, the user's email address, the user's location (e.g., city of residence), and comments. Nonalphanumeric characters are URL-escaped to prevent the format from getting messed up if the user enters newlines or tabs in the fields, giving records that look like:

05/07/98  JR  jr_ewing%40dallas.com  Dallas,%20TX  Like%20the%20hat

When the script is first called, it presents the user with the option of signing the guestbook file or looking at previous entries (Figure 4-6).

Figure 4-6. The Apache::Registry guestbook script generates its own fill-out form.

If the user presses the button labeled "Sign Guestbook," a confirmation page appears, which echoes the entry and prompts the user to edit or confirm it (Figure 4-7).

Figure 4-7. The confirmation page generated by guestbook

Pressing the "Change Entry" button takes the user back to the previous page with the fields filled in and waiting for the user's changes. Pressing "Confirm Entry" appends the user's entry to the guestbook file and displays the whole file (Figure 4-8).

Figure 4-8. The listing of previous guestbook entries generated by guestbook

Turning to the source code, the script begins by importing functions from a variety of modules, including CGI.pm, IO::File, Fcntl, and POSIX:

use strict;
use CGI qw(:standard :html3 escape unescape escapeHTML);
use IO::File ();
use Fcntl qw(:flock);
use POSIX qw(strftime);
use vars qw(@FIELDS %REQUIRED %BIG $GUESTBOOKFILE);
@FIELDS = qw(name e-mail location comments);
%REQUIRED = ('name' => 1, 'e-mail' => 1);
%BIG = ('comments' => 1);

The script then defines some constants. @FIELDS is an array of all the fields known to the guestbook. By changing the contents of this array you can generate different fill-out forms. %REQUIRED is a hash that designates certain fields as required, in this case name and e-mail. The script will refuse to add an entry to the guestbook until these fields are filled out (however, no error checking on the contents of the fields is done). %BIG is a hash containing the names of fields that are displayed as large text areas, in this case comments. Other fields are displayed as one-line text entries.

if ($ENV{MOD_PERL}) {
   $GUESTBOOKFILE = Apache->request->dir_config('GuestbookFile');
}
$GUESTBOOKFILE ||= "/usr/tmp/guestbookfile.txt";

Next the script checks if it is running under mod_perl by checking for the MOD_PERL environment variable. If the script finds that it is running under mod_perl, it fetches the Apache request object and queries the object for a per-directory configuration variable named GuestbookFile. This contains the physical pathname of the file where the guestbook entries are stored. If the script is a standalone CGI script, or if no GuestbookFile configuration variable is defined, the script defaults to a hardcoded file path. In the case of Apache::Registry scripts, the PerlSetVar directive used to set per-directory configuration variables must be located in a .htaccess file in the same directory as the script.

print header,
   start_html(-title => 'Guestbook', -bgcolor => 'silver'),
   h1("Guestbook");

The script now begins to generate the document by calling shortcut functions defined in the CGI module to generate the HTTP header, the HTML header and title, and a level 1 heading of "Guestbook."

 CASE: {
    $_ = param('action');
    /^sign/i and do    { sign_guestbook(); last CASE; };
    /^confirm/i and do { write_guestbook() and view_guestbook(); last CASE; };
    /^view/i and do    { view_guestbook(1); last CASE; };
    generate_form();
}

We now enter the variable part of the script. Depending on what phase of the transaction the user is in, we either want to prompt the user to fill out the guestbook form, confirm an entered entry, or view the entire guestbook. We distinguish between the phases by looking at the contents of a script parameter named action. If action equals sign, we know that the user has just completed the fill-out form and pressed the "Sign Guestbook" button, so we jump to the routine responsible for this part of the transaction. Similarly, we look for action values of confirm and view, and jump to the appropriate routines for these actions. If action is missing, or if it has some value we don't expect, we take the default action of generating the fill-out form.

print end_html;
exit 0;

Having done its work, the script prints out the </HTML> tag and exits.

sub generate_form {
   my @rows;
   for my $field (@FIELDS) {
      my $title = "Your $field";
      $title .= " (optional)" if !$REQUIRED{$field};
      my $element = $BIG{$field} ?
          textarea(-name => $field,
                   -rows => 4,
                   -columns => 50,
                   -wrap => 1)
              : textfield(-name => $field, -size => 50);
      push @rows, th($title) . td($element);
   }
   print start_form,
   table(TR{-align => 'LEFT'}, \@rows),
   br,
   submit(-name => 'action', -value => 'View Guestbook'),
   submit(-name => 'action', -value => 'Sign Guestbook'),
   end_form;
}

The subroutine responsible for generating the form is named, appropriately enough, generate_form(). It iterates over @FIELDS and dynamically generates a text label and a form element for each field, modifying the format somewhat based on whether the field is marked optional or big. Each label/field pair is pushed onto a list named @rows. When the loop is finished, @rows is turned into a nicely formatted table using CGI.pm's table-generation shortcuts. The "View Guestbook" and "Sign Guestbook" buttons are added to the form, and the routine finishes.

sub sign_guestbook {
   my @missing = check_missing(@FIELDS);
   if (@missing) {
      print_warning(@missing);
      generate_form();
      return;
   }

sign_guestbook() has a slightly more complex job. Its first task is to check the submitted form for missing required fields by calling the internal subroutine check_missing(). If any are missing, it displays the missing fields by calling another internal subroutine, print_warning(), and then invokes generate_form() to redisplay the form with its current values. No particular hocus-pocus is required to display the partially completed form correctly; this is just one of the beneficial side effects of CGI.pm's "sticky forms" feature.

    my @rows;
   foreach (@FIELDS) {
      push @rows, TR(th({-align=>'LEFT'},$_),
                     td(escapeHTML(param($_))));
   }
   print "Here is your guestbook entry.  Press ",
   em('Confirm')," to save it, or ",em('Change'),
   " to change it.",
   hr,
   table(@rows),
   hr;

If all the required fields are filled in, sign_guestbook() generates an HTML table to display the user's entries. The technique for generating the form is similar to that used in the previous subroutine, except that no special cases are needed for different types of fields. We do, however, have to be careful to call escapeHTML() (a function imported from CGI.pm) in order to prevent HTML entities and other funny characters that the user might have entered from messing up the page.

    print start_form;
   foreach (@FIELDS) {
      print hidden(-name => $_);
   }
   print submit(-name => 'action',
               -value => 'Change Entry'),
   submit(-name => 'action',
         -value => 'Confirm Entry'),
   end_form;
}

We end the routine by creating a short fill-out form. This form contains the contents of the user's guestbook entry stashed into a series of hidden fields, and push buttons labeled "Change Entry" and "Confirm Entry." We hide the guestbook entry information in this way in order to carry the information forward to the next set of pages.

sub check_missing {
   my %p;
   for (@_) { ++$p{$_} if param($_) }
   return grep !$p{$_}, keys %REQUIRED;
}
sub print_warning {
   print font({-color => 'red'},
             'Please fill in the following fields: ',
             em(join ', ', @_),
             '.');
}

The check_missing() and print_warning() subroutines are short and sweet. The first routine uses the Perl grep() function to check the list of provided fields against the list of required fields and returns a list of the truants, if any. The second routine accepts a list of missing fields and turns it into a warning of the form, "Please fill in the following fields: e-mail." For emphasis, the message is rendered in a red font (under browsers that understand the <FONT> extension).

The write_guestbook() and view_guestbook() subroutines are the most complex of the bunch. The main complication is that, on an active site, there's a pretty good chance that a second instance of the script may be invoked by another user before the first instance has completed opening and updating the guestbook file. If the writes overlap, the file could be corrupted and a guestbook entry lost or scrambled. For this reason, it's important for the script to lock the file before working with it.

POSIX-compliant systems (which include both Unix and Windows systems) offer a simple form of advisory file locking through the flock() system call. When a process opens a file and flock()s it, no other process can flock() it until the first process either closes the file or manually relinquishes the lock. There are actually two types of lock. A "shared" lock can be held by many processes simultaneously. An "exclusive" lock can only be held by one process at a time and prevents any other program from locking the file. Typically, a program that wants to read from a file will obtain a shared lock, while a program that wants to write to the file asks the system for an exclusive lock. A shared lock allows multiple programs to read from a file without worrying that some other process will change the file while they are still reading it. A program that wants to write to a file will call flock() to obtain an exclusive lock; the call will then block until all other processes have released their locks. After an exclusive lock is granted, no other program can lock the file until the writing process has finished its work and released the lock.

It's important to realize that the flock() locking mechanism is advisory. Nothing prevents a program from ignoring the flock() call and reading from or writing to a file without seeking to obtain a lock first. However, as long as only the programs you've written yourself attempt to access the file and you're always careful to call flock() before working with it, the system works just fine.

sub lock {
   my $path = shift;
   my $for_writing = shift;
    my ($lock_type, $path_name, $description);
   if ($for_writing) {
      $lock_type = LOCK_EX;
      $path_name = ">>$path";
      $description = 'writing';
   }
   else {
      $lock_type = LOCK_SH;
      $path_name = $path;
      $description = 'reading';
   }
    my $fh = IO::File->new($path_name) or
      warn "Couldn't open $path for $description: $!", return;
# now try to lock it
   my $success;
   my $tries = 0;
   while ($tries++ < 10) {
      last if $success = flock($fh, $lock_type|LOCK_NB);
      print p("Waiting for $description lock on guestbook file...");
      sleep(1);               # wait a second
   }
   unless ($success) {
      warn("Couldn't get lock for $description");
      return;
   }
   return $fh;
}

To make life a little simpler, the guestbook script defines a utility function named lock() that takes care of opening and locking the guestbook file (you'll find the definition at the bottom of the source listing). lock() takes two arguments: the name of the file to open and a flag indicating whether the file should be opened for writing. If the write flag is true, the function opens the file in append mode and then attempts to obtain an exclusive lock. Otherwise, it opens the file read only and tries to obtain a shared lock. If successful, the opened filehandle is returned to the caller.

The flock() function is used to obtain both types of lock. The first argument is the opened filehandle; the second is a constant indicating the type of lock to obtain. The constants for exclusive and shared locks are LOCK_EX and LOCK_SH, respectively. Both constants are imported from the Fcntl module using the :flock tag. We combine these constants with the LOCK_NB (nonblocking) constant, also obtained from Fcntl, in order to tell flock() to return if a lock cannot be obtained immediately. Otherwise, flock() will block indefinitely until the file is available. In order to avoid a long wait in which the script appears to be hung, we call flock() in a polling loop. If a lock cannot immediately be obtained, we print a warning message to the browser screen and sleep for 1 second. After 10 consecutive failed tries, we give up and exit the script. If the lock is successful, we return the file-handle.

sub write_guestbook {
   my $fh = lock($GUESTBOOKFILE, 1);
   unless ($fh) {
      print strong('An error occurred: unable to open guestbook file.'),p();
      Delete('action');
      print a({-href => self_url}, 'Try again');
      return;
   }
   seek ($fh,0,2);  # paranoia: seek to end of file
   my $date = strftime('%D',localtime);
   print $fh join("\t", $date, map {escape(param($_))} (@FIELDS)),"\n";
   print "Thank you, ", param('name'),", for signing the guestbook.\n";
   $fh->close;
   1;
}

To write a new entry into the guestbook, the write_guestbook() function calls lock() with the path to the guestbook file and a flag indicating we want write access. If the call fails, we display an appropriate error message and return. Otherwise, we seek to the end of the file, just in case someone else wrote to the file while we were waiting for the lock. We then join together the current date (obtained from the POSIX strftime() function) with the current values of the guestbook fields and write them out to the guestbook filehandle. To avoid the possibility of the user messing up our tab-delimited field scheme by entering tabs or newlines in the fill-out form, we're careful to escape the fields before writing them to the file. To do this, we use the map operator to pass the fields through CGI.pm's escape() function. This function is ordinarily used to make text safe for use in URIs, but it works just as well here.

After writing to the file, we're careful to close the filehandle. This releases the lock on the file and gives other processes access to it.

sub view_guestbook {
   my $show_sign_button = shift;
   print start_form,
   submit(-name => 'Sign Guestbook'),
   end_form
      if $show_sign_button;
   my $fh = lock($GUESTBOOKFILE, 0);
    my @rows;
   unless ($fh) {
      print strong('An error occurred: unable to open guestbook file.'),br;
      Delete('action');
      print a({-href => self_url},'Try again');
      return;
   }

The view_guestbook() subroutine looks a lot like the one we just looked at but in reverse. It starts by creating a tiny fill-out form containing a single button labeled "Sign Guestbook." This button is only displayed when someone views the guestbook without signing it first and is controlled by the $show_sign_button flag. Next we obtain a read-only filehandle on the guestbook file by calling lock() with a false second argument. If lock() returns an undefined result, we print an error message and exit. Otherwise, we read the contents of the guestbook file line by line and split out the fields.

    while (<$fh>) {
      chomp;
      my @data = map {escapeHTML($_)} map {unescape($_)} split("\t");
      unshift @rows, td(\@data);
   }
   unshift @rows, th(['Date',@FIELDS]);
   print p(
          table({-border => ''},
                caption(strong('Previous Guests')),
                TR(\@rows)));
   $fh->close;
   print a({-href => '/'}, 'Home');
   1;
}

The fields are then processed through map() twice: once to unescape the URL escape characters using the CGI.pm unescape() function and once again to make them safe to display on an HTML page using CGI.pm's escapeHTML() function. The second round of escaping is to avoid problems with values that contain the <, >, and & symbols. The processed lines are turned into HTML table cells, and unshifted onto a list named @rows. The purpose of the unshift is to reverse the order of the lines, so that more recent guestbook entries appear at the top of the list. We add the headings for the table and turn the whole thing into an HTML table using the appropriate CGI.pm shortcuts. We close the filehandle and exit.

If we were not interested in running this script under standard CGI, we could increase performance slightly and reduce memory consumption substantially by replacing a few functions with their Apache:: equivalents:

IO::File        --> Apache::File
CGI::escape     --> Apache::Util::escape_uri
CGI::unescape   --> Apache::Util::unescape_uri
CGI::escapeHTML --> Apache::Util::escape_html
POSIX::strftime --> Apache::Util::ht_time

See the reference listings in Chapter 9 for the proper syntax for these replacements. You'll also find a version of the guestbook script that uses these lightweight replacements on this book's companion web site, http://www.modperl.com.

Example 4-15. A Guestbook Script

#!/usr/local/bin/perl
# guestbook.cgi
use strict;
use CGI qw(:standard :html3 escape unescape escapeHTML);
use IO::File ();
use Fcntl qw(:flock);
use POSIX qw(strftime);
use vars qw(@FIELDS %REQUIRED %BIG $GUESTBOOKFILE);
@FIELDS = qw(name e-mail location comments);
%REQUIRED = ('name' => 1, 'e-mail' => 1);
%BIG = ('comments' => 1);
if ($ENV{MOD_PERL}) {
   $GUESTBOOKFILE = Apache->request->dir_config('GuestbookFile');
}
$GUESTBOOKFILE ||= "/usr/tmp/guestbookfile.txt";
print header,
   start_html(-title => 'Guestbook', -bgcolor => 'silver'),
   h1("Guestbook");
 CASE: {
    $_ = param('action');
    /^sign/i and do    { sign_guestbook(); last CASE; };
    /^confirm/i and do { write_guestbook() and view_guestbook(); last CASE; };
    /^view/i and do    { view_guestbook(1); last CASE; };
    generate_form();
}
print end_html;
exit 0;
sub generate_form {
   my @rows;
   for my $field (@FIELDS) {
      my $title = "Your $field";
      $title .= " (optional)" if !$REQUIRED{$field};
      my $element = $BIG{$field} ?
          textarea(-name => $field,
                   -rows => 4,
                   -columns => 50,
                   -wrap => 1)
              : textfield(-name => $field, -size => 50);
      push @rows, th($title) . td($element);
   }
   print start_form,
   table(TR{-align => 'LEFT'}, \@rows),
   br,
   submit(-name => 'action', -value => 'View Guestbook'),
   submit(-name => 'action', -value => 'Sign Guestbook'),
   end_form;
}
sub sign_guestbook {
   my @missing = check_missing(@FIELDS);
   if (@missing) {
      print_warning(@missing);
      generate_form();
      return;
   }
   my @rows;
   foreach (@FIELDS) {
      push @rows, TR(th({-align=>'LEFT'},$_),
                     td(escapeHTML(param($_))));
   }
   print "Here is your guestbook entry.  Press ",
   em('Confirm')," to save it, or ",em('Change'),
   " to change it.",
   hr,
   table(@rows),
   hr;
    print start_form;
   foreach (@FIELDS) {
      print hidden(-name => $_);
   }
   print submit(-name => 'action',
               -value => 'Change Entry'),
   submit(-name => 'action',
         -value => 'Confirm Entry'),
   end_form;
}
sub check_missing {
   my %p;
   for (@_) { ++$p{$_} if param($_) }
   return grep !$p{$_}, keys %REQUIRED;
}
sub print_warning {
   print font({-color => 'red'},
             'Please fill in the following fields: ',
             em(join ', ', @_),
             '.');
}
sub write_guestbook {
   my $fh = lock($GUESTBOOKFILE, 1);
   unless ($fh) {
      print strong('An error occurred: unable to open guestbook file.'),p();
      Delete('action');
      print a({-href => self_url}, 'Try again');
      return;
   }
   seek ($fh,0,2);  # paranoia: seek to end of file
   my $date = strftime('%D',localtime);
   print $fh join("\t", $date, map {escape(param($_))} (@FIELDS)),"\n";
   print "Thank you, ", param('name'),", for signing the guestbook.\n";
   $fh->close;
   1;
}
sub view_guestbook {
   my $show_sign_button = shift;
   print start_form,
   submit(-name => 'Sign Guestbook'),
   end_form
      if $show_sign_button;
   my $fh = lock($GUESTBOOKFILE, 0);
    my @rows;
   unless ($fh) {
      print strong('An error occurred: unable to open guestbook file.'),br;
      Delete('action');
      print a({-href => self_url},'Try again');
      return;
   }
   while (<$fh>) {
      chomp;
      my @data = map {escapeHTML($_)} map {unescape($_)} split("\t");
      unshift @rows, td(\@data);
   }
   unshift @rows, th(['Date',@FIELDS]);
   print p(
          table({-border => ''},
                caption(strong('Previous Guests')),
                TR(\@rows)));
   $fh->close;
   print a({-href => '/'}, 'Home');
   1;
}
sub lock {
   my $path = shift;
   my $for_writing = shift;
    my ($lock_type, $path_name, $description);
   if ($for_writing) {
      $lock_type = LOCK_EX;
      $path_name = ">>$path";
      $description = 'writing';
   }
   else {
      $lock_type = LOCK_SH;
      $path_name = $path;
      $description = 'reading';
   }
    my $fh = IO::File->new($path_name) or
      warn "Couldn't open $path for $description: $!", return;
# now try to lock it
   my $success;
   my $tries = 0;
   while ($tries++ < 10) {
      last if $success = flock($fh, $lock_type|LOCK_NB);
      print p("Waiting for $description lock on guestbook file...");
      sleep(1);               # wait a second
   }
   unless ($success) {
      warn("Couldn't get lock for $description");
      return;
   }
   return $fh;
}

A .htaccess file to go with the guestbook script might be:

PerlSetVar GuestbookFile /home/www/etc/guests.txt
   Show Contents   Previous Page   Next Page
Copyright © 1999 by O'Reilly & Associates, Inc.

HIVE: All information for read only. Please respect copyright!
Hosted by hive КГБ: Киевская городская библиотека