Writing Apache Modules with Perl and C

Writing Apache Modules with Perl and C

By:	Lincoln Stein and Doug MacEachern
Published:	O'Reilly & Associates, Inc. - March 1999

Show Contents Previous Page Next Page

Chapter 4 - Content Handlers / Content Handlers as File Processors
Converting Image Formats

Another useful application of Apache content handlers is converting file formats on the fly. For example, with a little help from the Aladdin Ghostscript interpreter, you can dynamically convert Adobe Acrobat (PDF) files into GIF images when dealing with a browser that doesn't have the Acrobat plug-in installed.¹

In this section, we show a content handler that converts image files on the fly. It takes advantage of Kyle Shorter's Image::Magick package, the Perl interface to John Cristy's ImageMagick library. Image::Magick interconverts a large number of image formats, including JPEG, PNG, TIFF, GIF, MPEG, PPM, and even PostScript. It can also transform images in various ways, such as cropping, rotating, solarizing, sharpening, sampling, and blurring.

The Apache::Magick content handler accepts URIs in this form:

/////?=&=...

In its simplest form, the handler can be used to perform image format conversions on the fly. For example, if the actual file is named bluebird.gif and you request bluebird.jpg, the content handler automatically converts the GIF into a JPEG file and returns it. You can also pass arguments to the converter in the query string. For example, to specify a progressive JPEG image (interlace = "Line") with a quality of 50 percent, you can fetch the file by requesting a URI like this one:

/images/bluebird.jpg?interlace=Line&quality=50

You can also run one or more filters on the image prior to the conversion. For example, to apply the "Charcoal" filter (which makes the image look like a charcoal sketch) and then put a decorative border around it (the "Frame" filter), you can request the image like this:

/images/bluebird.jpg/Charcoal/Frame?quality=75

Any named arguments that need to be passed to the filter can be appended to the query string, along with the conversion arguments. In the last example, we can specify a gold-colored frame this way:

/images/bluebird.jpg/Charcoal/Frame?quality=75&color=gold

This API doesn't allow you to direct arguments to specific filters. Fortunately, most of the filters that you might want to apply together don't have overlapping argument names, and filters ignore any arguments that don't apply to them. The full list of filters and conversion operations can be found at the PerlMagick web site, located at http://www.wizards.dupont.com/cristy/www/perl.html. You'll find pointers to the latest ImageMagick code library there as well.

One warning before you use this Apache module on your system: some of the operations can be very CPU-intensive, particularly when converting an image with many colors, such as JPEG, to one that has few colors, such as GIF. You should also be prepared for Image::Magick's memory consumption, which is nothing short of voracious.

Example 4-5 shows the code for Apache::Magick.

package Apache::Magick;

use strict;
use Apache::Constants qw(:common);
use Image::Magick ();
use Apache::File ();
use File::Basename qw(fileparse); 
use DirHandle ();

We begin as usual by bringing in the modules we need. We bring in Apache::Constants, File::Basename for its file path parsing utilities, DirHandle() for object-oriented interface to directory reading functions, and the Image::Magick module itself.

my %LegalArguments = map { $_ => 1 }
qw (adjoin background bordercolor colormap colorspace
   colors compress density dispose delay dither
   loop magick mattecolor monochrome page pointsze
   preview_type quality scene subimage subrange
   size tile texture treedepth undercolor);
my %LegalFilters = map { $_ => 1 }
qw(AddNoise Blur Border Charcoal Chop
  Contrast Crop Colorize Comment CycleColormap
  Despeckle Draw Edge Emboss Enhance Equalize Flip Flop
  Frame Gamma Implode Label Layer Magnify Map Minify
  Modulate Negate Normalize OilPaint Opaque Quantize
  Raise ReduceNoise Rotate Sample Scale Segment Shade
  Sharpen Shear Solarize Spread Swirl Texture Transparent
  Threshold Trim Wave Zoom);

We then define two hashes, one for all the filter and conversion arguments recognized by Image::Magick and the other for the various filter operations that are available. These lists were cut and pasted from the Image::Magick documentation. We tried to exclude the ones that were not relevant to this module, such as ones that create multiframe animations, but a few may have slipped through.

sub handler {
   my $r = shift;

    # get the name of the requested file
   my $file = $r->filename;

    # If the file exists and there are no transformation arguments
   # just decline the transaction.  It will be handled as usual.
   return DECLINED unless $r->args || $r->path_info || !-r $r->finfo;

The handler() routine begins as usual by fetching the name of the requested file. We decline to handle the transaction if the file exists, the query string is empty, and the additional path information is empty as well. This is just the common case of the browser trying to fetch an unmodified existing file.

    my $source;
   my ($base, $directory, $extension) = fileparse($file, '\.\w+'); 
   if (-r $r->finfo) { # file exists, so it becomes the source
      $source = $file;
    }
   else {              # file doesn't exist, so we search for it
      return DECLINED unless -r $directory;
      $source = find_image($r, $directory, $base); 
   }

    unless ($source) {
      $r->log_error("Couldn't find a replacement for $file");
      return NOT_FOUND;
   }

We now use File::Basename's fileparse() function to parse the requested file into its basename (the filename without the extension), the directory name, and the extension. We check again whether we can read the file, and if so it becomes the source for the conversion. Otherwise, we search the directory for another image file to convert into the format of the requested file. For example, if the URI requested is bluebird.jpeg and we find a file named bluebird.gif, we invoke Image::Magick to do the conversion. The search is done by an internal subroutine named find_image(), which we'll examine later. If successful, the name of the source image is stored in $source. If unsuccessful, we log the error with the log_error() function and return a NOT_FOUND result code.

    $r->send_http_header;
   return OK if $r->header_only;

At this point, we send the HTTP header using send_http_header(). The next line represents an optimization that we haven't seen before. It may be that the client isn't interested in the content of the image file, but just in its meta-information, such as its length and MIME type. In this case, the browser sends an HTTP HEAD request rather than the usual GET. When Apache receives a HEAD request, it sets header_only() to true. If we see that this has happened, we return from the handler immediately with an OK status code. Although it wouldn't hurt to send the document body anyway, respecting the HEAD request results in a slight savings in processing efficiency and makes the module compliant with the HTTP protocol.

    my $q = Image::Magick->new;
   my $err = $q->Read($source);

Otherwise, it's time to read the source image into memory. We create a new Image::Magick object, store it in a variable named $q, and then load the source image file by calling its Read() method. Any error message returned by Read() is stored into a variable called $err.

    my %arguments = $r->args;

    # Run the filters
   for (split '/', $r->path_info) {
      my $filter = ucfirst $_;
      next unless $LegalFilters{$filter}; 
      $err ||= $q->$filter(%arguments);
   }

    # Remove invalid arguments before the conversion
   for (keys %arguments) {
      delete $arguments{$_} unless $LegalArguments{$_};
   }

The next phase of the process is to prepare for the image manipulation. The first thing we do is tidy up the input parameters. We retrieve the query string parameters by calling the request object's args() method and store them in a hash named %arguments.

We then call the request object's path_info() method to retrieve the additional path information. We split the path info into a series of filter names and canonicalize them by capitalizing their initial letters using the Perl built-in operator ucfirst(). Each of the filters is applied in turn, skipping over any that aren't on the list of filters that Image::Magick accepts. We do an OR assignment into $err, so that we maintain the first non-null error message, if any. Having run the files, we remove from the %arguments array any arguments that aren't valid in Image::Magick's file format conversion calls.

    # Create a temporary file name to use for conversion
   my($tmpnam, $fh) = Apache::File->tmpfile;

Image::Magick needs to write the image to a temporary file. We call the Apache::File tmpfile() method to create a suitable temporary file name. If successful, tmpfile() returns the name of the temporary file, which we store in the variable $tmpnam, and a filehandle open for writing into the file, which we store in the variable $fh. The tmpfile() method is specially written to avoid a "race condition" in which the temporary file name appears to be unused when the module first checks for it but is created by someone else before it can be opened.

    # Write out the modified image
   open(STDOUT, ">&=" . fileno($fh));

The next task is to have Image::Magick perform the requested conversion and write it to the temporary file. The safest way to do this would be to pass it the temporary file's already opened filehandle. Unfortunately, Image::Magick doesn't accept filehandles; its Write() method expects a filename, or the special filename - to write to standard output. However, we can trick it into writing to the filehandle by reopening standard output on the filehandle, which we do by passing the filehandle's numeric file descriptor to open() using the rarely seen >&= notation. See the open() entry in the perlfunc manual page for complete details.

Since STDOUT gets reset before every Perl API transaction, there's no need to save and restore its original value.

    $extension =~ s/^\.//;
   $err ||= $q->Write('filename' => "\U$extension\L:-", %arguments);
   if ($err) {
      unlink $tmpnam;
      $r->log_error($err);
      return SERVER_ERROR;
   }
   close $fh;

We now call Image::Magick's Write() method with the argument ''=>:- where EXTENSION is the uppercased extension of the document that the remote user requested. We also tack on any conversion arguments that were requested. For example, if the remote user requested blue-bird.jpg?qual-ity=75, the call to Write() ends up looking like this:

$q->Write('filename'=>'JPG:-','quality'=>75);

If any errors occurred during this step or the previous ones, we delete the temporary file, log the errors, and return a SERVER_ERROR status code.

    # At this point the conversion is all done! 
   # reopen for reading
   $fh = Apache::File->new($tmpnam);
   unless ($fh) {
      $r->log_error("Couldn't open $tmpnam: $!");
      return SERVER_ERROR;
   }

    # send the file
   $r->send_fd($fh);

    # clean up and go
   unlink $tmpnam;
   return OK;
}

If the call to Write() was successful, we need to send the contents of the temporary file to the waiting browser. We could open the file, read its contents, and send it off using a series of print() calls, as we've done previously, but in this case there's a slightly easier way. After reopening the file with Apache::File's new() method, we call the request object's send_fd() method to transmit the contents of the filehandle in one step. The send_fd() method accepts all the same filehandle data types as the Perl built-in I/O operators. After sending off the file, we clean up by unlinking the temporary file and returning an OK status.

We'll now turn our attention to the find_image() subroutine, which is responsible for searching the directory for a suitable file to use as the image source if the requested file can't be found:

sub find_image {
   my ($r, $directory, $base) = @_;
   my $dh = DirHandle->new($directory) or return;

The find_image() utility subroutine is straightforward. It takes the request object, the parsed directory name, and the basename of the requested file and attempts to search this directory for an image file that shares the same basename. The routine opens a directory handle with DirHandle->new() and iterates over its entries.

    my $source;
   for my $entry ($dh->read) {
      my $candidate = fileparse($entry, '\.\w+');
      if ($base eq $candidate) {
          # determine whether this is an image file
          $source = join '', $directory, $entry;
          my $subr = $r->lookup_file($source);
          last if $subr->content_type =~ m:^image/:;
          undef $source;
      }
   }

For each entry in the directory listing, we parse out the basename using fileparse(). If the basename is identical to the one we're searching for, we call the request object's lookup_file() method to activate an Apache subrequest. lookup_file() is similar to lookup_uri(), which we saw earlier in the context of server-side includes, except that it accepts a physical pathname rather than a URI. Because of this, lookup_file() will skip the URI translation phase, but it will still cause Apache to trigger all the various handlers up to, but not including, the content handler.

In this case, we're using the subrequest for the sole purpose of getting at the MIME type of the file. If the file is indeed an image of one sort or another, then we save the request in a lexical variable and exit the loop. Otherwise, we keep searching.

    $dh->close;
   return $source;
}

At the end of the loop, $source will be undefined if no suitable image file was found, or it will contain the full pathname to the image file if we were successful. We close the directory handle, and return $source.

Example 4-5. Apache::Magick Converts Image Formats on the Fly

package Apache::Magick;
# file: Apache/Magick.pm

use Apache::Constants qw(:common);
use Image::Magick ();
use Apache::File ();
use File::Basename qw(fileparse);
use DirHandle ();

my %LegalArguments = map { $_ => 1 }
qw (adjoin background bordercolor colormap colorspace
    colors compress density dispose delay dither
    display font format iterations interlace
   loop magick mattecolor monochrome page pointsize
    preview_type quality scene subimage subrange
    size tile texture treedepth undercolor);
my %LegalFilters = map { $_ => 1 }
qw(AddNoise Blur Border Charcoal Chop
   Contrast Crop Colorize Comment CycleColormap
   Despeckle Draw Edge Emboss Enhance Equalize Flip Flop
   Frame Gamma Implode Label Layer Magnify Map Minify
   Modulate Negate Normalize OilPaint Opaque Quantize
   Raise ReduceNoise Rotate Sample Scale Segment Shade
   Sharpen Shear Solarize Spread Swirl Texture Transparent
   Threshold Trim Wave Zoom);

sub handler {
   my $r = shift;

    # get the name of the requested file
    my $file = $r->filename;

    # If the file exists and there are no transformation arguments
    # just decline the transaction.  It will be handled as usual.
    return DECLINED unless $r->args || $r->path_info || !-r $r->finfo;

    my $source;
    my ($base, $directory, $extension) = fileparse($file, '\.\w+');
   if (-r $r->finfo) { # file exists, so it becomes the source
       $source = $file;
    }
    else {              # file doesn't exist, so we search for it
      return DECLINED unless -r $directory;
      $source = find_image($r, $directory, $base);
  }

    unless ($source) {
       $r->log_error("Couldn't find a replacement for $file");
      return NOT_FOUND;
    }

    $r->send_http_header;
    return OK if $r->header_only;

    # Read the image
   my $q = Image::Magick->new;
   my $err = $q->Read($source);

    # Conversion arguments are kept in the query string, and the
    # image filter operations are kept in the path info
    my %arguments = $r->args;

    # Run the filters
    for (split '/', $r->path_info) {
      my $filter = ucfirst $_;
      next unless $LegalFilters{$filter};
      $err ||= $q->$filter(%arguments);
    }
    # Remove invalid arguments before the conversion
    for (keys %arguments) {
      delete $arguments{$_} unless $LegalArguments{$_};
    }

    # Create a temporary file name to use for conversion
    my($tmpnam, $fh) = Apache::File->tmpfile;

    # Write out the modified image
   open(STDOUT, ">&=" . fileno($fh));
    $extension =~ s/^\.//;
   $err ||= $q->Write('filename' => "\U$extension\L:-", %arguments);
   if ($err) {
      unlink $tmpnam;
      $r->log_error($err);
    return SERVER_ERROR;
   }
   close $fh;

    # At this point the conversion is all done!
    # reopen for reading
    $fh = Apache::File->new($tmpnam);
    unless ($fh) {
     $r->log_error("Couldn't open $tmpnam: $!");
      return SERVER_ERROR;
    }
    # send the file
   $r->send_fd($fh);

    # clean up and go
   unlink $tmpnam;
   return OK;
}

sub find_image {
   my ($r, $directory, $base) = @_;
   my $dh = DirHandle->new($directory) or return;
   my $source;
   for my $entry ($dh->read) {
      my $candidate = fileparse($entry, '\.\w+');
     if ($base eq $candidate) {
           # determine whether this is an image file
          $source = join '', $directory, $entry;
           my $subr = $r->lookup_file($source);
          last if $subr->content_type =~ m:^image/:;
           undef $source;
      }
   }
   $dh->close;
   return $source;
}

1;
__END__

Here is a perl.conf entry to go with Apache::Magick:

<Location /images>
 SetHandler perl-script
 PerlHandler Apache::Magick
</Location>

Show Contents Previous Page Next Page

HIVE: All information for read only. Please respect copyright!