Writing Apache Modules with Perl and C
By:   Lincoln Stein and Doug MacEachern
Published:   O'Reilly & Associates, Inc.  - March 1999

Copyright © 1999 by O'Reilly & Associates, Inc.


 


   Show Contents   Previous Page   Next Page

Chapter 7 - Other Request Phases
Customizing the Type Checking Phase

In this section...

Introduction
A DBI-Based Type Checker

Introduction

   Show Contents   Go to Top   Previous Page   Next Page

Following the successful completion of the access control and authentication steps (if configured), Apache tries to determine the MIME type (e.g., image/gif) and encoding type (e.g., x-gzip) of the requested document. The types and encodings are usually determined by filename extensions. (The term "suffix" is used interchangeably with "extension" in the Apache source code and documentation.) Table 7-1 lists a few common examples.

Table 7-1. MIME Types and Encodings for Common File Extensions

MIME types 
extension
type
.txt
text/plain
.html, .htm
text/html
.gif
image/gif
.jpg, .jpeg
image/jpeg
.mpeg, .mpg
video/mpeg
pdf
application/pdf
Encodings

extension

encoding

.gz
x-gzip
.Z
x-compress

By default, Apache's type checking phase is handled by the standard mod_mime module, which combines the information stored in the server's conf/mime.types file with AddType and AddEncoding directives to map file extensions onto MIME types and encodings.

The contents of the request record's content_type field are used to set the default outgoing Content-Type header, which the client uses to decide how to render the document. However, as we've seen, content handlers can, and often do, change the content type during the later response phase.

In addition to its responsibility for choosing MIME and encoding types for the requested document, the type checking phase handler also performs the crucial task of selecting the content handler for the document. mod_mime looks first for a SetHandler directive in the current directory or location. If one is set, it uses that handler for the requested document. Otherwise, it dispatches the request based on the MIME type of the document. This process was described in more detail at the beginning of Chapter 4. Also see "Reimplementing mod_mime in Perl," in Chapter 8, Customizing the Apache Configuration Process, where we reproduce all of mod_mime's functionality with a Perl module.

A DBI-Based Type Checker

   Show Contents   Go to Top   Previous Page   Next Page

In this section, we'll show you a simple type checking handler that determines the MIME type of the document on the basis of a DBI database lookup. Each record ofþthe database table will contain the name of the file, its MIME type, and its encoding.6 If no type is registered in the database, we fall through to the default mod_mime handler.

This module, Apache::MimeDBI, makes use of the simple Tie::DBI class that was introduced in the previous chapter. Briefly, this class lets you tie a hash to a relational database table. The tied variable appears as a hash of hashes in which the outer hash is a list of table records indexed by the table's primary key and the inner hash contains the columns of that record, indexed by column name. To give a concrete example, for the purposes of this module we'll set up a database table named doc_types having this structure:

+----------+------------+------------+
| filename | mime_type  | encoding   |
+----------+------------+------------+
| test1    | text/plain | NULL       |
| test2    | text/html  | NULL       |
| test3    | text/html  | x-compress |
| test4    | text/html  | x-gzip     |
| test5    | image/gif  | NULL       |
+----------+------------+------------+

Assuming that a hash named %DB is tied to this table, we'll be able to access its columns in this way:

$type     = $DB{'test2'}{'mime_type'};
$encoding = $DB{'test2'}{'encoding'};

Example 7-6 gives the source for Apache::MimeDBI.

package Apache::MimeDBI;
# file Apache/MimeDBI.pm
use strict;
use Apache::Constants qw(:common);
use Tie::DBI ();
use File::Basename qw(basename);
use constant DEFAULT_DSN    => 'mysql:test_www';
use constant DEFAULT_LOGIN  => ':';
use constant DEFAULT_TABLE  => 'doc_types';
use constant DEFAULT_FIELDS => 'filename:mime_type:encoding';

The module starts by pulling in necessary Perl libraries, including Tie::DBI and the File::Basename filename parser. It also defines a series of default configuration constants. DEFAULT_DSN is the default DBI data source to use, in the format :::. DEFAULT_LOGIN is the username and password for the web server to use to log into the database, separated by a : character. Both fields are blank by default, indicating no password needs to be provided. DEFAULT_TABLE is the name of the table in which to look for the MIME type and encoding information. DEFAULT_FIELDS are the names of the filename, MIME type, and encoding columns, again separated by the : character. These default values can be overridden with the per-directory Perl configuration variables MIME-Data-base, MIME-Login, MIMETable, and MIMEFields.

sub handler {
   my $r = shift;
    # get filename
   my $file = basename $r->filename;
    # get configuration information
   my $dsn        = $r->dir_config('MIMEDatabase') || DEFAULT_DSN;
   my $table      = $r->dir_config('MIMETable')    || DEFAULT_TABLE;
   my($filefield, $mimefield, $encodingfield) =
       split ':',$r->dir_config('MIMEFields') || DEFAULT_FIELDS;
   my($user, $pass) =
       split ':', $r->dir_config('MIMELogin') || DEFAULT_LOGIN;

The handler() subroutine begins by shifting the request object off the subroutine call stack and using it to recover the requested document's filename. The directory part of the filename is then stripped away using the basename() routine imported from File::Basename. Next, we fetch the values of our four configuration variables. If any are undefined, we default to the values defined by the previously declared constants.

    tie my %DB, 'Tie::DBI', {
       'db' => $dsn, 'table' => $table, 'key' => $filefield,
       'user' => $user, 'password' => $pass,
   };
   my $record;

We now tie a hash named %DB to the indicated database by calling the tie() operator. If the hash is successfully tied to the database, this routine will return a true value (actually, an object reference to the underlying Tie::DBI object itself). Otherwise, we return a value of DECLINED and allow other modules their chance at the MIME checking phase.

    return DECLINED unless tied %DB and $record = $DB{$file};

The next step is to check the tied hash to see if there is a record corresponding to the current filename. If there is, we store the record in a variable named $record. Otherwise, we again return DECLINED. This allows files that are not specifically named in the database to fall through to the standard file extension-based MIME type determination.

    $r->content_type($record->{$mimefield});
   $r->content_encoding($record->{$encodingfield})
           if $record->{$encodingfield};

Since the file is listed in the database, we fetch the values of the MIME type and encoding columns and write them into the request record by calling the request object's content_type() and content_encoding(), respectively. Since most documents do not have an encoding type, we only call content_encoding() if the column is defined.

    return OK;
}

Our work is done, so we exit the handler subroutine with an OK status code.

At the end of the code listing is a short shell script which you can use to initialize a test database named test_www. It will create the table shown in this example.

To install this module, add a PerlTypeHandler directive like this one to one of the configuration files or a .htaccess file:

<Location /mimedbi>
 PerlTypeHandler Apache::MimeDBI
</Location>

If you need to change the name of the database, the login information, or the table structure, be sure to include the appropriate PerlSetVar directives as well.

Figure 7-2 shows the automatic listing of a directory under the control of Apache::MimeDBI. The directory contains several files. test1 through test5 are listed in the database with the MIME types and encodings shown in the previous table. Their icons reflect the MIME types and encodings returned by the handler subroutine. This MIME type will also be passed to the browser when it loads and renders the document. test6.html doesn't have an entry in the database, so it falls through to the standard MIME checking module, which figures out its type through its file extension. test7 has neither an entry in the database nor a recognized file extension, so it is displayed with the "unknown document" icon. Without help from Apache::MimeDBI, all the files without extensions would end up as unknown MIME types.

Figure 7-2. An automatic listing of a directory controlled by Apache::MimeDBI

If you use this module, you should be sure to install and load Apache::DBI during the server startup phase, as described in Chapter 5. This will make the underlying database connections persistent, dramatically decreasing the time necessary for the handler to do its work.

Example 7-6. A DBI-Based MIME Type Checker

package Apache::MimeDBI;
# file Apache/MimeDBI.pm
use strict;
use Apache::Constants qw(:common);
use Tie::DBI ();
use File::Basename qw(basename);
use constant DEFAULT_DSN    => 'mysql:test_www';
use constant DEFAULT_LOGIN  => ':';
use constant DEFAULT_TABLE  => 'doc_types';
use constant DEFAULT_FIELDS => 'filename:mime_type:encoding';
sub handler {
   my $r = shift;
    # get filename
   my $file = basename $r->filename;
    # get configuration information
   my $dsn        = $r->dir_config('MIMEDatabase') || DEFAULT_DSN;
   my $table      = $r->dir_config('MIMETable')    || DEFAULT_TABLE;
   my($filefield, $mimefield, $encodingfield) =
       split ':', $r->dir_config('MIMEFields') || DEFAULT_FIELDS;
   my($user, $pass) =
       split ':', $r->dir_config('MIMELogin') || DEFAULT_LOGIN;
    # pull information out of the database
   tie my %DB, 'Tie::DBI', {
      'db' => $dsn, 'table' => $table, 'key' => $filefield,
      'user' => $user, 'password' => $pass,
   };
   my $record;
   return DECLINED unless tied %DB and $record = $DB{$file};
    # set the content type and encoding
   $r->content_type($record->{$mimefield});
   $r->content_encoding($record->{$encodingfield})
      if $record->{$encodingfield};
   return OK;
}
1;
__END__
# Here's a shell script to add the test data:
#!/bin/sh
mysql test_www <<END
DROP TABLE doc_types;
CREATE TABLE doc_types (
      filename        char(127) primary key,
      mime_type       char(30)  not null,
      encoding        char(30)
);
INSERT into doc_types values ('test1','text/plain',null);
INSERT into doc_types values ('test2','text/html',null);
INSERT into doc_types values ('test3','text/html','x-compress');
INSERT into doc_types values ('test4','text/html','x-gzip');
INSERT into doc_types values ('test5','image/gif',null);
END

Footnotes

6 An obvious limitation of this module is that it can't distinguish between similarly named files in different directories.    Show Contents   Go to Top   Previous Page   Next Page
Copyright © 1999 by O'Reilly & Associates, Inc.

HIVE: All information for read only. Please respect copyright!
Hosted by hive КГБ: Киевская городская библиотека