Show Contents Previous Page Next Page Chapter 7 - Other Request Phases In this section...
The very last phase of the transaction before the cleanup at the end is the logging phase. At this point, the request record contains everything there is to know about the transaction, including the content handler's final status code and the number of bytes transferred from the server to the client. Apache's built-in logging module mod_log_config ordinarily handles this phase by writing a line of summary information to the transfer log. As its name implies, this module is highly configurable. You can give it printf()-like format strings to customize the appearance of the transfer log to your requirements, have it open multiple log files, or even have it pipe the log information to an external process for special processing. By handling the logging phase yourself, you can perform special processing at the end of each transaction. For example, you can update a database of cumulative hits, bump up a set of hit count files, or notify the owner of a document that his page has been viewed. There are a number of log handlers on CPAN, including Apache::DBILogger, which sends log information to a relational database, and Apache::Traffic, which keeps summaries of bytes transferred on a per-user basis. Sending Email When a Page Is Hit Show Contents Go to Top Previous Page Next PageThe first example of a log handler that we'll show is Apache::LogMail. It sends email to a designated address whenever a particular page is hit and can be used in low-volume applications, such as the vanity home pages of ISP customers. A typical configuration directive would look like this: <Location /~kryan> PerlLogHandler Apache::LogMail PerlSetVar LogMailto kryan@public.com PerlSetVar LogPattern \.(html|txt)$ </Location> With this configuration in place, hits on pages in the /~kryan directory will generate email messages. The LogMailto Perl configuration variable specifies kryan@public.com as the lucky recipient of these messages, and LogPattern specifies that only files ending with .html or .txt will generate messages (thus eliminating noise caused by hits on inline images). Example 7-8 shows the code. After the usual preliminaries, we define the logging phase's handler() routine: sub handler { my $r = shift; my $mailto = $r->dir_config('LogMailto'); return DECLINED unless $mailto; my $filepattern = $r->dir_config('LogPattern'); return DECLINED if $filepattern && $r->filename !~ /$filepattern/; The subroutine begins by fetching the contents of the LogMailto configuration variable. If none is defined, it declines the transaction. Next it fetches the contents of LogPattern. If it finds a pattern, it compares it to the requested document's filename and again declines the transaction if no match is found. my $request = $r->the_request; my $uri = $r->uri; my $agent = $r->header_in("User-agent"); my $bytes = $r->bytes_sent; my $remote = $r->get_remote_host; my $status = $r->status_line; my $date = localtime; Now the subroutine gathers up various fields of interest from the request object, including the requested URI, the User-Agent header, the name of the remote host, and the number of bytes sent (method bytes_sent()). local $ENV{PATH}; #keep -T happy unless (open MAIL, "|/usr/lib/sendmail -oi -t") { $r->log_error("Couldn't open mail: $!"); return DECLINED; } We open a pipe to the sendmail program and use it to send a message to the designated user with the information we've gathered.7 The flags used to open up the sendmail pipe instruct it to take the recipient's address from the header rather than the command line and prevent it from terminating prematurely if it sees a line consisting of a dot. print MAIL <<END; To: $mailto From: mod_perl httpd <$from> Subject: Somebody looked at $uri At $date, a user at $remote looked at $uri using the $agent browser. The request was $request, which resulted returned a code of $status. $bytes bytes were transferred. END close MAIL; return OK; }
All text that we print to the The final email message will look something like this: From: Mod Perl <webmaster@public.com> To: kryan@public.com Subject: Somebody looked at /~kryan/guestbook.txt Date: Thu, 27 Aug 1998 08:14:23 -0400 At Thu Aug 27 08:14:23 1998, a user at 192.168.2.1 looked at The request was GET /~kryan/guestbook.txt HTTP/1.0, which resulted returned a code of 200 OK. 462 bytes were transferred. Example 7-8. A Logging Module to Notify of Hits via Email package Apache::LogMail; # File: Apache/LogMail.pm use strict; use Apache::Constants qw(:common); sub handler { my $r = shift; my $mailto = $r->dir_config('LogMailto'); return DECLINED unless $mailto; my $filepattern = $r->dir_config('LogPattern'); return DECLINED if $filepattern && $r->filename !~ /$filepattern/; my $request = $r->the_request; my $uri = $r->uri; my $agent = $r->header_in("User-agent"); my $bytes = $r->bytes_sent; my $remote = $r->get_remote_host; my $status = $r->status_line; my $date = localtime; my $from = $r->server->server_admin || "webmaster"; print MAIL <<END; To: $mailto From: mod_perl httpd <$from> Subject: Somebody looked at $uri At $date, a user at $remote looked at $uri using the $agent browser. The request was $request, which resulted returned a code of $status. $bytes bytes were transferred. END close MAIL; return OK; } 1; __END__Show Contents Go to Top Previous Page Next Page The second example of a log phase handler is a DBI database logger. The information from the transaction is sent to a relational database using the DBI interface. The record of each transaction is appended to the end of a relational table, which can be queried and summarized in a myriad of ways using SQL. This is a skeletal version of the much more complete Apache::DBILog and Apache::DBILogConfig modules, which you should consult before rolling your own. In preparation to use this module you'll need to set up a database with the appropriate table definition. A suitable MySQL table named access_log is shown here: +---------+--------------+------+-----+---------------------+-------+ | Field | Type | Null | Key | Default | Extra | +---------+--------------+------+-----+---------------------+-------+ | when | datetime | | | 0000-00-00 00:00:00 | | | host | char(255) | | | | | | method | char(4) | | | | | | url | char(255) | | | | | | auth | char(50) | YES | | NULL | | | browser | char(50) | YES | | NULL | | | referer | char(255) | YES | | NULL | | | status | int(3) | | | 0 | | | bytes | int(8) | YES | | 0 | | +---------+--------------+------+-----+---------------------+-------+ This table can be created with the following script: #!/bin/sh mysql -B test_www <<END create table access_log ( when datetime not null, host varchar(255) not null, method varchar(4) not null, url varchar(255) not null, auth varchar(50), browser varchar(50), referer varchar(255), status smallint(3) default 0, bytes int(8) ); END The database must be writable by the web server, which should be provided with the appropriate username and password to log in. The code (Example 7-9) is short and very similar to the previous example, so we won't reproduce it inline. We begin by bringing in modules that we need, including DBI and the ht_time() function from Apache::Util. Next we declare some constants defining the database, table, and database login information. Since this is just a skeleton of a module, we have hardcoded these values instead of taking them from PerlSetVar configuration directives. You can follow the model of Apache::MimeDBI if you wish to make this module more configurable.
The handler() subroutine recovers the request object and uses it to fetch all the information we're interested in recording, which we store in locals. We also call ht_time() to produce a nicely formatted representation of the request_time() in a format that SQL accepts. We connect to the database and create a statement handle containing a SQL INSERT statement. We invoke the statement handler's execute() statement to write the information into the database, and return with a status code of
The only trick to this handler, which we left out of Apache::LogMail, is the use of the last() to recover the request object. last() returns the final request object in a chain of internal redirects and other subrequests. Usually there are no subrequests, and last() just returns the main (first) request object, in which case the Example 7-9. A DBI Database Log Handler package Apache::LogDBI; # file: Apache/LogDBI.pm use Apache::Constants qw(:common); use strict; use DBI (); use Apache::Util qw(ht_time); use constant DSN => 'dbi:mysql:test_www'; use constant DB_TABLE => 'access_log'; use constant DB_AUTH => ':'; sub handler { my $orig = shift; my $r = $orig->last; my $date = ht_time($orig->request_time, '%Y-%m-%d %H:%M:%S', 0); my $host = $r->get_remote_host; my $method = $r->method; my $url = $orig->uri; my $user = $r->connection->user; my $referer = $r->header_in('Referer'); my $browser = $r->header_in('User-agent'); my $status = $orig->status; my $bytes = $r->bytes_sent; my $dbh = DBI->connect(DSN, split ':', DB_AUTH) || die $DBI::errstr; my $sth = $dbh->prepare("INSERT INTO ${\DB_TABLE} VALUES(?,?,?,?,?,?,?,?,?)") || die $dbh->errstr; $sth->execute($date,$host,$method,$url,$user, $browser,$referer,$status,$bytes) || die $dbh->errstr; $sth->finish; return OK; } 1; __END__ This handler can be installed with the following configuration file directive: PerlLogHandler Apache::LogDBI
You can place this directive in the main part of the configuration file in order to log all accesses, or place it in a directory section if you're interested in logging a particular section of the site only. An alternative is to install Having web transactions logged to a relational database gives you the ability to pose questions of great complexity. Just to give you a taste of what's possible, here are a few useful SQL queries to try:
SELECT count(*),sum(bytes) FROM access_log;
SELECT count(*) FROM access_log WHERE to_days(when)=to_days(now())-2;
SELECT date_format(when,'H') as hour,count(*) FROM access_log GROUP BY hour;
SELECT url,referer,count(url) FROM access_log WHERE status=404 GROUP BY url;
SELECT url,count(*) as count FROM access_log GROUP BY url ORDER BY count desc LIMIT 10;
SELECT date_format(when,'H') as hour, sum(bytes)/(60*60) as bytes_per_min FROM access_log GROUP BY hour; Footnotes 7 sendmail is only available on Unix systems. If you are using Windows or Windows NT, you would be best served by replacing the piped open with the appropriate calls to the Perl Net::SMTP module. You can find this module on CPAN. Show Contents Go to Top Previous Page Next PageCopyright © 1999 by O'Reilly & Associates, Inc. |
HIVE: All information for read only. Please respect copyright! |