Show Contents Previous Page Next Page Chapter 7 - Other Request Phases / The Header Parser Phase One nontrivial use for the header parser phase is to implement an unsupported HTTP request method. The Apache server handles the most common HTTP methods, such as GET, HEAD, and POST. Apache also provides hooks for managing the less commonly used PUT and DELETE methods, but the work of processing the method is left to third-party modules to implement. In addition to these methods, there are certain methods that are part of the HTTP/1.1 draft that are not supported by Apache at this time. One such method is PATCH, which is used to change the contents of a document on the server side by applying a "diff" file provided by the client.2 This section will show how to extend the Apache server to support the PATCH method. The same techniques can be used to experiment with other parts of HTTP drafts or customize the HTTP protocol for special applications. If you've never worked with patch files, you'll be surprised at how insanely useful they are. Say you have two versions of a large file, an older version named file.1.html and a newer version named file.2.html. You can use the Unix diff command to compute the difference between the two, like this: % diff file.1.html file.2.html > file.diff
When diff is finished, the output file, file.diff, will contain only the lines that have changed between the two files, along with information indicating the positions of the changed lines in the files. You can examine a diff file in a text editor to see how the two files differ. More interestingly, however, you can use Larry Wall's patch program to apply the diff to file.1.html, transforming it into a new file identical to file.2.html. % patch file.1.html < file.diff Because two versions of the same file tend to be more similar than they are different, diff files are usually short, making it much more efficient to send the diff file around than the entire new version. This is the rationale for the HTTP/1.1 PATCH method. It complements PUT, which is used to transmit a whole new document to the server, by sending what should be changed between an existing document and a new one. When a client requests a document with the PATCH method, the URI it provides corresponds to the file to be patched, and the request's content is the diff file to be applied. Example 7-5 gives the code for the PATCH handler, appropriately named Apache::PATCH. It defines both the server-side routines for accepting PATCH documents, and a small client-side program to use for submitting patch files to the server. package Apache::PATCH; # file: Apache/PATCH.pm use strict; use vars qw($VERSION @EXPORT @ISA); use Apache::Constants qw(:common BAD_REQUEST); use Apache::File (); use File::Basename 'dirname'; @ISA = qw(Exporter); @EXPORT = qw(PATCH); $VERSION = '1.00'; use constant PATCH_TYPE => 'application/diff'; my $PATCH_CMD = "/usr/local/bin/patch"; We begin by pulling in required modules, including Apache::File and File::Basename. We also bring in the Exporter module. This is not used by the server-side routines but is needed by the client-side library to export the PATCH() subroutine. We now declare some constants, including a MIME type for the submitted patch files, the location of the patch program on our system, and two constants that will be used to create temporary scratch files. The main entry point to server-side routines is through a header parsing phase handler named handler(). It detects whether the request uses the PATCH method and, if so, installs a custom response handler to deal with it. This means we install the patch routines with this configuration directive: PerlHeaderParserHandler Apache::PATCH The rationale for installing the patch handler with the PerlHeaderParserHandler directive rather than PerlTransHandler is that we can use the former directive within directory sections and .htaccess files, allowing us to make the PATCH method active only for certain parts of the document tree. The definition of handler() is simple: sub handler { my $r = shift; return DECLINED unless $r->method eq 'PATCH'; unless ($r->some_auth_required) { $r->log_reason("Apache::PATCH requires access control"); return FORBIDDEN; } $r->handler("perl-script"); $r->push_handlers(PerlHandler => \&patch_handler); return OK; } We recover the request object and call method() to determine whether
the request method equals
If the request passes the checks, we adjust the content handler to be the patch_handler() subroutine by calling the request object's handler() and push_handlers() methods. This done, we return The true work of the module is done in the patch_handler() subroutine, which is called during the response phase: sub patch_handler { my $r = shift; return BAD_REQUEST unless lc($r->header_in("Content-type")) eq PATCH_TYPE;
This subroutine recovers the request object and immediately checks the content type of the submitted data. Unless the submitted data has MIME type application/diff, indicating a diff file, we return a result code of # get file to patch my $filename = $r->filename; my $dirname = dirname($filename); my $reason; do { -e $r->finfo or $reason = "$filename does not exist", last; -w _ or $reason = "$filename is not writable", last; -w $dirname or $reason = "$filename directory is not writable", last; }; if ($reason) { $r->log_reason($reason); return FORBIDDEN; } Next we check whether the patch operation is likely to succeed. In order for
the # get patch data my $patch; $r->read($patch, $r->header_in("Content-length")); # new temporary file to hold output of patch command my($tmpname, $patch_out) = Apache::File->tmpfile; unless($patch_out) { $r->log_reason("can't create temporary output file: $!"); return FORBIDDEN; }
The next job is to retrieve the patch data from the request. We do this using the request object's read() method to copy Content-length bytes of patch data from the request to a local variable named # redirect child processes stdout and stderr to temporary file open STDOUT, ">&=" . fileno($patch_out); We want the output from patch to go to the temporary file rather
than to standard output (which was closed by the parent server long, long ago).
So we reopen STDOUT, using the # open a pipe to the patch command local $ENV{PATH}; #keep -T happy my $patch_in = Apache::File->new("| $PATCH_CMD $filename 2>&1"); unless ($patch_in) { $r->log_reason("can't open pipe to $PATCH_CMD: $!"); return FORBIDDEN; }
At this point we open up a pipe to the patch command and store the pipe in a new filehandle named # write data to the patch command print $patch_in $patch; close $patch_in; close $patch_out; We now print the diff file to the patch pipe. patch will process the diff file and write its output to the temporary file. After printing, we close the command pipe and the temporary filehandle. $patch_out = Apache::File->new($tmpname); # send the result to the user $r->send_http_header("text/plain"); $r->send_fd($patch_out); close $patch_out; return OK; } The last task is to send the patch output back to the client. We
send the HTTP header, using the convenient form that allows us to set the MIME
type in a single step. We now send the contents of the temporary file using
the request method's send_fd() method. Our work done, we close the
temporary filehandle and return Example 7-5. Implementing the PATCH Method package Apache::PATCH; # file: Apache/PATCH.pm use strict; use vars qw($VERSION @EXPORT @ISA); use Apache::Constants qw(:common BAD_REQUEST); use Apache::File (); use File::Basename 'dirname'; @ISA = qw(Exporter); @EXPORT = qw(PATCH); $VERSION = '1.00'; use constant PATCH_TYPE => 'application/diff'; my $PATCH_CMD = "/usr/local/bin/patch"; sub handler { my $r = shift; return DECLINED unless $r->method eq 'PATCH'; unless ($r->some_auth_required) { $r->log_reason("Apache::PATCH requires access control"); return FORBIDDEN; } $r->handler("perl-script"); $r->push_handlers(PerlHandler => \&patch_handler); return OK; } sub patch_handler { my $r = shift; return BAD_REQUEST unless lc($r->header_in("Content-type")) eq PATCH_TYPE; # get file to patch my $filename = $r->filename; my $dirname = dirname($filename); my $reason; do { -e $r->finfo or $reason = "$filename does not exist", last; -w _ or $reason = "$filename is not writable", last; -w $dirname or $reason = "$filename directory is not writable", last; }; if ($reason) { $r->log_reason($reason); return FORBIDDEN; } # get patch data my $patch; $r->read($patch, $r->header_in("Content-length")); # new temporary file to hold output of patch command my($tmpname, $patch_out) = Apache::File->tmpfile; unless($patch_out) { $r->log_reason("can't create temporary output file: $!"); return FORBIDDEN; } # redirect child processes stdout and stderr to temporary file open STDOUT, ">&=" . fileno($patch_out); # open a pipe to the patch command local $ENV{PATH}; #keep -T happy my $patch_in = Apache::File->new("| $PATCH_CMD $filename 2>&1"); unless ($patch_in) { $r->log_reason("can't open pipe to $PATCH_CMD: $!"); return FORBIDDEN; } # write data to the patch command print $patch_in $patch; close $patch_in; close $patch_out; $patch_out = Apache::File->new($tmpname); # send the result to the user $r->send_http_header("text/plain"); $r->send_fd($patch_out); close $patch_out; return OK; } # This part is for command-line invocation only. my $opt_C; sub PATCH { require LWP::UserAgent; @Apache::PATCH::ISA = qw(LWP::UserAgent); my $ua = __PACKAGE__->new; my $url; my $args = @_ ? \@_ : \@ARGV; while (my $arg = shift @$args) { $opt_C = shift @$args, next if $arg eq "-C"; $url = $arg; } my $req = HTTP::Request->new('PATCH' => $url); my $patch = join '', <STDIN>; $req->content(\$patch); $req->header('Content-length' => length $patch); $req->header('Content-type' => PATCH_TYPE); my $res = $ua->request($req); if($res->is_success) { print $res->content; } else { print $res->as_string; } } sub get_basic_credentials { my($self, $realm, $uri) = @_; return split ':', $opt_C, 2; } 1; __END__ At the time this chapter was written, no web browser or publishing system had actually implemented the PATCH method. The remainder of the listing contains code for implementing a PATCH client. You can use this code from the command line to send patch files to servers that have the PATCH handler installed and watch the documents change in front of your eyes. The PATCH client is simple, thanks to the LWP library. Its main entry point is an exported subroutine named PATCH(): sub PATCH { require LWP::UserAgent; @Apache::PATCH::ISA = qw(LWP::UserAgent); my $ua = __PACKAGE__->new; my $url; my $args = @_ ? \@_ : \@ARGV; while (my $arg = shift @$args) { $opt_C = shift @$args, next if $arg eq "-C"; $url = $arg; } PATCH() starts by creating a new LWP user agent using the subclassing
technique discussed later in the Apache::AdBlocker module (see "Handling
Proxy Requests" in this chapter). It recovers the authentication username
and password from the command line by looking for a -C (credentials)
switch, which is then stored into a package lexical named my $req = HTTP::Request->new('PATCH' => $url); my $patch = join '', <STDIN>; $req->content(\$patch); $req->header('Content-length' => length $patch); $req->header('Content-type' => PATCH_TYPE); my $res = $ua->request($req); The subroutine now creates a new HTTP::Request object that specifies PATCH as its request method and sets its content to the diff file read in from STDIN. It also sets the Content-length and Content-type HTTP headers to the length of the diff file and application/diff, respectively. Having set up the request, the subroutine sends the request to the remote server by calling the user agent's request() method. if($res->is_success) { print $res->content; } else { print $res->as_string; } } If the response indicates success (is_success() returns true) then we print out the text of the server's response. Otherwise, the routine prints the error message contained in the response object's as_string() method. sub get_basic_credentials { my($self, $realm, $uri) = @_; return split ':', $opt_C, 2; }
The get_basic_credentials() method, defined at the bottom of the source listing, is actually an override of an LWP::UserAgent method. When LWP::UserAgent tries to access a document that is password-protected, it invokes this method to return the username and password required to fetch the resource. By subclassing LWP::UserAgent into our own package and then defining a get_basic_credentials() method, we're able to provide our parent class with the contents of the To run the client from the command line, invoke it like this: % perl -MApache::PATCH -e PATCH -- -C username:password\ http://www.modperl.com/index.html < index.html.diff Hmm... Looks like a new-style context diff to me... The text leading up to this was: -------------------------- |*** index.html.new Mon Aug 24 21:52:29 1998 |--- index.html Mon Aug 24 21:51:06 1998 -------------------------- Patching file /home/httpd/htdocs/index.html using Plan A... Hunk #1 succeeded at 8. done A tiny script named PATCH that uses the module can save some typing: #!/usr/local/bin/perl use Apache::PATCH; PATCH; __END__ Now the command looks like this: % PATCH -C username:password \ http://www.modperl.com/index.html < index.html.diff Footnotes 2 Just two weeks prior to the production stage of this book, Script support for the PATCH method was added in Apache 1.3.4-dev. 3 In order for the PATCH method to work you will have to make the files and directories to be patched writable by the web server process. You can do this either by making the directories world-writable, or by changing their user or group ownerships so that the web server has write permission. This has security implications, as it allows buggy CGI scripts and other web server security holes to alter the document tree. A more secure solution would be to implement PATCH using a conventional CGI script running under the standard Apache suexec extension, or the sbox CGI wrapper (http://stein.cshl.org/WWW/software/sbox). 4 Why not just redirect the output of patch
to the temporary file by invoking patch with the Copyright © 1999 by O'Reilly & Associates, Inc. |
HIVE: All information for read only. Please respect copyright! |