Show Contents Previous Page Next Page
Chapter 9 - Perl API Reference Guide / The Apache Request Object Client Request Methods
This section covers the request object methods that are used to query or modify the incoming client request. These methods allow you to retrieve such information as the URI the client has requested, the request method in use, the content of any submitted HTML forms, and various items of information about the remote host.
args()
The args() method returns the contents of the URI query string
(that part of the request URI that follows the ? character,
if any). When called in a scalar context, args() returns the
entire string. When called in a list context, the method returns a list
of parsed key/ value pairs:
my $query = $r->args;
my %in = $r->args;
One trap to be wary of: if the same argument name is present several
times (as can happen with a selection list in a fill-out form), assignment
of args() to a hash will discard all but the last argument. To
avoid this, you'll need to use the more complex argument processing scheme
described in Chapter 4, Content
Handlers.
connection()
This method returns an object blessed into the Apache::Connection
class. See "The Apache::Connection Class" later in this chapter for information
on what you can do with this object once you get it.
my $c = $r->connection;
content()
When the client request method is POST, which generally occurs when
the remote client is submitting the contents of a fill-out form, the $r->content
method returns the submitted information but only if the request content
type is application/x-www-form-urlencoded . When called in a
scalar context, the entire string is returned. When called in a list context,
a list of parsed name=value pairs is returned. To handle other types of PUT or POSTed content, you'll need to use a
module such as CGI.pm or Apache::Request or use the read()
method and parse the data yourself.
Note that you can only call content() once. If you call the method
more than once, it will return undef (or an empty list) after the
first try.
filename()
The filename() method sets or returns the result of the URI
translation phase. During the URI translation phase, your handler will
call this method with the physical path to a file in order to set the
filename. During later phases of the transaction, calling this method
with no arguments returns its current value. Examples:
my $fname = $r->filename;
unless (open(FH, $fname)) {
die "can't open $fname $!";
}
my $fname = do_translation($r->uri);
$r->filename($fname);
finfo()
Immediately following the translation phase, Apache walks along the
components of the requested URI trying to determine where the physical
file path ends and the additional path information begins (this is described
at greater length at the beginning of Chapter 4).
In the course of this walk, Apache makes the system stat() call
one or more times to read the directory information along the path. When
the walk is finished, the stat() information for the translated
filename is cached in the request record, where it can be recovered using
the finfo() method. If you need to stat() the file,
you can take advantage of this cached stat structure rather than repeating
the system call. When finfo() is called, it moves the cached stat information
into the special filehandle _ that Perl uses to cache its
own stat operations. You can then perform file test operations directly
on this filehandle rather than on the file itself, which would incur the
penalty of another stat() system call. For convenience, finfo()
returns a reference to the _ filehandle, so file tests can
be done directly on the return value of finfo(). The following
three examples all result with the same value for $size .
However, the first two avoid the overhead of the implicit stat()
performed by the last.
my $size = -s $r->finfo;
$r->finfo;
my $size = -s _;
my $size = -s $r->filename; # slower
It is possible for a module to be called upon to process a URL that
does not correspond to a physical file. In this case, the stat()
structure will contain the result of testing for a nonexistent file, and
Perl's various file test operations will all return false. The Apache::Util package contains a number of routines that
are useful for manipulating the contents of the stat structure. For example,
the ht_time() routine turns Unix timestamps into HTTP-compatible
human readable strings. See the Apache::Util manpage and the
section "The Apache::URI Class"
later in this chapter for more details.
use Apache::Util qw(ht_time);
if(-d $r->finfo) {
printf "%s is a directory\n", $r->filename;
}
else {
printf "Last Modified: %s\n", ht_time((stat _)[9]);
}
get_client_block() setup_client_block() should_client_block()
The get_, setup_, and should_client_block
methods are lower-level ways to read the data sent by the client in POST
and PUT requests. This protocol exactly mirrors the C-language API described
in Chapter 10, C API Reference
Guide, Part I, and provides for timeouts and other niceties.
Although the Perl API supports them, Perl programmers should generally
use the simpler read() method instead.
get_remote_host()
This method can be used to look up the remote client's DNS hostname
or simply return its IP address. When a DNS lookup is successful, its
result is cached and returned on subsequent calls to get_remote_host()
to avoid costly multiple lookups. This cached value can also be retrieved
with the Apache::Connection object's remote_host() method.
This method takes an optional argument. The type of lookup performed
by this method is affected by this argument, as well as the value of the
Host-Name-Lookups directive. Possible arguments to this method,
whose symbolic names can be imported from the Apache::Constants
module using the :remotehost import tag, are the following:
REMOTE_HOST
If this argument is specified, Apache will try to look up the DNS name
of the remote host. This lookup will fail if the Apache configuration
directive Host-Name-Lookups is set to Off or if the
hostname cannot be determined by a DNS lookup, in which case the function
will return undef.
REMOTE_NAME
When called with this argument, the method will return the DNS name
of the remote host if possible, or the dotted decimal representation of
the client's IP address otherwise. This is the default lookup type when
no argument is specified.
REMOTE_NOLOOKUP
When this argument is specified, get_remote_host() will not
perform a new DNS lookup (even if the Host-Name-Lookups directive
says so). If a successful lookup was done earlier in the request, the
cached hostname will be returned. Otherwise, the method returns the dotted
decimal representation of the client's IP address.
REMOTE_DOUBLE_REV
This argument will trigger a double-reverse DNS lookup regardless of
the setting of the HostNameLookups directive. Apache will first
call the DNS to return the hostname that maps to the IP number of the
remote host. It will then make another call to map the returned hostname
back to an IP address. If the returned IP address matches the original
one, then the method returns the hostname. Otherwise, it returns undef.
The reason for this baroque procedure is that standard DNS lookups are
susceptible to DNS spoofing in which a remote machine temporarily assumes
the apparent identity of a trusted host. Double-reverse DNS lookups make
spoofing much harder and are recommended if you are using the hostname
to distinguish between trusted clients and untrusted ones. However, double
reverse DNS lookups are also twice as expensive. In recent versions of Apache, double-reverse name lookups are always
performed for the name-based access checking implemented by mod_access.
Here are some examples:
my $remote_host = $r->get_remote_host;
# same as above
use Apache::Constants qw(:remotehost);
my $remote_host = $r->get_remote_host(REMOTE_NAME);
# double-reverse DNS lookup
use Apache::Constants qw(:remotehost);
my $remote_host = $r->get_remote_host(REMOTE_DOUBLE_REV) || "nohost";
get_remote_logname()
This method returns the login name of the remote user or undef
if the user's login could not be determined. Generally, this only works
if the remote user is logged into a Unix or VMS host and that machine
is running the identd daemon (which implements a protocol known
as RFC 1413). The success of the call also depends on the IdentityCheck configuration
directive being turned on. Since identity checks can adversely impact
Apache's performance, this directive is off by default.
my $remote_logname = $r->get_remote_logname;
headers_in()
When called in a list context, the headers_in() method returns
a list of key/value pairs corresponding to the client request headers. When
called in a scalar context, it returns a hash reference tied to the Apache::Table
class. This class provides methods for manipulating several of Apache's internal
key/value table structures and, for all intents and purposes, acts just like
an ordinary hash table. However, it also provides object methods for dealing
correctly with multivalued entries. See "The Apache::Table Class" later in
this chapter for details.
my %headers_in = $r->headers_in;
my $headers_in = $r->headers_in;
Once you have copied the headers to a hash, you can refer to them by
name. See Table 9-1 for a list of incoming
headers that you may need to use. For example, you can view the length
of the data that the client is sending by retrieving the key Content-length:
%headers_in = $r->headers_in;
my $cl = $headers_in{'Content-length'};
You'll need to be aware that browsers are not required to be consistent
in their capitalization of header field names. For example, some may refer
to Content-Type and others to Content-type. The Perl
API copies the field names into the hash as is, and like any other Perl
hash, the keys are case-sensitive. This is a potential trap. For these reasons it's better to call headers_in() in a scalar
context and use the returned tied hash. Since Apache::Table sits
on top of the C table API, lookup comparisons are performed in a case-insensitive
manner. The tied interface also allows you to add or change the value
of a header field, in case you want to modify the request headers seen
by handlers downstream. This code fragment shows the tied hash being used
to get and set fields:
my $headers_in = $r->headers_in;
my $ct = $headers_in->{'Content-Length'};
$headers_in->{'User-Agent'} = 'Block this robot';
It is often convenient to refer to header fields without creating an
intermediate hash or assigning a variable to the Apache::Table
reference. This is the usual idiom:
my $cl = $r->headers_in->{'Content-Length'};
Certain request header fields such as Accept, Cookie,
and several other request fields are multivalued. When you retrieve their
values, they will be packed together into one long string separated by
commas. You will need to parse the individual values out yourself. Individual
values can include parameters which will be separated by semicolons. Cookies
are common examples of this:
Set-Cookie: SESSION=1A91933A; domain=acme.com; expires=Wed, 21-Oct-1998 20:46:07 GMT
A few clients send headers with the same key on multiple lines. In this
case, you can use the Apache::Table::get() method to retrieve
all of the values at once. For full details on the various incoming headers, see the documents at
http://www.w3.org/Protocols. Nonstandard headers, such as those transmitted
by experimental browsers, can also be retrieved with this method call.
Table 9-1. Incoming HTTP Request Headers
Field
|
Description
|
---|
Accept
|
MIME types that the client accepts
|
Accept-encoding
|
Compression methods that the client accepts
|
Accept-language
|
Languages that the client accepts
|
Authorization
|
Used by various authorization/authentication schemes
|
Connection
|
Connection options, such as Keep-alive |
Content-length
|
Length, in bytes, of data to follow
|
Content-type
|
MIME type of data to follow
|
Cookie
|
Client-side data
|
From
|
Email address of the requesting user (deprecated)
|
Host
|
Virtual host to retrieve data from
|
If-modified-since
|
Return document only if modified since the date specified
|
If-none-match
|
Return document if it has changed
|
Referer
|
URL of document that linked to the requested one
|
User-agent
|
Name and version of the client software
|
header_in()
The header_in() method (singular, not plural) is used to get
or set the value of a client incoming request field. If the given value
is undef, the header will be removed from the list of header
fields:
my $cl = $r->header_in('Content-length');
$r->header_in($key, $val); #set the value of header '$key'
$r->header_in('Content-length' => undef); #remove the header
The key lookup is done in a case-insensitive manner. The header_in()
method predates the Apache::Table class but remains for backward
compatibility and as a bit of a shortcut to using the headers_in()
method.
header_only()
If the client issues a HEAD request, it wants to receive the HTTP response
headers only. Content handlers should check for this by calling header_only()
before generating the document body. The method will return true in the
case of a HEAD request and false in the case of other requests. Alternatively,
you could examine the string value returned by method() directly,
although this would be less portable if the HTTP protocol were some day
expanded to support more than one header-only request method.
# generate the header & send it
$r->send_http_header;
return OK if $r->header_only;
# now generate the document...
Do not try to check numeric value returned by method_number()
to identify a header request. Internally, Apache uses the M_GET
number for both HEAD and GET methods.
method()
This method will return the string version of the request method, such
as GET, HEAD, or POST. Passing an argument will change the method, which
is occasionally useful for internal redirects (Chapter 4)
and for testing authorization restriction masks (Chapter 6,
Authentication and Authorization).
my $method = $r->method;
$r->method('GET');
If you update the method, you probably want to update the method number
accordingly as well.
method_number()
This method will return the request method number, which refers to internal
constants defined by the Apache API. The method numbers are available
to Perl programmers from the Apache::Constants module by importing
the :methods set. The relevant constants include M_GET ,
M_POST , M_PUT , and M_DELETE . Passing
an argument will set this value, mainly used for internal redirects and
for testing authorization restriction masks. If you update the method
number, you probably want to update the method accordingly as
well. Note that there isn't an M_HEAD constant. This is because
when Apache receives a HEAD request, it sets the method number to M_GET
and sets header_only() to return true.
use Apache::Constants qw(:methods);
if ($r->method_number == M_POST) {
# change the request method
$r->method_number(M_GET);
$r->method("GET");
$r->internal_redirect('/new/place');
}
There is no particular advantage of using method_number() over
method() for Perl programmers, other than being only slightly more
efficient.
parsed_uri()
When Apache parses the incoming request, it will turn the request URI into
a predigested uri_components structure. The parsed_uri()
method will return an object blessed into the Apache::URI class,
which provides methods for fetching and setting various parts of the URI.
See "The Apache::Util Class" later in this chapter for details.
use Apache::URI ();
my $uri = $r->parsed_uri;
my $host = $uri->hostname;
path_info()
The path_info() method will return what is left in the path
after the URI translation phase. Apache's default translation method,
described at the beginning of Chapter 4,
uses a simple directory-walking algorithm to decide what part of the URI
is the file and what part is the additional path information. You can provide an argument to path_info() in order to change
its value:
my $path_info = $r->path_info;
$r->path_info("/some/additional/information");
Note that in most cases, changing the path_info() requires
you to sync the uri() with the update. In the following example,
we calculate the original URI minus any path info, change the existing
path info, then properly update the URI:
my $path_info = $r->path_info;
my $uri = $r->uri;
my $orig_uri = substr $uri, 0, length($uri) - length($path_info);
$r->path_info($new_path_info);
$r->uri($orig_uri . $r->path_info);
protocol
The $r->protocol method will return a string identifying
the protocol that the client speaks. Typical values will be HTTP/1.0
or HTTP/1.1 .
my $protocol= $r->protocol;
This method is read-only.
proxyreq()
The proxyreq() method returns true if the current HTTP request
is for a proxy URI-- that is, if the actual document resides on a foreign
server somewhere and the client wishes Apache to fetch the document on
its behalf. This method is mainly intended for use during the filename
translation phase of the request.
sub handler {
my $r = shift;
return DECLINED unless $r->proxyreq;
# do something interesting...
}
See Chapter 7 for examples.
read()
The read() method provides Perl API programmers with a simple
way to get at the data submitted by the browser in POST and PUT requests.
It should be used when the information submitted by the browser is not
in the application/x-www-form-urlencoded format that the content()
method knows how to handle. Call read() with a scalar variable to hold the read data and
the length of the data to read. Generally, you will want to ask for the
entire data sent by the client, which can be recovered from the incoming
Content-length field:1
my $buff;
$r->read($buff, $r->header_in('Content-length'));
Internally, Perl sets up a timeout in case the client breaks the connection
prematurely. The exact value of the timeout is set by the Timeout
directive in the server configuration file. If a timeout does occur, the
script will be aborted.
Within a handler you may also recover client data by simply reading from
STDIN using Perl's read(), getc(), and readline (<> )
functions. This works because the Perl API ties STDIN to Apache::read()
before entering handlers.
server()
This method returns a reference to an Apache::Server object, from
which you can retrieve all sorts of information about low-level aspects of
the server's configuration. See "The Apache::Server Class" for details.
my $s = $r->server;
the_request()
This method returns the unparsed request line sent by the client. the_request()
is primarily used by log handlers, since other handlers will find it more
convenient to use methods that return the information in preparsed form.
This method is read-only.
my $request_line = $r->the_request;
print LOGFILE $request_line;
Note that the_request() is functionally equivalent to this
code fragment:
my $request_line = join ' ', $r->method, $r->uri, $r->protocol;
uri()
The uri() method returns the URI requested by the browser.
You may also pass this method a string argument in order to set the URI
seen by handlers further down the line, which is something that a translation
handler might want to do.
my $uri = $r->uri;
$r->uri("/something/else");
Show Contents Previous Page Next Page Copyright © 1999 by O'Reilly & Associates, Inc. |