Show Contents Previous Page Next Page Chapter 4 - Content Handlers / Apache::Registry There are a number of traps and pitfalls that you can fall into when using Apache::Registry. This section warns you about them. It helps to know how Apache::Registry works in order to understand why the traps are there. When the server is asked to return a file that is handled by the Apache::Registry content handler (in other words, a script!), Apache::Registry first looks in an internal cache of compiled subroutines that it maintains. If it doesn't find a subroutine that corresponds to the script file, it reads the contents of the file and repackages it into a block of code that looks something like this: package $; use Apache qw(exit); sub handler { #line 1 $ }
Before Apache::Registry even comes into play, In addition to caching the compiled script, Apache::Registry also stores the script's last modification time. It checks the stored time against the current modification time before executing the cached code. If it detects that the script has been modified more recently than the last time it was compiled, it discards the cached code and recompiles the script. The first and most common pitfall when using Apache::Registry is to forget that the code will be persistent across many sessions. Perl CGI programmers commonly make profligate use of globals, allocate mammoth memory structures without disposing of them, and open filehandles and never close them. They get away with this because CGI scripts are short-lived. When the CGI transaction is done, the script exits, and everything is cleaned up automatically. Not so with Apache::Registry scripts (or any other Apache Perl module, for that matter). Globals persist from invocation to invocation, big data structures will remain in memory, and open files will remain open until the Apache child process has exited or the server itself it shut down. Therefore, it is vital to code cleanly. You should never depend on a global variable being uninitialized in order to determine when a subroutine is being called for the first time. In fact, you should reduce your dependency on globals in general. Close filehandles when you are finished with them, and make sure to kill (or at least wait on) any child processes you may have launched. Perl provides two useful tools for writing clean code. use strict turns on checks that make it harder to use global variables unintentionally. Variables must either be lexically scoped (with my) or qualified with their complete package names. The only way around these restrictions is to declare variables you intend to use as globals at the top of the script with use vars. This code snippet shows how: use strict; use vars qw{$INIT $DEBUG @NAMES %HANDLES}; We have used strict in many of the examples in the preceding sections, and we strongly recommend it for any Perl script you write. The other tool is Perl runtime warnings, which can be turned on in -w will catch a variety of errors, dubious programming constructs, typos, and other sins. Among other things, it will warn when a bareword (a string without surrounding quotation marks) conflicts with a subroutine name, when a variable is used only once, and when a lexical variable is inappropriately shared between an outer and an inner scope (a horrible problem which we expose in all its gory details a few paragraphs later). -w may also generate hundreds of "Use of uninitialized value" messages
at run-time, which will fill up your server error log. Many of these warnings
can be hard to track down. If there is no line number reported with the warning,
or if the reported line number is incorrect,2
try using Perl's It may also be helpful to see a full stack trace of the code which triggered the warning. The cluck() function found in the standard Carp module will give you this functionality. Here is an example: use Carp (); local $SIG{__WARN__} = \&Carp::cluck;
Note that -w checks are done at runtime, which may slow down script execution time. In production mode, you may wish to turn warnings off altogether or localize warnings using the
Another subtle #!/usr/local/bin/perl -w for (0..3) { bump_and_print(); } sub bump_and_print { my $a = 1; sub bump { $a++; print "In the inner scope, \$a is $a\n"; } print "In the outer scope, \$a is $a\n"; bump(); } When you run this script, it generates the following inexplicable output: Variable "$a" will not stay shared at ./test.pl line 12. In the outer scope, $a is 1 In the inner scope, $a is 2 In the outer scope, $a is 1 In the inner scope, $a is 3 In the outer scope, $a is 1 In the inner scope, $a is 4 In the outer scope, $a is 1 In the inner scope, $a is 5
For some reason the variable The rationale for the peculiar behavior of lexical variables and ways to avoid it in conventional scripts are explained in the perldiag manual page. When using Apache::Registry this bug can bite you when you least expect it. Because Apache::Registry works by wrapping the contents of a script inside a handler() function, inner named subroutines are created whether you want them or not. Hence, this piece of code will not do what you expect: #!/usr/local/bin/perl use CGI qw/param header/; my $name = param('name'); print header('text/plain'); print_body(); exit 0; sub print_body { print "The contents of \$name is $name.\n"; }
The first time you run it, it will run correctly, printing the value of the name CGI parameter. However, on subsequent invocations the script will appear to get "stuck" and remember the values of previous invocations. This is because the lexically scoped Perl may be fixed someday to do the right thing with inner subroutines. In the meantime, there are several ways to avoid this problem. Instead of making the outer variable lexically scoped, you can declare it to be a package global, as this snippet shows: use strict; use vars '$name'; $name = param('name'); Because globals are global, they aren't subject to weird scoping rules. Alternatively, you can pass the variable to the subroutine as an argument and avoid sharing variables between scopes altogether. This example shows that variant: my $name = param('name'); print_body($name); sub print_body { my $name = shift; print "The contents of \$name is $name.\n"; Finally, you can put the guts of your application into a library and use or require it. The Apache::Registry then becomes only a hook that invokes the library: #!/usr/local/bin/perl require "my_application_guts"; do_everything(); The shared lexical variable problem is a good reason to use the -w switch during Apache::Registry script development and debugging. If you see warnings about a variable not remaining shared, you have a problem, even if the ill effects don't immediately manifest themselves. Another problem that you will certainly run into involves the use of custom libraries by Apache::Registry scripts. When you make an editing change to a script, the Apache::Registry notices the recent modification time and reloads the script. However, the same isn't true of any library file that you load into the script with use or require. If you make a change to a required file, the script will continue to run the old version of the file until the script itself is recompiled for some reason. This can lead to confusion and much hair-tearing during development!
You can avoid going bald by using Apache::StatINC, a standard part of the Alias /perl/ /usr/local/apache/perl/ <Location /perl> SetHandler perl-script PerlHandler Apache::Registry PerlInitHandler Apache::StatINC PerlSendHeader On Options +ExecCGI </Location> Because Apache::StatINC operates at a level above the level of individual
scripts, any nonstandard library locations added by the script with use
lib or by directly manipulating the contents of When you use Apache::StatINC, there is a slight overhead for performing a stat on each included file every time a script is run. This overhead is usually immeasurable, but it will become noticeable on a heavily loaded server. In this case, you may want to forego it and instead manually force the embedded Perl interpreter to reload all its compiled scripts by restarting the server with apachectl. In order for this to work, the PerlFreshRestart directive must be turned on in the Apache configuration file. If you haven't done so already, add this line to perl.conf or one of the other configuration files: PerlFreshRestart On You can try reloading compiled scripts in this way whenever things seem to have gotten themselves into a weird state. This will reset all scripts to known initial settings and allow you to investigate problems systematically. You might also want to stop the server completely and restart it using the -X switch. This forces the server to run as a single process in the foreground. Interacting with a single process rather than multiple ones makes it easier to debug misbehaving scripts. In a production environment, you'll want to do this on a test server in order to avoid disrupting web services. Footnotes 2 Certain uses of the eval operator and "here" documents are known to throw off Perl's line numbering. Show Contents Previous Page Next PageCopyright © 1999 by O'Reilly & Associates, Inc. |
HIVE: All information for read only. Please respect copyright! |