The Apache HTTP Server is a «heavy duty» network server that Subversion can leverage. Via a custom module, httpd makes Subversion repositories available to clients via the WebDAV/DeltaV protocol, which is an extension to HTTP 1.1 (see http://www.webdav.org/ for more information). This protocol takes the ubiquitous HTTP protocol that is the core of the World Wide Web, and adds writing—specifically, versioned writing—capabilities. The result is a standardized, robust system that is conveniently packaged as part of the Apache 2.0 software, is supported by numerous operating systems and third-party products, and doesn't require network administrators to open up yet another custom port. [43] While an Apache-Subversion server has more features than svnserve, it's also a bit more difficult to set up. With flexibility often comes more complexity.
Much of the following discussion includes references to Apache configuration directives. While some examples are given of the use of these directives, describing them in full is outside the scope of this chapter. The Apache team maintains excellent documentation, publicly available on their website at http://httpd.apache.org. For example, a general reference for the configuration directives is located at http://httpd.apache.org/docs-2.0/mod/directives.html.
Also, as you make changes to your Apache setup, it is likely
that somewhere along the way a mistake will be made. If you are
not already familiar with Apache's logging subsystem, you should
become aware of it. In your httpd.conf
file are directives that specify the on-disk locations of the
access and error logs generated by Apache (the
CustomLog
and ErrorLog
directives, respectively). Subversion's mod_dav_svn uses
Apache's error logging interface as well. You can always browse
the contents of those files for information that might reveal
the source of a problem that is not clearly noticeable
otherwise.
To network your repository over HTTP, you basically need four components, available in two packages. You'll need Apache httpd 2.0, the mod_dav DAV module that comes with it, Subversion, and the mod_dav_svn filesystem provider module distributed with Subversion. Once you have all of those components, the process of networking your repository is as simple as:
getting httpd 2.0 up and running with the mod_dav module,
installing the mod_dav_svn plugin to mod_dav, which uses Subversion's libraries to access the repository, and
configuring your httpd.conf
file to export (or expose) the repository.
You can accomplish the first two items either by
compiling httpd and Subversion from
source code, or by installing pre-built binary packages of
them on your system. For the most up-to-date information on
how to compile Subversion for use with the Apache HTTP Server,
as well as how to compile and configure Apache itself for
this purpose, see the INSTALL
file in
the top level of the Subversion source code tree.
Once you have all the necessary components installed on
your system, all that remains is the configuration of Apache
via its httpd.conf
file. Instruct Apache
to load the mod_dav_svn module using the
LoadModule
directive. This directive must
precede any other Subversion-related configuration items. If
your Apache was installed using the default layout, your
mod_dav_svn module should have been
installed in the modules
subdirectory of
the Apache install location (often
/usr/local/apache2
). The
LoadModule
directive has a simple syntax,
mapping a named module to the location of a shared library on
disk:
LoadModule dav_svn_module modules/mod_dav_svn.so
Note that if mod_dav was compiled as a
shared object (instead of statically linked directly to the
httpd binary), you'll need a similar
LoadModule
statement for it, too. Be sure
that it comes before the mod_dav_svn line:
LoadModule dav_module modules/mod_dav.so LoadModule dav_svn_module modules/mod_dav_svn.so
At a later location in your configuration file, you now
need to tell Apache where you keep your Subversion repository
(or repositories). The Location
directive
has an XML-like notation, starting with an opening tag, and
ending with a closing tag, with various other configuration
directives in the middle. The purpose of the
Location
directive is to instruct Apache to
do something special when handling requests that are directed
at a given URL or one of its children. In the case of
Subversion, you want Apache to simply hand off support for
URLs that point at versioned resources to the DAV layer. You
can instruct Apache to delegate the handling of all URLs whose
path portions (the part of the URL that follows the server's
name and the optional port number) begin with
/repos/
to a DAV provider whose
repository is located at
/absolute/path/to/repository
using the
following httpd.conf
syntax:
<Location /repos> DAV svn SVNPath /absolute/path/to/repository </Location>
If you plan to support multiple Subversion repositories
that will reside in the same parent directory on your local
disk, you can use an alternative directive, the
SVNParentPath
directive, to indicate that
common parent directory. For example, if you know you will be
creating multiple Subversion repositories in a directory
/usr/local/svn
that would be accessed via
URLs like http://my.server.com/svn/repos1
,
http://my.server.com/svn/repos2
, and
so on, you could use the httpd.conf
configuration syntax in the following example:
<Location /svn> DAV svn # any "/svn/foo" URL will map to a repository /usr/local/svn/foo SVNParentPath /usr/local/svn </Location>
Using the previous syntax, Apache will delegate the
handling of all URLs whose path portions begin with
/svn/
to the Subversion DAV provider,
which will then assume that any items in the directory
specified by the SVNParentPath
directive
are actually Subversion repositories. This is a particularly
convenient syntax in that, unlike the use of the
SVNPath
directive, you don't have to
restart Apache in order to create and network new
repositories.
Be sure that when you define your new
Location
, it doesn't overlap with other
exported Locations. For example, if your main
DocumentRoot
is exported to
/www
, do not export a Subversion
repository in <Location /www/repos>
.
If a request comes in for the URI
/www/repos/foo.c
, Apache won't know
whether to look for a file repos/foo.c
in
the DocumentRoot
, or whether to delegate
mod_dav_svn to return
foo.c
from the Subversion
repository.
At this stage, you should strongly consider the question of permissions. If you've been running Apache for some time now as your regular web server, you probably already have a collection of content—web pages, scripts and such. These items have already been configured with a set of permissions that allows them to work with Apache, or more appropriately, that allows Apache to work with those files. Apache, when used as a Subversion server, will also need the correct permissions to read and write to your Subversion repository.
You will need to determine a permission system setup that
satisfies Subversion's requirements without messing up any
previously existing web page or script installations. This
might mean changing the permissions on your Subversion
repository to match those in use by other things that Apache
serves for you, or it could mean using the
User
and Group
directives in httpd.conf
to specify that
Apache should run as the user and group that owns your
Subversion repository. There is no single correct way to set
up your permissions, and each administrator will have
different reasons for doing things a certain way. Just be
aware that permission-related problems are perhaps the most
common oversight when configuring a Subversion repository for
use with Apache.
At this point, if you configured
httpd.conf
to contain something like
<Location /svn> DAV svn SVNParentPath /usr/local/svn </Location>
…then your repository is «anonymously»
accessible to the world. Until you configure some
authentication and authorization policies, the Subversion
repositories you make available via the
Location
directive will be generally
accessible to everyone. In other words,
anyone can use their Subversion client to checkout a working copy of a repository URL (or any of its subdirectories),
anyone can interactively browse the repository's latest revision simply by pointing their web browser to the repository URL, and
anyone can commit to the repository.
The easiest way to authenticate a client is via the HTTP Basic authentication mechanism, which simply uses a username and password to verify that a user is who she says she is. Apache provides an htpasswd utility for managing the list of acceptable usernames and passwords, those to whom you wish to grant special access to your Subversion repository. Let's grant commit access to Sally and Harry. First, we need to add them to the password file.
$ ### First time: use -c to create the file $ ### Use -m to use MD5 encryption of the password, which is more secure $ htpasswd -cm /etc/svn-auth-file harry New password: ***** Re-type new password: ***** Adding password for user harry $ htpasswd -m /etc/svn-auth-file sally New password: ******* Re-type new password: ******* Adding password for user sally $
Next, you need to add some more
httpd.conf
directives inside your
Location
block to tell Apache what to do
with your new password file. The
AuthType
directive specifies the type of
authentication system to use. In this case, we want to
specify the Basic
authentication system.
AuthName
is an arbitrary name that you
give for the authentication domain. Most browsers will
display this name in the pop-up dialog box when the browser
is querying the user for his name and password. Finally,
use the AuthUserFile
directive to specify
the location of the password file you created using
htpasswd.
After adding these three directives, your
<Location>
block should look
something like this:
<Location /svn> DAV svn SVNParentPath /usr/local/svn AuthType Basic AuthName "Subversion repository" AuthUserFile /etc/svn-auth-file </Location>
This <Location>
block is not
yet complete, and will not do anything useful. It's merely
telling Apache that whenever authorization is required,
Apache should harvest a username and password from the
Subversion client. What's missing here, however, are
directives that tell Apache which sorts
of client requests require authorization. Wherever
authorization is required, Apache will demand
authentication as well. The simplest thing to do is protect
all requests. Adding Require valid-user
tells Apache that all requests require an authenticated
user:
<Location /svn> DAV svn SVNParentPath /usr/local/svn AuthType Basic AuthName "Subversion repository" AuthUserFile /etc/svn-auth-file Require valid-user </Location>
Be sure to read the next section («Authorization Options») for more detail on the
Require
directive and other ways to set
authorization policies.
One word of warning: HTTP Basic Auth passwords pass in
very nearly plain-text over the network, and thus are
extremely insecure. If you're worried about password
snooping, it may be best to use some sort of SSL encryption,
so that clients authenticate via https://
instead of http://
; at a bare minimum,
you can configure Apache to use a self-signed server
certificate.
[44]
Consult Apache's documentation (and OpenSSL documentation)
about how to do that.
Businesses that need to expose their repositories for access outside the company firewall should be conscious of the possibility that unauthorized parties could be «sniffing» their network traffic. SSL makes that kind of unwanted attention less likely to result in sensitive data leaks.
If a Subversion client is compiled to use OpenSSL, then
it gains the ability to speak to an Apache server via
https://
URLs. The Neon library used by
the Subversion client is not only able to verify server
certificates, but can also supply client certificates when
challenged. When the client and server have exchanged SSL
certificates and successfully authenticated one another, all
further communication is encrypted via a session key.
It's beyond the scope of this book to describe how to generate client and server certificates, and how to configure Apache to use them. Many other books, including Apache's own documentation, describe this task. But what can be covered here is how to manage server and client certificates from an ordinary Subversion client.
When speaking to Apache via https://
,
a Subversion client can receive two different types of
information:
a server certificate
a demand for a client certificate
If the client receives a server certificate, it needs to verify that it trusts the certificate: is the server really who it claims to be? The OpenSSL library does this by examining the signer of the server certificate, or certifying authority (CA). If OpenSSL is unable to automatically trust the CA, or if some other problem occurs (such as an expired certificate or hostname mismatch), the Subversion command-line client will ask you whether you want to trust the server certificate anyway:
$ svn list https://host.example.com/repos/project Error validating server certificate for 'https://host.example.com:443': - The certificate is not issued by a trusted authority. Use the fingerprint to validate the certificate manually! Certificate information: - Hostname: host.example.com - Valid: from Jan 30 19:23:56 2004 GMT until Jan 30 19:23:56 2006 GMT - Issuer: CA, example.com, Sometown, California, US - Fingerprint: 7d:e1:a9:34:33:39:ba:6a:e9:a5:c4:22:98:7b:76:5c:92:a0:9c:7b (R)eject, accept (t)emporarily or accept (p)ermanently?
This dialogue should look familiar; it's essentially the
same question you've probably seen coming from your web
browser (which is just another HTTP client like Subversion!).
If you choose the (p)ermanent option, the server certificate
will be cached in your private run-time
auth/
area in just the same way your
username and password are cached (see «Кэширование клиентской идентификационной информации»). If cached,
Subversion will automatically remember to trust this certificate
in future negotiations.
Your run-time servers
file also gives
you the ability to make your Subversion client automatically
trust specific CAs, either globally or on a per-host basis.
Simply set the ssl-authority-files
variable to a semicolon-separated list of PEM-encoded CA
certificates:
[global] ssl-authority-files = /path/to/CAcert1.pem;/path/to/CAcert2.pem
Many OpenSSL installations also have a pre-defined set
of «default» CAs that are nearly universally
trusted. To make the Subversion client automatically trust
these standard authorities, set the
ssl-trust-default-ca
variable to
true
.
When talking to Apache, a Subversion client might also receive a challenge for a client certificate. Apache is asking the client to identify itself: is the client really who it says it is? If all goes correctly, the Subversion client sends back a private certificate signed by a CA that Apache trusts. A client certificate is usually stored on disk in encrypted format, protected by a local password. When Subversion receives this challenge, it will ask you for both a path to the certificate and the password which protects it:
$ svn list https://host.example.com/repos/project Authentication realm: https://host.example.com:443 Client certificate filename: /path/to/my/cert.p12 Passphrase for '/path/to/my/cert.p12': ******** …
Notice that the client certificate is a «p12» file. To use a client certificate with Subversion, it must be in PKCS#12 format, which is a portable standard. Most web browsers are already able to import and export certificates in that format. Another option is to use the OpenSSL command-line tools to convert existing certificates into PKCS#12.
Again, the runtime servers
file
allows you to automate this challenge on a per-host basis.
Either or both pieces of information can be described in
runtime variables:
[groups] examplehost = host.example.com [examplehost] ssl-client-cert-file = /path/to/my/cert.p12 ssl-client-cert-password = somepassword
Once you've set the
ssl-client-cert-file
and
ssl-client-cert-password
variables, the
Subversion client can automatically respond to a client
certificate challenge without prompting you.
[45]
At this point, you've configured authentication, but not authorization. Apache is able to challenge clients and confirm identities, but it has not been told how to allow or restrict access to the clients bearing those identities. This section describes two strategies for controlling access to your repositories.
The simplest form of access control is to authorize certain users for either read-only access to a repository, or read/write access to a repository.
You can restrict access on all repository operations by
adding the Require valid-user
directive
to your <Location>
block. Using
our previous example, this would mean that only clients that
claimed to be either harry
or
sally
, and provided the correct
password for their respective username, would be allowed to
do anything with the Subversion repository:
<Location /svn> DAV svn SVNParentPath /usr/local/svn # how to authenticate a user AuthType Basic AuthName "Subversion repository" AuthUserFile /path/to/users/file # only authenticated users may access the repository Require valid-user </Location>
Sometimes you don't need to run such a tight ship. For
example, Subversion's own source code repository at
http://svn.collab.net/repos/svn allows anyone
in the world to perform read-only repository tasks (like
checking out working copies and browsing the repository with
a web browser), but restricts all write operations to
authenticated users. To do this type of selective
restriction, you can use the Limit
and
LimitExcept
configuration directives.
Like the Location
directive, these blocks
have starting and ending tags, and you would nest them
inside your <Location>
block.
The parameters present on the Limit
and LimitExcept
directives are HTTP
request types that are affected by that block. For example,
if you wanted to disallow all access to your repository
except the currently supported read-only operations, you
would use the LimitExcept
directive,
passing the GET
,
PROPFIND
, OPTIONS
, and
REPORT
request type parameters. Then the
previously mentioned Require valid-user
directive would be placed inside the
<LimitExcept>
block instead of just
inside the <Location>
block.
<Location /svn> DAV svn SVNParentPath /usr/local/svn # how to authenticate a user AuthType Basic AuthName "Subversion repository" AuthUserFile /path/to/users/file # For any operations other than these, require an authenticated user. <LimitExcept GET PROPFIND OPTIONS REPORT> Require valid-user </LimitExcept> </Location>
These are only a few simple examples. For more in-depth
information about Apache access control and the
Require
directive, take a look at the
Security
section of the Apache
documentation's tutorials collection at http://httpd.apache.org/docs-2.0/misc/tutorials.html.
It's possible to set up finer-grained permissions using a second Apache httpd module, mod_authz_svn. This module grabs the various opaque URLs passing from client to server, asks mod_dav_svn to decode them, and then possibly vetoes requests based on access policies defined in a configuration file.
If you've built Subversion from source code,
mod_authz_svn is automatically built
and installed alongside mod_dav_svn.
Many binary distributions install it automatically as well.
To verify that it's installed correctly, make sure it comes
right after mod_dav_svn's
LoadModule
directive in
httpd.conf
:
LoadModule dav_module modules/mod_dav.so LoadModule dav_svn_module modules/mod_dav_svn.so LoadModule authz_svn_module modules/mod_authz_svn.so
To activate this module, you need to configure your
Location
block to use the
AuthzSVNAccessFile
directive, which
specifies a file containing the permissions policy for paths
within your repositories. (In a moment, we'll discuss the
format of that file.)
Apache is flexible, so you have the option to configure your block in one of three general patterns. To begin, choose one of these basic configuration patterns. (The examples below are very simple; look at Apache's own documentation for much more detail on Apache authentication and authorization options.)
The simplest block is to allow open access to everyone. In this scenario, Apache never sends authentication challenges, so all users are treated as «anonymous».
Пример 6.1. A sample configuration for anonymous access.
<Location /repos> DAV svn SVNParentPath /usr/local/svn # our access control policy AuthzSVNAccessFile /path/to/access/file </Location>
On the opposite end of the paranoia scale, you can
configure your block to demand authentication from everyone.
All clients must supply credentials to identify themselves.
Your block unconditionally requires authentication via the
Require valid-user
directive, and defines
a means to authenticate.
Пример 6.2. A sample configuration for authenticated access.
<Location /repos> DAV svn SVNParentPath /usr/local/svn # our access control policy AuthzSVNAccessFile /path/to/access/file # only authenticated users may access the repository Require valid-user # how to authenticate a user AuthType Basic AuthName "Subversion repository" AuthUserFile /path/to/users/file </Location>
A third very popular pattern is to allow a combination
of authenticated and anonymous access. For example, many
administrators want to allow anonymous users to read certain
repository directories, but want only authenticated users to
read (or write) more sensitive areas. In this setup, all
users start out accessing the repository anonymously. If
your access control policy demands a real username at any
point, Apache will demand authentication from the client.
To do this, you use both the Satisfy Any
and Require valid-user
directives
together.
Пример 6.3. A sample configuration for mixed authenticated/anonymous access.
<Location /repos> DAV svn SVNParentPath /usr/local/svn # our access control policy AuthzSVNAccessFile /path/to/access/file # try anonymous access first, resort to real # authentication if necessary. Satisfy Any Require valid-user # how to authenticate a user AuthType Basic AuthName "Subversion repository" AuthUserFile /path/to/users/file </Location>
Once you've settled on one of these three
basic httpd.conf
templates, you need to
create your file containing access rules for particular
paths within the repository. This is described in
«Path-Based Authorization».
The mod_dav_svn module goes through a lot of work to make sure that data you've marked «unreadable» doesn't get accidentally leaked. This means that it needs to closely monitor all of the paths and file-contents returned by commands like svn checkout or svn update commands. If these commands encounter a path that isn't readable according to some authorization policy, then the path is typically omitted altogether. In the case of history or rename tracing—e.g. running a command like svn cat -r OLD foo.c on a file that was renamed long ago—the rename tracking will simply halt if one of the object's former names is determined to be read-restricted.
All of this path-checking can sometimes be quite
expensive, especially in the case of svn
log. When retrieving a list of revisions, the server
looks at every changed path in each revision and checks it
for readability. If an unreadable path is discovered, then
it's omitted from the list of the revision's changed paths
(normally seen with the --verbose
option),
and the whole log message is suppressed. Needless to say,
this can be time-consuming on revisions that affect a large
number of files. This is the cost of security: even if you
haven't configured a module like
mod_authz_svn at all, the
mod_dav_svn module is still asking Apache
httpd to run authorization checks on
every path. The mod_dav_svn module has
no idea what authorization modules have been installed, so
all it can do is ask Apache to invoke whatever might be
present.
On the other hand, there's also an escape-hatch of
sorts, one which allows you to trade security features for
speed. If you're not enforcing any sort of per-directory
authorization (i.e. not using
mod_authz_svn or similar module), then
you can disable all of this path-checking. In your
httpd.conf
file, use the
SVNPathAuthz
directive:
Пример 6.4. Disabling path checks altogether
<Location /repos> DAV svn SVNParentPath /usr/local/svn SVNPathAuthz off </Location>
The SVNPathAuthz
directive is «on» by
default. When set «off», all path-based
authorization checking is disabled;
mod_dav_svn stops invoking authorization
checks on every path it discovers.
We've covered most of the authentication and authorization options for Apache and mod_dav_svn. But there are a few other nice features that Apache provides.
One of the most useful benefits of an Apache/WebDAV
configuration for your Subversion repository is that the
youngest revisions of your versioned files and directories
are immediately available for viewing via a regular web
browser. Since Subversion uses URLs to identify versioned
resources, those URLs used for HTTP-based repository access
can be typed directly into a Web browser. Your browser will
issue a GET
request for that URL, and
based on whether that URL represents a versioned directory
or file, mod_dav_svn will respond with a directory listing
or with file contents.
Since the URLs do not contain any information about which version of the resource you wish to see, mod_dav_svn will always answer with the youngest version. This functionality has the wonderful side-effect that you can pass around Subversion URLs to your peers as references to documents, and those URLs will always point at the latest manifestation of that document. Of course, you can even use the URLs as hyperlinks from other web sites, too.
You generally will get more use out of URLs to versioned
files—after all, that's where the interesting content
tends to lie. But you might have occasion to browse a
Subversion directory listing, where you'll quickly note that
the generated HTML used to display that listing is very
basic, and certainly not intended to be aesthetically
pleasing (or even interesting). To enable customization of
these directory displays, Subversion provides an XML index
feature. A single SVNIndexXSLT
directive
in your repository's Location
block of
httpd.conf
will instruct mod_dav_svn to
generate XML output when displaying a directory listing, and
to reference the XSLT stylesheet of your choice:
<Location /svn> DAV svn SVNParentPath /usr/local/svn SVNIndexXSLT "/svnindex.xsl" … </Location>
Using the SVNIndexXSLT
directive and
a creative XSLT stylesheet, you can make your directory
listings match the color schemes and imagery used in other
parts of your website. Or, if you'd prefer, you can use the
sample stylesheets provided in the Subversion source
distribution's tools/xslt/
directory.
Keep in mind that the path provided to the
SVNIndexXSLT
directory is actually a URL
path—browsers need to be able to read your stylesheets
in order to make use of them!
Several of the features already provided by Apache in its role as a robust Web server can be leveraged for increased functionality or security in Subversion as well. Subversion communicates with Apache using Neon, which is a generic HTTP/WebDAV library with support for such mechanisms as SSL (the Secure Socket Layer, discussed earlier) and Deflate compression (the same algorithm used by the gzip and PKZIP programs to «shrink» files into smaller chunks of data). You need only to compile support for the features you desire into Subversion and Apache, and properly configure the programs to use those features.
Deflate compression places a small burden on the client and server to compress and decompress network transmissions as a way to minimize the size of the actual transmission. In cases where network bandwidth is in short supply, this kind of compression can greatly increase the speed at which communications between server and client can be sent. In extreme cases, this minimized network transmission could be the difference between an operation timing out or completing successfully.
Less interesting, but equally useful, are other features of the Apache and Subversion relationship, such as the ability to specify a custom port (instead of the default HTTP port 80) or a virtual domain name by which the Subversion repository should be accessed, or the ability to access the repository through a proxy. These things are all supported by Neon, so Subversion gets that support for free.
Finally, because mod_dav_svn is speaking a semi-complete dialect of WebDAV/DeltaV, it's possible to access the repository via third-party DAV clients. Most modern operating systems (Win32, OS X, and Linux) have the built-in ability to mount a DAV server as a standard network «share». This is a complicated topic; for details, read Приложение C, WebDAV и автоматическое управление версиями.
[43] They really hate doing that.
[44] While self-signed server certificates are still vulnerable to a «man in the middle» attack, such an attack is still much more difficult for a casual observer to pull off, compared to sniffing unprotected passwords.
[45] More security-conscious folk might not want to store
the client certificate password in the runtime
servers
file.
[46] Back then, it was called «ViewCVS».