A Tragedy of Errors
In the early days of the Web when most content consisted of static HTML
pages, the type of errors that we encountered were generally
HTTP status codes, most famously "404 – Not Found". Web servers today still use the same status codes (HTTP has changed little) and – out of the box – produce error pages that are less than helpful to the average, human, visitor.
Whilst these status pages (not all the codes are indicative of an actualy error) might be less than helpful for us humans, the
status codes contain important information which can be used by
software user agents such as search engines. Indeed, the user
agents with which us humans usually interact – Web browsers –
could be coded to respond to HTTP status codes in a more
There is something far worse than a terse "500 – Software Error" – receiving an error message with a status code of 200, which implies that the page has been found and is OK but merely says "Could not connect to database". With the increasing use of the PHP language, which tries its utmost to send HTML content, sloppy programming is liable to give rise to errors – often meaningless to the site visitor – being served as good pages.
To make our sites more accessible and usable, we need a nicer
way of saying "whoops!"
Catching Errors, And What To Do With Them
Before dynamic content (that generated by software rather than
static HTML pages) is released to the public, it is hoped that
a certain amount of testing will be done to make sure that it
works. This testing, however, often fails to address runtime
errors which may occur when something breaks in operation – most
often a database connection.
As developers, showing a database connect error can be very useful.
The general public, however, is not interested; all it knows is
that the site at which it is looking is broken. Worse still, I
have seen error messages on sites that reveal far too much about
the inner workings of the software, possibly exposing
vulnerabilities to any visiting villains.
How should we address this? I would suggest that we should take
the following steps:
- Ensure that anything that can throw an error is tested when
called and an error-handling routine called if the test fails.
- Provide a means by which we can easily switch our software in
and out of debug mode. (A variable at the top of the programme
along the lines of $debug=TRUE is a good way.)
- Error handling routines should look at whether debug mode is onor not and either serve an informative error page for the
developer, or pass an appropriate HTTP status code to the user.
Making "Whoops!" More Helpful
Those who only write static Web content and who fell asleep in the last section can wake up now – this part is relevant to you.
In the previous section, I explained how we can work around the
recent problem of software presenting errors as not errors, taking us back to our original issue of HTTP status codes and how they can be less than helpful to the general public. Let us now have a look at how we can make those codes work for us, inform our visitors and make our sites more accessible.
Know Your Web Server
Even if we are not involved in programming, some knowledge of how the Web server on which our content is hosted is configured can be important. As an inveterate user of the
Apache HTTPD server
software, the examples I present here will refer to that software.
Those using other Web servers, read the documentation or use
Google to search for specifics.
The Apache HTTPD allows us to specify our own pages, programmes or
scripts to handle errors, using the ErrorDocument directive. The
following example will direct 404 (not found) errors to Google. This is
not a helpful thing to do and is given for illustration only.
ErrorDocument 404 http://www.google.com
Custom error pages can be a good way to make our sites more
accessible when things go wrong.
Custom Error Pages
Now that we know that we can tell our Web server to call a custom page
when we get an error, what should we actually do? If we are serving
static HTML, we can provide informative pages suggesting what the
visitor might want to do next – whom they can contact for further
information or how to search for what they want. (A Google site search
form could be incorporated into a static 404 error document, for example.)
For those of us writing dynamic content, custom error pages are
where we can really start to make things happen. We do not need to write
a separate programme/script for every possible error – we can just use
one and feed the error code through the query string:
ErrorDocument 404 /errors.php?e=404
ErrorDocument 500 /errors.php?e=500
Here are a few things that we might consider doing with an error
- When handling a 404 error, look at the requested URI and suggest
- When handling a 404 error, suggest that the visitor might like
to visit a site map or site search facility.
- When handling a 404 error, examine the requested URI for
patterns that might suggest an XSS
attack and warn somebody.
- If we cannot connect to a database, send a 500 error page, but
tell the visitor that the site is temporarily unavailable (do not say
why – it can look bad), and provide contact information should they
need something urgently. At the same time, send an e-mail and/or
an SMS, using an online
provider, to the site administrator rather than relying on a visitor
to report the problem. The same could apply to any other failure of
an external service upon which our site relies. If we are acting
on the failure of an external service such as a database connection
we must not make any calls to that service from our error handling
programme! Error handlers must be able to stand-alone.
A Final Word on 404s
A frequent cause of 404 errors (and thus having to handle them) is the removal of pages, or conversions to different URI schemes. In the words of Sir Tim Berners-Lee,
Cool URIs don’t change.
If pages must be moved or removed, be certain to set up rewrites/redirects (another reason we should Know Our Web Server)
so that the visitor, at the very least, is directed to a page explaining
where a document has gone and why. It is suggested that readers acquaint
themselves with the HTTP status codes referenced at the beginning of this
article, in particular with the ones beginning with the digit 3.
By whatever means we are serving our content, let us ensure that when things go wrong, we provide information that is helpful to the site visitor – not to developers or hackers.
Smiffy, as always, is available to consult for those who
require help in saying "whoops", nicely.
- In my early days of writing Web applications, I used Perl (and still do when PHP is not specified). Whilst PHP will send content even when there is something fairly major wrong, errors in Perl coding often as not break completely, causing the server to send a 500 status code – a fairly obvious sign that something is broken. I am not criticising PHP, merely pointing out that it requires a little more care to code.
- If using debug mode in a visual environment, it is a good idea to have some code that writes "Debug Mode" in large letters at the top of each page to make it harder to forget to turn it off before deployment, which can make one look rather silly. Calling debug mode from the query string is another option, although there may be certain security implications.