A Tragedy of Errors
In the early days of the Web when most content consisted of static HTML pages, the type of errors that we encountered were generally HTTP status codes, most famously "404 - Not Found". Web servers today still use the same status codes (HTTP has changed little) and - out of the box - produce error pages that are less than helpful to the average, human, visitor.
Whilst these status pages (not all the codes are indicative of an actualy error) might be less than helpful for us humans, the status codes contain important information which can be used by software user agents such as search engines. Indeed, the user agents with which us humans usually interact - Web browsers - could be coded to respond to HTTP status codes in a more informative way.
There is something far worse than a terse "500 - Software Error" - receiving an error message with a status code of 200, which implies that the page has been found and is OK but merely says "Could not connect to database". With the increasing use of the PHP language, which tries its utmost to send HTML content, sloppy programming is liable to give rise to errors - often meaningless to the site visitor - being served as good pages.[1]
To make our sites more accessible and usable, we need a nicer way of saying "whoops!"
Catching Errors, And What To Do With Them
Before dynamic content (that generated by software rather than static HTML pages) is released to the public, it is hoped that a certain amount of testing will be done to make sure that it works. This testing, however, often fails to address runtime errors which may occur when something breaks in operation - most often a database connection.
As developers, showing a database connect error can be very useful. The general public, however, is not interested; all it knows is that the site at which it is looking is broken. Worse still, I have seen error messages on sites that reveal far too much about the inner workings of the software, possibly exposing vulnerabilities to any visiting villains.
How should we address this? I would suggest that we should take the following steps:
- Ensure that anything that can throw an error is tested when called and an error-handling routine called if the test fails.
- Provide a means by which we can easily switch our software in and out of debug mode. (A variable at the top of the programme along the lines of $debug=TRUE is a good way.) [2]
- Error handling routines should look at whether debug mode is onor not and either serve an informative error page for the developer, or pass an appropriate HTTP status code to the user.
Making "Whoops!" More Helpful
Those who only write static Web content and who fell asleep in the last section can wake up now - this part is relevant to you.
In the previous section, I explained how we can work around the recent problem of software presenting errors as not errors, taking us back to our original issue of HTTP status codes and how they can be less than helpful to the general public. Let us now have a look at how we can make those codes work for us, inform our visitors and make our sites more accessible.
Know Your Web Server
Even if we are not involved in programming, some knowledge of how the Web server on which our content is hosted is configured can be important. As an inveterate user of the Apache HTTPD server software, the examples I present here will refer to that software. Those using other Web servers, read the documentation or use Google to search for specifics.
The Apache HTTPD allows us to specify our own pages, programmes or scripts to handle errors, using the ErrorDocument directive. The following example will direct 404 (not found) errors to Google. This is not a helpful thing to do and is given for illustration only.
ErrorDocument 404 http://www.google.com
Custom error pages can be a good way to make our sites more accessible when things go wrong.
Custom Error Pages
Now that we know that we can tell our Web server to call a custom page when we get an error, what should we actually do? If we are serving static HTML, we can provide informative pages suggesting what the visitor might want to do next - whom they can contact for further information or how to search for what they want. (A Google site search form could be incorporated into a static 404 error document, for example.)
For those of us writing dynamic content, custom error pages are where we can really start to make things happen. We do not need to write a separate programme/script for every possible error - we can just use one and feed the error code through the query string:
ErrorDocument 404 /errors.php?e=404
ErrorDocument 500 /errors.php?e=500
Here are a few things that we might consider doing with an error handling programme:
- When handling a 404 error, look at the requested URI and suggest possible alternatives.
- When handling a 404 error, suggest that the visitor might like to visit a site map or site search facility.
- When handling a 404 error, examine the requested URI for patterns that might suggest an XSS attack and warn somebody.
- If we cannot connect to a database, send a 500 error page, but tell the visitor that the site is temporarily unavailable (do not say why - it can look bad), and provide contact information should they need something urgently. At the same time, send an e-mail and/or an SMS, using an online provider, to the site administrator rather than relying on a visitor to report the problem. The same could apply to any other failure of an external service upon which our site relies. If we are acting on the failure of an external service such as a database connection we must not make any calls to that service from our error handling programme! Error handlers must be able to stand-alone.
A Final Word on 404s
A frequent cause of 404 errors (and thus having to handle them) is the removal of pages, or conversions to different URI schemes. In the words of Sir Tim Berners-Lee, Cool URIs don't change.
If pages must be moved or removed, be certain to set up rewrites/redirects (another reason we should Know Our Web Server) so that the visitor, at the very least, is directed to a page explaining where a document has gone and why. It is suggested that readers acquaint themselves with the HTTP status codes referenced at the beginning of this article, in particular with the ones beginning with the digit 3.
Conclusion
By whatever means we are serving our content, let us ensure that when things go wrong, we provide information that is helpful to the site visitor - not to developers or hackers.
Smiffy, as always, is available to consult for those who require help in saying "whoops", nicely.
Notes
- In my early days of writing Web applications, I used Perl (and still do when PHP is not specified). Whilst PHP will send content even when there is something fairly major wrong, errors in Perl coding often as not break completely, causing the server to send a 500 status code - a fairly obvious sign that something is broken. I am not criticising PHP, merely pointing out that it requires a little more care to code.
- If using debug mode in a visual environment, it is a good idea to have some code that writes "Debug Mode" in large letters at the top of each page to make it harder to forget to turn it off before deployment, which can make one look rather silly. Calling debug mode from the query string is another option, although there may be certain security implications.
Smiffy at LinkedIn