Accessibility by Numbers
Whilst those of us who care about Web Accessibility tend to take care to make our own content as accessible as possible, there are times when we discover that we may not have been doing things as well as we could.
Over the last couple of weeks, I have been thinking about how we represent numbers in text. I was taught in school that any cardinal number (the ones we count with) less than 100 should be written out in full, for instance, one, two, three, forty-two. I can’t remember what we were supposed to do with ordinal numbers and fractions. I probably wasn’t listening at the time. ("Smith Minor does not pay attention in class.")
If we look at Checkpoint 14.1 of the WCAG10, we are told that we should "Use the clearest and simplest language appropriate for a site’s content". If our audience is not very good at reading English, the clearest and simplest way to represent a number must be in numeric form – 42 rather than forty-two or, even worse, two score and two. This could be even worse in French, where the humble 99 becomes quatre-vingt-dix-neuf.
I would say, therefore, that we should probably be writing our numbers as numbers, not words. For me, this is merely a case of breaking a quarter-century habit, and then getting detention for it.
Consider the number 123,456 – what number is it? There are two correct answers that I know of, depending on one’s cultural background. Most of the English speaking world would say 123456 (one hundred and twenty three thousand four hundred and fifty six). In continental Europe, however, this number would signify 123 and 456 thousandths (one hundred and twenty three decimal four five six). Diversity may be all very well, but does the world really need two reversed sets of thousands and decimal separators?
Here lies another accessibility issue, both for humans and software agents that are trying to make sense of numbers. Every good document on the Web declares its language, but can that language be implied to be a locale as well? In software terms, it is locale, not language, that tends to define how we treat punctuation when parsing a number.
I really cannot see an answer to this, so I present a couple of my own tongue-in-cheek possibilities and one a little more practical, though imperfect. These suggestions will probably come back to me with a lot of red ink on them and a stern injunction to "See Me"
Smiffy Solves the Numbers Problem
- All data must be typed, as it is in the better classes of programming languages with the exception of Perl (Perl is just too cool to warrant cluttering beautiful code with type definitions.) For example, the phrase "I sat there for 3 hours" should be expressed as "I sat there for <integer>3</integer> hours." – any separators found must be thousands separators and may be ignored as formatting, as an integer does not need a decimal separator. When declaring a float value, at least one decimal place must be specified: <float>234,567.0</float>. The decimal separator may then be deduced from the right-most piece of punctuation; anything else is just formatting.
- Return to our numeric roots. If we are using Arabic numerals, surely we should be using Arabic numeric punctuation. The Arabic decimal separator looks suspiciously like a comma. Uh-oh, does that mean that the Europeans have got it right? Not entirely, as the Arabic thousands separator looks like a single right quote that has slipped down the page – not a full stop (period). So, all we need to do is to mark up our numbers with ٫ and ٬, and we’re laughing. And we are all using Unicode now, aren’t we children?
- Use full language plus dialect (like en-GB) to imply locale, and work from that. This all falls apart for those of us with our own custom locales: en-US keyboard, en-GB spellcheck, en-AU currency and ISO8601 times and dates. Can three, sorry, 3 uses of ‘en’ be guaranteed to imply a specific set of numeric separators? Not with 100% confidence.
If anyone has a real solution to the way that we present and interpret numbers, please do tell.