Recently there has been a discussion in the WebAIM listserv regarding an apparent new “service” being developed by the Google research people. For those who have not heard about Research at Google, this is the section of Google that is devoted to expanding the Googleverse with new and interesting stuff. And, if a makes a few extra bucks for Google’s stockholders, hey that’s not bad either.
The latest great idea was recently discussed in a blog post entitled “A picture is worth a thousand (coherent) words: building a natural description of images,” which describes an effort to develop a system that is – to use the author’s words:
“…a machine-learning system that can automatically produce captions … to accurately describe images the first time it sees them. This kind of system could eventually help visually impaired people understand pictures, provide alternate text for images in parts of the world where mobile connections are slow, and make it easier for everyone to search on Google for images.”
The project has raised a great deal of discussion on the WebAIM listserv with most of the folks noting that Alternative Descriptions (a.k.a., Alternative text, ALT Tags, Alternative Attributes) can only be done by actual humans because they are so “subjective.”
Personally, I like to pride myself in being a “big picture” kinda guy; someone who tends to look at things broadly and seeing the large-scale implications. Rather than jumping into the discussion espousing my opinion on the merits or demerits of this new service, I was inspired to remember a discussion that took place on this very listserv, a few years back, where the merits of what makes a “good” Alternative Description were. I’ve used a variation of this “conversation” many times during presentations and training programs about accessible design, but I am not sure I’ve ever put it down into written form. So here is the summation:
A number of years ago a discussion was generated on one of my listservs about what exactly makes a good Alternative Description for images on website. The “conversation” – perhaps better described as a debate – quickly got hot and heavy with two primary respondents going at it fast and furious. Both of these folks were individuals with visual impairments who used screen reader technology as assistive technology. Both had strong opinions.
On one side was a person I perceived to be a young woman who insisted that all images required ALT descriptions and that each needed to provide a great deal of detail. “I want to know the color, the shape, size. I want to know what’s in the background, the expressions on people’s faces,” she argued. On the other side was an individual whom I perceived to be an older man. He was equally insistent that ALT descriptions were basically a waste of time. “Unless the graphic has words on it that tell me something, of if the image is a link, I don’t care what it is. Just use the ‘null alt’ so my screen reader will just ignore it. Nearly all of the images on websites are just ‘pretty pictures’ and are a waste of my time,” he lamented.
Of course I am paraphrasing the conversation here as it was so long ago and the actual transcript I no longer have. But you get the point. Here are two screen reader users who cannot seem to agree on what should be included in Alternative Descriptions for images. Indeed, they couldn’t even agree if ALT descriptions were even relevant.
When I have used this story in my training I ask the participant why they think the two opinions are so diverse. I question is it a “boy-girl” thing? Are women naturally more interested in “colorful” details and men no so? Or, is it perhaps an “age” thing? Are young people simply more interested in details and old curmudgeonly types simply dispassionate?
In recent times, my students have correctly noted the answer to this question. They correctly surmise that the older man was a person born blind and the young woman had acquired blindness later in life. They correctly determined that, to the man-born-blind, there was no “visual world” and that any reference to things visual was irrelevant. The woman, on the other hand, remembered fondly the visual world she had lost, and perhaps longing for it again, requested as much detail as possible to help her “see” what was on the page.
I share this reflection because it provides a reminder that no two people are the same, let alone two blind people. For those of us struggling with the question of what makes a “good” Alternative Description for an image on a web page or on a digital document there is no correct answer.
The idea that a machine, no matter how powerful, can somehow come up with a description of any random image in a way that would please both of our subjects is hard to imagine. Even if the two subjects could agree on what constitutes a “good” ALT description, being able to articulate this in a way that is complete and accurate seems to be almost impossible.
I don’t know if the new Google service will ever come to fruition, but like so many other things in the Googleverse, it is something we just will have to wait to see.