In reading Joe Dolson’s recent piece on the intersection of AI and accessibility, I absolutely appreciated the skepticism that he has for AI in general as well as for the ways that many have been using it. In fact, I’m very skeptical of AI myself, despite my role at Microsoft as an accessibility innovation strategist who helps run the AI for Accessibility grant program. As with any tool, AI can be used in very constructive, inclusive, and accessible ways; and it can also be used in destructive, exclusive, and harmful ones. And there are a ton of uses somewhere in the mediocre middle as well.
I’d like you to consider this a “yes… and” piece to complement Joe’s post. I’m not trying to refute any of what he’s saying but rather provide some visibility to projects and opportunities where AI can make meaningful differences for people with disabilities. To be clear, I’m not saying that there aren’t real risks or pressing issues with AI that need to be addressed—there are, and we’ve needed to address them, like, yesterday—but I want to take a little time to talk about what’s possible in hopes that we’ll get there one day.
Alternative text
Joe’s piece spends a lot of time talking about computer-vision models generating alternative text. He highlights a ton of valid issues with the current state of things. And while computer-vision models continue to improve in the quality and richness of detail in their descriptions, their results aren’t great. As he rightly points out, the current state of image analysis is pretty poor—especially for certain image types—in large part because current AI systems examine images in isolation rather than within the contexts that they’re in (which is a consequence of having separate “foundation” models for text analysis and image analysis). Today’s models aren’t trained to distinguish between images that are contextually relevant (that should probably have descriptions) and those that are purely decorative (which might not need a description) either. Still, I still think there’s potential in this space.
As Joe mentions, human-in-the-loop authoring of alt text should absolutely be a thing. And if AI can pop in to offer a starting point for alt text—even if that starting point might be a prompt saying What is this BS? That’s not right at all… Let me try to offer a starting point—I think that’s a win.
Taking things a step further, if we can specifically train a model to analyze image usage in context, it could help us more quickly identify which images are likely to be decorative and which ones likely require a desc
Leave a Reply