3.5. Articulation barriers

As we discussed in Chapter 2, the latest surge in generative AI is driven by the use of prompts—instructions or queries that you feed into a model to guide its responses. As multi-modal AIs gain traction, the role of prompts is expanding beyond text to encompass vision and voice. Virtually everything becomes driven by text. Take DALL-E 3, for example, which produces images influenced by the messages you exchange with ChatGPT.

This approach has its advantages. For one, it fosters a more conversational interaction with the product, making the technology more approachable. Using verbal prompts is often more intuitive than navigating a complex user interface, too—after all, you already know how to express yourself. You learned it in school!

Or did you, really?

Language proficiency

That’s not always obvious. To start, users need to be eloquent enough to craft the necessary textual prompts effectively. Also, since most of these models are primarily trained on English data, their performance in other languages can be subpar. This puts non-English speakers at a disadvantage. Prominent UX researcher Jakob Nielsen refers to this issue as the “articulation barrier.”

While it’s true that a GUI may not always be available in your native language, when it is, the quality of its output isn’t compromised by that fact. Also, translating a user interface doesn’t cost millions of dollars—unlike retraining a machine learning model.

Language proficiency isn’t the sole barrier to effective use of language models. Research indicates that in affluent countries like the United States and Germany, up to half of the population are considered low-literacy users. Although literacy rates may be higher in countries like Japan and potentially other Asian nations, the situation deteriorates significantly in middle-income and likely even more so in developing countries.

Even for those with high levels of literacy, conveying your requirements in written form can be challenging. In my book, “Writing Great Specifications,” I talk in depth about the complexities of drafting specifications for software development teams. This task is not unlike instructing LLMs like GPT-4 to create an app for you. In particular, two major pitfalls I discuss are information asymmetry and the under-documentation pattern.

Information asymmetry A situation that arises when one party possesses more or better information than the other, leading to an imbalance in understanding.

Under-documentation A common mistake which involves neglecting to provide adequate information, whether due to errors, miscommunication, or even laziness.

It’s not hard to see how these issues often intersect: we may have a clear vision of what we want the app to do, but fail to communicate this adequately to the model. These pitfalls are not theoretical; they manifest in real-world scenarios every day, even among well-educated, well-intentioned professionals—and with intelligent humans on both ends of the process.

Communication complexity—for algorithm designers

That covers the challenges of prompting, but there’s also the matter of the generated content to consider. Analysis reveals that the output from these models is typically crafted at a reading level of 12th grade or higher, making it problematic for low-literacy users. Usability research focusing on such users has long recommended that online text be written at an 8th-grade level to be more inclusive of a broader consumer base.

As measured by Nielsen:

Bing Chat’s response was calibrated at a 13th-grade reading level, similar to what a university freshman might face
ChatGPT responded at an astonishing 16th-grade reading level

Intriguingly, both of these applications are built on the same foundational model: GPT-4. This implies that it’s possible to prompt these models to produce simpler responses through system prompts, or to fine-tune them for that purpose. Each development team needs to determine the level of complexity that both they and their target audience are comfortable with.

My own experiences align with this perspective. I often rely on GPT-4 to assist me in editing my newsletter. Although I’m proficient in English, it’s not my native language—perfecting a newsletter issue to a a high standard on my own is time-consuming. For example, crafting an article like this one used to take me between one to two days before I started using ChatGPT. I’d complete the initial draft fairly quickly, but then spend a considerable amount of time fine-tuning the text—agonizing over idiomatic expressions, searching for synonyms, and the like.

GPT-4 has dramatically cut my editing time to just 30 minutes to an hour per article, allowing me to concentrate more on articulating my thoughts rather than perfecting their presentation. The trade-off? I often find myself having to simplify the model’s language choices. It just loves these complex, four- or five-syllable words. Yuck!