2.10. Beyond prompts

Having covered all these techniques, I must confess something to you, my reader, with a bit of embarrassment: I don’t even like prompts.

It’s true that the basics are straightforward and open to everyone. However, when it comes to tasks with edge cases, the need for clear expression of vague preferences, or tasks that require a precise understanding of LLM behavior, writing prompts can be difficult. Take the tree-of-thought technique we’ve discussed before. That wasn’t an easy prompt to write.

What about images, though? I found a prompt for a generated image on my feed in Playground, an AI image app:

Digital realistic art of a Plymouth Road Runner 426 Hemi, tinted windows, speeding through the city streets of Miami, the sun shining on the bright orange body and chrome parts. Professional digital painting made with alcohol inks and acrylic, in the style of WLOP, RHADS, APK, vibrant colors, sharp focus, vanishing point, three-point perspective. High quality.

The author uses very specific phrases to guide the model to their desired outcome. I wonder how much experimentation it took to achieve that. I doubt I could produce something similar without further training, even though I’ve been working with generative AI for the past year.

It seems likely to me that prompts will become increasingly less important over time. This trend is already evident in cutting-edge tools. OpenAI’s DALL-E 3 integrates natively with ChatGPT, allowing you to use ChatGPT as a collaborator to refine your prompts. You can describe what you want to see using the chat, ranging from a simple sentence to a detailed paragraph.

However, ChatGPT doesn’t directly use this description for the image prompt. Instead, it creates its own command—a case of two AI systems interacting independently.

Let me show you an example. This is an image prompt created from a very simple command I gave to ChatGPT:

Kamil to ChatGPT: Draw a cat, please.

ChatGPT to DALL-E 3: A realistic image of a domestic cat sitting peacefully. The cat has a fluffy coat, with distinct tabby markings in shades of grey and black. Its eyes are bright green, and it is gazing curiously at something out of frame. The background is simple, with a soft focus to keep the emphasis on the cat. There’s a hint of a comfortable living room setting, with a cozy couch and a warm, soft light that enhances the cat’s fur texture.

Research shows that GPT-4 is, in fact, more effective at eliciting user preferences through questioning than users were in conveying their needs through prompts—and that it requires less mental effort.

I’ve crafted a simple prompt demonstrating how to get a similar outcome:

As a language model, your role involves examining and refining prompts for systems similar to yourself. When a user presents a prompt, break down its purpose, considering the user’s objectives and motivations. Reflect on the constraints and optimal strategies for engaging with large language models in scenarios like the user’s. Your task is to reshape the prompt into a directive for another neural network to maximize effectiveness. Recognize that the user may not have expertise in prompt engineering, so your assistance is key in enhancing their interaction.

The prompt is: “Write a short story about knights.”

I believe this approach will become the norm: experts in various fields will create task elicitation systems specific to their domains. These systems will then help casual users engage with generative AI more easily and achieve better results, all without needing to learn prompt engineering.

As models start to take over the simpler aspects of prompt engineering before the field fully matures, AI engineers will focus on the more complex tasks. These include versioning, testing, fine-tuning, and deploying prompts and models. They will use a variety of advanced skills required to develop applications with this new generation of models.