2.4. Chain-of-thought

Let’s start with chain-of-thought. This is a novel strategy that prompts the language model to articulate its thought process. By exposing the model to a handful of examples where the reasoning is clearly spelled out, the model learns to similarly outline its reasoning when responding to new prompts.

To illustrate a chain-of-thought approach using different examples than those provided, let’s consider a scenario involving a bookstore and another involving a school’s sports equipment inventory.

Model input:

Question: A bookstore has 200 novels in stock. It receives 4 more boxes of novels, each box containing 25 novels. How many novels are in stock now?

Answer: Initially, the bookstore has 200 novels. Each of the 4 boxes contains 25 novels, so 4 boxes total 4 × 25 = 100 novels. Adding these to the original stock, 200 novels + 100 novels = 300 novels. The answer is 300 novels.

Question: The school’s sports department has 15 basketballs and 10 volleyballs. If they purchase 3 new sets of sports equipment, each set containing 5 basketballs and 4 volleyballs, how many of each type of ball do they have now?

Model output:

Answer: The sports department starts with 15 basketballs and 10 volleyballs. Each set of new equipment contains 5 basketballs and 4 volleyballs. So, 3 sets would have 3 × 5 = 15 basketballs and 3 × 4 = 12 volleyballs. Adding these to the original numbers, the department now has 15 + 15 = 30 basketballs and 10 + 12 = 22 volleyballs. The answer is 30 basketballs and 22 volleyballs.

As models advanced in intelligence, it became less crucial to provide examples of reasoning. The training datasets already contain a vast array of reasoning examples! Depending on the size of the model, you can activate chain-of-thought processing in almost any prompt by using a magic phrase: “Let’s think step by step.”

Model input:

Q: The school’s sports department has 15 basketballs and 10 volleyballs. If they purchase 3 new sets of sports equipment, each set containing 5 basketballs and 4 volleyballs, how many of each type of ball do they have now? Let's think step by step.

Model output:

A: The sports department starts with 15 basketballs and 10 volleyballs. Each set of new equipment contains 5 basketballs and 4 volleyballs. So, 3 sets would have 3 × 5 = 15 basketballs and 3 × 4 = 12 volleyballs. Adding these to the original numbers, the department now has 15 + 15 = 30 basketballs and 10 + 12 = 22 volleyballs. The answer is 30 basketballs and 22 volleyballs.

By outlining the thought process, the AI’s responses become more transparent and easier to inspect. Users can see how the AI arrived at its conclusion, which helps in verifying the accuracy and reliability of the response. By seeing the intermediate steps in the AI’s reasoning, users can more easily identify where the LLM might have gone wrong and provide targeted feedback for improvement.

For tasks that require sequential steps or reasoning, chain-of-thought prompting can guide the AI to follow a structured approach, leading to more accurate and relevant answers. When the AI explains its reasoning step-by-step, it can also be an educational tool, helping users learn how to approach similar problems in the future.