Anthropic's Prompt Engineering Tutorial

(github.com)

203 points | by cjbarber 18 hours ago

12 comments

Sammi 4 minutes ago
Here's my best advice of prompt engineering for hard problems. Always funnel out and then funnel in. Let me explain.
State your concrete problem and context. Then we funnel out by asking the AI to do a thorough analysis and investigate all the possible options and approaches for solving the issue. Ask it to go search the web for all possible relevant information. And now we start funneling in again by asking it to list the pros and cons of each approach. Finally we asked it to choose which one or two solutions are the most relevant to our problem at hand.
For easy problems you can just skip all of this and just ask directly because it'll know and it'll answer.
The issue with harder problems is that if you just ask it directly to come up with a solution then it'll just make something up and it will make up reasons for why it'll work. You need to ground it in reality first.
So you do: contrete context and problem, thorough analysis of options, list pros and cons, and pick a winner.
mold_aid 15 minutes ago
"Engineering" here seems rhetorically designed to convince people they're not just writing sentences. With respect "prompt writing" probably sounds bad to the same type of person who thinks there are "soft" skills.
jwr 3 hours ago
I find the word "engineering" used in this context extremely annoying. There is no "engineering" here. Engineering is about applying knowledge, laws of physics, and rules learned over many years to predictably design and build things. This is throwing stuff at the wall to see if it sticks.
[-]
- rr808 36 minutes ago
  I still like the Canadian approach that to have a title with the word Engineer in it you have to be licensed by the engineering regulator for the province you work in. The US way of every software dev, mechanic, hvac installer or plumber is an engineer is ridiculous.
  [-]
  - delichon 20 minutes ago
    Disagree. I think it's valid to describe your work as engineering if it is in fact engineering, regardless of credential. If the distinction is important, call it "<credential name> Engineer". But to simply seize the word and say you can't use it until you have this credential is authoritarian, unnecessary, rent seeking corruption.
    [-]
    - rr808 13 minutes ago
      Doctors and Lawyers are like this. Maybe something like CPA where you can be an accountant or a certified accountant which you need for something important.
- frde_me 1 hour ago
  You could make this same argument about a lot of work that fall onto "engineering" teams.
  There's an implicit assumption that anything an engineer does is engineering (and a deeper assumption that software as a whole is worthy of being called software engineering in the first place)
  [-]
  - jwr 1 hour ago
    Perhaps. My point is that the word "engineering" describes a specific approach, based on rigor and repeatability.
    If the results of your work depend on a random generator seed, it's not engineering. If you don't have established practices, it's not engineering (hence "software engineering" was always a dubious term).
    Throwing new prompts at a machine with built-in randomness to see if one sticks is DEFINITELY not engineering.
    [-]
    - fragmede 30 minutes ago
      > Throwing new prompts at a machine with built-in randomness to see if one sticks is DEFINITELY not engineering.
      Where does all the knowledge, laws of physics, and rules learned over many years to predictably design and build things come from, if not by throwing things at the wall and looking at what sticks and what does not, and then building a model based on the differences between what stuck and what did not, and then deriving a theory of stickiness and building up a set of rules on how things work?
      "Remember kids, the only difference between screwing around and science is writing it down." -Adam Savage
      [-]
      - galbar 19 minutes ago
        They come from science. Engineering applies laws, concepts and knowledge discovered through science. Engineering and science are not the same, they are different disciplines with different outcome expectations.
- einrealist 3 hours ago
  I call it "Vibe Prompting".
  Even minor changes to models can render previous prompts useless or invalidate assumptions for new prompts.
- frobisher 1 hour ago
  I hear you. But what's integration in calculus? :)
whatever1 3 hours ago
In today’s episode of Alchemy for beginners!
Reminds me of a time that I found I could speed up by 30% an Algo in a benchmark set if I seed the random number generator with the number 7. Not 8. Not 6. 7.
meander_water 45 minutes ago
Agree with the other commenters here that this doesn't feel like engineering.
However, Anthropic has done some cool work on model interpretability [0]. If that tool was exposed through the public API, then we could at least start to get a feedback loop going where we could compare the internal states of the model with different prompts, and try and tune them systematically.
[0] https://www.anthropic.com/research/tracing-thoughts-language...
raffael_de 3 hours ago
Should have "(2024)" in the submission title.
babblingfish 9 hours ago
The big unlock for me reading this is to think about the order of the output. As in, ask it to produce evidence and indicators before answering a question. Obviously I knew LLMs are a probabilistic auto complete. For some reason, I didn't think to use this for priming.
[-]
- stingraycharles 4 hours ago
  I typically ask it to start with some short, verbatim quotes of sources it found online (if relevant), as this grounds the context into “real” information, rather than hallucinations. It works fairly well in situations where this is relevant (I recently went through a whole session of setting up Cloudflare Zero Trust for our org, this was very much necessary).
- beering 8 hours ago
  Note that this is not relevant for reasoning models, since they will think about the problem in whatever order it wants to before outputting the answer. Since it can “refer” back to its thinking when outputting the final answer, the output order is less relevant to the correctness. The relative robustness is likely why openai is trying to force reasoning onto everyone.
  [-]
  - adastra22 5 hours ago
    This is misleading if not wrong. A thinking model doesn’t fundamentally work any different from a non-thinking model. It is still next token prediction, with the same position independence, and still suffers from the same context poisoning issues. It’s just that the “thinking” step injects this instruction to take a moment and consider the situation before acting, as a core system behavior.
    But specialized instructions to weigh alternatives still works better as it ends up thinking about thinking, thinking, then making a choice.
    [-]
    - simianwords 4 hours ago
      I think you are misleading as well. Thinking models do recursively generate the final “best” prompt to get the most accurate output. Unless you are genuinely giving new useful information in the prompt, it is kind of useless to structure the prompt in one way or another because reasoning models can generate intermediate steps that give best output. The evidence on this is clear - benchmarks reveal that thinking models are way more performant.
      [-]
      - zurfer 1 hour ago
        You're both kind of right. The order is less important for reasoning models, but if you carefully read thinking traces you'll find that the final answer is sometimes not the same as the last intermediary result. On slightly more challenging problems LLMs flip flop quite a bit and ordering the output cleverly can uplift the result. That might stop being true for newer or future models but I iterated quite a bit in this for sonnet 4.
- adastra22 8 hours ago
  Furthermore, the opposite behavior is very, very bad. Ask it to give you an answer and justify it, it will output a randomish reply and then enter bullshit mode rationalizing it.
  Ask it to objectively list pros and cons from a neutral/unbiased perspective and then proclaim an answer, and you’ll get something that is actually thought through.
gdudeman 9 hours ago
This is written for the 3 models (Sonnet, Haiku, Opus 3). While some lessons will be relevant today, others will not be useful or necessary on smarter, RL’d models like Sonnet 4.5.
> Note: This tutorial uses our smallest, fastest, and cheapest model, Claude 3 Haiku. Anthropic has two other models, Claude 3 Sonnet and Claude 3 Opus, which are more intelligent than Haiku, with Opus being the most intelligent.
[-]
- cjbarber 9 hours ago
  Yes, Chapters 3 and 6 are likely less relevant now. Any others? Specifically assuming the audience is someone writing a prompt that’ll be re-used repeatedly or needs to be optimized for accuracy.
vincnetas 4 hours ago
It's one year old. Curious how much of it is irrelevant already. Would be nice to see it updated.
CuriouslyC 1 hour ago
Don't write prompts yourself, use DSPy. That's real prompt "engineering"
111yoav 5 hours ago
Is there an up to date version of this that was written against their latest models?
ipnon 3 hours ago
I really struggle to feel the AGI when I read such things. I understand this is all of year old. And that we have superhuman results in mathematics, basic science, game playing, and other well-defined fields. But why is it difficult to impossible for LLMs to intuit and deeply comprehend what it is we are trying to coax from them?
[-]
- mkl 44 minutes ago
  > But why is it difficult to impossible for LLMs to intuit and deeply comprehend what it is we are trying to coax from them?
  It's right there in the name. Large language models model language and predict tokens. They are not trained to deeply comprehend, as we don't really know how to do that.
- jimmcslim 1 hour ago
  Ask yourself "what is intelligence?". Can intelligence at the level of human experience exist without that which we all also (allegedly) have... "consciousness". What is the source of "consciousness"? Can consciousness be computed?
  Without answers to these questions, I don't think we are ever achieving AGI. At the end of the day, frontier models are just arithmetic, conditionals, and loops.
- xanderlewis 2 hours ago
  > superhuman results in mathematics
  LLMs mostly spew nonsense if you ask them basic questions on research or even master's degree-level mathematics. I've only ever seen non-mathematicians suggest otherwise, and even the biggest mathematician advocate for AI, Terry Tao, seems to recognise this too.