Back to stories
Prompts
Spent 3 weeks and ~$800 prompt-engineering a classifier. A regex would have worked.
Story time. I joined a startup and my first task was to classify support tickets into 12 categories.
I thought: this is a perfect use case for an LLM. I spent three weeks iterating on prompts, trying GPT-3.5 vs GPT-4, adding few-shot examples, adjusting temperature, writing elaborate system prompts with definitions of each category, chain-of-thought reasoning, the works.
I got it up to 91% accuracy. Pretty good! I was proud.
At some point my manager looked over my shoulder and pointed out that the previous engineer had built a regex-based classifier that ran in 0ms, cost literally $0, and had 94% accuracy.
It had been in the codebase the whole time. I had just never looked.
I've been writing unit tests for regex patterns ever since.
445