Who is going to pay $5 or more per prompt!?

UPDATE: yesterday I spent $150 on prompts testing the high powered model on my app. Lots of $5 - $10 prompts. My biggest prompt? $18.76 :astonished_face:

I would say it is great at research and resolving things the other models just keep failing on.

For example, I was having a nightmare trying to get the extended thinking mode (my default) to fix a problem - going round in circles. Probably only $10-$15 spent, but it was taking me hours to resolve, as each fix can be several minutes while you discuss it with the agent and then wait for it to do its thing. But the high powered model came at the problem in a fresh way, did some web research, delved deep and came up with the answer. So like using ChatGPT or Claude, but with advantage it can “see” your code.

Yes, the cheaper modes can also do web research and I recommend you enable this in the agent (looking on github, seeking out known issues with plugins, etc). But the big model somehow felt it was going the extra mile. And it definitely saved me time of having the cheapers ones keep coming up with bad solutions.

Today I am back to Extending Thinking mode. And my daily spend will be nearer a few tens of dollars, not $150!

So is the high powered model worth the extra cost?

  1. Yes, occasionally if you and the cheaper models are banging your heads against a wall and going round in circles trying to fix stuff. Pull out the big one for a handful of prompts, nail your roadblock, and then switch back down.

  2. But would I use it permanently? No. It is intelligent and thinks hard on big meaty issues. But when using it for standard prompts that aren’t related to fixing or researching horrible problems, I felt it was no smarter than the extended thinking model.

What are other people seeing with the different models?

2 Likes