I’d love to be able to see my usage while I’m working on building an app, as well as seeing how much my agent requests are costing so that I can calibrate my usage rather than needing to go to a separate page (Sign Up - Replit)
It’s better not to look
and remembering that the Usage page is 1 hour behind, so not easy to accurately determine which of your most recent prompts cost what.
haha likely true ![]()
Ah, I had read that usage had once been easier to access, I wonder if this is why it was moved - potentially if the delay was confusing?
Also interesting that it’s difficult to correlate what prompting techniques are likely driving cost. I wonder if this information is available?
It’s a challenging thing to quantify. The reason AI coding is so valuable and time-saving is the same reason it’s challenging to identify costs upfront based on a prompt. Coding is a complex web of dependencies that it has to sift through every time.
What I can tell you is that cache matters. A lot of context is cached and saves money as long as you’re in a groove and ‘vibing’ - but if you idle, it auto-purges and has to be rebuilt. You’ll notice that if you start a session, the first prompt will be expensive, but subsequent prompts will be cheaper. Then you go get coffee and yell at your kids, come back, and you have to re-cache, only to be hit with a larger fee.
Take note of these things and adjust your flow accordingly.
The total cost of the current chat used to be at the top of the agent window - you hovered over an icon to see it.
It was simply adding up the live costs of each prompt/rollback - really nothing hard in that, so they had zero technical reason to remove it. The only reason to remove it seems to be to obfuscate how much we’re spending at any one time.
Can you explain your experience on this a bit more Eric?
I totally understand how cache and the context window build up and up and up over time,… until POP, and suddenly agent is talking gibberish and it is time to kill it and start a new chat.
But auto-purging after a period of silence (and then a costly first prompt as it refreshes its cache) is a new one on me.
For me, it seems to be more predictably in the middle of a long chat that Agent suddenly feels unwell, and it is time to start a new chat.
Learned this from engineers, and after doing so, noticed it in my own app and session flow. This is why long runs with agent 3 are cheaper than multiple start and stops with less autonomy. I think.
Hhmmm. Intriguing. Of course, we have to balance long runs with agent 3 are cheaper than multiple start and stops with less autonomy
with:
short surgical start/stop prompts are more precise and can get us to a working solutions quicker than long runs which might initially seem effective but create apps that don’t work in the way we intended, so we then have to do further prompts to fix it up
Just sayin’ ![]()
But all definitely fascinating. I am about to start my first hardcore vibe coding since the autonomy modes were introduced. So I will definitely monitor my workflow closely next week.
I think it would be excellent for Replit users to be given a few free passes a month with an ai conversation audit to “justify” the undoing of a particular charge, instead of a unilateral “caveat emptor” policy.
I’ve been spending way too much time on that page counting nickels. Seems like a good recommendation nonetheless.
They get hit with API fees, win lose or draw, and so do we.