So....does Agent v2 have a limit or nah?

Started an Agent v2 chat. Came back a while later and it was at $2 and counting. Is there a point at which it stops now or does it just keep spending til it’s out of things to do? What is the method here?

As the people paying for it, we should probably know.

Early access still so still changing things pretty rapidly, it does a checkpoint after a logical unit of work or certain number of steps. It first creates a plan then as it approaches its goals there’s reflection steps where it decides if it’s done, if it’s moved closer, or if it should backtrack and try something different. Once everything’s finalized for the broad launch there’ll be a corresponding blog post about pricing and checkpoints with agent v2.

On first launch it did everything in one checkpoint, with update now it will make a checkpoint after a number of steps depending on reflection, going to get to the point (hopefully soon) where you can choose how deep and how long you want it to go before kicking it off or stop/redirect after a checkpoint. You can currently pause it and start a new chat if you don’t want it to keep going.

1 Like

Thanks for the clarity Kody.

The major problem I see with the V2 that has persisted from the V1 agent is the fact that it doesn’t seem to understand that when it tries to mix it’s implementation across tech stacks it’s not going to be able to work, so for instance if it does something in streamlit It doesn’t seem to realize that it then cannot go ahead and try to implement another feature in tailwind… It will try to do it and then spend the rest of your bank account trying to resolve the issues that pop up in the communication between the two. I think that a lot of the time people complain about the agent getting stuck in a death spiral this is the main culprit. Maybe there could be something that would allow the agent to evaluate whether or not this is happening… In fact all the agents needs to really be able to do is to say it seems as if I cannot resolve this issue maybe after 5 or 6 attempts at the same problem, instead of encourage you to keep trying.

The agent when it first inits a repo selects a tech stack and should not deviate from that. It’s the 4 options in the lobby, do you have an example Repl where it used streamlit then tried to use tailwind/js?

Anything of it entirely switching stacks is a product bug that we should fix. Tracking it sometimes does like a “throws hands in air and tries something new” but that’s a behavior from the underlying model that we try to suppress, if it’s happening to you it’s be great if you reported it as a bug so we can replicate it

I don’t know I guess I could add you to the Reple not sure what will be the best way to Showcase it and I wish that I had screenshot the initial plan I had asked for a simple conference registration landing page and it actually in the initial plan suggested that it would use Tailwind and the streamlit Tailwind to do the front end and streamlit to process the registration information collection anyway halfway through the project I decided that I wanted to do a registration system and I wanted to be able to login as an admin to collect the data it decided to implement that in ShadowCn and that’s where the program crashed The Moment I Saw the back end UI I knew it was switching the ShadowCn and I knew the end was near anyway in the end I just went back to the original Tailwind landing page and embedded a Microsoft form for the registration back end