I have spent 4 days on a simple weekly sales calculation card and the agent can’t figure out how to get it to calculate properly, then can’t get the display to work. I asked it for a very simple data value extraction and it added a new tab and created 3 test buttons. Can we Roll back to old agents please?
i have turned off experimental features in my replit profile, and I’m so much happier, and wish I’d done that 2 weeks ago.
Really? What have you noticed?
I’ve no idea if it’s using claude 3.5 or 3.7, but i do find that it just does what i tell it, and pretty much stays on track during the same chat, rather than wandering off and doing random stuff, or undoing what it did earlier. It feels like agent 2 is managing the context window differently and losing some of it over time, which agent 1 somehow got right (a tough ask). Point is, I did a LOAD of dev in Jan/Feb, and came to a halt in March with very little progress. I thought it was my app getting complex, but no, it was this agent change. I’ll be very cautious with experimental features from now. The $100 spend was manageable, but my lost time is irrecoverable.
You nailed it exactly, I got more done in my February than all of March due to the constant need to force the agent to focus. I wish I had worked faster and knew more I would have had my app done a month ago. Shame, it was almost done. Just can’t get the agent to put that last brick on without smashing through the other walls.
Wow. I’m going to revert today. Did you have any issues switching mid project?
no. but make no mistake, agent v1 is not what it used ot be. i think the system instrucitons have changed to suite agent v2. but it is definitely better. Wish i could get a credit for the $100 i wasted this past month, and the $10’000 of my time!
I turned off explorer mode and had a great weekend. I got more done over the weekend then I did last in the 3 weeks. Monday hits and seems I am back at the slog. I am not sure if they changed the algorithm again but now the agent is back to doing a lot of actions with no result. Maybe the user load was so low over the weekend I was able to use more computing cycles?
I think you get v2 irrespective of that setting now. Having said that, I found from around Sunday, it has actually been rather good. My general rule now:
- New pages, give to agent. It does stuff I didn’t even thing of.
- Changes, stuck to Assistant. It’s like Agent has a high temp and top k, so get too creative, whereas assosiate does the job more cleanly, but only a few steps at a time