Agent 3-va (variable autonomy): Add your feedback

To help Replit teams get rapid feedback as they continue tweaking the agent…

If you have been trying out the new agent 3 improvements (with the variable autonomy options), please add your experiences below - including cost observations:

Personally, I will be trying over the coming week, so don’t have any feedback yet.


PS, “Agent 3-va” is my made-up name. I wonder if we can make it stick :blush:

Im been running in Medium mode all day, with web toggle on, nothing else clicked. spent the whole day adding moving parts to the app, 1 by 1, total spend was $11.15. Pretty normal $ usage IMO, average spend was probably less than $1 per addition. Agent did have comprehensive md docs to reference, with high level prompts, which made it seamless, not 1 hiccup. :exploding_head:

Did not use plan mode, only build mode, as plan just confirmed what I gave it, then charged a small fee to agree with me. So build did all the work from then on.

Build agent was happy to add one thing at a time, giving detailed explanation of implementation it just did, I test, I confim, it checks, writes new report. I come back with new prompt. So it seems im back to normal for now.

2 Likes

Great feedback, and those costs sound like we’re back to the good old days! I look forward to trying it too. A few random questions for you:

  • did you have “app testing” turned off?

  • did you have “high power model” turned off?

  • are you able to tell build mode to “discuss, analyse and review/plan, but don’t change anything until we agree” - like in the old agent2 days? I know in theory this is what “plan” mode is for, but personally I find it too clunky switching between plan and build, so would prefer to stay in build, and just instruct it as needed

  • I was thinking that “low” would be the nearest to agent2. But it sounds like your experience is that “medium” is most like agent2? If so, I wonder what low mode is actually for

  • another thought: if we are in Plan mode, then what does having the different autonomies actually do? Almost feels like Plan/Build are redundant now?

  1. App testing and high power off.
  2. My app planning and prompts are generated outside of Replit in detail, hence me not using the Plan mode today.
  3. I add a single action prompt referencing md docs and workflow to replit. It implements normally in 1 go. If not, I then say in detail what is visible, and what was expected, I check console logs and try give as much info back as possible. Once fixed, I always add a follow up with my personal testing across all platforms, costs a few c for final results, which I copy paste then out of replit.
  4. If you plan well enough, you can bypass plan mode with a good prompt.

I dont understand how anyone can solely work in Replit plan mode when there are so many AI platforms you can use for this same purpose.

1 Like

I don’t. I use a mix of ChatGPT and agent. But in the old agent2 days, I did also ask it to do some planning, code analysis, etc - the main benefit is obvs that it can “see” your code.

I personally dislike the Plan/Build modes they introduced, and had a very nice workflow until they added these.

I haven’t done any real coding with agent 3-va yet. But I decided to take advantage of the new “High” power mode and ask Agent to do a major review of my entire codebase. Total cost $2.38. Once we had agreed on the tasklist it ran for just under 10 minutes.

Note: I didn’t ask it to do it in Max mode, because I simply do not believe in what this mode is about. No way on earth I will let an agent rip through my existing code for 200 mins un-aided (perhaps a brand new app is a different story).

Major flaws:

  • saying it needs to go into Build mode simply to read files, and suggesting it will do changes, even though we agreed (whilst planning in Plan mode) that it would only review. I trusted it, but the flow is very clunky and confusing - more so for any less tech-experienced users. The wording around all of this, with a “start building” button even though the plan it listed only had review-style tasks is amateur stuff, and overall a poor UX

  • $0.08 for a prompt, simply to say “yes I agree to your request not to make changes”. I now believe Replit are charging us for local compute during prompts, not just LLM tokens. Although how my simple question cost $0.08 is beyond me!

Major happiness:

  • well, apart from the fact my app passed it’s review with flying colours :blush: I felt we might be heading back towards a workflow that actually works. Not perfect, but these agent 3 fixes are definitely in the right direction

  • $2.38 to analyse every line of code in a big scaffold app, including all the configs and native app-building workflow scripts. And understand how it all works and hangs together, and where any issues/security flaws were. I’d say that was value for money

Since I’ve been focused on getting my MVP to GA, I haven’t needed much beyond low mode. I have to say, I’m really impressed with the accuracy-to-cost ratio—it’s been perfect for tying up loose ends and getting my MVP ready for my first customer.

Great work, Replit team!

This request was to audit all nav routes to ensure that they support my read only access for Role Based Access Controls (RBAC). $1.42 is money well spent for that audit and route guard adjustments in the BE to support read only mode.

This is a beefier request.

  • 63 actions
  • 2936 lines read
  • 258 lines added and 180 lines removed
  • $3.06 cost

1 Like

Agent 3-va feedback so far

I’ve been using agent 3-va (variable autonomy) today. Not so much for coding, but just bringing a few docs upto date (such as replit.md) and auditing some of my codebase.

  • I have High Power Mode off: I will only use this when coding something hugely complex that needs deep research

  • I have App Testing off: I may try this out one day, but I am not sold on this idea, and prefer my own testing for now

  • I have remained in “High” autonomy level - which I’m loving. I feel like I’ve got my old friend Agent 2 back :blush:

  • I used both Plan and Build modes. At times I was in Build mode and asked it to simply review things but make no changes. It complied with this perfectly, and didn’t try to edit any files. I feel like I can trust it again in Build mode, as long as I give clear instructions

Change request for Replit Team

  • “High autonomy level” and “High power model” - using the word “High” in both is confusing - please rename one of them.

Conclusion (early days, with no real coding, but so far I’m very positive)

The vibe feels right. Welcome back Agent 2, my old friend :heart:

1 Like

The vibe is definitely feeling right, wouldn’t you agree?

2 Likes

We’re back, baby.

2 Likes

1 Like

Incredibly grateful to @Gipity-Steve @realfunnyeric and others who may have engaged Replit directly with the community’s concerns. While it may have been 1 step back with Agent 3 launch, I’m hoping we have taken 2 steps forward with variable autonomy and set a good feedback loop (and precedent) with Replit for any future changes.

1 Like

I do believe this to be the case. Cheers, everyone!

2 Likes

I think the whole community pulled together here and made themselves heard.

Replit may have just put another $250m in the bank, but the fallout from agent 3’s launch will have hit them hard, and made them realise they will be nothing without all the great users!

1 Like

Agents 3 in low mode is my new favorite assistant.

2 Likes

Agent 3 in low mode implementing a tenant level read-only mode for me.


Feature Request: I’d like Agent 3 to have read access to server logs and files while in plan mode. There’s no need to switch to build mode just for the agent to gather this information to construct its plan.

Requiring build mode for simple read access is unnecessarily risky.

1 Like

Wonder how the pricing is working now. Since it seemed like we were getting back to feeling like Agent 2 looking at the above, I decided to try a very small fix in Medium autonomy mode and not with the high power model. Took only 2 minutes with only 9 code changes and was $1.38.

Just fyi: My examples above were low mode.

Initial actions after idle appear to cost more, due to caching, then subsequent actions are usually cheaper.

Pay attention to this on your next session and see if after 5 or so minutes of idle time, your next prompt is slightly more expensive as it burns tokens caching, but subsequent, quick follow ups are much less costly.

Something I’m watching.

1 Like