User Feedback Integration for Replit Agent Evals

Marshal_Thompson · August 12, 2025, 9:55pm

Evals and scorers are the backbone of how Braintrust enables optimization for AI agents. At what point can we expect a Replit user feedback mechanism—directly tied into the eval process—for the Replit agent? This would allow real-world usage insights from Replit users to feed into the scoring and training loops, ultimately helping the Replit team refine the agent more effectively.

It’s surprising that, as a Braintrust customer, Replit doesn’t already have this in place. A feedback-to-eval pipeline seems like a critical feature for quickly identifying and addressing the recurring concerns that many of us have raised here in the forum. Not only would it make it easier for the Replit team to prioritize fixes and improvements, but it would also give users confidence that their input has a tangible impact on the product’s evolution.

Direct, structured feedback integrated into evals would be a win-win for both the Replit team and its user community. Is there a timeline or plan for rolling this out?

Gipity-Steve · August 13, 2025, 4:18am

Very tricky. Because many problems are caused by bad prompting.

Garbage in garbage out as the old techie saying goes.

If Replit’s big backend training system had to deal with angry feedback every time agent did something wrong [because it wasn’t instructed clearly by the user], then I think this would lead to a lot of false positive type situations where they were modifying the AI rules for the wrong reasons.

But this is not to say you are wrong @Marshal_Thompson - I just think it is extremely tricky to find the right model for taking, evaluating and using feedback from users who all have very different degrees of expertise and understanding.

Marshal_Thompson · August 13, 2025, 5:46am

@Gipity-Steve

Interestingly, Braintrust’s entire business model is built around solving exactly the challenge you described.

They position themselves as an observability platform for fine-tuning LLM agents — and, according to their presentation earlier this week, Replit is already a customer.

If you haven’t explored it yet, Braintrust’s solution focuses on improving agent inputs and outputs through curated datasets and automated scorers. When paired with their new feature, Loop, the platform should, in theory, provide the Replit team with the observability needed to pinpoint and address exactly the type of issue you’re talking about — or at the very least, give them clear visibility into it.

That said, without a robust user feedback loop feeding into those evaluations, I’m struggling to see how the Replit team can meaningfully move the needle in this problem space. The observability is only as good as the real-world signals it’s measuring.

Topic		Replies	Views
I'm sorry to say this but... Agent 4 sucks Agent & Assistant	9	159	March 29, 2026
Feedback from a Gen AI Instructor – Pricing Issues, Agent 3 Problems, and Loss of Confidence Workspace	4	83	September 18, 2025
"Critical Agent" Mode for Replit Agent Feature Requests agent	2	24	March 19, 2026
Agent has serious issues Agent & Assistant using-replit	9	306	May 27, 2025
Has Replit gotten worse? Bugs	19	365	September 3, 2025

User Feedback Integration for Replit Agent Evals

Related topics