Overcomplicating simple features, Agent can't debug itself

Pretty simple - in some cases, Agent does great at getting an initial prototype up. Just started one yesterday though, and I’ve had probably 10+ back and forths asking it to ‘make sign up work’. Its a simple, boilerplate sign up / sign in flow.

I’m trying not to manually fix the bug, but both Agent and Assistant are just failing to fix it, they keep iterating back and forth between two (incorrect) solutions. The issue is they’ve set up the sign up form to trigger a GET request to /users, rather than any registration flow or endpoint. And they refuse to fix it, even with very targeted guidance.

An interesting case for sure.

2 Likes

I’ve found signup and auth the hardest things to get built.

When I rebuilt my app, I decided to build auth/signup first so that I’d know it worked.

5 Likes

How I ended up getting it working without manually writing the code (I’m trying to explore the limitations of these agents without giving too much assistance to it):

  1. I asked Claude/Sonnet 3.5 (separately) to take the auth page Replit agent made, and make JUST a working sign up page.
  2. I tested that working back in replit
  3. I asked replit agent to add a sign in page, with this instruction:

" okay, ive fixed the signup flow. But, there is no sign in. Please add sign in capability, and DO NOT BREAK THE EXISTING SIGNUP Flow. make a separate component, do whatever you need to NOT modify the existing auth-page signup logic."

That seemed to do it.

I suspect that asking it to build a site with sign in / sign up will result in failures a significant portion of the time - but this seems like something a custom flow with some ‘known working’ examples could be programmed in by Replit’s engineers, that the agent can draw from, rather than using their existing langgraph flow as if its a novel/new problem space.

2 Likes

Hey there! Coming from no coding background I actually asked DeepSeek another AI to give me a prompt to help guide the AI assistant better i rarely use the agent because it makes massive mistakes and gets stucks. here’s the prompt feel free to use it: to resolve the issues, follow these steps:

  1. Explain Intended Behavior:
    • Describe how the feature should work, including its purpose, inputs, outputs, and interactions with other modules.
  2. Analyze Current Behavior:
    • Review the code and identify inconsistencies, bugs, or deviations from the intended behavior. Highlight any violations of modularity or TDD principles.
  3. Propose a Fix:
    • Suggest a solution that resolves the issues without breaking core functionality. Ensure the fix aligns with modular design and reduces dependencies.
  4. Apply TDD Principles:
    • Outline unit or integration tests to verify the fix, covering edge cases and error handling. Add new tests if existing ones are insufficient.
  5. Validate and Document:
    • Explain how the fix will be validated to prevent regressions. Document changes clearly, updating comments, READMEs, or diagrams as needed.
3 Likes

Thanks for the feedback :slight_smile: there’s a bunch of stuff in the works for improving agent’s auth tooling, seen a bunch of reports on it and its part of a big sweep of features coming in about 2ish weeks.

Can you link to a repl where you’ve been having one of those auth issues with agent, or send me a DM with it so I can use it as another reference for where agent’s making the auth mistakes?

4 Likes

Sure! Try this one. This is a good example, if you look at the agent logs, for what was happening. You’ll see that I ‘resolved’ it by injecting my own sign up logic separately, then asking it to build sign in without breaking sign up.

https://replit.com/@jjmilburn/SprintHive

1 Like

Agent loves to drop my user table. I’m like hey, you broke this function, so write yourself some unit tests and every time you touch this function, run your unit test. “That’s a great idea. I’ll drop your user table!”. :joy: It’s pretty bad. I kinda just want an oauth template that is resistant to agent and assistant chaos. This user management debugging loop is such a waste of time.

3 Likes