How I went from full Agent dependency to self-hosted infrastructure — and why Replit is still part of the stack

jefftj86 · April 4, 2026, 2:51am

I’ve been in this community for about 10-11 months building a lab-grown diamond marketplace. Started with zero infrastructure knowledge, built through AI-assisted vibe coding, and Replit was the reason any of it was possible. I want to share what my setup looks like now because I think there are people here in a similar position who might benefit from seeing one path through it.

This isn’t a “I left Replit” post. I’m still here. I’m on the $20 plan and Replit is still my app host and staging environment. But what sits behind it looks completely different than it did three months ago, and getting here taught me more about what I’d actually built than the eight months before it combined.

Where I was

I have a production TypeScript app that has grown to around 700K lines. The Agent built most of it. Agent 3 specifically was excellent — I want to say that clearly because I know the team reads these. It understood context, it worked with my codebase the way I expected, and for a stretch it felt like exactly the right tool. I got genuinely good output from it.

When Agent 4 rolled out, it broke my workflow in ways I couldn’t absorb. I’d spent months honing how I worked around Agent 3’s behavior — my prompting patterns, my expectations for how it would handle my codebase, the rhythm of how I built with it. Agent 4 had a different temperature entirely. It would go off on tangents, reorganize things that were working, make architectural calls I hadn’t asked for. On a 700K line codebase those tangents weren’t just annoying, they were expensive. I’d burn through agent sessions getting little to no gain, sometimes leaving things worse than where I started.

I don’t think Agent 4 was a bad product. I think the team shipped something designed to serve the broader community and the direction they’re heading. But for someone with a large existing codebase and a workflow built entirely around how the previous version operated, the change was disruptive enough that I needed to solve the problem permanently rather than keep adapting to each new release.

The agent costs were also adding up. I do long focused sessions, usually around four hours, and when you’re deep in a problem you’re not watching the meter. Some months the total across agent fees, database compute, and deployments was over $1000. Other months worse, $1200-$1400. For a solo founder also running a jewelry business with physical overhead, that unpredictability was affecting how I built. I was avoiding improvements because of what they’d cost to implement. Bad place to be.

What I did

I had two 2013 Mac Pro 6,1s sitting around that I’d picked up for $500 total. An 8-core and a 12-core. Both with 64GB RAM expandable to 128GB, PCIe SSDs, dual NICs, and Thunderbolt ports that can run 10TB external storage if I need it. I also had an M1 Mac Mini. My daily driver is an M1 MacBook Pro with 32GB.

I split the roles across them. The office Mac Pro became the primary PostgreSQL database server. The home Mac Pro runs AI compute — vision processing and heavier inference workloads. The Mac Mini runs a local language model for search query parsing via Ollama. The MacBook ties it all together as the development hub.

Everything connects over Tailscale, which is a mesh VPN that took about ten minutes to configure. The database is exposed to my Replit app through a Cloudflare Tunnel — no open ports on any machine, Cloudflare handles DDoS protection, SSL termination, and rate limiting at the edge. Replit just sees a normal database connection string.

Why the database move mattered

I was on Replit connected Neon DB. For most apps it’s great. For mine it had become the bottleneck because it has 10gb per month limit including usage from what I could tell. I blew through that limit every month and would rack up over $80-$120 in charges and rising as traffic spiked.

A diamond marketplace is search-heavy in a way most apps aren’t. Dozens of filter variables — shape, cut, color, clarity, carat, measurements, optical performance scores, lab, fluorescence, price, availability — combined in every possible way. Multiple joins, range filters across many columns, sorting on computed values. On Neon’s shared tenant infrastructure some of those queries were taking five to ten seconds. That’s not a bad query, that’s hardware contention on a shared platform.

To work around it I’d built a complex caching layer — shared diamond pools, filter variation pools, rollup caches, invalidation logic, all running against a twelve-gigabyte memory budget. Weeks of design work. Brittle. Only partially effective. It made the codebase significantly harder to reason about. I was building infrastructure on top of infrastructure to compensate for constraints that weren’t mine to solve at the application level.

A Mac Pro with 64GB of RAM running PostgreSQL means my entire dataset — 1.3 million diamond records — fits in memory. No disk reads for hot data, no cold starts, no connection limits I don’t control. The day I made the move, the entire caching layer got deleted. Direct queries are fast enough that none of it was necessary. The codebase got dramatically simpler overnight.

For $250 a machine I’m running production PostgreSQL that outperforms what I was paying significantly more for on managed infrastructure. These old trash cans are genuinely underestimated hardware.

The development environment that made it possible

This is the part I think is most relevant for people here.

I set up Claude Code inside VS Code on my MacBook with two MCP servers — one for GitHub, one for SSH. That gives it direct access to my repositories and persistent terminal sessions into all my servers over the mesh VPN simultaneously.

I wrote a document describing everything I needed — connection pooling, database replication, vector database setup, automated backups, services that restart after reboots. Gave it to Claude Code and it SSHed into the machines and worked through each task. When it hit problems it debugged them. When a session dropped it reconnected and picked up where it left off. It coordinated work across machines in different physical locations.

This is fundamentally different from how the Agent works. The Agent operates on your code inside a sandbox. Claude Code with MCP servers operates on your actual infrastructure. It’s the difference between a coding assistant and an agent that can go do real things in a real environment. I’m getting agentic programming across all my machines simultaneously with full version control through GitHub.

The quality of output is better because the context window is larger, it’s working on actual production systems instead of a sandbox, and I’m not paying per interaction so I don’t hesitate to iterate. That last part matters more than I expected. When every back-and-forth costs money you start self-censoring your development process. You skip the extra attempt, you don’t try the alternative approach, you accept “good enough” because “better” has a price tag. Removing that friction changed how I build.

What Replit does now

Replit is my app host and staging environment. I push to GitHub, Replit syncs, I test in preview, I deploy. That’s the scope and it works well for that. The $20 plan covers everything I need from it.

I don’t regret a single month on the higher tiers. The total spend was a crash course in full-stack systems engineering learned in reverse. I’d frame it as a business writeoff and an education that would have cost far more any other way. This platform gave me the ability to ship a real product as someone who doesn’t come from a traditional development background. That’s not a small thing and I don’t take it for granted.

What I’m building with the headroom

With stable infrastructure I’m focused on what I actually wanted to build.

LightScore — a proprietary optical performance grading system. Every diamond in my catalog has a 360-degree video that’s actually hundreds of downloadable image frames. I’m running trained vision models against those frames to score each stone on brilliance, fire, scintillation, and optical symmetry, calibrated to real-world millimeters using the physical dimensions from each diamond’s grading certificate. The processing pipeline runs across the home Mac Pro for inference and the office Mac Pro for storage, with every step writing state to the database so it can pause and resume without losing work.

Natural language search — replacing traditional filter-based diamond search with vector similarity search. AI embeddings for every diamond, stored in a vector database with pgvector. A customer types “something sparkly, eye clean, not too big, under five thousand” and a local language model parses the intent, matches it against embeddings, and returns results re-ranked by LightScore. The language model runs locally via Ollama. No API calls, no per-query cost, no data leaving my servers.

Both of these would have been impractical to build on managed infrastructure at the cost tier I was operating at. On hardware I already own they’re just compute time.

For anyone recognizing the pattern

If you’re here building something real and you’re starting to feel the tension — costs climbing, the agent making decisions you’re not tracking, infrastructure you couldn’t explain if someone asked — you’re probably at the same inflection point I was.

The answer isn’t to tough it out and the answer isn’t to abandon the platform. It’s to take a weekend, actually understand what you’ve built, and start owning the parts that make sense to own. Replit is still great at what it’s great at. You just don’t need it to be everything once you’ve outgrown that phase.

The tools exist to make that transition without a systems engineering background. I’m proof of that. The gap is smaller than it looks.

RocketMan · April 4, 2026, 3:02am

@jefftj86 “oh my gosh. Holy cow We get it, but tone it down about 80% “

Lmfao I’m crying rn lol you set me up.

I’ll circle back to the top and actually read this now.

danielgadus · April 4, 2026, 2:35pm

@jefftj86, you share your journey very well! Thank you, your posts are inspiring. Please continue. I also do not consider myself a coder, but if you are patient and attentive, you can build very interesting projects in Repelit.

In many aspects, my project is similar to your and just yesterday I migrated the production database from neon.us to neon.eu, since my clients are in Europe. I also switched to a dedicated server in Hetzner in the EU, so now I am GDPR compliant.

You can read my journey here How I got GDPR compliance and how self-hosting helped me with a near million-line (still 30% left) production app

realfunnyeric · April 4, 2026, 6:58pm

Hope you stick around and keep contributing here, even with general vibe coding information. You’re an asset to the community and have been for a long time. Appreciate you, brother!

RocketMan · April 4, 2026, 8:30pm

2nd this notion!

jefftj86 · April 4, 2026, 10:08pm

Yeah i’ll stick around either way, all the systems are running basically the same thing, Replit is Claude under the hood so most of the stuff is the same. I’m still using replit to deploy, it keeps things simple i’m just using VScode and claude with git to deploy the code to replit and different scripts to my servers now. I was afraid of doing this 6 months ago but I learned alot in the last 6 months so it has become less scary and actually not too hard with the help of these agents anyway. I’ll keep everyone posted I have some really cool projects in the works, I’ve been trying to finish up this site so I can move on to other things

Topic		Replies	Views
Easy hosting options that are not Replit? Deployments	24	492	September 29, 2025
Replit are listening Agent & Assistant using-replit , agent , announcement	13	218	September 19, 2025
Share your Replit Rewind 2025 Agent & Assistant	28	231	December 30, 2025
Agent 3 Issues Continue Bugs using-replit , agent	9	145	September 28, 2025
Agent 3 is extremely expensive Agent & Assistant agent	31	868	October 10, 2025