How I got GDPR compliance and how self-hosting helped me with a near million-line (still 30% left) production app

Why I Moved a 950,000-Line Flask App from Replit Deployments to Hetzner

Hey Replit community,

I want to share my journey — not to criticize Replit (I still use it every day for development and love the many things in it, but to help others understand when self-hosting makes sense, and what workarounds I tried before making the switch.

What I’am Building

I build Bidmio — a full ERP/CRM platform for construction businesses, with modules for projects, invoicing, stock management, HR/payroll, accounting, e-invoicing (Peppol BIS), bank payments (ISO 20022), and more. It’s a Flask app with ~950,000 lines of code across Python, HTML, JavaScript, and CSS. 60+ database models, 40+ blueprints, 13,500+ translation keys in 5 languages.

This isn’t a weekend project. Real businesses use it every day.

The Cold Start Problem

Replit Deployments create a new container for every deploy. For a small app, that’s fine — maybe 5-10 seconds. For my app:

  • 196 seconds (3+ minutes) from deploy to first page load

  • Gunicorn starts → Python parses ~30,000 lines of data files at import time → loads 60+ SQLAlchemy models → registers 40+ blueprints → runs database migrations and seeding

  • During all of this, users see a loading screen

Every. Single. Deploy. 2-3 Per Day.

And it’s not just deploys — if the container goes idle and Replit recycles it, the next visitor triggers the same 3-minute cold start.

The Workarounds I Built (Before Giving Up)

I didn’t leave immediately. I spent weeks building workarounds:

  1. Custom loading page with status polling — A /startup-status endpoint that returns 503 until the app is ready, with a frontend that polls and shows a loading animation. Users at least saw something instead of a blank error.

  2. Lazy imports — Moved ~30,000 lines of Python data (help center articles, glossary terms) from top-level imports to lazy loading. Cut startup from 196s to about 45s.

  3. Background initialization thread — Database migrations, seeding, and health checks run in a background thread so gunicorn can at least accept connections while the app bootstraps.

  4. Retry logic with exponential backoff — Our database (Neon, US region) sometimes wasn’t reachable when the container started. If the background init failed, it failed permanently. I had to add automatic retries.

  5. Database connection timeouts — Added explicit timeouts so a single Neon cold start wouldn’t block all 32 gunicorn threads for 100+ seconds.

  6. Aggressive query optimization — A context processor running on every page was taking 80-100 seconds due to N+1 queries hitting a cold Neon instance. I had to rewrite it with eager loading and caching.

All of this engineering effort was just to make deployments survivable — not even fast.

The Move to Hetzner

On Hetzner (Ca €30 for server /month VPS + +ca €7 for Object Storage), the deployment model is fundamentally different:



Replit Deployments Hetzner VPS
Deploy process Destroy old container, build new one from scratch Upload new code, restart app process
Cold start 45-196 seconds (after our optimizations) 0 — server is always running
Deploy downtime 1-3 minutes of loading screen ~5 seconds (process restart)
Idle behavior Container may be recycled → cold start on next visit Always on, always warm
DB latency Neon US (cross-Atlantic for EU users) Neon EU (same region)
Monthly cost Usage-based (can spike) Fixed, predictable
Setup effort Managed (easy) More setup, full control

The key difference: Hetzner uploads new code to a running server. When a user loads a page after deploy, the new code is already there and running. No container build, no cold start, no loading page.

Database: Neon US → Neon EU

I also migrated our primary default Replit database from Neon US to Neon EU. For European users, this cut database latency dramatically — queries stay within Europe instead of crossing the Atlantic.

But I kept a hybrid setup:

  • Neon EU — Primary database for European users (majority of our customer base)

  • Neon US + original Replit deployment — Still running for certain domains serving users outside Europe

This way, everyone gets the best latency for their location.

When Replit Deployments Are Perfect

I still think Replit Deployments are great for:

  • Smaller apps (under ~50,000 lines)

  • Prototypes and MVPs

  • Apps where a few seconds of cold start doesn’t matter

  • Teams that want zero DevOps overhead

But once your codebase grows to hundreds of thousands of lines and real users depend on instant availability, you’ll hit the same wall I did.

What I Still Use Replit For

I haven’t (YET :D) left Replit — I still use:

  • Replit IDE for all development

  • Replit AI Agent for building features

  • Replit Deployments for non-EU domains

  • The development workflow is still 100% Replit

Replit is incredible for building. For deploying a near-million-line production app to European users, I needed something different.

Hope this helps anyone facing similar scaling decisions.

2 Likes

Could you talk more about your Hetzner setup?

How/ what migrating off Replit and onto Hetzner was like?

Then maybe how you went about taking a plain Hetzner server and configured it to host your site?

If possible detailed walkthroughs or guides? This is the type of thing we need to get started here.

I agree I’ll never stop using Replit for quick builds like you said. I agree with it all.

Anyone who has tried to actually scale will know the pain you are talking about, and they’ll know the pain of deep iteration costs with the Replit agent.

I’ve seen a few people now migrate their large projects to more robust infrastructure and I’m trying to catalog and eventually compile all of this into a working document that I give to the community.

I’m doing it for free as an investment project to my portfolio. Not just for fun.

If you’re not able to, because of time or something I completely understand!

Anything you’re willing to share would be extremely helpful.

I’m praying for your success!

Ha, the timing on this is wild. Didn’t coordinate that at all.

The parallels are real though. I was on the default Neon DB almost went to a self hosted database but with me updating gigabytes of data every few days it just didn’t make sense — same shared tenant latency problems on complex queries against a large dataset. Built months of caching infrastructure to compensate, deleted the entire thing the day I moved to self-hosted PostgreSQL with 64GB of RAM where the full dataset fits in memory. Your 196s-to-45s lazy import work is the same pattern — solid engineering solving a problem that comes with scale, not bad code.

That’s really what it comes down to. Replit and managed services like Neon are great until your dataset or codebase hits a size where shared infrastructure can’t keep up. For me it was 1.3 million diamond records and search queries with dozens of filter combinations. For you it was 950K lines and 60+ models with a 3-minute cold start. Different apps, same inflection point.

Still on Replit for app hosting — they’re actually working with me on the plan situation which I appreciate, the actual hosting won’t cost me more than the $20 plan anyway. Everything else is self-managed: database, AI compute, storage, all running on my own hardware behind Tailscale and Cloudflare Tunnel.

Good to see someone else talking about this honestly. 950K lines of production ERP serving real businesses is serious work.