Hello everyone,
There’s additional work one must do to ensure websites appear in search engine and llm results, as well as beef-up security in the system. Before I publish, I prompt the agent to run a series of scripts to ensure the latest version of my web apps are ready. This is an ever-evolving predeployment checklist because cybersecurity is an ever-moving target.
To replicate something like this, you can use my message below as the prompt to the agent to make a script for you; just be sure to replace anything that says “google firebase” with whatever third party auth method you’re using OR remove it entirely if you’re using replit’s auth method. I’d also recommend telling it what you want the trigger to be for running the script, so it can include that in the plan to build the scripts for you. My trigger is:
When I tell you, the agent, to “run the predeployment checklist,” you should create a new and fresh task plan confirming what you will do (what the task is/are, affirm what is and isnt in scope, confirm what if any files will be impacted).
Here’s an example of the task that thus gets generated by the agent:
#(Task Number - Autogenerated by Agent) - Run Pre-Deployment Verification
What & Why
Execute the two-step pre-deployment process before any release:
- Run the automated script that checks production health, SEO, and crawler visibility.
- Work through the full security checklist against the codebase and report findings.
Both steps must pass (or any FAIL/NEEDS REVIEW items must be explicitly accepted with documented justification) before a release can proceed.Done looks like
- Step 1 (
scripts/predeployment-check.sh) is run and all checks report PASS. If any check fails, the issue is identified and either fixed or documented with justification.- Step 2 (
scripts/predeployment-check-security.sh) is worked through in full. Every numbered item in all 10 sections is evaluated and labeled PASS, FAIL, NEEDS REVIEW, or N/A. Each FAIL or NEEDS REVIEW includes: the finding, affected file/line, severity (Critical / High / Medium / Low), and a concrete remediation.- A summary of the results is written to
.local/predeployment-report.mdcovering both steps.Out of scope
- Actually deploying/publishing the app (this is verification only).
- Fixing any security issues found (findings are documented; fixes are separate tasks).
Tasks
- Run Step 1 — automated checks — Execute
bash scripts/predeployment-check.shagainst production and capture the output. If any check fails, document what failed and why.- Run Step 2 — security checklist — Work through every numbered item in
scripts/predeployment-check-security.shin order, inspecting the codebase for each check. Report PASS / FAIL / NEEDS REVIEW / N/A for each item, with findings and remediations for anything that isn’t a clean PASS.- Write the report — Summarize both steps in
.local/predeployment-report.mdwith a clear overall verdict (READY TO DEPLOY or BLOCKED).Relevant files
scripts/predeployment-check.shscripts/predeployment-check-security.sh
I. SEO and LLM Crawler
After ensuring my website’s pages are all properly indexed in Google Search Console, I want to make sure that—given any recent changes made in dev (preview)—pre-rendering works, the site is discoverable by LLM/AI, and finally meets standard search engine indexing. It runs four curl checks against my sites:
1. SEO metadata on the homepage
Fetches the homepage and checks that the raw HTML contains a <title> tag and an og:description meta tag. This confirms the page isn’t being served as an empty <div id="root"> shell — i.e. the content is pre-rendered server-side so search engines can read it without running JavaScript.
2. /llms.txt returns content
Checks that this file exists, returns HTTP 200, and has more than 100 bytes of content. llms.txt is an emerging convention (similar to robots.txt) that gives AI crawlers and large language models a structured, plain-text summary of what your site is and does. It’s how tools like ChatGPT’s browsing mode or Perplexity can accurately describe my product.
3. /sitemap.xml is valid
Checks that the sitemap returns HTTP 200 and contains a <urlset> element, which is the required root tag for a valid XML sitemap. Google and other search engines use this to discover and index all pages efficiently.
4. Googlebot gets pre-rendered content
Re-fetches the homepage but this time spoofing the User-Agent as Googlebot, then counts how many times my title and description appear in the response. It expects each word to appear more than 5 times. This verifies that the pre-rendering layer (whatever serves bots differently from regular users) is actually working — a misconfigured renderer might serve Googlebot the empty JS shell, which would gut my search rankings.
II. Security Check
The automated scanners (Semgrep and HoundDog.ai) cover maybe 25–30% of the script — mostly the OWASP surface in Section 10 and secrets/PII detection. The rest is Firebase-specific auth logic, my app’s own data isolation patterns, GDPR compliance flows, and operational readiness; none of which Semgrep or HoundDog can reason about. The checklist is largely complementary, not duplicative, so you may want to tailor too if you’re using another auth method than Replit’s, are operating where GDPR matters, or want to harden your own operational readiness.
What Semgrep (SAST — static code patterns) covers:
-
It can catch unescaped output going into innerHTML (XSS)
-
It can detect hardcoded secrets and API keys in source
-
It has rules for missing security headers, SSRF patterns, and SQL injection
-
It may flag
req.query.api_keypatterns (Section 2.6)
What HoundDog (data privacy/PII flow) covers:
-
It traces PII flowing into logs, third-party calls, or push notification payloads
-
It can catch email/phone/name appearing in places it shouldn’t
What neither tool covers (the bulk of the checklist):
-
Firebase-specific business logic — whether your app actually calls
verifyIdToken, checks the right claims, or has a revocation flow. Semgrep has no Firebase-specific rules for this. -
Whether a specific userId filter is present on every query in your particular schema — too app-specific for generic SAST.
-
Legal consent table being append-only and API-enforced — pure business logic.
-
The full GDPR account deletion cascade order, data export compliance — runtime and policy, not static patterns.
-
SLOs, alert thresholds, health check behavior — infrastructure/runtime, not static analysis.
-
Connection pooling config, CDN setup, background job trigger mechanisms.
Here’s a table of what the security portion of the script checks
| Section | What it checks |
|---|---|
| 1 – Firebase Auth | Token verification (exp/aud/iss claims), multi-provider linking safety, revocation on deletion |
| 2 – Dual API Auth | Web Bearer vs mobile x-api-key scoping, key rotation policy, hashed key storage, header-only transport |
| 3 – Schema & Mass Assignment | Writable field allowlists, ORM userId scoping on every query |
| 4 – Barcode/SVG | Input validation, SVG XSS escaping, response headers, offer text sanitization |
| 5 – FCM Notifications | No PII/codes in push payloads, token cleanup on deletion, FCM key in secrets |
| 6 – Legal Consent | Append-only consent table, API-layer enforcement, server-side timestamps |
| 7 – GDPR/Data Isolation | IDOR via missing userId filters, full account deletion cascade, data export (Art. 15) |
| 8 – Observability | No PII in logs, alert signals, SLOs, health check endpoint |
| 9 – Replit Hardening | No secrets in source, connection pooling, CDN for static assets, background job triggers |
| 10 – OWASP Pre-flight | Verbose errors, header leakage, rate limiting, SSRF, CSP, CORS, HSTS, no PII in URLs, race conditions |