> From WordPress to Astro
How we ripped jsonld.com out of a 10-year-old WordPress install and rebuilt it as a static Astro site. Same URLs, same SEO, zero PHP.
- published
- modified
- size
- 7.1K
- path
- /from-wordpress-to-astro/
JSONLD.COM has been a WordPress site since 2015 -- GeneratePress child theme, Yoast SEO, a handful of custom PHP. For a reference site where the content barely changes, that's a lot of moving parts: a database, a PHP runtime, plugin updates, security patches, caching layers. All to serve what is fundamentally a folder of static pages full of code examples.
So we ripped it out.
the shape of the old site
62 published URLs. 51 pages. 11 posts. 46 of the pages had JSON-LD snippets embedded as GitHub Gists. Two custom tools -- a JSON-LD generator (v2 was already client-side JavaScript) and a lat/long converter (PHP + jQuery, calling Google Geocoding server-side). Yoast SEO was doing the heavy lifting on page metadata: title tags, meta descriptions, canonical links, a full schema graph per page.
Nothing about it was bad. But none of it needed to be dynamic.
the goal
Every URL on the old site returns the same content on the new site, every SEO tag is preserved, and no user notices a thing except the redesign. URL parity was the non-negotiable line: technical SEOs will audit this site the day it launches, and a broken link is a broken link.
phase 1: recon
Inventoried the OpenLiteSpeed vhost config, the active theme, the DB schema, the custom PHP, the two tools. Surprise find: the JSON-LD generator was already a self-contained ES module -- 1,100 lines of client-side JavaScript dropped into a PHP template. We could lift it into Astro as-is. The lat/long tool was the opposite -- server-side PHP making HTTP calls to Google. That one needed a full rewrite to pure client-side fetch().
phase 2: extracting the content
Rather than parse WordPress's database, we wrote a Node script that crawled the live site. For each URL it fetched the rendered HTML, then used cheerio to pull out:
<title>, meta description, canonical, robots- Every
og:*andtwitter:*tag - Every
<script type="application/ld+json">block (including Yoast's multi-node graph) - The
.entry-contentbody HTML
The body stayed as HTML, not Markdown -- too much content (Gutenberg divs, Gist <script> tags, custom notice boxes) would've lost fidelity in conversion. Astro's content collections happily accept JSON with a raw HTML body field.
Edge case: six pages had hierarchical URLs like /event/concert/ that the old site canonicalized from flat slugs like /concert/. A recursive SQL query against wp_posts pulled the parent/child tree and fed it into the extractor. Then the Astro catch-all route emits them at the nested paths, and a .htaccess 301-redirects the flat slugs.
phase 3: astro + tailwind 4
One dynamic route handles all 58 default pages. Home, blog, category archive, generator, lat/long, contact, and a test page each get their own dedicated route. The layout injects GA + GTM + the sitewide Person JSON-LD on every page, so the metadata is owned by code instead of by a plugin.
phase 4: design -- terminal brutalist x retro-futuristic
System monospace stack (no web fonts, zero network cost). Phosphor green + amber accents on deep near-black in dark, warm cream paper in light. CRT scanlines as a subtle CSS overlay. Every page opens with a shell prompt showing the URL as $PWD. The /examples/ grid renders each schema type as an @type identifier. Gist embeds get wrapped in a fake terminal-window chrome via a .gist::before pseudo-element. exit 0 in the footer.
The audience is developers and SEOs. They view-source. So the rendered HTML itself is pretty-printed at build time via prettier, with JSON-LD blocks indented 2 spaces. The source of any page on this site should read like a manual, not like minified output.
phase 5: seo audit before shipping
Wrote a custom audit script that runs two passes: Mode A reads the content collection directly, Mode B crawls the built site. Caught 20 pages missing meta descriptions (pre-existing on WP, not a migration regression) and one title too long. Fixed them in a single patch. Live audit post-deploy: zero errors, zero warnings.
phase 6: cleanup + go-live
One bash script at scripts/pre-live/go-live-cleanup.sh git rms every WordPress file (21 paths including wp-config.php with the database password), moves the uploads directory out of the WP tree into public/, swaps robots.txt from staging to production, regenerates llms.txt + llms-full.txt, and rebuilds.
Then: mv /var/www/html /var/www/html-wp-old-$(date +%s), cp -r dist /var/www/html, reload OpenLiteSpeed. About 30 seconds of downtime, masked by Cloudflare's edge cache.
results
- Same URLs. 62 live URLs, 100% match against the old Yoast sitemap. Six flat-slug pages 301 to their hierarchical targets.
- Same SEO. Every Yoast-generated JSON-LD graph preserved verbatim. Every canonical, OG tag, Twitter card carried over. GA/GTM untouched.
- Lighthouse 99. Static HTML served from OpenLiteSpeed on the same Vultr box. No CDN origin shield yet -- just static files.
- Smaller. 25MB of source, versus WP's 250MB install plus a 38MB MySQL dump.
- Faster builds.
npm run buildtakes about 23 seconds for 63 pages plus a full site-wide pretty-printer pass. - Git-native. Every page, every snippet, every metadata change is a diff now. The site has a changelog.
images
WordPress scattered uploads across /wp-content/uploads/{year}/{month}/ -- a deep tree with 331 files across a dozen year subdirectories. We flattened the whole thing into /img/ (every filename was already unique across the tree) and 301-redirected the old paths so Google's image index doesn't blink. Future plan is to re-sort by year once the site starts collecting more visual content; for now it's a flat folder and that's fine -- jsonld.com isn't image-heavy, the content is code.
what's next
The Gist embeds still load from GitHub at runtime. They're the obvious next thing to rip out -- every JSON-LD example becomes a file in src/data/snippets/ and gets rendered inline via Astro's Shiki highlighter. We've got a side-by-side preview showing what that looks like next to the current Gist embed.
After that, the contact form wires to n8n. Eventually the blog posts (currently JSON with HTML bodies) convert to Markdown for easier authoring.
But the core move is done: jsonld.com is now a folder of HTML files, served from the same box that was running a full LAMP stack a few hours ago.