Advanced Proxy

Dynamic rendering for bots using Puppeteer - VPS or Cloudflare Worker.

When to Use the Proxy

The proxy handles requests that weren’t prerendered at build time:

  • Dynamic routes - Blog posts fetched from CMS
  • Search results - /search?q=...
  • User-specific pages - Can’t prerender
  • Frequent updates - Stock prices, live data

Architecture

Request
   │
   ├─► Bot? ──► Puppeteer ──► Rendered HTML ──► Cache ──► Response
   │
   └─► Human ──► 302 Redirect ──► Live site

Bot Detection

The proxy checks User-Agent:

const BOT_LIST = [
  'googlebot', 'bingbot', 'slurp', 'duckduckbot',
  'baiduspider', 'yandexbot', 'twitterbot',
  'facebookexternalhit', 'discordbot', 'linkedinbot'
]

function isBot(ua) {
  const lower = (ua || '').toLowerCase()
  return BOT_LIST.some(token => lower.includes(token))
}

Configure in ssr.config.js:

proxy: {
  botList: ['googlebot', 'bingbot', 'my-custom-bot'],
}

VPS Deployment

1. Install

npm install express puppeteer
cp init/scripts/proxy.js ./scripts/

2. Configure

// ssr.config.js
export default {
  siteUrl: 'https://example.com',
  proxy: {
    url: 'https://proxy.example.com',
    targetUrl: 'https://origin.example.com',
    secret: process.env.PRESTRUCT_SECRET,
    botList: ['googlebot', 'bingbot'],
  }
}

3. Run

export PRESTRUCT_SECRET=your-secret
export PORT=3000
node scripts/proxy.js

4. Process Manager

pm2 start scripts/proxy.js --name prestruct-proxy

Cloudflare Worker Deployment

Requirements

  • Workers Paid plan
  • @cloudflare/puppeteer

wrangler.toml

[[browser]]
binding = "BROWSER"

[[kv_namespaces]]
binding = "CACHE"
id = "your-kv-id"

[vars]
PRESTRUCT_TARGET_URL = "https://example.com"

Deploy

wrangler secret put PRESTRUCT_SECRET
wrangler deploy

Caching

Cache Key

SHA-256 of URL path + query string.

TTL

  • VPS: 24 hours default (configurable)
  • Worker: Set via CACHE_TTL_SECONDS

Cache Busting

curl -H "x-prestruct-refresh: your-secret" \
  https://proxy.example.com/page/

targetUrl vs siteUrl

proxy: {
  targetUrl: 'http://localhost:5173',  // Local dev
  // or
  targetUrl: 'https://staging.example.com',  // Staging
}

Security

  • Requires matching PRESTRUCT_SECRET for cache refresh
  • Only GET requests processed
  • URL validation before Puppeteer

Browser Pooling

VPS proxy uses single shared Chromium:

let browser = null

async function getBrowser() {
  if (!browser || !browser.connected) {
    browser = await puppeteer.launch({
      headless: true,
      args: ['--no-sandbox']
    })
  }
  return browser
}

Troubleshooting

Issue Solution
Browser crash Auto-reopens on next request
Navigation timeout Check targetUrl, increase timeout
Cache not invalidating Verify PRESTRUCT_SECRET matches
Worker cold starts Consider paid plan