📄 life-story-final(1)(1).doc
_

ask-charles-api

ask-charles-api

Serverless backend for the “Ask Charles” chatbot on charlescook.me. Runs as a single Vercel serverless function. Talks to Anthropic for chat completion, captures $ai_generation events to PostHog for AI observability.

What it does

Deploy

  1. cd ask-charles-api
  2. npm install
  3. npx vercel link (create a new Vercel project, point at this folder)
  4. Set environment variables in the Vercel dashboard (or via vercel env add):
    • ANTHROPIC_API_KEY — required
    • POSTHOG_API_KEY — required for AI observability
    • POSTHOG_HOSThttps://eu.i.posthog.com if you’re on EU, otherwise https://us.i.posthog.com
    • ANTHROPIC_MODEL — optional, defaults to claude-opus-4-7. For cheaper/faster chat use claude-sonnet-4-6.
    • ALLOWED_ORIGINS — optional, comma-separated list of allowed CORS origins. Defaults include charlescook.me and localhost:4000.
  5. npx vercel deploy --prod
  6. Update ask_charles_api in the site’s _config.yml to the deployed URL (e.g. https://ask-charles-api.vercel.app)
  7. Push the site, redeploy Pages, done.

Local dev

cp .env.example .env
# fill in the keys
npm run dev

Then in the main repo, temporarily set ask_charles_api: http://localhost:3000 in _config.yml, run bundle exec jekyll serve, open /#ask-charles.

Iterating on the prompt

The system prompt is in api/chat.ts (SYSTEM_PROMPT). When you change it, bump SYSTEM_PROMPT_VERSION so PostHog can attribute traces to revisions and you can compare prompt versions side-by-side.

For a smoother iteration loop, move the prompt to PostHog’s prompt management once you’ve got a baseline you like — that way non-engineers can iterate on the persona without redeploying.

Cost notes

claude-opus-4-7 (default) is the most capable but $5/$25 per 1M tokens. For a public chatbot you’ll likely want claude-sonnet-4-6 ($3/$15) or claude-haiku-4-5 ($1/$5). Set ANTHROPIC_MODEL=claude-sonnet-4-6 and redeploy.

The system prompt is large on purpose — both for voice fidelity AND so prompt caching has a meaningful prefix to cache. After the first call, subsequent calls within the 5-minute TTL pay ~10% of the input price on the cached portion. Watch $ai_cache_read_input_tokens in PostHog to verify it’s working.

Wiring up the rest of the PostHog AI suite

This backend captures $ai_generation events. To unlock the rest of the LLM Analytics product: