ask-charles-api
ask-charles-api
Serverless backend for the “Ask Charles” chatbot on charlescook.me.
Runs as a single Vercel serverless function. Talks to Anthropic for chat completion,
captures $ai_generation events to PostHog for AI observability.
What it does
- One endpoint,
POST /api/chat, that takes{ messages, distinctId, sessionId, traceId } - Streams the Claude response back as newline-delimited JSON (
{type:"delta",text:"..."}) - Captures a
$ai_generationevent to PostHog with input/output/tokens/latency/cost/cache stats - Tags every event with
prompt_versionso you can A/B compare prompt revisions in PostHog
Deploy
cd ask-charles-apinpm installnpx vercel link(create a new Vercel project, point at this folder)- Set environment variables in the Vercel dashboard (or via
vercel env add):ANTHROPIC_API_KEY— requiredPOSTHOG_API_KEY— required for AI observabilityPOSTHOG_HOST—https://eu.i.posthog.comif you’re on EU, otherwisehttps://us.i.posthog.comANTHROPIC_MODEL— optional, defaults toclaude-opus-4-7. For cheaper/faster chat useclaude-sonnet-4-6.ALLOWED_ORIGINS— optional, comma-separated list of allowed CORS origins. Defaults includecharlescook.meandlocalhost:4000.
npx vercel deploy --prod- Update
ask_charles_apiin the site’s_config.ymlto the deployed URL (e.g.https://ask-charles-api.vercel.app) - Push the site, redeploy Pages, done.
Local dev
cp .env.example .env
# fill in the keys
npm run dev
Then in the main repo, temporarily set ask_charles_api: http://localhost:3000 in
_config.yml, run bundle exec jekyll serve, open /#ask-charles.
Iterating on the prompt
The system prompt is in api/chat.ts (SYSTEM_PROMPT). When you change it, bump
SYSTEM_PROMPT_VERSION so PostHog can attribute traces to revisions and you can
compare prompt versions side-by-side.
For a smoother iteration loop, move the prompt to PostHog’s prompt management once you’ve got a baseline you like — that way non-engineers can iterate on the persona without redeploying.
Cost notes
claude-opus-4-7 (default) is the most capable but $5/$25 per 1M tokens. For a public
chatbot you’ll likely want claude-sonnet-4-6 ($3/$15) or claude-haiku-4-5 ($1/$5).
Set ANTHROPIC_MODEL=claude-sonnet-4-6 and redeploy.
The system prompt is large on purpose — both for voice fidelity AND so prompt
caching has a meaningful prefix to cache. After the first call, subsequent calls
within the 5-minute TTL pay ~10% of the input price on the cached portion. Watch
$ai_cache_read_input_tokens in PostHog to verify it’s working.
Wiring up the rest of the PostHog AI suite
This backend captures $ai_generation events. To unlock the rest of the LLM
Analytics product:
- Sentiment classification — works automatically once you have events flowing. Open the Traces or Generations tab in PostHog.
- Evaluations — set up LLM-as-judge evals in PostHog → LLM analytics → Evaluations. Good first eval: “does this sound like Charles?” with the style rules as criteria.
- Playground — open any generation from Traces in the playground to iterate on the prompt without changing code.
- Prompt management — replace the hardcoded
SYSTEM_PROMPTwith a fetched prompt from PostHog. Then non-eng can iterate. - Session replay link — already wired:
$ai_session_idflows from the client (sessionStorage) anddistinctIdmatches the PostHog distinct_id, so traces auto-link to replays once session recording is enabled site-wide. - A/B test prompt versions — wrap the system prompt choice in a PostHog feature flag and split traffic between two
SYSTEM_PROMPT_VERSIONvalues.