AI-Assisted Career Curation
Collecting fifteen years of career artifacts, curating them through a bronze-silver-gold pipeline, catching AI errors before they became public, and deploying the result as a live interview chatbot.
The Problem
I had fifteen years of career artifacts scattered across formats — annual reviews, accomplishments docs, presentations, old resumes, bios, project descriptions. Over forty files in total. Some lived in Google Docs, some in Confluence, some in folders I hadn't opened in years. Individually, each one captured a moment. Together, they told a story I'd never actually assembled.
The obvious move: feed everything to an AI and let it synthesize a portfolio. That doesn't work. Not because the AI can't write — it writes fluently, confidently, and fast. The problem is that confidence doesn't correlate with accuracy. An LLM will blend attributions across companies, hallucinate specific tools you never used, and present fabricated details with the same tone as verified facts. When the output represents your professional reputation, "close enough" isn't acceptable.
The real question wasn't whether AI could help — it was how to design a process that captured AI's speed while maintaining the accuracy standards of content that would be public, permanent, and attached to my name. That's a data quality problem, not a writing problem.
The Platform
Before curating content, I needed somewhere to put it. I scaffolded the entire site infrastructure with Claude Code in a single session — a Turborepo monorepo running four Next.js 15 apps with a shared Tailwind v4 theme, an MDX content pipeline using next-mdx-remote and gray-matter, and automated deploys through Vercel. The architecture decisions that might have taken a few evenings of reading docs happened in the same conversation as the scaffold.
The most important piece of infrastructure wasn't the framework — it was CLAUDE.md, a living document at the repo root that serves as persistent context across AI sessions. It captures architecture conventions, process guidelines, source trust rules, and hard-won lessons. Without it, every new session with Claude starts from zero. With it, the AI picks up where the last session left off. That pattern — designing for context preservation — became a theme of the entire project.
The Medallion Model: Bronze, Silver, Gold
I borrowed a concept from data engineering: the medallion architecture. Raw data enters at bronze, gets cleaned and structured at silver, and reaches publication quality at gold. Each tier has different rules, different quality gates, and different people responsible.
Bronze is raw capture. No editing, no curation, no quality judgment. I collected everything — forty-plus files including annual self-reviews, accomplishments documents, presentation decks, a consolidated resume, Q&A transcripts from clarifying sessions with Claude, and an AI-synthesized report from Atlassian's Rovo agent that crawled my Confluence history. The rule at bronze: capture everything, trust nothing.
Silver is themed narrative. Instead of organizing chronologically (which is how resumes work but not how portfolio content is consumed), I organized by theme: AI platform work, team transformation, product mindset, engineering practices, coaching philosophy, financial impact. Six themed files total, each curated through a structured process with Claude Code and verified by me. Silver is where AI errors get caught — or don't.
Gold is publish-ready. The case studies on this site, the resume page, the content you're reading right now. Gold has the strictest gates: metric verification, confidentiality review, and the question that matters most — would I stand behind this sentence in a job interview?
The per-theme process followed the same pattern each time: identify relevant bronze sources, run a clarifying Q&A session to fill gaps, check for confidentiality issues, write the curated narrative, then human review. That last step is where the model earns its keep — or reveals its limits.
Source Trust Hierarchy
Not all source material is equally reliable, and AI doesn't know the difference. I established a four-tier hierarchy:
- My direct Q&A answers — verbatim, self-correcting in real time. Most reliable.
- My authored documents — annual reviews, accomplishments docs. Reliable but written for a specific audience, not comprehensive accuracy.
- The consolidated resume — a bronze source, not ground truth. Mine had at least one factual error that survived years of use.
- AI-synthesized reports — useful for gap analysis, dangerous for specifics. The Rovo synthesis introduced a hallucinated tool. It appeared authoritative. It was wrong.
The principle: treat AI synthesis like a junior analyst's first draft. Trust the structure, verify the specifics.
Errors Caught
Every error below was caught by human review — not by the AI self-correcting, not by a second AI pass, not by automated checks.
The hallucination that propagated. The Rovo/Confluence synthesis listed Micrometer as part of my observability stack. It's not — we use Datadog, Portkey, Opik, and Pub/Sub into BigQuery. But the AI had no reason to doubt an authoritative-looking source, so the hallucination appeared in two separate silver files before I caught it during review.
The right fact, wrong company. A silver draft attributed my remote-first meeting policy to Porch. I established that practice at Brinks, pre-COVID. Porch was already fully remote when I joined — there was no policy to establish. This is the sneakiest category of AI error: it reads correctly, sounds plausible, and only fails if you were actually there.
Two more: a silver file added Europe to my global team (never existed — traced back to an error in my own resume, not an AI invention), and another described a framework as adopted in production when my team had only facilitated another team's use of it. "Facilitated" and "managed" are different claims.
Four errors, one pass. That's the argument for keeping humans in the loop — not because AI is bad at writing, but because AI is bad at knowing when it's wrong.
The Conversation Is an Artifact
The most valuable content in this project didn't come from any document. It came from Q&A exchanges — my verbatim answers to Claude's clarifying questions during silver curation. When Claude asked about the decision process for building Prediction Hub vs. buying VertexAI, my answer included vendor lock-in concerns, observability gaps, and internal politics I'd never written down. That answer, captured in a bronze Q&A file, became the foundation for a case study section. Without intentional capture, it would have evaporated when the conversation window closed.
AI conversations are ephemeral by default. I designed for preservation at three levels: Q&A transcripts in bronze, devlog entries during working sessions, and CLAUDE.md as persistent context across sessions. Together, the process of working with AI leaves a trail, not just the output.
From Pipeline to Product
A curation pipeline only proves its value if the output actually gets used. The interview chatbot on the homepage is that proof — an inline panel where recruiters can ask questions about my experience, grounded in the verified artifacts from the pipeline.
The chatbot's system prompt is the pipeline's output: the resume, case studies, and silver narratives — roughly 35,000 tokens of curated, verified content loaded at build time. No RAG needed; the full context fits in GPT-4o-mini's window with room to spare. The model answers from verified facts rather than generating from its training data, which is exactly the trust boundary the pipeline was designed to enforce.
Deploying it required the same thinking as any production AI system, even at personal-project scale. Rate limiting to prevent abuse. Input validation to reject oversized or malformed messages. Prompt injection defense in the system prompt. Security headers on the endpoint. The scale is small; the checklist is the same one you'd run for an enterprise deployment.
The chatbot also surfaced a validation bug on day two — the input length check was applying to all messages in the conversation history, including assistant responses. Since the assistant regularly generates responses exceeding 500 characters, the second question in any conversation would fail. A reminder that AI systems need the same testing discipline as any other production code, and that "it works on the first message" is not sufficient test coverage.
The Meta-Layer
This project has a recursive quality worth naming. The infrastructure was scaffolded with Claude Code. The content was curated through a pipeline designed collaboratively with Claude. The documentation was written in working sessions with Claude. The interview chatbot was built and hardened with Claude. And this case study — about using AI to curate a career portfolio and deploy it as an interactive product — is itself evidence of the skill it describes.
That's not clever circularity. It's the point. Knowing how to work with AI — when to trust output, when to verify, how to design quality gates, how to preserve context, how to take curated content and put it into production — is a distinct professional skill. It's the same set of instincts that govern production AI systems: observability (can you see what the AI did?), trust boundaries (which sources can you rely on?), human-in-the-loop design (where does automation stop and judgment begin?), security posture (what happens when someone tries to break it?). The scale is different. The thinking is the same.
The medallion model, the source trust hierarchy, the four errors caught, the chatbot deployment — none of these required building novel software. They required treating AI collaboration as a process design problem and following through to production. That's the skill this case study demonstrates, and it's the skill that matters most as AI becomes a standard part of how knowledge work gets done.