AI engineer building
reliable AI products.

Founding engineer building AI products across voice agents, evaluation systems, semantic search, and multi-tenant SaaS. I care about the part where models become reliable products. Currently leading engineering at HyrecruitAI.

03 / Engineering notes

Systems worth reading into

Writing archive

A tighter set of technical essays and case-study notes: search, recommendations, self-hosted systems, and relevance loops that show how I think through products end to end.

Hybrid search without hand-waving

How I would structure Azure AI Search for real products: BM25, vector retrieval, semantic ranker, filters, scoring profiles, and evaluation loops that keep relevance honest.

Read the search blueprint

The home ARR stack

A practical map of a self-hosted media automation stack: Jellyfin, Jellyseerr, Sonarr, Radarr, indexers, download clients, naming rules, and failure boundaries.

Read the stack notes

Flash: talent search at scale

A case-study style breakdown of Flash, a talent search and recommendation platform built around candidate signals, recruiter workflows, ranking, and scale.

Open the case study

Ranking loops that learn

Why search quality is not just a model choice. The product needs feedback capture, judgment labels, exploration, offline evals, and a way to improve without surprising users.

Read the relevance essay
04 · Writing

Selected writing

Full archive

Posts on the parts of shipping AI that don't fit in a tweet — guardrails, voice pipelines, semantic caching, LLM evals, embeddings, and the multi-tenant SaaS plumbing.

05 · Now

Current focus

Building real-time intelligent systems day to day — voice agents that hold a coherent interview turn-by-turn under 800ms, LLM evaluation engines with grounded rubrics, and the guardrail layers that keep them safe in production.

This quarter the focus is evaluation infrastructure: moving past single-prompt scoring into multi-pass rubric pipelines, then measuring the consistency gap against human raters. In parallel, rebuilding the semantic caching layer to push LLM inference costs down without bleeding quality.

On the side I am writing more about the unglamorous parts of shipping AI — guardrails, embeddings precision, voice latency budgets, and the multi-tenant SaaS plumbing that lets these systems run for thousands of paying users. New posts land on the blog most weeks.

Reading: Designing Data-Intensive Applications (re-read, this time with vector databases in mind) and the OpenAI Realtime API release notes — voice latency is the next ceiling worth pushing.

Open to senior AI / product engineering conversations, deep technical collaborations, talks, and writing opportunities.

06 · Talk shop

Open to senior AI / product engineer roles.

Strongest fit: teams building real-time agents, LLM evaluation systems, AI guardrails, or developer-facing AI infrastructure where product quality and systems judgment both matter.