Technical deep dives
Last updated: April 5, 2026
Technical deep dives
This section is for people who want to understand the system at an engineering level. How does vector search find relevant chunks when the query has no keyword overlap with the rulebook? Why two tiers instead of one? How does the pipeline handle ten languages without ten separate models? Each article focuses on one technical dimension and can be read independently.
Architecture and search
System architecture — The service topology: what runs where, how services communicate, what each one does.
Vector search & embeddings — Why semantic search works where keyword search fails, and the role of 768-dimensional vectors.
How semantic search works: finding meaning, not keywords — A deeper look at embeddings, similarity, and why HNSW indexing matters at this scale.
The RAG pipeline in detail — Retrieval-augmented generation broken down: retrieval, ranking, context assembly, synthesis.
Answer generation
The two-tier answer system — Why most questions get a fast Tier 1 answer and some get escalated to Tier 2, and how the decision is made.
Inside the prompt architecture — How synthesis prompts are composed: game facts injection, question category routing, source formatting.
Autonomous quality optimisation — The nightly process that evaluates recent answers with an independent judge model and proposes improvements to the prompt templates.
Multilingual and limitations
10 languages, one pipeline — The single-pipeline approach to multilingual support and its trade-offs.
What the system can't see: limits of PDF extraction — Honest account of what Tika can and cannot extract from a rulebook PDF.
Who this is for
You don't need to be a software engineer to read these, but they assume you are comfortable with technical vocabulary. If you want the user-facing view, start with How it works instead.