Coordinating Multiple AI Models
I established tri-model routing on Vandoko in late February 2026. It is now standard across all my active projects. The basic idea is simple: different models are better at different things. The execution took some work to get right.
The Model Assignments
Claude Opus 4.6 handles orchestration, multi-step implementation, architecture decisions, and testing. It is the primary model for most development work. When a task requires holding a lot of context across multiple files and making judgment calls about structure, Claude is the right tool.
Codex (gpt-5.3-codex) handles backend analysis, architecture review, and debugging. It is particularly good at reading a backend codebase and identifying where something went wrong. I invoke it through the MCP integration or the Codex CLI depending on context.
Gemini 2.5 Pro handles frontend UI and UX design, shadcn registry research, and visual polish decisions. There is an important constraint here: the Gemini 2.5 Pro MCP server is broken. It returns empty content due to a thinking token bug. I always use the CLI for Gemini Pro: gemini -m gemini-2.5-pro -p "prompt". Gemini 2.5 Flash works through MCP and handles quick lookups.
How Context Stays Consistent
All three models share the same Obsidian vault. Each project has an ai-memory.md file that any model can read and write. Cross-project patterns live in a Knowledge/ directory. Attribution is tracked through a last-updated-by frontmatter field.
This means when Codex does a debugging session on the NestJS backend and documents a finding, Claude can read that finding in the next session without me summarizing it manually. The vault is the shared memory layer.
The CLI scripts in Tool Bag handle environment setup. start-session.ps1 loads the right MCP profile for the project. handoff.ps1 generates a JSON package that can be passed to any model to pick up where the last session left off.
When to Use Which
The routing rules emerged from making the wrong call and noticing the cost.
Sending a complex multi-file refactor to Gemini Flash is slow and produces worse results than Claude Opus. Sending a "what does this API response structure look like" question to Claude Opus is overkill when Gemini Flash handles it in seconds for a fraction of the cost.
Frontend UI decisions that involve Tailwind class choices, component animation timing, or visual hierarchy are better with Gemini Pro. It thinks about these problems differently than Claude and the outputs look better. Backend schema design, service layer architecture, or debugging a type error across a monorepo goes to Claude or Codex.
The hard rule: never use Gemini 2.5 Pro through MCP. Always CLI. This took one broken session to learn and I documented it immediately.
Tool Bag as Infrastructure
The routing setup lives in Tool Bag, not in any individual project. MCP profiles define which servers each project loads. The vandoko.json profile loads different servers than the martech.json profile. Each profile is a JSON file that maps server names to their configurations.
This separation keeps project repos clean and makes it easy to add a new MCP server to all projects at once by updating the profile in Tool Bag.
The benchmark work I did in March 2026 ran against the skills in Tool Bag. After fixing the P0 and P1 issues, the portfolio of 41 active skills scored 88.7 average across six dimensions. That infrastructure pays back across every project that uses it.