Engineering

Lessons from Our Routing Layer

Not all tasks belong to the same model. How we built a dispatcher that picks the right one.

Emily Osei

Parley

Senior Technical Writer

blue sky and red tree

Modern AI systems rarely rely on a single model anymore. In production environments, the real constraint is not intelligence — it is efficiency: cost, latency, reliability, and predictability. This is why most serious systems evolve toward multi-model architectures.

We learned this the hard way while building a routing layer that dynamically assigns tasks across a fleet of models. The intuition was simple: small models should handle simple tasks, large models should handle complex reasoning, and everything in between should be optimized for cost-performance tradeoffs.

In practice, the system quickly became more nuanced.

Some tasks that look “hard” are actually format-constrained and are solved better by smaller models. Other tasks that look “simple” require contextual reasoning that only larger models can reliably handle. This breaks the naive assumption that complexity maps directly to model size.

Our routing layer evolved into a decision system that considers:

  • input complexity signals

  • required output structure

  • latency budget

  • historical success rates per model

  • cost sensitivity per request type

One of the most important realizations was that routing errors are more expensive than weak model outputs. A slightly worse answer from the right model is almost always better than a strong answer from the wrong model.

Over time, we also discovered that routing is not static. It is a learning system. We continuously re-train routing heuristics based on observed failures, drift in model behavior, and changes in cost-performance curves across providers.

Ultimately, the system stops behaving like “choosing a model” and starts behaving like “coordinating intelligence.” That shift is where most of the real gains come from.

Related posts

Legal operations software for modern law firms. Bring in clients, run matters, and grow your practice, all in one place.

Legal operations software for modern law firms. Bring in clients, run matters, and grow your practice, all in one place.

Legal operations software for modern law firms. Bring in clients, run matters, and grow your practice, all in one place.