5 FHIR Terminology Servers That Handle $expand at Production Scale

$expand is the operation that separates polished demos from production-ready terminology servers. Expanding a small intent-of-use ValueSet looks easy. Expanding a million-concept SNOMED CT ValueSet under burst load is where the FHIR terminology server either holds up or quietly times out. Picking the right server for that workload is one of the harder calls in a modern healthcare stack.

Below are five FHIR terminology servers known to handle $expand cleanly at production scale. For context, see the complete guide to FHIR terminology services for modern healthcare, and deeper FHIR walkthroughs for practitioners covers more on the operational side.

What Production Scale Means Here

For this article, production scale means the FHIR terminology server can:

Expand a 100k-concept ValueSet under realistic concurrency without spiking memory.
Serve hundreds of $expand calls per second for smaller ValueSets with consistent p99 latency.
Recover from a code system version bump without a long re-index window.
Return paginated results that clients can stream rather than buffer.

That is a higher bar than a vanilla deployment usually clears.

The 5 Servers Worth Considering

Snowstorm. Built around SNOMED CT and tuned for large $expand operations. The pagination behavior is one of the better implementations in the open-source community.

Ontoserver. Mature commercial server with strong telemetry for spotting $expand slowness and good defaults for the high-cardinality case.

Aidbox Terminology. Server-side $expand that benefits from sitting next to the FHIR store. Handles repeated expansions of the same ValueSet efficiently with a built-in cache.

HAPI Terminology with a tuned cache layer. The base server is reliable; the cache is what makes the production-scale $expand case work.

Firely Server Terminology. Performs well on profile-driven ValueSets and handles the version-pinning case cleanly, which matters when multiple code system versions are live.

For a focused comparison of two of these, HAPI terminology vs Snowstorm walks through the tradeoffs in production stacks.

Where `$expand` Tends to Break

A few patterns that often cause $expand outages:

Implicit pagination. The client asks for a million-concept ValueSet and the server tries to return it in one response. Always paginate; always pin the page size.
Stale cache after a release. The cache holds the previous version's expansion and the new version's lookups miss it. Bust the cache on version bump.
Subsumption-heavy ValueSets. ValueSets that walk the SNOMED CT hierarchy can be expensive; cache aggressively and consider precomputing the result.
Cold-start latency. The first expansion after a server restart is slow. Warm the cache as part of the rollout.

A Production Test That Stays Honest

A short test that exposes most $expand problems in an afternoon:

Pick the largest ValueSet the system needs to expand.
Run a load test with realistic concurrency and watch p50, p95, p99.
Restart the server mid-test and watch what happens to the latency curve.
Bump the code system version and rerun the test.
Verify the paginated response is correct on the last page, not just the first.

A FHIR terminology server that survives all five checks is a strong candidate for production. One that fails the restart or version-bump tests usually needs additional operational work before it is safe to put in front of real clinical workflows.

$expand is the operation that punishes the wrong terminology server choice. Pick a server that has been tested at the relevant scale and the rest of the system stops being shaped around terminology limitations.

Sources

ValueSet $expand implementation reference - GitHub docs, IHTSDO/Snowstorm
Mastering FHIR Terminology (covers $expand performance) - PDF slides, Dion McMurtrie, DevDays 2023
HAPI vs Snowstorm vs Ontoserver under load - Blog post, Rath Panyowat, 2025