A leaderboard sounds like a simple feature. Sort players by score. Display the top N. Show each player their rank. The implementation takes an afternoon for small player counts. At 1 million concurrent users — all potentially playing, scoring, and checking their rank simultaneously — the naive implementation collapses under read load within minutes.
Here's how you architect leaderboards for real scale, and why each decision matters.
The read/write imbalance
The first thing to understand about leaderboard traffic: reads outnumber writes by roughly 50:1 in typical games. Players check their rank constantly. Score updates happen at match end. In a game with 1M concurrent users, you might get 100,000 score write operations per minute but 5,000,000+ rank read operations per minute. The architecture needs to be optimized for reads, not writes.
Sorted sets in an in-memory data store are the standard approach for the hot tier. Operations for score update, rank query, and range retrieval are all O(log N) — fast even at large player counts. A single node can comfortably handle the write throughput of 100,000 score updates per minute. The read problem requires different thinking.
Why you can't just cache the leaderboard
The obvious optimization: cache the full leaderboard. Serve reads from cache. Invalidate on score changes. This works until you realize that at 1M concurrent, score changes are constant. If you invalidate the cache on every score update, and there are 100,000 score updates per minute, your cache is effectively never warm. You're doing full leaderboard reads from the sorted set on every request.
The solution is time-bounded consistency. The global leaderboard doesn't need to be real-time accurate to the second. A 30-second-stale global leaderboard is completely acceptable to players — they're not refreshing the top-10,000 list every 5 seconds expecting it to change. Pre-compute and cache the global leaderboard on a 30-second refresh cycle. Serve all global leaderboard reads from cache. Update the cache in the background.
Player rank reads are different. Players do expect their own rank to be relatively current. Handle personal rank queries against the live sorted set, not the pre-computed cache. The query is a single O(log N) rank lookup — fast on its own. The read volume for personal rank (each player checks their own rank) is distributed across player IDs, not concentrated on top-of-leaderboard reads.
Segmented leaderboards and why they help
Global leaderboards are impressive in theory. In practice, most players care far more about their rank within a meaningful comparison group — their friends, their region, their bracket. Segmented leaderboards serve this need and are dramatically cheaper to compute.
A regional leaderboard for APAC has maybe 200,000 entries instead of 1,000,000. Pre-computation takes a fraction of the time. The sorted set is smaller, queries are faster, cache warm time is shorter. Friends leaderboards are smaller still — typically 50–200 entries — and can be computed on-demand per player without pre-computation.
The architecture that works at scale: maintain global score data in a primary sorted set. Pre-compute regional and bracket leaderboards from this on configurable refresh cycles (more frequent for smaller segments). Serve friends leaderboards as on-demand queries against the global set with a player-ID filter. Cache everything except personal rank queries.
Handling score events at scale
At 100,000 score write operations per minute, writes need to be handled asynchronously. If game servers write scores directly to your leaderboard service at match end, a peak event (large tournament, seasonal event) can create a write spike that your leaderboard service can't absorb synchronously.
Score updates should flow through an event queue. Game servers publish score events. Leaderboard workers consume from the queue and apply updates to the sorted set. Workers can scale horizontally — add more workers to drain the queue faster. The sorted set write is the bottleneck, but at 100,000 updates/minute spread across multiple worker threads, a reasonably sized deployment handles it without issue.
Queue depth is your early warning signal. If your queue depth is growing instead of draining at peak, you're underprovisioned on leaderboard workers. Alert on this before it becomes player-visible latency on rank updates.
Season resets and the thundering herd
Leaderboard resets — end of season, weekly resets, event completions — are your highest-risk operations. At reset time, every active player wants to check their final rank and their new starting rank simultaneously. You can see 10–20x normal read traffic in a 5-minute window around reset events.
Handle resets with a two-phase approach: archive the completed leaderboard (write to cold storage as a point-in-time snapshot), then atomically swap in a fresh empty sorted set. The archive operation is slow — do it before the swap, not simultaneously. The swap itself is fast. Players reading during the swap get either the old leaderboard or the new empty one — both are valid states.
Pre-warm your cache for the post-reset state before announcing the reset in-game. If you notify players of the reset via in-game message and they rush to check leaderboards, you want the cache already populated for the new season, not being computed under peak read load.
What "real-time" actually means for leaderboards
Players don't need millisecond-accurate global rankings. They need their own rank to update within a reasonable window after a score change, and they need the global leaderboard to reflect real competition. 30-second global refresh cycles and sub-5-second personal rank update latency cover 95% of player expectations without requiring an architecture that fights against the physics of distributed data systems.
Leaderboard infrastructure that handles tournament day
GameStack's leaderboard layer supports real-time score ingestion, segmented rankings, and season reset operations. Built for the spike, not the average.
Explore the platform