Shard Sizing: The Single Most Important Elasticsearch Decision
Too few shards and you can't scale out; too many and the cluster state eats your heap. The middle is narrower than you think.
A shard is a Lucene index. Every shard costs heap (for mappings, field data, segment metadata) and every shard is a unit of replication. Getting the count right is the single highest-impact decision in running Elasticsearch.
The numbers that matter
- Target shard size: 10–50 GiB for search-heavy, up to 65 GiB for log ingest.
- Shards per node: budget 20 shards per GiB of heap (so a 32 GiB heap node handles ~600 shards comfortably).
- Heap per node: cap at 31 GiB — above that you lose compressed ordinary object pointers.
Too many shards
Cluster state is held in memory on every node. Ten thousand shards means tens of thousands of mapping entries replicated around the cluster — master nodes spend their time gossiping instead of working. Symptoms: slow index creation, long master election, memory pressure with tiny data volume.
Too few shards
You can't parallelise. Since shards are the unit of search, a 2 TiB index in 4 shards gives you 4-way fan-out max; even a 50-node cluster will only use 4 of those nodes for that query.
ILM and rollover — the modern answer
Stop pre-sizing a giant index. Use Index Lifecycle Management with a rollover at, say, 50 GiB or 30 days:
PUT _ilm/policy/logs
{
"policy": {
"phases": {
"hot": { "actions": { "rollover": { "max_size": "50gb", "max_age": "30d" } } },
"warm": { "min_age": "30d", "actions": { "forcemerge": { "max_num_segments": 1 } } },
"cold": { "min_age": "90d", "actions": { "freeze": {} } },
"delete": { "min_age": "365d", "actions": { "delete": {} } }
}
}
}
You write to an alias, ES rotates underlying indices at the size/age you pick, old tiers migrate to cheaper nodes. Shard count emerges from the policy instead of being guessed up front.
Do you even need replicas?
In a log-analytics cluster with reliable source ingestion, replicas mostly exist for availability. Running the hot tier at number_of_replicas=1 and warm/cold at 0 is a common way to halve storage spend without changing durability guarantees.