Shard Sizing: The Single Most Important Elasticsearch Decision

Too few shards and you can't scale out; too many and the cluster state eats your heap. The middle is narrower than you think.

August 18, 20258 min readElasticsearchOperations

A shard is a Lucene index. Every shard costs heap (for mappings, field data, segment metadata) and every shard is a unit of replication. Getting the count right is the single highest-impact decision in running Elasticsearch.

The numbers that matter

Target shard size: 10–50 GiB for search-heavy, up to 65 GiB for log ingest.
Shards per node: budget 20 shards per GiB of heap (so a 32 GiB heap node handles ~600 shards comfortably).
Heap per node: cap at 31 GiB — above that you lose compressed ordinary object pointers.

Too many shards

Cluster state is held in memory on every node. Ten thousand shards means tens of thousands of mapping entries replicated around the cluster — master nodes spend their time gossiping instead of working. Symptoms: slow index creation, long master election, memory pressure with tiny data volume.

Too few shards

You can't parallelise. Since shards are the unit of search, a 2 TiB index in 4 shards gives you 4-way fan-out max; even a 50-node cluster will only use 4 of those nodes for that query.

ILM and rollover — the modern answer

Stop pre-sizing a giant index. Use Index Lifecycle Management with a rollover at, say, 50 GiB or 30 days:

PUT _ilm/policy/logs
{
  "policy": {
    "phases": {
      "hot":    { "actions": { "rollover": { "max_size": "50gb", "max_age": "30d" } } },
      "warm":   { "min_age": "30d", "actions": { "forcemerge": { "max_num_segments": 1 } } },
      "cold":   { "min_age": "90d", "actions": { "freeze": {} } },
      "delete": { "min_age": "365d", "actions": { "delete": {} } }
    }
  }
}

You write to an alias, ES rotates underlying indices at the size/age you pick, old tiers migrate to cheaper nodes. Shard count emerges from the policy instead of being guessed up front.

Do you even need replicas?

In a log-analytics cluster with reliable source ingestion, replicas mostly exist for availability. Running the hot tier at number_of_replicas=1 and warm/cold at 0 is a common way to halve storage spend without changing durability guarantees.

Shard Sizing: The Single Most Important Elasticsearch Decision

The numbers that matter

Too many shards

Too few shards

ILM and rollover — the modern answer

Do you even need replicas?

6 replies// weighed in

More from this topic

Inverted Index 101: How Elasticsearch Actually Finds Things

Mapping Explosion: Why Your Cluster Is Eating RAM for Breakfast

Scoring & Relevance: BM25, Boosting, and Rescoring That Works