ML
Apache Kafka

Kafka Streams vs Flink vs Plain Consumer: Choosing the Right Layer

Three legitimate ways to process a Kafka topic — which one fits your shape of problem.

October 24, 20259 min readKafkaStreaming

Every Kafka-based pipeline eventually faces the same fork: should I use a plain KafkaConsumer, Kafka Streams, or a dedicated engine like Flink? The honest answer is that each is a different product for a different class of problem.

Plain consumer

A thread pool of KafkaConsumer instances. You control every byte. Good when:

  • The work is stateless or the state lives in an external store.
  • You want a single language (Java/Go/Python) and zero new infrastructure.
  • Latency budget is tight and you don't want the overhead of a DSL.

You'll be the one implementing retries, DLQs, graceful shutdown, and windowing. That's fine if those rarely change.

Kafka Streams

A library that runs inside your JVM application. State is local (RocksDB) with a changelog topic as the backup. The DSL gives you joins, windowed aggregations, and exactly-once semantics for free.

StreamsBuilder b = new StreamsBuilder();
b.stream("payments", Consumed.with(Serdes.String(), paymentSerde))
 .filter((k, v) -> v.amount() > 0)
 .groupByKey()
 .windowedBy(TimeWindows.ofSizeWithNoGrace(Duration.ofMinutes(5)))
 .aggregate(Summary::empty, Summary::fold, Materialized.as("summary-store"))
 .toStream()
 .to("payment-summary");

Best when your app is already JVM and your state fits comfortably on a single node (RocksDB scales per instance, not across).

A cluster of its own. Savepoints, event-time processing with watermarks, SQL, and real distributed state. Pick Flink when:

  • State is large enough that you want it sharded and checkpointed to object storage.
  • You have multiple sources and sinks beyond Kafka.
  • Event-time correctness (late data, watermarks) is load-bearing to the business.

A quick heuristic

  1. Stateless or externally-stateful → plain consumer.
  2. Stateful, JVM, single-team → Kafka Streams.
  3. Stateful, multi-source, event-time correctness → Flink.

Don't pick the fanciest one by default. Every one of these engines is someone's full-time job to operate.

SharePostLinkedIn

Reader Discussion

8 replies// weighed in

TopNewestAuthor
Add to the thread
Disagree, agree harder, or share your own experience…
Email instead →markdown okbe kind
  1. Highlighted by author
    Tuấn Phạm🇻🇳 HCMC· Staff Engineer · Tiki Data PlatformStory

    min.insync.replicas=2 với acks=all là combo chuẩn. Bọn em từng để mặc định min.insync.replicas=1, một broker rolling restart là mất 4 message — xui là đúng cái event payment confirm. Ngồi viết postmortem từ 2h sáng đến 7h.

    Oct 25, 2025·1 day later
    • ML
      Minh LeAuthor

      Đúng cái khoảnh khắc realize default = 1 thì đã muộn. Cảm ơn bro share — mình sẽ thêm warning box vô post.

      Oct 25, 2025
  2. Quốc Anh· Backend Lead · FinhayAgrees

    Đoạn segment files giải thích quá ngắn gọn. Bọn em hay quên log.segment.bytes vs log.retention.bytes là 2 thằng khác nhau, bị retention không kick in là vì segment chưa rolled — đúng cái rule of thumb cuối bài.

    Oct 26, 2025·2 days later
  3. Amelia Brooks· Distributed SystemsPushback

    tiny nit but acks=0 is not literally fire-and-forget at the protocol layer — the producer still writes to its socket buffer. The 'forget' is the broker side. Pedantic but bites people in metrics dashboards.

    Nov 02, 2025·1 week later
  4. Priya Ramaswamy· SRE · Checkout InfraFrom experience

    Static membership single-handedly cut our deploy-induced p95 lag from ~28s to under 2s. group.instance.id is the most under-rated config in the entire client lib. We tell every new SRE about it on day one now.

    Oct 28, 2025·4 days later
  5. Jakub Nowak· Backend EngineerPushback

    small pushback — cooperative sticky is great in theory but mixing it with a 2.4 broker we still had on legacy gave us a 6h partial outage where some consumers thought they owned 0 partitions and others owned everything. compatibility matrix is not a footnote, it's load bearing

    Oct 31, 2025·1 week later·edited
  6. Léa Dubois· SREAsks

    any chance you'd publish these as a PDF collection? would love to print and read offline on flights. screen-fatigue is real.

    Oct 30, 2025·6 days later
  7. Ahmed Rahman· Full StackKind words

    concise + opinionated = my favourite kind of engineering post. so many blogs hedge every claim into mush. give me the spicy take with the receipts. more please.

    Oct 25, 2025·1 day later

Worked on something similar? Email ducminhldm@gmail.com — I read every one. The good ones become future posts.

Comments seeded · live discussion via email