Skip to content

Kafka Streams

wireform-kafka-streams is a Haskell library for building streaming applications on Apache Kafka. You write topologies as ordinary Haskell values and run them inside your service. Kafka handles durability and ordering; the library handles state stores, joins, windowing, exactly-once semantics, and rebalancing.

New to Kafka Streams? QuickstartWhat is Kafka Streams?Your first topology

Have a specific problem?

Coming from Java Kafka Streams? The API mirrors the Java client. Check the Riffle extensions for features the Java client doesn’t have.

Five self-contained parts. Run against an in-process test driver. No external Kafka broker needed.

  1. What is Kafka Streams?: The mental model and vocabulary
  2. Your first topology: Read from one topic, write to another
  3. Stateful processing: Count words and query the results
  4. Joins and tables: Enrich a stream with reference data
  5. Going to production: Eight things to set up before deploying
When you need to…Read this
Call external HTTP/SQL/GRPC APIsEnrichment guide
Scale past your partition countScaling and rebalancing
Deploy in Kubernetes without losing stateRunning in containers
Write to Postgres/S3 with exactly-once semanticsExactly-once across systems
Roll out a new topology versionTopology evolution
Understand why IQ reads don’t match writesVisibility versus ACID databases
Set up monitoring and alertsObservability
Handle an incidentRunbooks

Optional extensions for advanced use cases. These solve problems that standard Kafka Streams doesn’t address.

  • Async I/O: Call slow external APIs without blocking processing. The operator handles concurrency, timeouts, retries, and exactly-once semantics automatically.

  • Snapshot stores: Recover large state stores quickly. Instead of replaying hours of changelog to rebuild state, restore from a recent checkpoint.

  • 2PC sinks: Write to Postgres, S3, or HTTP endpoints with exactly-once semantics. Uses two-phase commit to keep Kafka and external systems in sync.

  • Watermark coordinator: Handle streams with very different data rates. Prevents windows from stalling when one source goes idle.

  • Key-group routing: Scale your application past your topic’s partition count. Useful when you need more parallelism than your input topics provide.

See Extended features for details.

  • A topology is a typed Haskell value (Topology input output) composed with Control.Category.(>>>). Like a function input -> output, it transforms streams. Common pattern: Topology Void () for topologies that read from and write to Kafka topics.
  • The runtime is a library, not a cluster. Your service contains the topology; scaling means running more processes in the same consumer group.
  • State stores live next to your service (local disk or memory), backed by Kafka changelog topics for durability and standby tasks for fast failover.