Glossary
This glossary defines every term used across the Kafka Streams documentation. Use it when you encounter unfamiliar terminology.
Terms are organized alphabetically. Each definition includes the module where the concept lives and links to deeper explanations. Other pages in this documentation link directly to specific glossary entries. For example, the Visibility page links to event time when first introducing the concept.
Atomicity, Consistency, Isolation, Durability - the four classical guarantees of a SQL database transaction. Streaming systems give you a different bundle of guarantees, and the Visibility versus ACID databases page works through the mismatch.
Alignment group
Section titled “Alignment group”A Riffle concept: a set of watermark sources whose
watermarks should not diverge by more than a configured bound. A
fast source whose watermark out-paces the group’s slowest member
by more than agBound is backpressured: the runtime pauses
fetching from it until the slowest member catches up. Lives in
Kafka.Streams.Watermark.AlignmentGroup.
applicationId
Section titled “applicationId”The consumer-group identity and the prefix every internal topic (changelog, repartition) inherits. Changing it is a fresh-start deploy, not a rollout. See Topology evolution.
Assignor
Section titled “Assignor”The component that decides which member of a consumer group owns which partition (and, in Streams, which task). The built-in default is cooperative-sticky: it prefers to keep tasks on their current owner across rebalances, and revocations are incremental rather than stop-the-world. See also KIP-848.
Async I/O operator
Section titled “Async I/O operator”A Riffle Prim that runs the user’s IO action on a
bounded worker pool instead of on the stream thread.
Provides backpressure, EOS-correct offsets, ordered or unordered
output, per-request timeout / retry, and explicit failure policy.
Full walkthrough in Enrichment via external systems.
At-least-once
Section titled “At-least-once”The default processing guarantee. Every record is processed at least once: and therefore possibly more than once on a rebalance or fault. External side effects should be idempotent.
Backpressure
Section titled “Backpressure”The mechanism by which a slow downstream stage signals an upstream stage to slow down. In Kafka Streams the consumer poll loop is naturally backpressured by the processor it feeds: a full inbox blocks new fetches. In Riffle async I/O, a full in-flight queue blocks the stream thread on enqueue.
Broker
Section titled “Broker”A node in a Kafka cluster: the server-side process that stores topic logs and serves produce / fetch requests. Multiple brokers form a cluster; one is the controller (under KRaft mode) that coordinates metadata.
CDC (Change Data Capture)
Section titled “CDC (Change Data Capture)”A pattern (and protocol family) where each row-level change in an
upstream database (insert, update, delete with before / after
images) is captured and published downstream. Debezium and AWS DMS
are the canonical implementations. Riffle’s Kafka.Streams.Sources.CDC
ships a primitive for materialising CDC feeds into KTables with
snapshot/streaming-phase awareness and key-aware compaction.
Changelog topic
Section titled “Changelog topic”The internal Kafka topic the framework writes a state
store’s updates to. On instance loss, the state is
rebuilt by replaying this topic from offset 0 (or, with snapshot
stores, from the last snapshot’s offset).
Convention: <applicationId>-<storeName>-changelog.
Co-partitioning
Section titled “Co-partitioning”The requirement that two topics being joined share the same
partition count and the same key partitioner, so each (key, A)
record and its matching (key, B) record land on the same task.
Streams validates co-partitioning at startup; mismatches throw
during topology validation.
Cogroup
Section titled “Cogroup”A Kafka Streams DSL operator that lets you aggregate multiple
input streams into one output table with a single aggregator
function per stream and a single shared store. The Haskell port is
cogroup / addCogrouped / aggregateCogrouped in
Kafka.Streams.Cogroup.
Commit cycle
Section titled “Commit cycle”The orchestrated sequence the runtime runs every
commitIntervalMs (default 30 s) to make a batch of work durable.
Under EOS-V2 with Riffle 2PC sinks the six steps are:
beginTxn → flush → commitOffsets → preCommit2PC → commitTxn → commit2PC → storeCommit. See Kafka.Streams.Runtime.EOS.runCommitCycle
and the Exactly-once page.
Consumer group
Section titled “Consumer group”A set of Kafka consumers sharing the same group.id (in Streams,
the applicationId) that the broker
coordinator collectively assigns partitions to. Each partition is
owned by at most one member at a time.
Cooperative-sticky
Section titled “Cooperative-sticky”See Assignor.
CQRS (Command Query Responsibility Segregation)
Section titled “CQRS (Command Query Responsibility Segregation)”An architectural pattern in which writes go to one model (commands) and reads come from another (queries), connected by an asynchronous projection. Kafka Streams + downstream query store is a natural CQRS implementation; the trade-off discussion lives in Visibility versus ACID databases.
DLQ (Dead-letter queue)
Section titled “DLQ (Dead-letter queue)”A topic (or other sink) that receives records the runtime couldn’t
process. Riffle’s bounded suppress supports a dead-letter
overflow policy; processing.exception.handler (KIP-1033) supports
a DLQ disposition for records that throw.
DispatchMode
Section titled “DispatchMode”The Riffle StreamsConfig knob that picks which
Kafka.Streams.Runtime.WorkerPool constructor the runtime uses.
Three values: DispatchPartition (parity default: explicit
per-worker partition ownership), DispatchHashed (parity hashing
by (topic, partition)), DispatchKeyGroup (Riffle key-group
routing). See Scaling.
Domain-specific language. In Kafka Streams, the
typed combinator API (stream, mapValues, filter, groupBy,
count, …) as opposed to the lower-level Processor
API. The library exposes two DSL surfaces: the
Free-Arrow Free DSL (Kafka.Streams.Topology.Free) and the
imperative builder DSL (Kafka.Streams.KStream, etc.).
EOS / EOS-V2 / EOS-V3
Section titled “EOS / EOS-V2 / EOS-V3”Exactly-once semantics. EOS-V2 (KIP-447, KIP-129) is
the transactional-producer-plus-TxnOffsetCommit story that
ensures records written by Streams and consumer offsets advance
atomically. EOS-V3 / KIP-892 adds transactional state stores so a
store’s writes commit in lockstep with the producer transaction.
The processingGuarantee config selects between
AtLeastOnceP and ExactlyOnceP.
Emit policy
Section titled “Emit policy”Riffle generalisation of the JVM Streams EmitStrategy
(KIP-825). A first-class EmitPolicy value any windowed /
stateful operator can consume: EmitOnUpdate,
EmitOnWindowClose, EmitOnCount n, or EmitCustom with a
user predicate. Lives in Kafka.Streams.EmitPolicy.
Event time
Section titled “Event time”The timestamp associated with a record by its producer: when the underlying business event happened. Contrast with processing time: the timestamp the runtime saw the record. The pair drives all of windowing, watermarks, and grace periods.
Event-time TTL
Section titled “Event-time TTL”Riffle KV-store wrapper: state expires based on the coordinated
watermark, not on wall-clock. Lives in
Kafka.Streams.State.KeyValue.TTL. Pair with
ttlClockFromCoordinator for the coordinator-driven clock.
Fenced producer
Section titled “Fenced producer”A producer the broker has rejected because a newer producer with
the same transactional.id was observed. Surfaces as
ProducerFencedException / InvalidProducerEpochException. Almost
always means two instances are running for the same (applicationId, taskId). See the zombie-producer
runbook.
Foreign-key join
Section titled “Foreign-key join”A KTable-KTable join keyed not on the record key but on a
caller-supplied extractor of the left value. Streams handles the
subscription-token bookkeeping. Haskell port:
foreignKeyJoinKTable / leftForeignKeyJoinKTable in
Kafka.Streams.ForeignKeyJoin.
Free arrow / FreeArrow
Section titled “Free arrow / FreeArrow”A free construction over the Arrow typeclass: the typed DSL is
represented as a value, then interpreted / optimised / compiled at
the boundary. Kafka.Streams.Topology.Free builds a
FreeArrow Prim AST that the compiler walks. The benefit:
topologies are first-class values you can inspect, snapshot, and
optimise before they run.
GADT (Generalized Algebraic Data Type)
Section titled “GADT (Generalized Algebraic Data Type)”A Haskell ADT whose constructors can refine the type parameters of
the resulting value. Used by Prim in
Kafka.Streams.Topology.Free to encode the input / output types
of each operator at the type level.
GlobalKTable
Section titled “GlobalKTable”A KTable replicated in full on every instance, not partitioned
across the group. Used for small reference data (currency rates,
country lookups) where you want zero-network-cost lookups. Loaded
from a Kafka topic via globalTable.
Grace period
Section titled “Grace period”The amount of time a window stays open accepting late
records past its end. A 1-hour tumbling window with a 10-minute
grace closes 70 minutes after the window’s start in event time.
Configured via withGracePeriod / withSessionGracePeriod.
Heartbeat
Section titled “Heartbeat”The periodic message a consumer group member sends the coordinator to confirm liveness. Under KIP-848, heartbeats also carry subscription metadata and receive reconciliation deltas.
Hopping window
Section titled “Hopping window”A fixed-size window that advances by a step smaller than its size: adjacent windows overlap. A 5-minute hopping window with a 1-minute advance produces a new window every minute, each containing the last 5 minutes’ worth of records.
Idempotent / idempotency
Section titled “Idempotent / idempotency”An operation that has the same effect when invoked multiple times
as when invoked once. PUT /resource/42 with the same body is
idempotent; POST /create-order typically isn’t. Critical for
external side effects under at-least-once or for
the recovery path of a two-phase commit sink.
Idempotency token
Section titled “Idempotency token”A stable per-record identifier (usually the upstream (topic, partition, offset) tuple) written to a state store
to deduplicate side effects on replay. The pattern: on
each record, check the store; if absent, fire the side effect,
then write the token. See Enrichment.
Idle source
Section titled “Idle source”A watermark source that hasn’t produced records for
longer than its IdleAfter threshold. The
coordinator excludes idle sources from
the min-watermark computation so a quiet partition doesn’t stall
downstream windows.
Interactive query (IQ)
Section titled “Interactive query (IQ)”Read-only access to a live state store from
outside the stream thread. Lives in
Kafka.Streams.InteractiveQueries. The query layer is your code;
the library exposes the typed handles + cross-instance discovery
metadata (KIP-535).
Internal topic
Section titled “Internal topic”A Kafka topic the framework auto-creates and owns:
changelog topics for state stores,
repartition topics for keyed shuffles. Their
names derive from the applicationId and the stable
name of the owning operator.
An operator that combines records from two streams (or a stream and a table). Five shapes: stream-stream (windowed), stream-table, table-table, stream-GlobalKTable, and foreign-key.
KafkaStreams
Section titled “KafkaStreams”The runtime handle. newKafkaStreams constructs it from a
topology + config; startKafkaStreams runs it. closeKafkaStreams
drains and shuts down. Most live operations
(setStateListener, addStreamThread, pauseKafkaStreams, etc.)
take a KafkaStreams value.
Key-group
Section titled “Key-group”A Riffle abstraction: a fixed routing space (default 128) into
which record keys hash, decoupling parallelism
from partition count. Lives in
Kafka.Streams.Runtime.KeyGroup. See Scaling.
KGroupedStream / KGroupedTable
Section titled “KGroupedStream / KGroupedTable”Intermediate types produced by groupBy / groupByKey on a
stream / table. They’re the input to aggregation operators
(count, reduce, aggregate).
Kafka Improvement Proposal: the Apache Kafka design- review process. Specific KIPs that come up by name across these docs:
| KIP | What |
|---|---|
| KIP-295 | topology.optimization config knob (reuse-source-topics, merge-repartitions, single-store-self-join) |
| KIP-307 | Stable, deterministic processor names |
| KIP-418 | Named branches (Branched.*) |
| KIP-441 | Probing rebalance for warmup-ready standbys |
| KIP-447 | EOS-V2 producer-per-instance with TxnOffsetCommit |
| KIP-535 | Cross-instance IQ discovery (StreamsMetadata, KeyQueryMetadata) |
| KIP-591 | default.dsl.store config |
| KIP-825 | First-class EmitStrategy |
| KIP-848 | Next-gen consumer-group protocol: broker-side incremental reconciliation |
| KIP-892 | EOS-V3 transactional state stores |
| KIP-924 | In-process TaskAssignor plug-in |
| KIP-925 | Rack-aware assignment strategy |
| KIP-1033 | First-class processing-exception handler with DLQ disposition |
KStream
Section titled “KStream”A Kafka Streams record stream: an append-only sequence of (key, value, timestamp, headers) tuples. Compare with KTable.
KTable
Section titled “KTable”A Kafka Streams changelog stream interpreted as a key-value table: later records overwrite earlier ones for the same key. Materialised into a state store (in-memory or RocksDB). GlobalKTable is the fully-replicated variant.
Kafka’s Raft-based metadata protocol that replaced ZooKeeper for cluster coordination. The library’s integration tests spin up a KRaft-mode broker; the streams runtime is KRaft-agnostic at the client layer.
Linearisable
Section titled “Linearisable”A consistency model in which every operation appears to take effect atomically at some point between its invocation and its response. The Kafka Streams in-memory stores’ single-key reads are linearisable; iterators are eager snapshots, not linearisable across mutations.
Little’s law
Section titled “Little’s law”The relation L = λW: the average number of items in a system
equals the arrival rate times the average time in the system.
Used in Enrichment
to size async-I/O worker pools: workers ≈ throughput × latency.
Log compaction
Section titled “Log compaction”A Kafka topic-level retention policy that keeps the latest value per key indefinitely (rather than truncating by time or size). Changelog topics use compaction so state recovery from offset 0 remains bounded by unique-key count, not by total record count.
Materialized
Section titled “Materialized”The DSL knob that controls which state store
backend an aggregation, join, or table-source uses. Picks the
store builder and the serdes. Mirrors the JVM
Materialized builder.
Member epoch / rebalance epoch
Section titled “Member epoch / rebalance epoch”KIP-848 versioning. Member epoch bumps every time a member’s owned assignment changes; rebalance epoch bumps every time the group-wide target changes. Stale-epoch heartbeats must reconcile before continuing.
Object store
Section titled “Object store”Riffle’s abstraction over S3 / GCS / Azure Blob / a filesystem.
Snapshot stores write their snapshot blobs through an
ObjectStoreClient and read them back on recovery. Lives in
Kafka.Streams.Runtime.ObjectStore.
Offset
Section titled “Offset”The position of a record within a partition. The consumer commits offsets to the broker so a restart resumes from the right place. Under EOS, offsets advance only when the commit cycle commits.
Orphan internal topic
Section titled “Orphan internal topic”A changelog or repartition
topic on the broker that doesn’t correspond to any store /
operator in the current topology. Usually a sign that a previous
deploy renamed something. Detected by
Kafka.Streams.Observability.OrphanTopics.detectOrphans. See the
runbook.
Parallelism
Section titled “Parallelism”The number of concurrent units of work the runtime can run. In
parity Streams: bounded by numStreamThreads × instances, capped
by the source-topic partition count. Under Riffle
key-groups: bounded by kgcTotal (default 128),
independent of partition count.
Partition
Section titled “Partition”A subset of a Kafka topic’s data, distributed across brokers and consumed in parallel. Records with the same key always land on the same partition (under the default partitioner). The partition is the unit of consumer-group assignment.
Pipeline
Section titled “Pipeline”A newtype in Kafka.Streams.Pipeline wrapping a -> IO b with
Category, Arrow, ArrowChoice, Functor, and Applicative
instances. Used to build reusable, named topology fragments that
compose with (>>>). The ArrowChoice instance gives you the
ROP (+++) / (|||) combinators
over Either.
Pre-commit drain
Section titled “Pre-commit drain”Riffle hook (ProcessorContext.ctxRegisterPreCommitDrain) that
lets an async operator block the commit cycle
until its in-flight queue is empty. Used by asyncMapValues etc.
to make offset commits EOS-correct.
The GADT constructor type in Kafka.Streams.Topology.Free that
represents one operator in the typed AST. Every DSL combinator
(mapValues, filter, groupBy, …) corresponds to a Prim.
Probing rebalance
Section titled “Probing rebalance”The cadence at which the runtime re-issues a rebalance to promote
standby tasks that are within
acceptableRecoveryLag. Lives in
Kafka.Streams.Runtime.ProbingRebalance. Default cadence: 10
minutes (probingRebalanceIntervalMs).
Processing guarantee
Section titled “Processing guarantee”AtLeastOnceP vs ExactlyOnceP in StreamsConfig.processingGuarantee.
Determines whether the producer is transactional, whether offsets
commit through TxnOffsetCommit, and whether state stores use the
transactional buffer. See EOS.
Processing time
Section titled “Processing time”The wall-clock time the runtime saw a record. Contrast with event time.
Processor API
Section titled “Processor API”The lower-level Kafka Streams API: write a Processor /
FixedKeyProcessor directly with process + init callbacks,
access state stores by name, schedule punctuators.
The DSL ultimately compiles to processor-API calls.
Punctuator
Section titled “Punctuator”A scheduled callback inside a processor. Two clocks:
WallClockTimePunctuation (every N ms of real time) and
StreamTimePunctuation (every N ms of event time
advance). Used for time-driven emits, idle-window detection,
cache eviction.
Rebalance
Section titled “Rebalance”The process by which a consumer group reassigns partition / task ownership across its members in response to a membership change. Under the classic protocol it was stop-the-world; under KIP-848 it’s incremental per-task with no double-ownership at any point.
Reconciliation
Section titled “Reconciliation”KIP-848 term for the diff between a member’s currently-owned
assignment and its target assignment. Reconciliation carries
rAdd and rRemove sets; a losing member acknowledges its
rRemove before the gaining member sees the task in rAdd.
Defined in Kafka.Streams.Runtime.RebalanceProtocol.
Remote KV
Section titled “Remote KV”A Riffle KV-store backend with no local state: every get / put
is a network call against a remote store (FoundationDB / TiKV /
DynamoDB shape). Node restart is a metadata operation. Lives in
Kafka.Streams.State.KeyValue.Remote.
Repartition
Section titled “Repartition”The operation of re-keying a stream and re-publishing it to an
internal topic so downstream stateful operators see records
co-partitioned. Performed by repartition / through /
implicit auto-insert (optAutoInsertRepartition).
Repartition topic
Section titled “Repartition topic”The internal topic a repartition operator
writes to. Convention: <applicationId>-<nodeName>-repartition.
Replay
Section titled “Replay”Re-processing records the runtime already saw, after a fault or rebalance rewinds the consumer to a prior committed offset. Under at-least-once this is the normal recovery path; under EOS it still happens but the duplicate output is aborted.
Railway-oriented programming
Section titled “Railway-oriented programming”A pattern, popularised by Scott Wlaschin’s F# write-up, for
modelling pipelines of fallible operations as two parallel
tracks: a success track and a failure track. Each stage either
advances the value on the success track or routes it to the
failure track; failure short-circuits all downstream success
stages cleanly. Kafka Streams’ DeserializationHandler /
ProcessingExceptionHandler / ProductionHandler /
AsyncFailurePolicy / SinkOutcome are all track-switch
surfaces; the Pipeline newtype’s ArrowChoice
instance is the explicit ROP combinator set. Full mapping in
Railway-oriented programming with streams.
runCommitCycle
Section titled “runCommitCycle”The orchestrator function in Kafka.Streams.Runtime.EOS that
drives one commit cycle. Takes an EOSCoordinator
- a flush body + a getter for offsets-to-commit; returns
CommitSucceeded/CommitAborted/CommitFatal.
Schema Registry
Section titled “Schema Registry”Confluent’s per-subject schema-versioning service. The library’s
serdes (Kafka.Streams.Serde.Avro, Kafka.Streams.Serde.JsonSchema,
Kafka.Streams.Serde.Protobuf) speak the Confluent envelope and
the registrySerdeChecked wrapper enforces compatibility-mode
checks at construction time.
SchemaVersioned store
Section titled “SchemaVersioned store”Riffle KV-store wrapper that tags every write with a
SchemaVersion and migrates reads forward through a chain of
SchemaMigration callbacks. burnInMigrate rewrites older
entries in-place with resumable progress. Lives in
Kafka.Streams.State.KeyValue.SchemaVersioned.
SinkTxnId
Section titled “SinkTxnId”The Riffle identifier for one in-flight transaction on a
two-phase commit sink. Made stable
across restarts (typically applicationId-instanceId-cycleCounter)
so recovery can correlate prepared txns with their producer
counterparts.
Snapshot store
Section titled “Snapshot store”A Riffle KV-store backend that incrementally writes snapshots of
its state to an object store. Recovery time
becomes O(time-since-last-snapshot) rather than O(state-size).
Lives in Kafka.Streams.State.KeyValue.Snapshot.
Stable name
Section titled “Stable name”A deterministic, build-stable name for an operator node. Mirrors
the JVM’s KIP-307 generator: KSTREAM-MAPVALUES-0000000007. The
name is part of the deployment contract because internal
topics inherit from it. Lives in
Kafka.Streams.Topology.StableNames.
Standby task
Section titled “Standby task”A warm replica of an active task’s state, maintained by
tailing the active’s changelog. Promotion on
instance loss becomes metadata-only if the standby is within
acceptableRecoveryLag. Riffle’s SnapshotPointer mode lets a
standby hold only (snapshotId, advancedTo) rather than a full
replica.
State store
Section titled “State store”The persistent (or in-memory) key-value, window, or session store
attached to a stateful operator. Backends: in-memory + RocksDB
(via +rocksdb); Riffle adds snapshot, tiered, remote, and
versioned variants. Read-only IQ access via
Kafka.Streams.InteractiveQueries.
StreamsConfig
Section titled “StreamsConfig”The top-level runtime configuration record. Mirrors the Java
StreamsConfig properties. Defined in Kafka.Streams.Config.
StreamsMetadata
Section titled “StreamsMetadata”KIP-535 record: what each instance in the group looks like to its peers (host:port, owned partitions per store, standby partitions per store). Used by an external IQ proxy to route key-level queries to the right host.
StreamTime
Section titled “StreamTime”The per-task event-time clock: running max of extracted timestamps on records this task has seen. Replaced by the coordinated watermark where Riffle opts in.
Stream thread
Section titled “Stream thread”The OS thread that drives one consumer + N workers in this runtime. (Different from the JVM Streams “one stream thread per consumer” model: see the README for the comparison.) The thread that runs user-supplied processor code; nothing in user code should block it for long.
Subscription metadata
Section titled “Subscription metadata”Bytes a consumer group member attaches to its
JoinGroup request. Streams uses it to advertise
application.server, owned standby tasks,
and the assignor’s per-member state.
Suppress
Section titled “Suppress”The DSL operator that holds emissions back until a condition is
met: typically “until window closes” or “until time limit
expires”. Riffle adds a bounded variant with explicit
BufferOverflowPolicy (DropOldestSilently, ShutdownWhenFull,
suppressWindowedShed to DLQ).
The unit of parallelism. One task owns one
partition of one subtopology + its local state stores.
TaskId identifies a (subtopology, partition) pair.
TaskAssignor
Section titled “TaskAssignor”KIP-924 in-process plug-in for the leader-side assignment path.
Replaces the built-in cooperative-sticky assignor. The runtime
constructs an ApplicationState from the live view and invokes
taAssign.
Tiered KV
Section titled “Tiered KV”A Riffle KV-store backend with a hot tier (local in-memory or
RocksDB) + a cold tier (S3 / GCS). Reads probe hot, fall through
to cold, and promote. Eviction decides which entries demote when
the hot tier exceeds its budget. Lives in
Kafka.Streams.State.KeyValue.Tiered.
Topology
Section titled “Topology”The compiled, validated graph the runtime executes. Topology i o is a type with two parameters, like a function i -> o.
- Input type
i: What stream type enters the topology.Voidmeans the topology pulls from sources (Kafka topics), not from other code. - Output type
o: What stream type exits the topology.()means the topology pushes to sinks (Kafka topics), not to other code.
Built from the DSL value of type Topology Void () (or other combinations), or directly via the imperative Kafka.Streams.Topology builder. Validated by validateTopology before the runtime starts.
TopicNameExtractor
Section titled “TopicNameExtractor”KIP-303 sink-side dynamic-topic-name extractor. Lets one sink
route records to different topics based on the record. The library
exposes it via toExtracted + TopicNameExtractor. Useful when
you want “dynamic topology” without actually mutating the
compiled topology.
Transactional producer
Section titled “Transactional producer”A Kafka producer with a transactional.id that supports
beginTransaction / commitTransaction / abortTransaction /
sendOffsetsToTransaction. The foundation of EOS-V2.
Tumbling window
Section titled “Tumbling window”A fixed-size, non-overlapping window. A 5-minute tumbling window produces a new window every 5 minutes; each record belongs to exactly one window.
Two-phase commit sink
Section titled “Two-phase commit sink”A Riffle sink interface that commits external-system
writes atomically with the Kafka commit cycle.
Five operations: tpsStage (per-record buffer), tpsPrepare
(promote batch to “prepared”), tpsCommit (atomically make
visible), tpsAbort (discard), tpsRecover (resolve half-
committed txns on restart). Lives in
Kafka.Streams.Sinks.TwoPhase.
WAL (Write-ahead log)
Section titled “WAL (Write-ahead log)”A durable log of intended changes written before the changes are applied to the underlying store. Under Riffle snapshot stores the changelog topic is the WAL between snapshots: not the sole source of truth.
Watermark
Section titled “Watermark”A timestamp that the runtime guarantees no later
record will arrive before. Lets downstream operators decide when
a window can close, when state can be expired, when a stream-
stream join’s window is finalised. Derived from a
WatermarkGenerator (MonotonicTimestamps, BoundedOutOfOrderness,
CustomGenerator).
Watermark coordinator
Section titled “Watermark coordinator”The Riffle per-StreamsApp component that aggregates per-source
watermarks into the effective watermark (= min of live, non-idle
sources), handles idle-source detection, and
enforces alignment-group backpressure. Lives
in Kafka.Streams.Watermark.
Window
Section titled “Window”A bounded slice of event time over which a
stateful operator aggregates. Four shapes: tumbling,
hopping, sliding, and session. Each has its own
builder in Kafka.Streams.Window.
Glossary maintenance
Section titled “Glossary maintenance”If a term appears in the docs and isn’t defined here, add it. Prefer one-paragraph definitions; link out to deeper coverage rather than restating it.