II.
Workflow overview
Reference · liveworkflow:real-time-streaming-health-check
Real-Time Streaming Health Check overview
Monitors the health of real-time streaming pipelines across Kafka, Flink, and Spark Streaming deployments — checking consumer lag against acceptable thresholds per topic, verifying checkpoint and offset commit freshness, detecting schema registry compatibility violations before they cause deserialization failures, alerting on partition skew and rebalance storms, validating exactly-once semantics integrity through watermark tracking, and triggering automated remediation for common failure modes (stuck consumers, memory pressure, backpressure cascades). Produces streaming health dashboards, lag trend reports, and incident trigger logs. Excludes pipeline development.
Attributes
displayName
Real-Time Streaming Health Check
workflowKind
operational
triggerType
scheduled
typicalCadence
daily
complexity
single-team
description
Monitors the health of real-time streaming pipelines across Kafka, Flink,
and Spark Streaming deployments — checking consumer lag against acceptable
thresholds per topic, verifying checkpoint and offset commit freshness,
detecting schema registry compatibility violations before they cause
deserialization failures, alerting on partition skew and rebalance storms,
validating exactly-once semantics integrity through watermark tracking, and
triggering automated remediation for common failure modes (stuck consumers,
memory pressure, backpressure cascades). Produces streaming health
dashboards, lag trend reports, and incident trigger logs. Excludes pipeline
development.
Outgoing edges
applies_to_domain2
- domain:data-engineering·DomainData Engineering
- domain:observability·DomainObservability
involves_role3
- role:platform-engineer·Role
- role:data-scientist·RoleData Scientist
- role:debugger·RoleDebugger
performed_by_org_unit2
- org-unit:data-platform-team·OrgUnitData Platform Team
- org-unit:engineering·OrgUnitEngineering
requires_skill_area2
- skill-area:kafka-stream-processing·SkillAreaKafka Stream Processing
- skill-area:observability-pipeline·SkillAreaObservability Pipeline
triggers_responsibility2
- responsibility:data-quality-monitoring·ResponsibilityData quality monitoring
- responsibility:on-call-handoff·ResponsibilityOn-call handoff
Incoming edges
follows_workflow1
- stack-profile:stream-processing·StackProfileStream Processing Stack (Kafka, Flink, Schema Registry, Prometheus)