Kobayashi — animated data flow

1 s write 1 m write 1 h write query result ▮ aging file ▮ expired (unlink)
meters collector TLS, persistent streaming tier kafka producer · 1s blocks Kafka kafka consumer emits 1 s · 1 m · 1 h blocks Kobayashi backend router 1 s 1 m 1 h Riak — SSD cluster 1 s data · bitcask_expiry_secs Riak — HDD cluster 1 m + 1 h · EventHorizon tombstones t = 00:00 SSD cluster — 1 s data bitcask_expiry_secs = 20 (Bitcask config) file rotation every ~4 s HDD cluster — 1 m + 1 h data EventHorizon tombstones, same Bitcask mod → same unlink behavior file rotation every ~12 s · expiry ≈ 60 s
Petrie query

Find 45 minutes of total traffic seen on meters 1, 2, 226, & 301 starting 18 hours ago, broken down by minute and peer ip, retain top 10 by the ratio of retransmits to packets.

get volume_1s_meter_ip [
  meter in {1, 2, 226, 301};
  epochMillis from -18h for 45m;
]
categorize
  sum(ingress)            as ingress,
  sum(egress)             as egress,
  sum(ingressPackets + egressPackets) as packets,
  sum(retransmits)        as retransmits,
  mean(appRttUsec/1000) as appRttMs
by
  epochMillis/60/1000*60*1000 as ts_minute, ip
retain
  top 10
  per ts_minute
  on retransmits/packets
Cube storage — volume_1s_meter_ip major-time × meter-partition · each cell ≈ 80 KB pre-blocked tuples 45 min window Time → Step 1: planner computes key set major/minor/partition alignment → deterministic block keys 15 keys (limit 60) — accepted Kobayashi Petrie parser get · categorize · retain QueryPlanner cost = # keys, preflight reject Multiget streams as keys land reads visible blocks (see §3) Customer dashboard line chart / CSV export over arbitrary time range

What you're watching. Pre-blocked write tuples leave the meters, transit the collector, and reach the streaming tier — which acts as the Kafka producer. The Kafka topic then feeds a single Kafka consumer that emits three resolution-tagged block streams in parallel: 1 s, 1 m, and 1 h. All three rails hit Kobayashi, which routes by major-block size — 1 s goes to the SSD Riak cluster, 1 m and 1 h to the HDD cluster. Each cluster's Bitcask data files appear in section 2, newest at the top of each stack. Files turn amber as they approach their retention horizon and red the moment they cross it — at which point the modified Bitcask unlinks the whole file in one filesystem op. The 1 s tier reaches that state via bitcask_expiry_secs aging; the 1 m tier reaches it via EventHorizon tombstones — same outcome, one storage-engine modification. Click Issue Petrie query to send a query: parser → planner → multiget, then a fan-out read against the visible files (briefly highlighted) and a result point landing in the dashboard chart. Chart values follow a sum of sine waves so the series has an organic, seasonal shape.