Imagine your favorite grocery store has thousands of messages flying around every second — price changes, new orders, customer complaints, scanner beeps. How does it all get organized? Meet Apache Kafka and ZooKeeper!
FreshMart is a big grocery chain. Every second, hundreds of events happen. Let's see them all:
Without a system like Kafka, every department would have to directly talk to every other department — the inventory system would need a direct wire to the sales system, the complaint system, the displays... It's like every person in school having their own personal intercom to every other person. Total chaos! Kafka acts like the school's PA system — one broadcast, everyone who needs it hears it.
Think of ZooKeeper as the store manager at FreshMart. They don't do the actual work — they coordinate who does what.
Kafka is like a massive bulletin board at FreshMart's back office. Anyone can post a message, and any department can read it — at any time, without removing it.
A system that sends messages to Kafka. In our store: the POS scanner produces a "sale happened" message every time someone checks out.
A system that reads messages from Kafka. The inventory system reads sale messages to know when to reorder stock.
A named channel for messages. Like folders in a filing cabinet: sales, inventory, complaints.
A server computer that stores messages. FreshMart uses 3 brokers so if one breaks, the others keep running.
Each topic is split into partitions (slices). Like splitting a pizza — multiple people can eat at once, making it faster!
Each message gets a position number (offset). Consumers remember their offset so they know where they left off reading.
Each department's messages live in their own topic. Here are all of FreshMart's Kafka topics:
| Topic Name | Data Source | Message Rate | Who Reads It | Example Message |
|---|---|---|---|---|
| pos-sales | 🧾 Checkout Scanners | ~50/min | Inventory, Analytics, Loyalty | |
| inventory-updates | 🏷️ Warehouse Scanners | ~20/min | Displays, Vendors, Ordering System | |
| customer-complaints | 😤 App + Help Desk | ~3/min | Manager Alerts, CRM, Analytics | |
| vendor-deliveries | 🚚 Dock Scanners | ~5/hour | Inventory, Accounting, Ordering | |
| display-updates | 🖥️ Price Management | ~10/min | All Digital Signs | |
| daily-sales-report | 📊 Analytics Engine | 1/day | Management, HQ, Finance |
ETL is like cooking a meal from raw ingredients. You collect ingredients (Extract), prepare them (Transform), then serve them (Load).
Pull raw data from all sources: POS scanners, inventory sensors, complaint forms, vendor APIs. Kafka delivers these messages in real time.
Clean, combine, and reshape data. Convert timestamps, merge sale + product tables, flag low-stock alerts, calculate daily totals.
Put the clean, ready data into databases, dashboards, or reporting systems — where humans can use it to make decisions.
DW_SALES_FACT table. Dashboard refreshes. Manager sees "Produce: +12% vs yesterday." Vendor email queued for apple reorder.Tracks every product's quantity. When milk drops below 20 gallons, it fires an alert. This feeds the reorder system AND the digital dairy-aisle display automatically.
Every checkout beep sends a message. Multiple consumers read this same message: inventory decrements stock, analytics counts revenue, loyalty system adds points.
When a customer reports rotten strawberries in the app, a message fires. If 3+ complaints about the same SKU hit in 30 min, it auto-alerts the store manager.
When the delivery truck's barcode is scanned at the dock, Kafka records it. Inventory auto-updates. The accounting system also reads this to match invoices.
The electronic price signs CONSUME messages from Kafka. When a sale is triggered in the pricing system, every sign in the produce aisle updates within seconds — no human needed!
The analytics engine is both a consumer AND a producer! It reads from sales, transforms the data, then produces a summary to the daily-sales-report topic for management.
Here's how everything connects. ZooKeeper coordinates Kafka, which manages all the messages flowing between every system.