Threadwork started as a simple threaded discussion tool. But users wanted more: they wanted to see who was online, who was typing, when messages were read. These 'simple' features turned into our biggest engineering challenge. Real-time isn't about WebSockets - it's about state synchronization.
“Real-time features are 10% WebSocket code and 90% state synchronization challenges.”
We evaluated Redux, Jotai, and Zustand for state management. Redux felt like overkill for our needs. Jotai's atomic model didn't map well to our highly relational data. Zustand hit the sweet spot: minimal boilerplate, excellent TypeScript support, and easy integration with React's concurrent features.
The Presence System
Showing who's online seems trivial: track connections, broadcast status changes. The complexity emerges in edge cases. What happens when a user opens multiple tabs? When their network flickers? When they go idle?
We implemented a heartbeat system with intelligent debouncing. Clients send heartbeats every 30 seconds. The server tracks the last heartbeat per user, not per connection. Status changes (online/away/offline) only broadcast when the aggregate state changes, not on every heartbeat.
Idle detection uses the Page Visibility API combined with mouse/keyboard activity tracking. After 5 minutes of no activity, users transition to 'away'. After 30 minutes, they appear offline. This provides useful presence information without being creepy about user monitoring.
Optimistic Updates Done Right
Sending a message should feel instant. But we can't just fire-and-forget - users need to know if their message actually delivered. Our solution: optimistic updates with status indicators.
When a user sends a message, it immediately appears in their UI with a 'sending' indicator. The actual send happens asynchronously. On success, the indicator updates to 'sent'. On failure, it shows an error with a retry option. The message never disappears - that would be jarring.
Zustand's middleware pattern made this clean. We created a 'pending actions' slice that tracks in-flight operations. The UI subscribes to both the messages slice and the pending slice, rendering the appropriate state.
Typing Indicators at Scale
Typing indicators seem simple: when someone types, tell everyone. But naive implementations create storms of WebSocket messages. With 50 people in a thread, that's thousands of 'user is typing' events per minute.
We batch and throttle. Clients only send 'typing' events once per 2 seconds while the user is actively typing. The server aggregates by thread and broadcasts a list of currently-typing users, not individual events. Clients use local timers to clear stale typing indicators.
Message Threading and AI Summaries
Threadwork's core innovation is treating every conversation as a structured thread, not a chat stream. Messages belong to threads. Threads belong to spaces. This hierarchy maps naturally to Zustand's nested state.
AI summaries run asynchronously when threads reach certain length thresholds. We use streaming responses so users see the summary build in real-time. The summary includes extracted decisions, action items, and key discussion points.
Lessons Learned
Real-time features are 10% WebSocket code and 90% state synchronization challenges. The hardest bugs weren't connection issues - they were race conditions between optimistic updates and server responses, or stale state from reconnection scenarios.
Zustand's simplicity was crucial. When debugging real-time state, you need to understand exactly what's happening. Complex state management adds cognitive overhead at exactly the wrong moment. Keep it simple, make it observable, test the edge cases obsessively.