Our team’s Scrum process relies heavily on an in-house real-time retrospective board. But over the last few Sprints, a critical technical pain point became impossible to ignore: network fluctuations or brief backend service restarts would sever the frontend WebSocket connection, causing users to lose their input and disrupting the entire team’s flow. Worse yet, the UI provided virtually no feedback on the connection status, leaving team members to refresh the page and simply hope for the connection to be restored. A seemingly simple WebSocket connection had revealed its fragility in a production environment.
The initial implementation was dangerously naive:
// A simplified version of our old implementation
import React, { useEffect, useState } from 'react';
function OldFlakyBoard() {
const [messages, setMessages] = useState([]);
const ws = new WebSocket('wss://api.example.com/retrospective');
useEffect(() => {
ws.onmessage = (event) => {
const newMessage = JSON.parse(event.data);
setMessages(prev => [...prev, newMessage]);
};
ws.onclose = () => {
// What to do here? Just log it?
console.error('WebSocket disconnected.');
};
return () => {
ws.close();
};
}, []);
// ... render logic
}
The problems with this code are obvious: it handles no edge cases. Once the onclose event fires, the connection is dead for good unless the user manually refreshes. This is completely unacceptable in a high-availability collaborative scenario.
Conception and Design: A Robust Communication Abstraction Layer
The root of the problem was the tight coupling of WebSocket management logic with our UI components and the lack of a clear state machine to describe the connection’s lifecycle. To solve this, my vision was to create an independent, reusable React Hook: useWebSocket.
This Hook had to meet several core requirements:
- State Machine Management: Explicitly manage the four connection states:
CONNECTING,OPEN,CLOSING, andCLOSED. - Automatic Reconnection: When the connection drops unexpectedly, it must automatically attempt to reconnect using an Exponential Backoff strategy to avoid overwhelming a recovering server with DDoS-like requests.
- Message Buffering: Messages sent by the user during a disconnect should not be discarded. They must be queued and sent sequentially once the connection is re-established.
- Clean API: Provide a simple interface to components, exposing the connection status, the latest received message, and a safe
sendMessagemethod. - Visual State Feedback: Use Sass/SCSS to strongly link the connection state to UI elements, giving users immediate and clear visual feedback.
For our tech stack, we decided against introducing third-party libraries like socket.io. While they offer out-of-the-box reconnection features, we wanted 100% control over the reconnection logic and message buffering strategy, while also keeping our stack lean. The native WebSocket API, combined with the powerful state management capabilities of React Hooks, was more than sufficient to build the resilience layer we needed.
Setting Up a Mock Server Environment
Before starting the frontend implementation, we needed a WebSocket server that could simulate an unstable network environment. A simple one can be quickly built using Node.js and the ws library. The key feature is that it can be manually restarted to test the client’s reconnection logic.
server.js:
// A simple WebSocket server for testing resilience
const WebSocket = require('ws');
const PORT = process.env.PORT || 8080;
const wss = new WebSocket.Server({ port: PORT });
let clientCounter = 0;
wss.on('connection', (ws) => {
clientCounter++;
const clientId = clientCounter;
console.log(`[Server] Client #${clientId} connected.`);
// Broadcast to all clients
const broadcast = (message) => {
wss.clients.forEach((client) => {
if (client.readyState === WebSocket.OPEN) {
client.send(JSON.stringify(message));
}
});
};
broadcast({
type: 'user_join',
payload: { id: clientId, count: wss.clients.size },
timestamp: new Date().toISOString()
});
ws.on('message', (message) => {
try {
const parsedMessage = JSON.parse(message);
console.log(`[Server] Received from Client #${clientId}:`, parsedMessage);
// Echo back the message with server timestamp
const response = {
...parsedMessage,
meta: {
serverTimestamp: new Date().toISOString(),
processedBy: 'main_server'
}
};
broadcast(response);
} catch (error) {
console.error(`[Server] Error parsing message from Client #${clientId}:`, error);
}
});
ws.on('close', () => {
console.log(`[Server] Client #${clientId} disconnected.`);
broadcast({
type: 'user_leave',
payload: { id: clientId, count: wss.clients.size },
timestamp: new Date().toISOString()
});
});
ws.on('error', (error) => {
console.error(`[Server] WebSocket error for Client #${clientId}:`, error);
});
});
console.log(`[Server] WebSocket server started on ws://localhost:${PORT}`);
console.log('[Server] Press CTRL+C to stop the server.');
process.on('SIGINT', () => {
console.log('\n[Server] Shutting down gracefully...');
wss.close(() => {
console.log('[Server] All connections closed.');
process.exit(0);
});
});
This server is basic but perfectly adequate for testing. We can start it with node server.js and then stop and restart it at will with CTRL+C during testing.
Core Implementation: The useWebSocket Custom Hook
This is the heart of the entire solution. We’ll create a useWebSocket.js file to house all our connection management logic.
// src/hooks/useWebSocket.js
import { useState, useEffect, useRef, useCallback } from 'react';
// Define connection states as constants for clarity and to avoid magic strings
export const ReadyState = {
CONNECTING: 0,
OPEN: 1,
CLOSING: 2,
CLOSED: 3,
};
const useWebSocket = (url, options = { retry: 5, retryInterval: 3000 }) => {
const [lastMessage, setLastMessage] = useState(null);
const [readyState, setReadyState] = useState(ReadyState.CLOSED);
// Use useRef to hold WebSocket instance, message queue, and retry logic state.
// This prevents re-renders from recreating them.
const ws = useRef(null);
const messageQueue = useRef([]);
const retryCount = useRef(0);
const sendMessage = useCallback((message) => {
const formattedMessage = JSON.stringify(message);
if (readyState === ReadyState.OPEN && ws.current) {
ws.current.send(formattedMessage);
} else {
// If the connection is not open, buffer the message.
console.warn('[useWebSocket] Connection not open. Buffering message:', message);
messageQueue.current.push(formattedMessage);
}
}, [readyState]);
const connect = useCallback(() => {
if (ws.current && ws.current.readyState !== ReadyState.CLOSED) {
// Prevent multiple connection attempts
return;
}
setReadyState(ReadyState.CONNECTING);
ws.current = new WebSocket(url);
ws.current.onopen = () => {
console.log(`[useWebSocket] Connection opened to ${url}`);
setReadyState(ReadyState.OPEN);
retryCount.current = 0; // Reset retry counter on successful connection
// Flush message queue on successful connection
if (messageQueue.current.length > 0) {
console.log(`[useWebSocket] Flushing ${messageQueue.current.length} buffered messages.`);
messageQueue.current.forEach((msg) => ws.current?.send(msg));
messageQueue.current = []; // Clear the queue
}
};
ws.current.onmessage = (event) => {
// In a real project, you'd likely want more complex logic here,
// perhaps a reducer to manage a list of messages.
// For this hook, we just expose the last message.
setLastMessage(event);
};
ws.current.onerror = (error) => {
console.error('[useWebSocket] WebSocket error:', error);
// The onclose event will be fired subsequently, which handles reconnection.
};
ws.current.onclose = (event) => {
console.warn(`[useWebSocket] Connection closed. Code: ${event.code}, Reason: ${event.reason}`);
setReadyState(ReadyState.CLOSED);
// Only attempt to reconnect if the closure was unexpected.
// 1000 is a normal closure.
if (event.code !== 1000 && retryCount.current < options.retry) {
retryCount.current++;
const timeout = options.retryInterval * Math.pow(2, retryCount.current - 1); // Exponential backoff
console.log(`[useWebSocket] Attempting to reconnect in ${timeout / 1000}s (Attempt ${retryCount.current}/${options.retry}).`);
setTimeout(connect, timeout);
} else if (retryCount.current >= options.retry) {
console.error(`[useWebSocket] Reached max retry attempts (${options.retry}). Giving up.`);
}
};
}, [url, options.retry, options.retryInterval]);
// The main effect to initiate and clean up the connection
useEffect(() => {
connect();
// Cleanup function to close the connection when the component unmounts
return () => {
if (ws.current) {
retryCount.current = options.retry + 1; // Prevent reconnection on unmount
setReadyState(ReadyState.CLOSING);
ws.current.close(1000, 'Component unmounting');
console.log('[useWebSocket] Connection closed on component unmount.');
}
};
// The dependency array should be stable. connect is wrapped in useCallback.
}, [connect, options.retry]);
return { sendMessage, lastMessage, readyState };
};
export default useWebSocket;
Several key considerations went into this Hook’s design:
- The Central Role of
useRef: The WebSocket instance (ws.current), message queue (messageQueue.current), and retry counter (retryCount.current) are all stored inuseRef. This is critical because changes to arefdo not trigger a component re-render. It acts like an instance variable, persisting its state across renders, making it perfect for managing these non-UI states. - The Necessity of
useCallback:sendMessageandconnectare wrapped inuseCallbackto ensure their references remain stable between renders. This is important for performance optimization and preventing theuseEffecthook from firing unnecessarily. - Exponential Backoff Strategy: The line
setTimeout(connect, options.retryInterval * Math.pow(2, retryCount.current - 1))is the core of our exponential backoff. The first retry happens after 3 seconds, the second after 6, the third after 12, and so on. This gives the server time to recover and prevents the client from wasting resources. - Graceful Shutdown: In the
useEffectcleanup function, we explicitly callws.current.close(1000, ...). Status code 1000 signifies a normal closure, which prevents the reconnection logic in theonclosehandler from triggering. SettingretryCountbeyond the maximum serves as a double-check.
State Visualization: The Role of Sass/SCSS
With the robust connection logic in place, the next step is to provide clear visual feedback to the user. We’ll create a StatusIndicator component whose styling is dynamically managed by Sass.
src/components/StatusIndicator.js:
import React from 'react';
import './StatusIndicator.scss';
import { ReadyState } from '../hooks/useWebSocket';
const StatusIndicator = ({ readyState }) => {
const statusTextMap = {
[ReadyState.CONNECTING]: 'Connecting...',
[ReadyState.OPEN]: 'Connected',
[ReadyState.CLOSING]: 'Closing...',
[ReadyState.CLOSED]: 'Disconnected',
};
const getStatusString = (state) => {
switch (state) {
case ReadyState.CONNECTING: return 'connecting';
case ReadyState.OPEN: return 'open';
case ReadyState.CLOSING: return 'closing';
case ReadyState.CLOSED: return 'closed';
default: return 'unknown';
}
};
const statusString = getStatusString(readyState);
return (
<div className="status-indicator" data-status={statusString}>
<div className="status-indicator__light"></div>
<span className="status-indicator__text">{statusTextMap[readyState] || 'Unknown Status'}</span>
</div>
);
};
export default StatusIndicator;
The key here is passing the connection state to the DOM via the data-status attribute. Now, Sass can leverage this attribute to apply different styles.
src/components/StatusIndicator.scss:
// Define a map for status colors for easy maintenance
$status-colors: (
connecting: #f39c12, // orange
open: #2ecc71, // green
closing: #e67e22, // dark orange
closed: #e74c3c // red
);
.status-indicator {
display: flex;
align-items: center;
padding: 8px 12px;
border-radius: 4px;
background-color: #34495e;
color: #ecf0f1;
font-family: sans-serif;
font-size: 14px;
transition: background-color 0.3s ease;
&__light {
width: 12px;
height: 12px;
border-radius: 50%;
margin-right: 8px;
transition: background-color 0.3s ease, box-shadow 0.3s ease;
background-color: #7f8c8d; // Default color
}
// Loop through the map to generate styles for each status
@each $status, $color in $status-colors {
&[data-status='#{$status}'] {
.status-indicator__light {
background-color: $color;
box-shadow: 0 0 8px 0 rgba($color, 0.7);
}
}
}
// Add specific animations for connecting state
&[data-status='connecting'] {
.status-indicator__light {
animation: pulse 1.5s infinite ease-in-out;
}
}
}
@keyframes pulse {
0% {
transform: scale(0.9);
opacity: 0.7;
}
50% {
transform: scale(1.1);
opacity: 1;
}
100% {
transform: scale(0.9);
opacity: 0.7;
}
}
The advantages of using Sass here are on full display:
-
$status-colorsMap: Defining states and colors in one place makes theme changes or color adjustments trivial. -
@eachLoop: Instead of manually writing repetitive CSS rules for each state, the@eachloop automatically generates all the[data-status]selectors, leading to cleaner and more maintainable code. - Animation: A simple
pulseanimation is added for theconnectingstate. This subtle dynamic effect significantly improves the user experience, letting the user know the system is actively working.
Integrating into the Retrospective Board Component
Now, we can integrate the useWebSocket Hook and the StatusIndicator component into our main application component.
src/App.js:
import React, { useState, useEffect } from 'react';
import useWebSocket, { ReadyState } from './hooks/useWebSocket';
import StatusIndicator from './components/StatusIndicator';
const WS_URL = 'ws://localhost:8080';
function App() {
const [messages, setMessages] = useState([]);
const [inputValue, setInputValue] = useState('');
const { sendMessage, lastMessage, readyState } = useWebSocket(WS_URL, {
retry: 10,
retryInterval: 5000
});
useEffect(() => {
if (lastMessage !== null) {
// Parse the data and add it to our message list
const data = JSON.parse(lastMessage.data);
setMessages((prev) => [...prev, data]);
}
}, [lastMessage]);
const handleSendMessage = () => {
if (inputValue.trim() === '') return;
const message = {
type: 'retrospective_item',
payload: {
text: inputValue,
author: 'User'
},
timestamp: new Date().toISOString()
};
console.log('[App] Sending message:', message);
sendMessage(message);
setInputValue('');
};
return (
<div className="app-container">
<header>
<h1>Real-time Scrum Retrospective Board</h1>
<StatusIndicator readyState={readyState} />
</header>
<main className="message-area">
{messages.map((msg, idx) => (
<div key={idx} className={`message ${msg.type}`}>
<pre>{JSON.stringify(msg, null, 2)}</pre>
</div>
))}
</main>
<footer>
<input
type="text"
value={inputValue}
onChange={(e) => setInputValue(e.target.value)}
onKeyPress={(e) => e.key === 'Enter' && handleSendMessage()}
placeholder="Type your feedback..."
// Disable input when not connected, a good UX practice
disabled={readyState !== ReadyState.OPEN}
/>
<button onClick={handleSendMessage} disabled={readyState !== ReadyState.OPEN}>
Send
</button>
</footer>
</div>
);
}
export default App;
The entire application is now far more robust. You can try running the frontend app and the backend server, then manually stop (CTRL+C) and restart the server. You’ll observe:
- The UI’s status indicator will transition from “Connected” (green) to “Disconnected” (red), and then to “Connecting…” (orange, with a pulsing animation).
- During the disconnect, the input field and send button will be disabled. If your logic allowed input, messages you send would be buffered by the
useWebSockethook. - As soon as the server restarts and becomes available, the status indicator will turn “Connected” again, and any buffered messages will be sent automatically.
Visualizing the Connection State Machine
To better understand the logical flow inside the useWebSocket hook, we can map out its state transition diagram using Mermaid.js.
stateDiagram-v2
[*] --> CLOSED: Initial State
CLOSED --> CONNECTING: connect() called or retry triggered
CONNECTING --> OPEN: onopen event
CONNECTING --> CLOSED: onclose/onerror event (connection failed)
OPEN --> CLOSING: component unmounts or disconnect() called
OPEN --> CLOSED: onclose/onerror event (connection dropped)
CLOSING --> CLOSED: onclose event
note right of OPEN
Message queue is flushed.
Messages are sent directly.
end note
note left of CLOSED
If closure was unexpected,
a timer for reconnection
is set with exponential backoff.
Outgoing messages are buffered.
end note
This diagram clearly illustrates how the connection transitions between states and the key events that trigger these changes. Visual aids like this are invaluable during technical design reviews with the team.
Limitations and Future Optimization Paths
This solution effectively solved the core pain point our team faced during Scrum retrospectives, but it’s not perfect. From a senior engineer’s perspective, there are still edge cases and iterative improvements to consider:
- Lack of a Message Acknowledgment Mechanism: The current implementation is “fire and forget.” While we have client-side buffering, a message can still be lost if the server crashes after receiving it but before processing it. a complete solution would require the server to send an ACK/NACK for each message, allowing the client to safely remove it from the buffer queue. This would significantly increase complexity on both the client and server.
- Risk of a “Thundering Herd” Effect: If a server outage causes a large number of clients to disconnect simultaneously, they may all attempt to reconnect within the same time window when the server comes back online. Although exponential backoff staggers subsequent attempts, the initial reconnection rush could still overwhelm the server. Introducing a random jitter to the timeout (
timeout = baseInterval * 2^n + random(0, 1000)) can mitigate this issue. - Abstracting a Shared Connection: Currently,
useWebSocketcreates a new WebSocket connection for each component that calls it. In a more complex application, multiple components might need to share a single WebSocket connection. This would require lifting the connection management logic into a React Context or a dedicated state management library (like Zustand or Redux), turninguseWebSocketinto a consumer of that shared connection. - Heartbeat Detection: Some network intermediaries (like NAT gateways or firewalls) may close TCP connections that have been idle for too long. Implementing a client-server heartbeat (ping/pong) mechanism can maintain the connection’s liveness and detect “zombie connections” faster than TCP keep-alives.
Despite these potential areas for improvement, the current implementation is a massive step forward. It introduces a robust, predictable, and user-friendly real-time communication layer to our frontend application, ensuring our Scrum process proceeds smoothly. This journey from identifying a pain point to solving it through layered abstraction and meticulous state management is, in itself, a valuable engineering practice.