How would you design a highly performant web app that renders large data tables with real-time updates

Virtualize rows (and columns if wide), paginate/window the data, push heavy sort/filter/aggregation to the server or a Web Worker, apply real-time updates as targeted patches with batching/throttling, and keep the UI responsive with memoization and stable references.

7 min read·~30 min to think through

A large data table with real-time updates stresses both rendering (too many cells) and update throughput (constant changes). The design has to handle both without jank.

1. Rendering — never render the whole table

Row virtualization — render only the visible window + overscan (@tanstack/react-virtual). DOM stays at ~30 rows regardless of dataset size.
Column virtualization too if the table is very wide.
Sticky headers/columns layered on top of the virtualized body.
Pagination or windowed fetching — don't even load all rows; fetch pages/cursors on demand or load incrementally.
Variable row heights → measurement cache + position index.

2. Data operations — keep heavy work off the main thread

Sorting/filtering/grouping/aggregating large datasets synchronously freezes the UI.

Server-side sort/filter/paginate when the dataset is large or shared — the server returns just the page you need. Best default for "millions of rows."
Web Worker for client-side ops on big in-memory datasets.
If on the main thread, chunk and yield, and debounce filter inputs.
Memoize derived data; use useDeferredValue so typing in a filter stays responsive.

3. Real-time updates — the hard part

Updates arriving constantly can cause a re-render storm.

Targeted patches — update only the changed rows/cells, keyed by id; never replace the whole dataset (which would re-render everything).
Batch & throttle — coalesce bursts of updates into one render per frame/interval instead of one render per message.
Normalize the data ({ [id]: row }) so a patch is an O(1) lookup, not an array scan.
Off-screen updates are cheap — virtualization means only visible changed rows actually re-render; off-screen patches just update the store.
Subtle UX — flash/highlight changed cells, but don't reorder rows out from under the user mid-scroll unless they asked to sort.
WebSocket/SSE for the stream, with reconnection + resync (refetch current page on reconnect).

4. React-level performance

Memoize rows (React.memo) keyed by id so unchanged rows skip re-render.
Stable references for callbacks/props passed into rows.
Selector-based store (Zustand/Redux) so a cell update notifies only that row's subscribers.
Avoid inline functions/objects in the row render path.

5. Putting it together

Server: paginated + sorted + filtered query  ─┐
WebSocket: row patches ───────────────────────┤→ normalized store {id: row}
                                               │
useVirtualizer → renders visible window ───────┘
  └ memoized <Row> subscribes to its own id's slice
  └ batched/throttled update flush

The framing

"Two pressures: rendering and update throughput. Rendering — virtualize rows/columns and page the data so DOM size is constant. Throughput — apply real-time changes as targeted, batched patches against a normalized store, so only visible changed rows re-render. Heavy sort/filter goes server-side or to a Web Worker. Plus row memoization and stable references. The principle is: bound the DOM, bound the work per frame, and patch surgically."

Follow-up questions

•How do you stop a stream of real-time updates from causing a re-render storm?
•When do you push sort/filter to the server vs a Web Worker vs the main thread?
•Why normalize the row data, and how does it help with patches?
•How do real-time row updates interact with virtualization?

Common mistakes

•Rendering all rows/cells and freezing the browser.
•Replacing the whole dataset on every update instead of patching by id.
•One re-render per incoming message instead of batching.
•Sorting/filtering huge datasets synchronously on the main thread.
•Not memoizing rows, so any update re-renders every visible row.

Performance considerations

•Virtualization caps DOM and reconciliation cost. Normalized store + targeted patches make updates O(1) and limit re-render blast radius. Batching/throttling bounds renders-per-second. Server-side or worker-side data ops keep the main thread free.

Edge cases

•An update for a row that's currently off-screen (cheap — just patch the store).
•A real-time update that changes sort order while the user is scrolling.
•Reconnection — resync the current page after missed updates.
•Variable row heights with live content changes.

Real-world examples

•Trading dashboards, observability tables, analytics grids with live-updating rows.
•@tanstack/react-virtual + a normalized Zustand store + batched WebSocket patches.

Senior engineer discussion

Seniors decompose it into rendering pressure (virtualize + page) and update throughput (normalized store, targeted batched patches, throttling), and place heavy data ops server-side or in a worker. They explain how virtualization makes off-screen updates nearly free and use memoized rows + selector stores to bound the re-render blast radius.