Streaming: is it worth it?

Streaming is transmitting data as it’s created. Most pertinently, on the web, you can stream HTML to a browser as your server generates it.

The opposite of streaming is buffering: The server generates a page’s full HTML and only then sends the HTML to the browser.

In theory, streaming HTML is straightforward. But in practice, it has complications that create ordering dependencies between what information should be sent first, such as a page <title>, and where in the code it’s generated.

This post dives into these tradeoffs.

Benefits

For the fastest user experience, every millisecond counts. Every millisecond spent waiting on something is a millisecond wasted for the client.

In an ideal world, the moment a request comes in, the server should begin sending HTML. There are large portions of a page that are known before the database or backends return any data. For example, the page scaffolding likely contains HTML that remains constant between responses, like <script>s, styles, metadata, and the general site layout. If your server already knows this data, why not send it immediately?

Additionally, browsers can render partial HTML. This means the user can see and interact with the page as it streams in, rather than needing to wait for the full boot in one big chunk at the end. If the full page can’t be instant, streaming can still improve the user experience with progress updates.

Streaming also improves server overhead, because the servers don’t have to buffer entire pages. By incrementally flushing page data to browsers, servers keep memory pressure low, which lets them process more requests and save overhead costs. Streams also play nicely with Node’s backpressure handling.

Streaming is the right thing to do for your users and your wallet.

The basic requirement of frameworks

When you build sites with a frontend framework, in order to stream HTML, the framework needs to support streaming.

Few modern frameworks support HTML streaming, and even fewer do it well.

Modern frontend frameworks were built for client-side rendering, and so they mostly deal with DOM. But the DOM API isn’t conducive to streaming — it’s more random access. You can get SSR by simulating a DOM on the server[1] and run the framework in that, but such solutions will not produce streams or optimal performance.

The bare minimum needed for a framework is a rendering pipeline that directly creates an HTML stream, skipping an intermediate DOM/DOM-like step. This requires a linear execution of the application, roughly aligned with how the HTML output orders its tags. Such a rendering pipeline needs to be designed into the framework: it can’t easily be added afterwards.

More and more frameworks are gaining streaming capabilities, but basic streaming support is not enough; there are additional constraints and complications to consider.

Streaming constraints

Every response starts with a request. Here, we assume that each request needs a unique response from a backend data source — which I’m going to call “the database” for simplicity, but it could just as easily be API responses or anything else your frontend application server has to ask another computer for. (If no unique backend data is required, then the response would always be the same. It would be better to cache such a response publicly on a CDN.)

The basic problem: We want to start streaming immediately, even before the database responds. For a framework to support that, it needs to support some form of asynchronous rendering. The standardized way for JS to wait on data is Promises; so a framework must be able to pause rendering until a promise completes, then continue rendering.

Meta-frameworks and other data providers

While frameworks can handle the actual stream rendering, meta-frameworks are responsible for data coordination and fetching. How a meta-framework does this has implications for streaming. For example, some meta-frameworks fully resolve the database call before passing the result to the rendering framework for asynchronous rendering. Such solutions are not ideal as the server must wait before commencing streaming of the response.

Redux, the most common cohort of React, also has severe difficulties providing stream-friendly data. Other data providers such as GraphQL and the React Context API also can give you streaming footguns.

For the following streaming complications that arise with streaming, we’ll look at how meta-frameworks might solve them, and how that would have a huge impact on your site’s streaming performance.

Status code

The HTTP spec requires that before any response is sent, the server must first send a status code. This is hard for two reasons:

Deciding between a 404 or a 200 may depend on the response from the database. That means we either waste time waiting for the database to resolve, or we guess the status code and hope nothing goes wrong.
A 200 status may depend on successfully executing the application; otherwise, the response may be 500. But the whole point of streaming is to send data as the application executes. We don’t want to wait for full execution to know that no error was thrown.

In practice, it’s reasonable to cheat by sending 200 and continuing, but such decisions must be up to the developer; only the developer knows when such tradeoffs are acceptable.

This is the status information that’s helpful mid-stream:

Anything 1XX is already mid-response information, so it’s all good.
2XX is the success case so we probably don’t have to worry.
3XX can contain bodies but nowadays almost never. If you do need a mid-HTML redirect, there’s <meta http-equiv=refresh content="0;…"> or <script>location.href = '…'</script> or both.
4XX is the missing, important part of this story. Some aren’t a problem (400 Bad Request happens before response processing even begins), but important errors like 404 are.
5XX semantics can be conveyed with stream error mechanisms that vary with the HTTP version.

Your meta-framework must have facilities to let the developer make these choices — and many do not. Many meta-frameworks require that the database fully responds before rendering.

<head>

A page <head> contains a lot of metadata — most obviously, the <title>. The <title> is problematic for two reasons:

It needs to be streamed relatively early in the response
It usually depends on database response; as in, <title>Details for $product</title>.

These requirements are problematic for solutions that let application components set the <title> (like Helmet), because that means that streaming must pause until whatever component(s) responsible for that title information finish. Such an architecture fundamentally breaks streaming: Any data dependency can block the stream, even if the underlying framework can stream without problems.

In practice, this means that a meta-framework must let you send the <head> before later components finish fetching and resolving data. Again, many do not.

Styling

Like the <title>, styling is another chunk of data that’s usually in the <head>. Some frameworks like styled-components and Next.js make recommendations for CSS approaches. However, many others choose not to have opinions on styling, which lets third-party solutions flourish as developers try different approaches.

For example: CSS-in-JS (such as emotion) is a very popular approach. But if that approach means all components need to fully render before the <style> tags can be output, then that breaks streaming, as the framework is forced to buffer the whole response.

It turns out that <style>/<link rel=stylesheet> tags do not need to be in the <head>; they can be anywhere in the HTML. However, doing so creates other problems.

To be more specific, the styles do have to be moved back to the <head>. They only need to be available before the first rendered instance of a component. This means components can render as they stream in without having to be blocked by CSS for components later in the page.

Let’s assume that a component creates a <style> tag, and we render two instances of the component. For size reasons, identical styles are deduped: The first component instance gets the <style> tag, and the second instance does not. But when the application runs on the client, the first component instance may be behind an if statement and possibly omitted. Omitting that instance will also remove the styles for the other, ruining its visual appearance. For this reason, the framework needs to move the styles to the first possible render point — it cannot make CSS “some other library’s problem.” If a framework doesn’t have styling as a core primitive, it will complicate streaming.

Out-of-order streaming

If you think about it, styling is a form of out-of-order streaming. When streaming, some information — such as styles — is missing, but we continued rendering without it. Once the styles become available, we insert them midstream. On application boot, the framework moves the styles to the correct location, correcting the ordering on the client.

What if we could use the same trick for other kinds of data?

Let’s take the <title> again as an example. We could just render a generic placeholder <title> and continue streaming. Once title data becomes available, we can stream a tiny <script> that updates the page title to the desired value.

But it’s not just small items like titles. We could apply the same trick to large parts coming from the database. We could render <label>Loading… <progress></progress></label> and continue streaming the rest of the application. Once the data becomes available, a small <script> can be streamed to update the original document location with the resolved data. For example, Marko includes the <await> core tag to render markup asynchronously using a Promise.

Out-of-order streaming can send as much of the response to the client as possible without halting for missing data. Instead, we render a placeholder and continue pushing “static” or already-resolved portions. Once data resolves, we can backfill the missing parts.

Of course, such backfilling is much easier done with HTML frameworks — such as Qwik.

Conclusion

Streaming is hard because it requires that all of the parts in the system support it. It’s not just framework support, but also meta-framework support, and the developer must have the right tools to do it. If any one part is not ready, the stream clogs. (The most common blocking source is that the data is not yet available.) Ideally, we could do out-of-order streaming with the help of JavaScript, but it helps if frameworks are designed with out-of-order streaming in mind.

1. react-dom/server has a big hierarchical tree of element objects that emit and receive events from each other. That’s a simulated Document Object Model, even if it’s not the official spec.