Streaming JSON in just 200 lines of JavaScript

Streaming JSON in just 200 lines of JavaScript

I was continueing my exploration of React server components when I stumbled upon on this article about progressive JSON. Dan Abramov describes a technique for streaming JSON from a server to a client in chunks, allowing the client to start rendering parts of the data before the entire payload has been received. This can significantly improve perceived performance, especially for large datasets. So, I started wondering how much of an effort it would be to implement something like that. It turned out a fun exercise, and I ended up with a small (200loc) library called Streamson. So, this post is about how I built it.

The idea

The idea behind the progressive JSON streaming is to send parts of the JSON data as soon as they are available, rather than waiting for the entire JSON structure to be ready. This is particularly useful for large datasets or when the data is being generated on-the-fly. For the things that are not there yet, we can send placeholders that the client can later replace with the actual data when it arrives. For example:

{
  "user": {
    "id": 1,
    "name": "John Doe",
    "posts": [
      { "id": 101, "title": "First Post", "content": "..." },
      { "id": 102, "title": "Second Post", "content": "..." }
    ]
  }
}

Let's say that we have the information about the user right away, but the posts are being fetched from a database and will take some time. Instead of waiting for the posts to be ready, we can send a placeholder for the posts field:

{
  "user": {
    "id": 1,
    "name": "John Doe",
    "posts": "_$1"
  }
}

Then when the posts are ready, we can send them as a separate chunk:

{
  "_$1": [
    { "id": 101, "title": "First Post", "content": "..." },
    { "id": 102, "title": "Second Post", "content": "..." }
  ]
}

Something on the client side needs to be able to handle these placeholders and replace them with the actual data when it arrives.

Implementing the server

Let's start with a simple function that will accept the server response object (our gateway to the client) and a data object that we want to send:

function serve(res, data) {
  res.setHeader("Content-Type", "application/x-ndjson; charset=utf-8");
  res.setHeader("Transfer-Encoding", "chunked");

  // sending chunks to the client
  res.write(JSON.stringify(...) + "\n");
  res.write(JSON.stringify(...) + "\n");

  // when done
  res.end();
}

There are a couple of interesting things to note here:

  1. We are using the application/x-ndjson content type. NDJSON (Newline Delimited JSON) is a convenient format for streaming JSON data, where each line is a valid JSON object. This allows us to send multiple JSON objects in a single response, separated by newlines.

  2. We are using Transfer-Encoding: chunked header. This tells the client that the response will be sent in chunks, and the client should not expect a Content-Length header. Also to keep the connection alive until we call res.end().

Next, we need to chunk our data. We can do this by traversing the data object and replacing parts of it with placeholders. When we encounter a part that needs to be sent later (a promise), we can store it in a queue and send it as a separate chunk when it's ready. Here's the function that I used:

function normalize(value) {
  function walk(node) {
    if (isPromise(node)) {
      const id = getId();
      registerPromise(node, id);
      return id;
    }
    if (Array.isArray(node)) {
      return node.map((item) => walk(item));
    }
    if (node && typeof node === "object") {
      const out = {};
      for (const [key, val] of Object.entries(node)) {
        out[key] = walk(val);
      }
      return out;
    }
    return node;
  }
  return walk(value);
}

This function recursively walks through the data object. When it encounters a promise, it generates a unique placeholder ID, registers the promise for later resolution, and returns the placeholder. For arrays and objects, it recursively processes their elements or properties. Primitive values are returned as-is.

The registerPromise function stores the promise along with its placeholder ID in a queue. When the promise resolves, we can send the resolved data as a separate chunk to the client.

let promises = [];

function registerPromise(promise, id) {
  promises.push({ promise, id });
  promise.then((value) => {
    send(id, value);
  }).catch((err) => {
    console.error("Error resolving promise for path", err);
    send(id, { error: "promise error", timeoutMs: TIMEOUT });
  });
}

The send function is responsible for sending the resolved data to the client:

function send(id, value) {
  res.write(JSON.stringify({ i: id, c: normalize(value) }) + "\n");
  promises = promises.filter((p) => p.id !== id);
  if (promises.length === 0) res.end();
}

The send function writes a new chunk to the response, containing the placeholder ID and the normalized value. It then removes the resolved promise from the queue. If there are no more pending promises, it ends the response effectively closing the connection to the client.

The full implementation could be found here.

To close this part off, here's an example of an object that we can send from the server:

const data = {
  user: {
    id: 1,
    name: "John Doe",
    posts: fetchPostsFromDatabase(), // returns a promise
  },
};

async function fetchPostsFromDatabase() {
  const posts = await database.query("SELECT * FROM posts WHERE userId = 1");
  return posts.map((post) => ({
    id: post.id,
    title: post.title,
    content: post.content,
    comments: fetchCommentsForPost(post.id), // returns a promise
  }));
}

Notice how each post has a comments field that is also a promise. This means that the comments will be sent as a separate chunk after the posts are sent.

Implementing the client

On the client side, we need to handle the incoming chunks and replace the placeholders with the actual data. We can use the Fetch API to make a request to the server and read the response as a stream. Every time when we encounter a placeholder we replace it with a promise that will be resolved when the actual data arrives. The key logic looks like that:

try {
    const res = await fetch(endpoint);
    const reader = res.body.getReader();
    const decoder = new TextDecoder();

    async function process() {
      let done = false;
      while (!done) {
        const { value, done: readerDone } = await reader.read();
        done = readerDone;
        if (value) {
          try {
            const chunk = JSON.parse(decoder.decode(value, { stream: true }));
            chunk.c = walk(chunk.c);
            if (promises.has(chunk.i)) {
              promises.get(chunk.i)(chunk.c);
              promises.delete(chunk.i);
            }
          } catch (e) {
            console.error(`Error parsing chunk.`, e);
          }
        }
      }
    }
    process();
  } catch (e) {
    console.error(e);
    throw new Error(`Failed to fetch data from Streamson endpoint ${endpoint}`);
  }
}

The process function reads the response stream chunk by chunk. Each chunk is parsed as JSON, and the walk function is called to replace any placeholders with promises. If the chunk contains data for a previously registered placeholder ID, the corresponding promise is resolved with the received data. The key moment is reader.read(), which allows us to wait for new data.

Here's the walk function that replaces placeholders with promises:

function walk(node) {
  if (isPromisePlaceholder(node)) {
    return new Promise((done) => {
      promises.set(node, done);
    });
  }
  if (Array.isArray(node)) {
    return node.map((item) => walk(item));
  }
  if (node && typeof node === "object") {
    const out = {};
    for (const [key, val] of Object.entries(node)) {
      out[key] = walk(val);
    }
    return out;
  }
  return node;
}
function isPromisePlaceholder(val) {
  return typeof val === "string" && val.match(/^_\$(\d)/);
}

It is similar to the server-side normalize function. When it encounters a placeholder, it returns a new promise that will be resolved later when the actual data arrives. For arrays and objects, it recursively processes their elements or properties. Primitive values are returned as-is. Of course the ids should match those generated on the server side.

Again the full implementation could be found here. Both files combined are 155 lines of code 😎.

Of course there is a NPM package. Meet Streamson! 👨

Streaming JSON in chunks with placeholders is an interesting technique for improving the perceived performance of web applications, especially when dealing with large datasets or on-the-fly generated data. By sending parts of the data as soon as they are available, we can allow the client to start rendering content earlier, leading to a better user experience.

All you need is control over both the server and the client, and about 200 lines of JavaScript code.

I decided to package the code as a NPM library called Streamson. It can be installed via npm intall streamson. On the server you can use it like that:

import { serve } from "streamson";
import express from "express";

const app = express();
const port = 5009;

app.get("/data", async (req, res) => {
  const myData = {
    title: "My Blog",
    description: "A simple blog example using Streamson",
    posts: getBlogPosts(), // this returns a Promise
  }
  serve(res, myData);
});

app.listen(port, () => {
  console.log(`Example app listening on port ${port}`);
});

The client is basically 1KB of JavaScript that you can download from here https://unpkg.com/streamson@latest/dist/streamson.min.js. Once you bring it in your page, you'll see a global Streamson function that you can use like that:

const request = Streamson("/data");

const data = await request.get();
console.log(data.title); // "My Blog"

const posts = await request.get('posts');
console.log(posts); // Array of blog posts

Happy streaming!

If you enjoy this post, share it on Twitter, Facebook, LinkedIn, or Buy Me a Coffee at ko-fi.com

With over two decades of deep front-end expertise, I offer comprehensive web consultancy and stack audits, alongside specialized workshops, training, and engaging public speaking to elevate your team's skills and optimize your digital presence. Contact me.