Made in Builder.io

Join us for an AI launch event by

Builder.io and Figma
Talk to Us
Product
Developers
Talk to Us

Blog

Home

Resources

Blog

Forum

Github

Login

Signup

×

Visual CMS

Drag-and-drop visual editor and headless CMS for any tech stack

Theme Studio for Shopify

Build and optimize your Shopify-hosted storefront, no coding required

Resources

Blog

Get StartedLogin

‹ Back to blog

Web Development

WTF Is Code Extraction

March 3, 2023

Written By Miško Hevery

We are full-stack developers! That means we write both client and server code. But where should we place the server and client code? Conventional wisdom says that we should put them in different files.

Except, it is not so simple; we also have code that runs both on the server and client. After all, we do Server-side rendering (SSR), so most of our client code also runs on the server.

I want to challenge the conventional wisdom and convince you that there is an existing trend of putting server and client code together and that it is better. Let’s call it: “code collocation.”

Next.js/Remix/SolidJS: it is already happening

The idea of placing server and client code together is not new, and it is already happening.

NextJS code example showing how well-known-exports can be tree shaked for client bundles.

Look at the above NextJS code. Notice the getStaticProps function. getStaticProps executes on the server only, whereas the default exported component executes on the client (and server as part of SSR/SSG.)

Because most of the code executes in both locations, I don’t think it makes much sense to separate it into different files; instead, what NextJS is doing here is a better developer experience.

NextJS is only one of many frameworks that does this. Most meta-frameworks have some mechanism for it based on export-extraction.

But we have a problem to solve. We need to provide code to the server and code to the client, and as of right now, server code can’t access the DOM API and client code can’t read server dependencies such as databases. So there needs to be a way to separate the code.

The act of separating the code and creating server and client code bundles is called code extraction. Three different strategies starting with the most basic to advanced ones, are:

  • Export extraction
  • Function extraction
  • Closure extraction

Let’s dive into them.

Export extraction is a way to remove server code from the client bundle by relying on the bundle tree-shaker behavior.

export const wellKnownName = () => {
  // server-only code.
  // There is NO reference to `wellKnownName` anywhere
  // in the client code.
}

export const ComponentA = () => {
  // client (and sometimes server) code.
  // This code is imported from other locations in the codebase.
}

A tree-shaker starts at a root (your application's main() function) and then traverses the references starting at the main() method recursively. Anything reachable is placed in the bundle. Anything which is not reachable is thrown away.

The ComponentA is reachable from the main() method and is therefore retained in the client bundle. (If it were not, why would you have it in your code base?)

The wellKnownName on the other hand is not reachable from the main() and is therefore removed from the bundle. The reason I named it wellKnownName in this example is that the exported name is not arbitrary. It is a name that the framework expects and can call it using reflection, which is why we refer to it as a well-known-export.

So if we bundle our codebase with tree-shaking ON, we end up with a client bundle. On the other hand, if we bundle our codebase with tree-shaking OFF, we end up with all code that is then used on the server.

The common pattern for export extraction is that the frameworks use it to load data for the route. The server function produces data that the client component consumes. So a closer example looks like this:

export const wellKnownLoader = () => {
  return {data: "some-data"};
}

export const MyComponent = ({data}) => {
  return <span>{data}<span>
}

To put it differently, what is happening is this:

<MyComponent ...{wellKnownLoader()}/>

The framework calls the wellKnownLoader and passes the return value to the component. The critical thing to understand is that you are not allowed to write that code! If you did, it would force the bundler to include the wellKnownLoader in the client-side bundle, and this would be bad because wellKnownLoader probably imports server-only code, such as a call to a database.

But we need a way to assert that the correct type-information is flowing between the wellKnownLoader and the MyComponent, so we typically write something like this:

export const wellKnownLoader = () => {
  return someData;
}

export const MyComponent = ({data}: ReturnType<typeof wellKnownLoader>) => {
  return <span>{data}<span>
}

The critical bit is ReturnType. This allows us to refer to the type information of the wellKnownLoader without referring to the wellKnownLoader. WAT? You see, TypeScript runs first, and TypeScript erases all the type references. So even though there is a type reference to wellKnownLoader there is no value reference. This bit is crucial because it allows us to refer to server types without causing the bundler to include the server code.

In summary, we rely on well-known-exports to refer to code on the server but lose the code on the client.

Export extraction is nice; what could be better than that? Well, there are two limitations of export-extraction:

  1. It has to be a well-known name.
  2. We have to remember to flow the type manually.

Let’s dive deep into the implications of both.

The fact that we have a well-known name is a problem because it means we can only have one server function per file and because only the framework can call that function. Wouldn’t it be nice to have multiple server functions per file and NOT be limited to only having the server call the function and then give us the data? For example, it would be nice to be able to call server code from user interaction. (Think RPC)

The second problem is that we have to manually flow the type, and we could, in theory, pass the wrong type there. Nothing prevents us from doing so, as in this example.

export const wellKnownLoader = () => {
  return someData;
}

export const MyComponent = ({data}: WRONG_TYPE_HERE) => {
  return <span>{data}<span>
}

So what we really want is this:

export const myDataLoader = () => {
  // SERVER CODE
  return dataFromDatabase();
}

export const MyComponent = () => {
  // No need to flow type manually. Can't get this wrong.
  const data = myDataLoader();

  return (
    <button onClick={() => {
        ((count) => {
          // SERVER CODE
          updateDatabase(count)
        })(1); 
      }}>
      {data}
    </button>
  );
}

Except, that breaks the tree-shaker as all of the server code is now included in the client and the client will try to execute server code which will blow up.

So how could we “mark” some code as “server”?

So there are two parts to this problem:

  1. Marking the “server” code
  2. Transforming the code into something which can separate the server code from the client code.

If we could turn this problem into the previous export-extraction problem, we would know how to separate the server client code. Enter a marker function!

A marker function is a function that allows us to label a piece of code for transformation.

Let’s rewrite the above code with SERVER() as a marker function:

export const myDataLoader = SERVER(() => {
  // SERVER CODE
  return dataFromDatabase();
});

export const MyComponent = () => {
  const data = myDataLoader();

  return (
    <button onClick={() => {
        SERVER((count) => {
          // SERVER CODE
          updateDatabase(count)
        })(1); 
      }}>
      {data}
    </button>
  );
}

Notice that we wrapped the server code in SERVER() function. We can now write an AST transform that looks for SERVER() function and translates it to something like this:

/*#__PURE__*/ SERVER_REGISTER('ID123', () => {
  return dataFromDatabase();
});

/*#__PURE__*/ SERVER_REGISTER('ID456', (count) => {
  updateDatabase(count)
});

export const myDataLoader = SERVER_PROXY('ID123');

export const MyComponent = () => {
  const data = myDataLoader();

  return (
    <button onClick={() => {
        SERVER_PROXY('ID456')(1); 
      }}>
      {data}
    </button>
  );
}

Our AST transform did a few things:

  1. It moved the code from the SERVER() into a new top-level location
  2. It assigned a unique id to each SERVER() function.
  3. The moved code got wrapped in SERVER_REGISTER().
  4. The SERVER_REGISTER() got /*#__PURE__*/ annotation.
  5. The SERVER()marker got transformed to SERVER_PROXY() with the unique id.

So let’s unpack this.

First, the /*#__PURE__*/ annotation is critical because it tells the bundler not to include this code in the client bundle. This is how we get to remove the server code from the client.

Second, the AST transform has moved the code from an inlined position to a top-level position, where it becomes subject to tree shaking.

Third, we have registered the moved function with the framework using the SERVER_REGISTER() function.

Forth, we allow the framework to provide a SERVER_PROXY() function, which allows it to “bridge” the client and server code through some form of RPC, fetch, etc.

Voilà! We can now sprinkle server code in the client and have the correct types flow through our system. Victory!

Well, we could do better. As of right now, we have hard-coded the AST transform to only recognize SERVER(). What if we could have a whole vocabulary of these marker functions? worker(), server(), log() and so on? Better yet, what if the developer could create their own? So we need a way for any function to trigger the transformation.

Enter the $ suffix. What if any function suffixed with $ (as in___$()) could trigger the above AST and perform such translation? Here is an example with webWorker$().

import {webWorker$} from 'my-cool-framework';

export function() {
  return (
    <button onClick={async () => {
              console.log(
                'browser', 
                await webWorker$(() => {
                  console.log('web-worker');
                  return 42;
                })
              );
           })}>
     click
   </button>
  );
}

Would become:

import {webWorkerProxy, webWorkerRegister} from 'my-cool-framework';

/*#__PURE__*/ webWorkerRegister('id123', () => {
  console.log('web-worker');
  return 42;
});

export function() {
  return (
    <button onClick={async () => {
              console.log('browser', await webWorkerProxy('id123'));
            })}>
     click
   </button>
  );
}

Now the AST transform allows the developer to create any marker function and have the developer assign their own meaning to it. All you have to do is export ___$, ___Register, and ___Proxy function, and you can create your own cool code! A marker function that runs code on the server, on a web worker, or…imagine the possibilities.

Well, there is one possibility that does not work. What if we want to have a function that lazy loads code?

import {lazy$} from 'my-cool-framework';
import {inovkeLayzCode} from 'someplace';

export function() {
  return (
    <button onClick={async () => lazy$(() => invokeLazyCode())}>
     click
   </button>
  );
}

The problem with lazy$() is that there is no way for it to get a hold of the code because the tree-shaker threw it away! So what we need is a slightly different strategy. Let’s restructure the code so that we can implement lazy$(). This will require moving our code to a different file for lazy loading instead of marking it as /*#__PURE__*/ for tree-shaking.

FILE: hash123.js

import {invokeLazyCode} from 'someplace';

export const id456 = () => invokeLazyCode();

FILE: original file

import {lazyProxy} from 'my-cool-framework';

export function() {
  return (
    <button onClick={async () => lazyProxy('./hash123.js', 'id456')}>
     click
   </button>
  );
}

With this setup, it is possible for the lazyProxy() function to lazy load the code because the tree shaker did not throw it away; instead, it just put it in a different file. Now it is up to the function to decide what it wants to do with it.

The second benefit of this approach is that we no longer need to rely on the /*#__PURE__*/ to throw away our code. We move the code to a different location and let the code decide if this code should be loaded into the current runtime.

Lastly, we no longer need the __Register() function, as the runtime can decide and load the function on the server if needed.

OK, the above is pretty sweet! It lets you create some superb DX through the marker functions. So what could be better?

Well, this code will not work!

import {lazy$} from 'my-cool-framework';
import {invokeLazyCode} from 'someplace';

export function() {
  const [state] = useStore();
  return (
    <button onClick={async () => lazy$(() => invokeLazyCode(state))}>
     click
   </button>
  );
}

The problem is that the lazy$(() => invokeLazyCode(state)) closes over state. So when it gets extracted into a new file, it creates an unresolved reference.

import {inovkeLayzCode} from 'someplace';

export id234 = () => invokeLazyCode(state); // ERROR: `state` undefined

But fear NOT! There is a solution to this as well. Let’s generate code like this instead.

FILE: hash123.js

import {invokeLazyCode} from 'someplace';
import {lazyLexicalScope} from 'my-cool-framework';

export const id456 = () => {
  const [state] = lazyLexicalScope(); // <==== IMPORTANT BIT
  invokeLazyCode(state);
}

FILE: original file

import {lazyProxy} from 'my-cool-framework';

export function() {
  return (
    <button onClick={async () => lazyProxy('./hash123.js', 'id456', [state])}>
     click
   </button>
  );
}

The two things to notice are:

  1. When the compiler extracts the closure, it notices which variables the closure captures. It then gives the framework a chance to restore those variables by inserting lazyLexicalScope().
  2. When the compiler generates lazyProxy() call, it inserts the missing variables in the same order like so [state].

The above two changes allow the underlying framework to marshal the closed-over variables to the new location. In other words, we can now lazy load closures! 🤯 (If your mind is not 🤯, then you have not been paying attention!)

Let’s say we want to implement lazy$(). What would we have to do? Well, surprisingly, little.

export function lazy$<ARGS extends Array<unknown>, RET>(
  fn: (...args: ARGS) => RET
): (...args: ARGS) => Promise<RET> {
  return async (...args) => fn.apply(null, args);
}

let _lexicalScope: Array<unknown> = [];
export function lazyLexicalScope<SCOPE extends Array<unknown>>(): SCOPE {
  return _lexicalScope as SCOPE;
}

export function lazyProxy<
  ARGS extends Array<unknown>,
  RET,
  SCOPE extends Array<unknown>
>(
  file: string,
  symbolName: string,
  lexicalScope: SCOPE
): (...args: ARGS) => Promise<RET> {
  return async (...args) => {
    const module = await import(file);
    const ref = module[symbolName];
    let previousLexicalScope = _lexicalScope;
    try {
      _lexicalScope = lexicalScope;
      return ref.apply(null, args);
    } finally {
      _lexicalScope = previousLexicalScope;
    }
  };
}

OK, how about server$() that can invoke code on the server?

export function server$<ARGS extends Array<unknown>, RET>(
  fn: (...args: ARGS) => RET
): (...args: ARGS) => Promise<RET> {
  return async (...args) => fn.apply(null, args);
}

let _lexicalScope: Array<unknown> = [];
export function serverLexicalScope<SCOPE extends Array<unknown>>(): SCOPE {
  return _lexicalScope as SCOPE;
}

export function serverProxy<
  ARGS extends Array<unknown>,
  RET,
  SCOPE extends Array<unknown>
>(
  file: string,
  symbolName: string,
  lexicalScope: SCOPE
): (...args: ARGS) => Promise<RET> {
  // BUILD TIME SWITCH  
  return import.meta.SERVER ? serverImpl : clientImpl;
}

function serverImpl<
  ARGS extends Array<unknown>,
  RET,
  SCOPE extends Array<unknown>
>(
  file: string,
  symbolName: string,
  lexicalScope: SCOPE
): (...args: ARGS) => Promise<RET> {
  return async (...args) => {
    const module = await import(file);
    const ref = module[symbolName];
    let previousLexicalScope = _lexicalScope;
    try {
      _lexicalScope = lexicalScope;
      return ref.apply(null, args);
    } finally {
      _lexicalScope = previousLexicalScope;
    }
  };
}

function clientImpl<
  ARGS extends Array<unknown>,
  RET,
  SCOPE extends Array<unknown>
>(
  file: string,
  symbolName: string,
  lexicalScope: SCOPE
): (...args: ARGS) => Promise<RET> {
  return async (...args) => {
    const res = await fetch("/api/" + file + "/" + symbolName, {
      method: "POST",
      headers: {
        "Content-Type": "application/json",
      },
      body: JSON.stringify({
        args,
        lexicalScope,
      }),
    });
    return res.json();      
  };

}

Pretty powerful stuff!

So far, we have shown you how to extract simple behavior to server, closure, or lazy loading. The results are pretty powerful. Now, what if you could take this to the extreme and build a framework from the ground up which takes these ideas and incorporates them everywhere? Well, Qwik is just such a framework, and it allows some amazing things in terms of lazy-loading, lazy-execution, and mixing of server/client code. Check out this example:

import { component$ } from "@builder.io/qwik";
import { routeLoader$ } from "@builder.io/qwik-city";
import { server$ } from "./temp";

export const useMyData = routeLoader$(() => {
  // ALWAYS RUNS ON SERVER
  console.log("SERVER", "fetch data");
  return { msg: "hello world" };
});

export default component$(() => {
  // RUNS ON SERVER OR CLIENT AS NEEDED
  const data = useMyData();
  return (
    <>
      <div>{data.value.msg}</div>
      <button
        onClick$={async () => {
          // RUNS ALWAYS ON CLIENT
          const timestamp = Date.now();
          const value = await server$(() => {
            // ALWAYS RUNS ON SERVER
            console.log("SERVER", timestamp);
            return "OK";
          });
          console.log("CLIENT", value);
        }}
      >
        click
      </button>
    </>
  );
});

Look how seamless it is to mix server and client code and it is all thanks to code-extraction.

The obvious question is, won’t that make it easy to leak secrets? In the current state, yes, but the actual implementation is a bit more complex to ensure that secrets aren’t sent to the client, but that is for another article.

We are already mixing server/client code in a single file in existing technologies using the export extraction pattern. But the solutions are limited. The new frontier of technologies will allow you to mix the code even further through function extraction and closure extraction. This can be done in a way that allows the developer to create their own marker functions and take advantage of code splitting like never before.

Don't miss our AI launch event on Oct. 12 

Claim your ticket

Share

Twitter
LinkedIn
Facebook
Hand written text that says "A drag and drop headless CMS?"

We are launching something exciting soon...

Claim your ticket

Like our content?

Join Our Newsletter

Continue Reading
Web Development25 MIN
Bun vs Node.js: Everything you need to know
WRITTEN BYVishwas Gopinath
September 19, 2023
Web Development15 MIN
A First Look at HTMX and How it Compares to React
WRITTEN BYYoav Ganbar
September 15, 2023
Visual Headless CMS20 MIN
Visual headless CMS: the what, why and how
WRITTEN BYVishwas Gopinath
September 14, 2023