The Superpower of Babel: How We Saved 16% on Our Bundle Size

The Superpower of Babel: How We Saved 16% on Our Bundle Size

Omri Lavi
Omri Lavi

Intro

Web developers are all too familiar with the challenge of optimizing bundle sizes to ensure smooth, fast-loading web applications. Recently, we tackled this issue head-on by developing a unique Babel loader that decreased our bundle sizes by up to 16%. But our journey was not just about the end result. Throughout the development process, we gained valuable insights into the planning, design, and implementation of Babel loaders, as well as the benefits and limitations of using Babel in web development. In this article, we’ll share the lessons we learned and the solutions we discovered as we worked to optimize our web applications. Join us as we explore the challenges of bundle optimization and the potential benefits of using a custom Babel loader.

The Challenge (AKA a Crazy Idea)

​​Managing bundle sizes can be a daunting task for web developers, especially when files with constant values contribute to the bulk of the bundle. We faced a similar issue, where large files containing only constant values were being added to the bundle, even if only a few values were actually being used. While tree shaking seemed like a potential solution, we quickly realized that it doesn’t work with internal object properties. Let’s examine a sample file to better understand the issue at hand:

// config.js
export default {
 SERVICE1: {
   project_id: "some_id"
   url: "some_url"
   //...
 },
 SERVICE2: {
   api_url: "service2 url"
   //...
 }
 //... many more keys ...
}

When a large key-value object, such as SERVICE1, is referenced even once in a single Webpack entry point, the entire object is included in the compilation’s output. In many cases, other top-level sibling properties, like SERVICE2, are also included, resulting in a bundle that may contain thousands of unnecessary values. This can significantly impact the performance of the web application. To address this issue, we came up with a simple solution: instead of including the entire object, we inlined only the necessary values. By doing so, we were able to reduce the size of our bundle and improve the overall performance of our web application. 

Show Me an Example Already! 👀

Let’s take a look at a sample file on the left, which imports the config.js file and references a specific value. On the right, you’ll see the output of our plugin for this file. As you can see, the plugin inlines only the necessary values, resulting in a much smaller and more efficient bundle. 

// Original version
import { SERVICE1 } from './config';
console.log(SERVICE1.url);
// Plugin’s output

console.log("some_url"); 

After inlining the referenced values, the runtime behavior of the optimized code is indistinguishable from the original code. This optimization is even more powerful than tree-shaking because it completely removes the large objects from the bundle, rather than just keeping the values that are needed. Our goal was to create a tool that would automate the inlining process and provide a simple and effective way to reduce the size of our bundles.

Babel, We Choose You!

We knew we needed a tool that enables us doing the following:

  1. Go through all the files.
  2. Identify references to files that can be inlined.
  3. Inline the references.
  4. Remove the imports for the files that were inlined.

We initially considered using build-time tools, which typically process all files in the bundle. However, we also needed to parse and manipulate the Abstract Syntax Trees (ASTs) of the files in order to inline the referenced values. While Webpack can handle parsing and manipulating imports, it is not very effective at manipulating the ASTs of the files. In contrast, Babel is a powerful tool for AST manipulation. 
Another consideration is that we might migrate from Webpack to other build-tools in the future, and we didn’t want to couple ourselves to it. We preferred marrying ourselves to Babel over Webpack.

For these reasons we chose to implement the solution as a Babel plugin, which will be used along with babel-loader.

Our North Stars ✨

The following principles guided us throughout the development of the the plugin:

Safety

Our top priority while developing the plugin was to ensure that it would not break anything. Since the plugin would be removing references, imports, and even entire files from the original code, debugging any issues caused by the plugin could have been challenging. For this reason, we avoided taking any assumption on the code while developing the plugin. If there was a scenario in the plugin’s flow that we weren’t sure how to handle, we preferred throwing an error. For example, if during the traversal of file’s AST we encounter a type of node that we don’t expect, we would prefer throwing an error instead of handling the case. Additionally, to increase our confidence, we relied heavily on automated validation scripts. By following this principle of safety, we were able to ensure that the plugin would be reliable and stable.

Incremental Changes

When we started developing the plugin, we already had a clear idea of which files we wanted to inline. Instead of trying to inline everything at once, which would have been risky and could cause unforeseen issues, we took an agile approach. We enabled a gradual opt-in process by inlining only files with the “.pure.inlined.js” extension. To start inlining a file, all we had to do was change its extension. This gave us complete control over which cases we handled and which parts could potentially break during the migration process.

The beauty of this approach was that we could expand the scope of supported scenarios. We started with a small range of scenarios and threw errors for any scenario we couldn’t handle. Each time we migrated a new file with a new scenario, we made small changes to the plugin to support it. This allowed us to progress quickly with confidence.

Developer Experience

To ensure our developers have the best experience possible, we have included all necessary validations within the plugin. This means developers don’t need to remember any rules for using the plugin, as it can handle the required assertions for them. This has made the plugin’s adoption much easier and more straightforward. To further enhance the developer experience, we have also included detailed instructions for all errors that may arise, along with the use of built-in Babel features to generate more informative compilation failure messages.

Unexpected dynamic import or require of a pure-inline file "./config.inlined.pure" from file "/path/to/file.js".

Files that are purely inlined are only allowed to be imported using a regular ES6 import, not with require or dynamic imports.

   4 |
 > 5 | const { SERVICE1 } = require("./config.inlined.pure");
     |                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
   6 |

Performance

As simple as it goes, we wanted our plugin to be quick.

Overcoming the Challenges

Developing the plugin was a complex task that presented us with several challenges. Here are the main obstacles we faced and how we managed to overcome them.

Challenge #1. Understanding the Limitations

We understood early on that pure files are very special, and they have unique limitations:

Side Effects

Pure files cannot have any side effects, since their contents will be removed during the compilation. We focused on two types of side effects that are forbidden in Pure files:  global references and importing non-pure files.

Global References

Global references are references to global objects such as “window” or “navigator“. These could implicitly mean a side-effect, either by altering some state, or by relying on an external resource that may change in runtime. Pure files need to be entirely static, and their contents must not rely on anything which is not pure.

Detecting global references was easy using Babel. We traverse the code, and visit each node of type “ReferencedIdentifier“. If the identifier has no binding in the current scope, it means it is unbound and thus a global reference. 

The implementation looks as following:

async function getIdentifiersWithoutBindings(code) {
 const refs = new Set();
 await babel.transformAsync(code, {
   plugins: [
     () => ({
       visitor: {
         ReferencedIdentifier: babelPath => {
           const name = babelPath.node.name;
           if (!babelPath.scope.hasBinding(name, true)) {
             refs.add(name);
           }
         }
       }
     })
   ]
 });
 return refs;
}

const [originalRefs, transpiledRefs] = await Promise.all([
   getIdentifiersWithoutBindings(originalContent, originalFile),
   getIdentifiersWithoutBindings(transpiledContent, transpiledFile)
]);

const hasDifferentRefs = !isEqual(originalRefs, transpiledRefs);

This is not a bulletproof implementation accurate since it collected some extra identifiers which may not be references. However, since we collect the identifiers the same way on the original and the transpiled code, the comparison is accurate.

Importing Non-pure Files

Ideally, we would have wanted to forbid any import in Pure files, to be extra safe regarding side-effects. However, we had some Pure files that depended on other Pure files, and we didn’t want to force major changes in the codebase. Allowing imports of other Pure files was a good solution, since we knew that other Pure files shouldn’t have side-effects as well.

Direct References Only

To keep things simple, we only allowed direct references to primitive values coming from Pure files. Non-primitive values, like objects, were not possible to inline, since they could potentially represent something stateful. For example, consider the following Pure file:

// configuration.inlined.pure.js


module.exports = {
 SERVICE1: {
   apiUrl: "some_url"
 }
}

And the file that is consuming it:

// index.js
import { SERVICE1 } from './configuration';
const obj = SERVICE1; // invalid - SERVICE1 is an object, and can't be inlined.
console.log(obj.apiUrl);

// instead, do this:
console.log(SERVICE1.apiUrl);

By only allowing direct references to values from Pure files, we eliminated a complete type of edge cases that required advanced AST traversals. This way the plugin had a much simpler implementation.

First Plugin to Run

Since we only detect static imports, we needed the primitives-inliner plugin to run first between the Babel plugins. It is possible that other plugins are performing transformations to the import or require statements. By running the plugin first, we minimize the scope of scenarios we need to support.

Static Imports Only

Like many other AST-based tools, we could not detect dynamic requires or imports easily. Since we actually removed Pure modules from the bundle’s output, it would have been dangerous to have dynamic connections to them. We decided it would be good enough to only support static imports. 

import { SERVICE1 } from './config.inlined.pure'; // OK

const { SERVICE2 } = await import('./config.inlined.pure'); // NOT OK
const { SERVICE3 } = require('./config.inlined.pure'); // NOT OK

In 99% of the cases, it wasn’t a problem, since the imports were static anyway. 

Challenge #2. CommonJS vs ESM

The plugin works by identifying references to Pure files, extracting the referenced keys, computing them, and replacing them with the original references. To compute the Pure files’ exports, we use a simple require() statement. However, we faced a limitation when dealing with Pure files written in ESM format, as NodeJS doesn’t support them by default. To address this, we wrapped the require() statement with try/catch and examined the error thrown. If the error indicated that the file is in ESM format, we transpiled it to CommonJS format using Babel (with @babel/plugin-transform-modules-commonjs), and then ran the transpiled code to retrieve its exports. This required a custom implementation, which is beyond the scope of this article.

Challenge #3. Computed Properties

In JavaScript, you may access an object’s properties in two ways: as static properties (e.g. myObject.myKey) or as computed properties (e.g. myObject['myKey']). Computed properties were a big issue for us. Let’s say we have a Pure file named config.inlined.pure.js :

module.exports = {
 SERVICE1: {
   apiUrl: "some_url"
 }
}

And the consuming file, index.js:

import { SERVICE1 } from "./config.inlined.pure";

const property = 'apiUrl;
const apiUrl = SERVICE1[property]; // <--
console.log(apiUrl);

Even though it’s a simple case, it would be very difficult to understand what value is being referenced from config.inlined.pure.js without running the file. 

For this reason, we decided to support only non-computed property accesses in references of Pure files. 

/path/to/file.js: Accessing computed values of constant files is not supported. Please use static with dot-notations, for example - "foo.bar" instead of "foo['bar']"
     7 |
     8 | const property = "apiUrl";
  >  9 | const apiUrl = SERVICE1[property]; // <--
       |                ^^^^^^^^^^^^^^^^^^
    10 | console.log(apiUrl);

It may sound too strict, but in practice it’s not a real limitation. The referenced values are practically always known upfront, and are almost never computed at runtime.

Challenge #4. Validating the Pure Files

As mentioned before, we wanted the plugin to perform all the required validations. Among these, there were limitations on the Pure files themselves (e.g. no side effects). Our initial implementation was as following:

  1. All .js files were loaded via Webpack with babel-loader.
  2. Thus, our Babel plugin ran on all .js files:
    1. If the file is a Pure file, validate it as a Pure file.
    2. Otherwise, treat it as a regular file, and perform the inlining (as described earlier)

At first this approach seemed to work well. However, we quickly discovered a serious issue in production builds. 

When Webpack loads a file via Babel, it checks its dependencies according to the transpiled file output, and not according to the original file. Why does this matter? Since we just removed the imports of the pure files. This means that the pure files are not loaded anymore via Babel, and that we won’t get our plugin to validate their content.

Our solution was rather simple. Our plugin “knew” which Pure files were being removed. So just before we remove the imports of the files, we perform the validations on them. For example, in this file:

import { SERVICE1 } from "./config.inlined.pure";

console.log(SERVICE1.apiUrl);

When the plugin removes the import statement of config.inlined.pure.js, it first validates its content as a Pure file, and only then removes the import. 

The validation itself is performed with an ad-hoc Babel traversal on the Pure file. We also cached the pure files that were already validated in this build, to prevent the same file from being validated multiple times if it is imported from many places.

This way we were sure that any Pure file that was being referenced in the compilation, had to be valid (otherwise, an error would have been thrown).

Challenge #5. Tests

Testing the plugin was not a simple task. Writing unit tests was easy – but how could we know that the plugin works as it should? We needed some kind of an end-to-end testing mechanism. 

After researching various testing mechanisms, we discovered “Exec Tests.” These tests involve transpiling “test cases” of the plugin and running the output.

To create Exec Tests for the Pure plugin, we wrote a series of test files that included various Pure modules and references to those modules. We then ran these test files through Webpack with the Pure plugin installed and compared the output to the expected results.

This method of testing allowed us to ensure that the plugin was correctly inlining Pure modules and removing references to them, without any unintended side effects.

Let’s see an example. Let’s say we have a Pure file named config.inlined.pure.js :

module.exports = {
 SERVICE1: {
   apiUrl: "some_url"
}

And a consumer index.js file:

import { SERVICE1 } from "./config.inlined.pure";

output.push(SERVICE1.apiUrl);  // <--

Please notice the reference to the global variable “output”. During the test, we transpile the file index.js with our plugin, and receive the code:

output.push("some_url");

Then, we take this code, and wrap it with a function that has an array in its scope, named output:

const result = new Function(`
   const output = [];
   ${code}
   return output;
`)();

By doing that, the array result holds all the strings that we injected in the original code. 

Why is this useful? Since we can compare the same file’s output with and without the primitives-inliner, and compare the content of the result arrays. If the content is identical, we can be confident that none of the test cases are failing.

By writing such Exec Tests we were able to make changes in the plugin with high confidence, while also adding new test cases easily.

Challenge #6. Webpack Aliases

The next challenge we had was to make sure we resolve paths in the exact same way as Webpack. If we resolve the aliases differently, we risk getting the wrong value being referenced and inlined. This could have been a serious issue, and we couldn’t risk any mistake.

Webpack is using enhanced-resolve behind the scenes. When configuring Webpack’s “resolve” property, it actually injects these properties to a new instance of enhanced-resolve. Then, this instance is being used whenever a path needs to be resolved. By injecting the “resolve” configurations to the primitives-inliner plugin, we get the exact same behavior as Webpack. 

This way we were able to handle Pure files that are imported using aliases as well.

Challenge #7. Safe Incremental Migration

We had many files in our codebase that we wanted to convert to Pure files. We referred this process as “Purifying the file”, and it usually involved two steps. First, we would rename the file, adding the “.inlined.pure.js” extension to it. Then, we needed to update all the imports for this file to point to the new file name. As it turned out, purifying our main codebase (AKA The Monolith) was not a simple task. Here’s a short list of the issues we had:

Problem: Barrel Files

Some files in the codebase were structured in a Barrel file pattern, where multiple files are imported and re-exported from a single index file. This pattern made it difficult to determine which files were actually referencing a specific Pure file. Instead of importing the Pure files directly, many files were importing the Barrel files, which in turn were importing and re-exporting the Pure files.

Problem: “Popular” Files

Some of the files that the team wanted to inline were used extensively throughout the codebase, with potentially hundreds of import statements pointing to them. For instance, there was a file that contained all the feature flag configurations, which was used in many different files. Renaming such files to add the “.inlined.pure.js” extension and updating their import statements to point directly to them could be a daunting task, as it was not clear which files referenced them directly and which used them through a barrel file pattern. Additionally, updating all the import statements could potentially break the code due to the plugin’s implementation. Therefore, we had to be careful and methodical in our approach to purifying these files.

Problem: Confidence

We wanted to be sure that we didn’t leave “dead” references to files that were being purified. Meaning, we wanted to avoid cases where the import statement of a file was removed, but some of the references to it were not removed. This would have caused the worst types of bugs. In such scenarios, we’ll only discover it when running the code. Since we use the plugin only in production builds, it means we would have discovered these bugs only in actual production environments – which is not acceptable.

The Solution: A Smart Script

We developed a script to automate the process of purifying files which consisted of two main parts: code modification and validation. The modification part involved identifying all direct imports of the file and its associated barrel files and replacing the imports of the barrel files with direct imports of the underlying files. This method successfully handled 99% of the imports. The second part of the script was the validation phase where we ensured that the plugin was functioning as expected for the newly purified files. We transpiled the updated files with Babel using the primitives-inliner plugin and then compared the global references of the original file before transpilation with the global references of the transpiled code. If the number of global references was different, it indicated that our plugin removed an import statement without changing the referenced imported values, which was an issue that needed to be addressed.

For example, consider the following index.js file:

import { CONST1 } from "./config1.inlined.pure";
import { CONST2 } from "./config2.inlined.pure";

console.log(CONST1);
console.log(CONST2);

It has a single global reference to the “console” object. If for some reason it had more than a single global reference after transpiling the file with our plugin, it necessarily meant that our plugin removed the import statement without inlining the value. If such a scenario happens, the script throws an error with all the details we need for debugging.

The script was a huge part of the plugin’s success. Without it, “purifying” our entire codebase would have a very long time.

Summary

In this article we’ve talked about the different challenges we had while developing this Babel plugin, and how we overcame them. Overall, we used this plugin to “Purify” ~175 files. We needed to update 2780 files that used these purified files, while updating more than 4200 import statements. This was a long process, but it allowed us to save up to 16% of each of our bundles. For us, it was a fun and unique experience, and we learned a lot. 

Using Babel is truly a superpower, and it can enable amazing things. We hope that this article will help you understand some of the complexities involving developing a Babel plugin. 

Resources

Omri Lavi
Client Infrastructure Tech Lead @ monday.com