Design
The Query Engine (Rust)
The core engine is written in Rust, and is designed to support multiple languages. It is responsible for:
- Parsing source code into an AST.
- Creating graphs from source code.
- Responsing to queries from the adapters.
- Managing the cache.
turbo-tasks
The core engine is based on turbo-tasks
, which is a incremental computation framework written in Rust that supoprts persistent caching.
Note that the caching includes the task graph itself. So if you change the code, the task graph will be re-computed, but the task graph itself is cached.
This is not common in other incremental computation frameworks.
Caching the task graph
turbo-tasks
is designed to support huge codebases. If you work for a huge company, you may know... that the disk IO becomes the bottleneck for initial startup.
Although parsing is very fast and emberassingly parallel, there are too many files to parse just to draw the dependency graph.
node-file-trace
You can easily auto-detect ESM import/exports
, but for users, it is not enough. The behavior of the program may depend on other kinds of files.
For example, it may read a SQL file and execute it. It may read a JSON file and parse it. It may read a .env
file and load the environment variables.
Of course, taskend
provides APIs to explicitly declare the dependencies, but it is not convinient for users.
So Vercel did a great job to create node-file-trace
, which is a tool to detect all the dependencies of Node.js programs.
We use it. There's alreay a Rust implementation of it, so we can easily use it in our Rust code.
Dynamically sized compile unit
For ECMAScript, we can easily track all dependencies of a file. But for other languages, it is not easy.
So we have various kinds of compile unit
s. For example, for a Rust project using cargo
, a crate is a compile unit.
The compile unit is the base unit of dependency query, and test execution.
It means, if you change a file in a cargo project, all dependent crates will be re-compiled and re-tested.
You may think this is not efficient, but it's about correctness, and it's not that bad.
For example, the CI time of the SWC project is largely dominated by the time for swc_plugin_runner
crate.
But it's not changed frequently even indirectly. So with taskend
, we can skip the test of swc_plugin_runner
crate in the most cases, and it's a huge win.
Special thanks
turbo-tasks
(Vercel)node-file-trace
(Vercel)- haetae (opens in a new tab)