# Building a custom web-tree-sitter Tree-sitter parsers often use external C scanners, and those scanners sometimes use functions in the C standard library. For this to work in a WASM environment, web-tree-sitter needs to have anticipated which stdlib functions will need to be available. If a tree-sitter parser uses stdlib function X, but X is not included in [this list of exports](https://github.com/tree-sitter/tree-sitter/blob/master/lib/binding_web/exports.json), the parser will fail to work and will throw an error whenever it hits a code path that uses the rogue function. For this reason, Pulsar builds a custom web-tree-sitter. Every time someone tries to integrate a new tree-sitter parser into a Pulsar grammar, they might find that the parser relies on some stdlib function we haven’t included yet — in which case they can let us know and we’ll be able to update our web-tree-sitter build so that it can export that function. Pulsar will need to do this until [tree-sitter#949](https://github.com/tree-sitter/tree-sitter/issues/949) is addressed in some way. ## Check out the modified branch for the version we’re targeting At time of writing, Pulsar was targeting web-tree-sitter version 0.20.7, so a branch exists [on our fork](https://github.com/pulsar-edit/tree-sitter/tree/v0-20-7-modified) called `v0-20-7-modified`. That branch contains a modified `exports.json` file and a modified script for building web-tree-sitter. When we target a newer version of web-tree-sitter, a similar branch should be created against the corresponding upstream tag. The commits that were applied on the previous modified branch should be able to be cherry-picked onto the new one rather easily. ## Add whatever methods are needed to `exports.json` For instance, tree-sitter-ruby introduced a new dependency on the C stdlib function `iswupper` a while back, and web-tree-sitter doesn’t export that one by default. So we can add the line ``` "_iswupper", ``` in an appropriate place in `exports.json`, then rebuild web-tree-sitter so that the WASM-built version of the tree-sitter-ruby parser has that function available to it. If a third-party tree-sitter grammar needs something more esoteric, our default position should be to add it to the build. If the export results in a major change in file size or — somehow — performance, then the change can be discussed. ## Run `script/build-wasm` from the root To build web-tree-sitter for a particular version, make sure you’re using the appropriate version of Emscripten. [This document](https://github.com/sogaiu/ts-questions/blob/master/questions/which-version-of-emscripten-should-be-used-for-the-playground/README.md) is useful at matching up tree-sitter versions with Emscripten versions. The default `build-wasm` script performs minification with terser. That’s easy enough to turn off — and we do — but even without minifcation, emscripten generates a JS file that doesn’t have line breaks or indentation. We fix this by running `js-beautify` as a final step. Pulsar, as a desktop app, doesn’t gain a _lot_ from minification, and ultimately it’s better to have a source file that the user can more easily debug if necessary. And it makes the next change a bit easier: ## Add a warning message When a parser tries to use a stdlib function that isn’t exported by web-tree-sitter, the error that’s thrown is not very useful. So we try to detect when that scenario is going to happen and insert a warning in the console to help users that might otherwise be befuddled. This may be automated in the future, but for now you can modify `tree-sitter.js` to include the `checkForAsmVersion` function: ```js var Module = typeof Module !== "undefined" ? Module : {}; var TreeSitter = function() { function checkForAsmVersion(prop) { if (!(prop in Module['asm'])) { console.warn(`Warning: parser wants to call function ${prop}, but it is not defined. If parsing fails, this is probably the reason why. Please report this to the Pulsar team so that this parser can be supported properly.`); } } var initPromise; var document = typeof window == "object" ? { currentScript: window.document.currentScript } : null; ``` You can then search for this line ```js if (!resolved) resolved = resolveSymbol(prop, true); ``` and add the following line right below it: ```js checkForAsmVersion(prop); ``` The line in question is [generated by emscripten](https://github.com/emscripten-core/emscripten/blob/67ebee3261629f7e3c2bd24b61098af0c730d8d9/src/library_dylink.js#L699), so if it changes in the future, you should be able to look up its equivalent in the correct version of emscripten. ## Copy it to `vendor` Under `lib/binding_web` you’ll find the built files `tree-sitter.js` and `tree-sitter.wasm`. Copy both to Pulsar’s `vendor/tree-sitter` directory. Relaunch Pulsar and do a smoke test with a couple of existing grammars to make sure you didn’t break anything.