Contributing
Code of Conduct
Contributors to Tree-sitter should abide by the Contributor Covenant.
Developing Tree-sitter
Prerequisites
To make changes to Tree-sitter, you should have:
- A C compiler, for compiling the core library and the generated parsers.
- A Rust toolchain, for compiling the Rust bindings, the highlighting library, and the CLI.
- Node.js and NPM, for generating parsers from
grammar.jsfiles. - Either Emscripten, Docker, or podman for compiling the library to Wasm.
Building
Clone the repository:
git clone https://github.com/tree-sitter/tree-sitter
cd tree-sitter
Optionally, build the Wasm library. If you skip this step, then the tree-sitter playground command will require an internet
connection. If you have Emscripten installed, this will use your emcc compiler. Otherwise, it will use Docker or Podman:
cd lib/binding_web
npm install # or your JS package manager of choice
npm run build
Build the Rust libraries and the CLI:
cargo build --release
This will create the tree-sitter CLI executable in the target/release folder.
If you want to automatically install the tree-sitter CLI in your system, you can run:
cargo install --path crates/cli
If you're going to be in a fast iteration cycle and would like the CLI to build faster, you can use the release-dev profile:
cargo build --profile release-dev
# or
cargo install --path crates/cli --profile release-dev
Testing
Before you can run the tests, you need to fetch some upstream grammars that are used for testing:
cargo xtask fetch-fixtures
To test any changes you've made to the CLI, you can regenerate these parsers using your current CLI code:
cargo xtask generate-fixtures
Then you can run the tests:
cargo xtask test
Similarly, to test the Wasm binding, you need to compile these parsers to Wasm:
cargo xtask generate-fixtures --wasm
cargo xtask test-wasm
Wasm Stdlib
The tree-sitter Wasm stdlib can be built via xtask:
cargo xtask build-wasm-stdlib
This command looks for the Wasi SDK indicated by the TREE_SITTER_WASI_SDK_PATH
environment variable. If you don't have the binary, it can be downloaded from wasi-sdk's releases
page. Similarly, this command also looks for the wasm-opt tool from binaryen indicated by the TREE_SITTER_BINARYEN_PATH
environment variable. wasm-opt and the rest of the binaryen tool suite can be downloaded from the project's releases
page. Note that any changes to crates/language/wasm/** requires rebuilding the tree-sitter Wasm stdlib via cargo xtask build-wasm-stdlib.
Debugging
The test script has a number of useful flags. You can list them all by running cargo xtask test -h.
Here are some of the main flags:
If you want to run a specific unit test, pass its name (or part of its name) as an argument:
cargo xtask test test_does_something
You can run the tests under the debugger (either lldb or gdb) using the -g flag:
cargo xtask test -g test_does_something
Part of the Tree-sitter test suite involves parsing the corpus tests for several languages and performing randomized edits
to each example in the corpus. If you just want to run the tests for a particular language, you can pass the -l flag.
Additionally, if you want to run a particular example from the corpus, you can pass the -e flag:
cargo xtask test -l javascript -e Arrays
If you are using lldb to debug the C library, tree-sitter provides custom pretty printers for several of its types.
You can enable these helpers by importing them:
(lldb) command script import /path/to/tree-sitter/lib/lldb_pretty_printers/tree_sitter_types.py
Published Packages
The main tree-sitter/tree-sitter repository contains the source code for
several packages that are published to package registries for different languages:
-
Rust crates on crates.io:
tree-sitter— A Rust binding to the core librarytree-sitter-config— User configuration of the command-line tooltree-sitter-cli— The command-line tooltree-sitter-generate— The parser-generation librarytree-sitter-highlight— The syntax-highlighting librarytree-sitter-language— A language type for grammars to interact with the core librarytree-sitter-loader— The parser building and loading librarytree-sitter-tags— The syntax-tagging library
-
JavaScript modules on npmjs.com:
web-tree-sitter— A Wasm-based JavaScript binding to the core librarytree-sitter-cli— The command-line tool
There are also several other dependent repositories that contain other published packages:
tree-sitter/node-tree-sitter— Node.js bindings to the core library, published astree-sitteron npmjs.comtree-sitter/py-tree-sitter— Python bindings to the core library, published astree-sitteron PyPI.org.tree-sitter/go-tree-sitter— Go bindings to the core library, published astree_sitteron pkg.go.dev.
Release workflow
Treesitter follows semver (pre-1.0.0) with a dual development strategy:
- All development happens on the
masterbranch, i.e., any new PR (bugfix or feature) must targetmaster. - Applicable bugfixes and minor improvements are backported to the latest
release-0.xbranch, wherexis the latest minor version. This can be automated by adding theci:backport release-0.xlabel to the PR. No new features or breaking changes should be backported; this is important to make sure that patch releases are drop-in replacements that are always safe to update to (i.e., downstream users should never have to check patch versions for input or output changes)!
Important: All crates within the project (see above) are versioned in lockstep with the exception of tree-sitter-language, which is versioned independently and only bumped when necessary.
Minor releases
Minor releases (0.x.0) are made from the master branch following these steps:
- Create a "release
v0.x.0" PR onmasterand apply theci:check releaselabel to check for possible issues, in particular whether the language crate needs to be bumped. - If the check release workflow indicates, bump the patch version of the language crate. Important: This must be higher than the last published version on
crates.io, which may be from a patch release! (I.e., the version may need to be bumped by two or more.) - Once the PR is merged, tag the commit accordingly:
tag v0.x.0and push viagit push --tags(maintainers only). This will trigger the release and publish workflows. - Edit the Github release to include release notes (auto-generated, if nothing else).
- Create a new
release-0.xbranch and push to Github. - Bump all version numbers (except for the language crate) to
0.{x+1}.0, e.g., viacargo xtask bump-version 0.{x+1}.0and committing as "release: start working on0.{x+1}.0". (This is important to be able to distinguish prerelease nightly builds from the last release by the--versionoutput.) Note: this requires tooling for all involved languages, includingnpmandzig. Double-check that all versions are bumped correctly by grepping for the old version!
On average, minor releases should happen 1-3 times a year.
Patch releases
Patch releases (0.x.y) are made from the release-0.x branch following these steps:
- Bump all version numbers (except for the language crate) to
0.x.yas described above. - Create a "release
v0.x.yPR onrelease-0.xand apply theci:check releaselabel. - If the check release workflow indicates, bump the patch version of the language crate.
- Once the PR is merged, tag the commit accordingly:
tag v0.x.yand push viagit push --tags(maintainers only). This will trigger the release and publish workflows. - Edit the Github release to include release notes (auto-generated, if nothing else).
On average, minor releases should happen every six weeks.
Developing Documentation
Our current static site generator for documentation is mdBook, with a little bit of custom JavaScript to handle
the playground page. Most of the documentation is written in Markdown, including this file! You can find these files
at docs/src. If you'd like to submit a PR to improve the documentation, navigate to the page you'd like to
edit and hit the edit icon at the top right of the page.
Prerequisites for Local Development
To run and iterate on the docs locally, the
mdbook CLI tool is required, which can be installed with
cargo install mdbook
You might have noticed we have some fancy admonitions sprinkled throughout the documentation, like the note above.
These are created using mdbook-admonish, a preprocessor for mdBook. As such, this is also
a requirement for developing the documentation locally. To install it, run:
cargo install mdbook-admonish
Once you've installed it, you can begin using admonitions in your markdown files. See the reference for more information.
Spinning it up
Now that you've installed the prerequisites, you can run the following command to start a local server:
cd docs
mdbook serve --open
mdbook has a live-reload feature, so any changes you make to the markdown files will be reflected in the browser after
a short delay. Once you've made a change that you're happy with, you can submit a PR with your changes.
Improving the Playground
The playground page is a little more complicated, but if you know some basic JavaScript and CSS you should be able to make
changes. The playground code can be found in docs/src/assets/js/playground.js, and its corresponding css
at docs/src/assets/css/playground.css. The editor of choice we use for the playground is CodeMirror,
and the tree-sitter module is fetched from here. This, along with the Wasm module and Wasm parsers, live in the
.github.io repo.