Getting Started

Dependencies

To develop a Tree-sitter parser, there are two dependencies that you need to install:

  • A JavaScript runtime — Tree-sitter grammars are written in JavaScript, and Tree-sitter uses a JavaScript runtime (the default being Node.js) to interpret JavaScript files. It requires this runtime command (default: node) to be in one of the directories in your PATH.

  • A C Compiler — Tree-sitter creates parsers that are written in C. To run and test these parsers with the tree-sitter parse or tree-sitter test commands, you must have a C/C++ compiler installed. Tree-sitter will try to look for these compilers in the standard places for each platform.

Installation

To create a Tree-sitter parser, you need to use the tree-sitter CLI. You can install the CLI in a few different ways:

  • Build the tree-sitter-cli Rust crate from source using cargo, the Rust package manager. This works on any platform. See the contributing docs for more information.

  • Install the tree-sitter-cli Rust crate from crates.io using cargo. You can do so by running the following command: cargo install tree-sitter-cli --locked

  • Install the tree-sitter-cli Node.js module using npm, the Node package manager. This approach is fast, but is only works on certain platforms, because it relies on pre-built binaries.

  • Download a binary for your platform from the latest GitHub release, and put it into a directory on your PATH.

Project Setup

The preferred convention is to name the parser repository "tree-sitter-" followed by the name of the language, in lowercase.

mkdir tree-sitter-${LOWER_PARSER_NAME}
cd tree-sitter-${LOWER_PARSER_NAME}

Note that the LOWER- prefix here means the "lowercase" name of the language.

Init

Once you've installed the tree-sitter CLI tool, you can start setting up your project, which will allow your parser to be used from multiple languages.

# This will prompt you for input
tree-sitter init

The init command will create a bunch of files in the project. There should be a file called grammar.js with the following contents:

/**
 * @file PARSER_DESCRIPTION
 * @author PARSER_AUTHOR_NAME PARSER_AUTHOR_EMAIL
 * @license PARSER_LICENSE
 */

/// <reference types="tree-sitter-cli/dsl" />
// @ts-check

module.exports = grammar({
  name: 'LOWER_PARSER_NAME',

  rules: {
    // TODO: add the actual grammar rules
    source_file: $ => 'hello'
  }
});

Note that the placeholders shown above would be replaced with the corresponding data you provided in the init sub-command's prompts.

To learn more about this command, check the reference page.

Generate

Next, run the following command:

tree-sitter generate

This will generate the C code required to parse this trivial language.

You can test this parser by creating a source file with the contents "hello" and parsing it:

echo 'hello' > example-file
tree-sitter parse example-file

Alternatively, in Windows PowerShell:

"hello" | Out-File example-file -Encoding utf8
tree-sitter parse example-file

This should print the following:

(source_file [0, 0] - [1, 0])

You now have a working parser.

Finally, look back at the triple-slash and @ts-check comments in grammar.js; these tell your editor to provide documentation and type information as you edit your grammar. For these to work, you must download Tree-sitter's TypeScript API from npm into a node_modules directory in your project:

npm install # or your package manager of choice

To learn more about this command, check the reference page.