Getting Started

Building the Library

To build the library on a POSIX system, just run make in the Tree-sitter directory. This will create a static library called libtree-sitter.a as well as dynamic libraries.

Alternatively, you can incorporate the library in a larger project's build system by adding one source file to the build. This source file needs two directories to be in the include path when compiled:

source file:

  • tree-sitter/lib/src/lib.c

include directories:

  • tree-sitter/lib/src
  • tree-sitter/lib/include

The Basic Objects

There are four main types of objects involved when using Tree-sitter: languages, parsers, syntax trees, and syntax nodes. In C, these are called TSLanguage, TSParser, TSTree, and TSNode.

  • A TSLanguage is an opaque object that defines how to parse a particular programming language. The code for each TSLanguage is generated by Tree-sitter. Many languages are already available in separate git repositories within the Tree-sitter GitHub organization and the Tree-sitter grammars GitHub organization. See the next section for how to create new languages.

  • A TSParser is a stateful object that can be assigned a TSLanguage and used to produce a TSTree based on some source code.

  • A TSTree represents the syntax tree of an entire source code file. It contains TSNode instances that indicate the structure of the source code. It can also be edited and used to produce a new TSTree in the event that the source code changes.

  • A TSNode represents a single node in the syntax tree. It tracks its start and end positions in the source code, as well as its relation to other nodes like its parent, siblings and children.

An Example Program

Here's an example of a simple C program that uses the Tree-sitter JSON parser.

// Filename - test-json-parser.c

#include <assert.h>
#include <string.h>
#include <stdio.h>
#include <tree_sitter/api.h>

// Declare the `tree_sitter_json` function, which is
// implemented by the `tree-sitter-json` library.
const TSLanguage *tree_sitter_json(void);

int main() {
  // Create a parser.
  TSParser *parser = ts_parser_new();

  // Set the parser's language (JSON in this case).
  ts_parser_set_language(parser, tree_sitter_json());

  // Build a syntax tree based on source code stored in a string.
  const char *source_code = "[1, null]";
  TSTree *tree = ts_parser_parse_string(
    parser,
    NULL,
    source_code,
    strlen(source_code)
  );

  // Get the root node of the syntax tree.
  TSNode root_node = ts_tree_root_node(tree);

  // Get some child nodes.
  TSNode array_node = ts_node_named_child(root_node, 0);
  TSNode number_node = ts_node_named_child(array_node, 0);

  // Check that the nodes have the expected types.
  assert(strcmp(ts_node_type(root_node), "document") == 0);
  assert(strcmp(ts_node_type(array_node), "array") == 0);
  assert(strcmp(ts_node_type(number_node), "number") == 0);

  // Check that the nodes have the expected child counts.
  assert(ts_node_child_count(root_node) == 1);
  assert(ts_node_child_count(array_node) == 5);
  assert(ts_node_named_child_count(array_node) == 2);
  assert(ts_node_child_count(number_node) == 0);

  // Print the syntax tree as an S-expression.
  char *string = ts_node_string(root_node);
  printf("Syntax tree: %s\n", string);

  // Free all of the heap-allocated memory.
  free(string);
  ts_tree_delete(tree);
  ts_parser_delete(parser);
  return 0;
}

This program requires three components to build:

  1. The Tree-sitter C API from tree-sitter/api.h (requiring tree-sitter/lib/include in our include path)
  2. The Tree-sitter library (libtree-sitter.a)
  3. The JSON grammar's source code, which we compile directly into the binary
clang                                   \
  -I tree-sitter/lib/include            \
  test-json-parser.c                    \
  tree-sitter-json/src/parser.c         \
  tree-sitter/libtree-sitter.a          \
  -o test-json-parser
./test-json-parser

When using dynamic linking, you'll need to ensure the shared library is discoverable through LD_LIBRARY_PATH or your system's equivalent environment variable. Here's how to compile with dynamic linking:

clang                                   \
  -I tree-sitter/lib/include            \
  test-json-parser.c                    \
  tree-sitter-json/src/parser.c         \
  -ltree-sitter                         \
  -o test-json-parser
./test-json-parser