Getting Started
Building the Library
To build the library on a POSIX system, just run make
in the Tree-sitter directory. This will create a static library
called libtree-sitter.a
as well as dynamic libraries.
Alternatively, you can incorporate the library in a larger project's build system by adding one source file to the build. This source file needs two directories to be in the include path when compiled:
source file:
tree-sitter/lib/src/lib.c
include directories:
tree-sitter/lib/src
tree-sitter/lib/include
The Basic Objects
There are four main types of objects involved when using Tree-sitter: languages, parsers, syntax trees, and syntax nodes.
In C, these are called TSLanguage
, TSParser
, TSTree
, and TSNode
.
-
A
TSLanguage
is an opaque object that defines how to parse a particular programming language. The code for eachTSLanguage
is generated by Tree-sitter. Many languages are already available in separate git repositories within the Tree-sitter GitHub organization and the Tree-sitter grammars GitHub organization. See the next section for how to create new languages. -
A
TSParser
is a stateful object that can be assigned aTSLanguage
and used to produce aTSTree
based on some source code. -
A
TSTree
represents the syntax tree of an entire source code file. It containsTSNode
instances that indicate the structure of the source code. It can also be edited and used to produce a newTSTree
in the event that the source code changes. -
A
TSNode
represents a single node in the syntax tree. It tracks its start and end positions in the source code, as well as its relation to other nodes like its parent, siblings and children.
An Example Program
Here's an example of a simple C program that uses the Tree-sitter JSON parser.
// Filename - test-json-parser.c
#include <assert.h>
#include <string.h>
#include <stdio.h>
#include <tree_sitter/api.h>
// Declare the `tree_sitter_json` function, which is
// implemented by the `tree-sitter-json` library.
const TSLanguage *tree_sitter_json(void);
int main() {
// Create a parser.
TSParser *parser = ts_parser_new();
// Set the parser's language (JSON in this case).
ts_parser_set_language(parser, tree_sitter_json());
// Build a syntax tree based on source code stored in a string.
const char *source_code = "[1, null]";
TSTree *tree = ts_parser_parse_string(
parser,
NULL,
source_code,
strlen(source_code)
);
// Get the root node of the syntax tree.
TSNode root_node = ts_tree_root_node(tree);
// Get some child nodes.
TSNode array_node = ts_node_named_child(root_node, 0);
TSNode number_node = ts_node_named_child(array_node, 0);
// Check that the nodes have the expected types.
assert(strcmp(ts_node_type(root_node), "document") == 0);
assert(strcmp(ts_node_type(array_node), "array") == 0);
assert(strcmp(ts_node_type(number_node), "number") == 0);
// Check that the nodes have the expected child counts.
assert(ts_node_child_count(root_node) == 1);
assert(ts_node_child_count(array_node) == 5);
assert(ts_node_named_child_count(array_node) == 2);
assert(ts_node_child_count(number_node) == 0);
// Print the syntax tree as an S-expression.
char *string = ts_node_string(root_node);
printf("Syntax tree: %s\n", string);
// Free all of the heap-allocated memory.
free(string);
ts_tree_delete(tree);
ts_parser_delete(parser);
return 0;
}
This program requires three components to build:
- The Tree-sitter C API from
tree-sitter/api.h
(requiringtree-sitter/lib/include
in our include path) - The Tree-sitter library (
libtree-sitter.a
) - The JSON grammar's source code, which we compile directly into the binary
clang \
-I tree-sitter/lib/include \
test-json-parser.c \
tree-sitter-json/src/parser.c \
tree-sitter/libtree-sitter.a \
-o test-json-parser
./test-json-parser
When using dynamic linking, you'll need to ensure the shared library is discoverable through LD_LIBRARY_PATH
or your system's
equivalent environment variable. Here's how to compile with dynamic linking:
clang \
-I tree-sitter/lib/include \
test-json-parser.c \
tree-sitter-json/src/parser.c \
-ltree-sitter \
-o test-json-parser
./test-json-parser