Package 'treesitter'

Title: Bindings to 'Tree-Sitter'
Description: Provides bindings to 'Tree-sitter', an incremental parsing system for programming tools. 'Tree-sitter' builds concrete syntax trees for source files of any language, and can efficiently update those syntax trees as the source file is edited. It also includes a robust error recovery system that provides useful parse results even in the presence of syntax errors.
Authors: Davis Vaughan [aut, cre], Posit Software, PBC [cph, fnd], Tree-sitter authors [cph] (Tree-sitter C library)
Maintainer: Davis Vaughan <[email protected]>
License: MIT + file LICENSE
Version: 0.1.0
Built: 2024-06-25 03:40:16 UTC
Source: CRAN

Help Index


Is x a language?

Description

Use is_language() to determine if an object has a class of "tree_sitter_language".

Usage

is_language(x)

Arguments

x

⁠[object]⁠

An object.

Value

  • TRUE if x is a "tree_sitter_language".

  • FALSE otherwise.

Examples

language <- treesitter.r::language()
is_language(language)

Is x a node?

Description

Checks if x is a tree_sitter_node or not.

Usage

is_node(x)

Arguments

x

⁠[object]⁠

An object.

Value

TRUE if x is a tree_sitter_node, otherwise FALSE.

Examples

language <- treesitter.r::language()
parser <- parser(language)

text <- "fn <- function() { 1 + 1 }"
tree <- parser_parse(parser, text)
node <- tree_root_node(tree)

is_node(node)

is_node(1)

Is x a parser?

Description

Checks if x is a tree_sitter_parser or not.

Usage

is_parser(x)

Arguments

x

⁠[object]⁠

An object.

Value

TRUE if x is a tree_sitter_parser, otherwise FALSE.

Examples

language <- treesitter.r::language()
parser <- parser(language)

is_parser(parser)

is_parser(1)

Is x a query?

Description

Checks if x is a tree_sitter_query or not.

Usage

is_query(x)

Arguments

x

⁠[object]⁠

An object.

Value

TRUE if x is a tree_sitter_query, otherwise FALSE.

Examples

source <- "(identifier) @id"
language <- treesitter.r::language()

query <- query(language, source)

is_query(query)

is_query(1)

Is x a tree?

Description

Checks if x is a tree_sitter_tree or not.

Usage

is_tree(x)

Arguments

x

⁠[object]⁠

An object.

Value

TRUE if x is a tree_sitter_tree, otherwise FALSE.

Examples

language <- treesitter.r::language()
parser <- parser(language)

text <- "fn <- function() { 1 + 1 }"
tree <- parser_parse(parser, text)

is_tree(tree)

is_tree(1)

Language field count

Description

Get the number of fields contained within a language.

Usage

language_field_count(x)

Arguments

x

⁠[tree_sitter_language]⁠

A tree-sitter language object.

Value

A single double value.

Examples

language <- treesitter.r::language()
language_field_count(language)

Language field identifiers

Description

Get the integer field identifier for a field name. If you are going to be using a field name repeatedly, it is often a little faster to use the corresponding field identifier instead.

Usage

language_field_id_for_name(x, name)

Arguments

x

⁠[tree_sitter_language]⁠

A tree-sitter language object.

name

⁠[character]⁠

The language field names to look up field identifiers for.

Value

An integer vector the same length as name containing:

  • The field identifier for the field name, if known.

  • NA, if the field name was not known.

See Also

language_field_name_for_id()

Examples

language <- treesitter.r::language()
language_field_id_for_name(language, "lhs")

Language field names

Description

Get the field name for a field identifier.

Usage

language_field_name_for_id(x, id)

Arguments

x

⁠[tree_sitter_language]⁠

A tree-sitter language object.

id

⁠[integer]⁠

The language field identifiers to look up field names for.

Value

A character vector the same length as id containing:

  • The field name for the field identifier, if known.

  • NA, if the field identifier was not known.

See Also

language_field_id_for_name()

Examples

language <- treesitter.r::language()
language_field_name_for_id(language, 1)

Language name

Description

Extract a language object's language name.

Usage

language_name(x)

Arguments

x

⁠[tree_sitter_language]⁠

A tree-sitter language object.

Value

A string.

Examples

language <- treesitter.r::language()
language_name(language)

Language state advancement

Description

Get the next state in the grammar.

Usage

language_next_state(x, state, symbol)

Arguments

x

⁠[tree_sitter_language]⁠

A tree-sitter language object.

state, symbol

⁠[integer]⁠

Vectors of equal length containing the current state and symbol information.

Value

A single integer representing the next state.

Examples

language <- treesitter.r::language()
parser <- parser(language)

text <- "fn <- function() { 1 + 1 }"
tree <- parser_parse(parser, text)
node <- tree_root_node(tree)

# Navigate to function definition
node <- node_child(node, 1)
node <- node_child(node, 3)
node

state <- node_parse_state(node)
symbol <- node_grammar_symbol(node)

# Function definition symbol
language_symbol_name(language, 85)

# Next state (this is all grammar dependent)
language_next_state(language, state, symbol)

Language state count

Description

Get the number of states traversable within a language.

Usage

language_state_count(x)

Arguments

x

⁠[tree_sitter_language]⁠

A tree-sitter language object.

Value

A single double value.

Examples

language <- treesitter.r::language()
language_state_count(language)

Language symbol count

Description

Get the number of symbols contained within a language.

Usage

language_symbol_count(x)

Arguments

x

⁠[tree_sitter_language]⁠

A tree-sitter language object.

Value

A single double value.

Examples

language <- treesitter.r::language()
language_symbol_count(language)

Language symbols

Description

Get the integer symbol ID for a particular node name. Can be useful for exploring the grammar.

Usage

language_symbol_for_name(x, name, ..., named = TRUE)

Arguments

x

⁠[tree_sitter_language]⁠

A tree-sitter language object.

name

⁠[character]⁠

The names to look up symbols for.

...

These dots are for future extensions and must be empty.

named

⁠[logical]⁠

Should named or anonymous nodes be looked up? Recycled to the size of name.

Value

An integer vector the same size as name containing either:

  • The integer symbol ID of the node name, if known.

  • NA if the node name was not known.

See Also

language_symbol_name()

Examples

language <- treesitter.r::language()
language_symbol_for_name(language, "identifier")

Language symbol names

Description

Get the name for a particular language symbol ID. Can be useful for exploring a grammar.

Usage

language_symbol_name(x, symbol)

Arguments

x

⁠[tree_sitter_language]⁠

A tree-sitter language object.

symbol

⁠[positive integer]⁠

The language symbols to look up names for.

Value

A character vector the same length as symbol containing:

  • The name of the symbol, if known.

  • NA, if the symbol was not known.

See Also

language_symbol_for_name()

Examples

language <- treesitter.r::language()
language_symbol_name(language, 1)

Node descendant count

Description

Returns the number of descendants of this node, including this node in the count.

Usage

node_descendant_count(x)

Arguments

x

⁠[tree_sitter_node]⁠

A node.

Value

A single double.

Examples

language <- treesitter.r::language()
parser <- parser(language)

text <- "fn <- function() { 1 + 1 }"
tree <- parser_parse(parser, text)
node <- tree_root_node(tree)

# Top level program node
node_descendant_count(node)

# The whole `<-` binary operator node
node <- node_child(node, 1)
node_descendant_count(node)

# Just the literal `<-` operator itself
node <- node_child_by_field_name(node, "operator")
node_descendant_count(node)

Get a child's field name by index

Description

node_field_name_for_child() returns the field name for the ith child, considering both named and anonymous nodes.

Nodes themselves don't know their own field names, because they don't know if they are fields or not. You must have access to their parents to query their field names.

Usage

node_field_name_for_child(x, i)

Arguments

x

⁠[tree_sitter_node]⁠

A node.

i

⁠[integer(1)]⁠

The index of the child to get the field name for.

Value

The field name for the ith child of x, or NA_character_ if that child doesn't exist.

Examples

language <- treesitter.r::language()
parser <- parser(language)

text <- "fn <- function() { 1 + 1 }"
tree <- parser_parse(parser, text)
node <- tree_root_node(tree)

# Navigate to first child
node <- node_child(node, 1)
node

# Get the field name of the first few children (note that anonymous children
# are considered)
node_field_name_for_child(node, 1)
node_field_name_for_child(node, 2)

# 10th child doesn't exist, this returns `NA_character_`
node_field_name_for_child(node, 10)

Get a node's underlying language

Description

node_language() returns the document text underlying a node.

Usage

node_language(x)

Arguments

x

⁠[tree_sitter_node]⁠

A node.

Value

A tree_sitter_language object.

Examples

language <- treesitter.r::language()
parser <- parser(language)

text <- "1 + foo"
tree <- parser_parse(parser, text)
node <- tree_root_node(tree)

node_language(node)

Get a node's parent

Description

node_parent() looks up the tree and returns the current node's parent.

Usage

node_parent(x)

Arguments

x

⁠[tree_sitter_node]⁠

A node.

Value

The parent node of x or NULL if there is no parent.

Examples

language <- treesitter.r::language()
parser <- parser(language)

text <- "fn <- function() { 1 + 1 }"
tree <- parser_parse(parser, text)
node <- tree_root_node(tree)

# Parent of a root node is `NULL`
node_parent(node)

node_function <- node |>
  node_child(1) |>
  node_child(3)

node_function

node_parent(node_function)

"Raw" S-expression

Description

node_raw_s_expression() returns the "raw" s-expression as seen by tree-sitter. Most of the time, node_show_s_expression() provides a better view of the tree, but occasionally it can be useful to see exactly what the underlying C library is using.

Usage

node_raw_s_expression(x)

Arguments

x

⁠[tree_sitter_node]⁠

A node.

Value

A single string containing the raw s-expression.

Examples

language <- treesitter.r::language()
parser <- parser(language)

text <- "1 + foo"
tree <- parser_parse(parser, text)
node <- tree_root_node(tree)

node_raw_s_expression(node)

Pretty print a node's s-expression

Description

node_show_s_expression() prints a nicely formatted s-expression to the console. It powers the print methods of nodes and trees.

Usage

node_show_s_expression(
  x,
  ...,
  max_lines = NULL,
  show_anonymous = TRUE,
  show_locations = TRUE,
  show_parentheses = TRUE,
  dangling_parenthesis = TRUE,
  color_parentheses = TRUE,
  color_locations = TRUE
)

Arguments

x

⁠[tree_sitter_node]⁠

A node.

...

These dots are for future extensions and must be empty.

max_lines

⁠[double(1) / NULL]⁠

An optional maximum number of lines to print. If the maximum is hit, then ⁠<truncated>⁠ will be printed at the end.

show_anonymous

⁠[bool]⁠

Should anonymous nodes be shown? If FALSE, only named nodes are shown.

show_locations

⁠[bool]⁠

Should node locations be shown?

show_parentheses

⁠[bool]⁠

Should parentheses around each node be shown?

dangling_parenthesis

⁠[bool]⁠

Should the ⁠)⁠ parenthesis "dangle" on its own line? If FALSE, it is appended to the line containing the last child. This can be useful for conserving space.

color_parentheses

⁠[bool]⁠

Should parentheses be colored? Printing large s-expressions is faster if this is set to FALSE.

color_locations

⁠[bool]⁠

Should locations be colored? Printing large s-expressions is faster if this is set to FALSE.

Value

x invisibly.

Examples

language <- treesitter.r::language()
parser <- parser(language)

text <- "fn <- function(a, b = 2) { a + b + 2 }"
tree <- parser_parse(parser, text)
node <- tree_root_node(tree)

node_show_s_expression(node)

node_show_s_expression(node, max_lines = 5)

# This is more like a typical abstract syntax tree
node_show_s_expression(
  node,
  show_anonymous = FALSE,
  show_locations = FALSE,
  dangling_parenthesis = FALSE
)

Node symbol

Description

node_symbol() returns the symbol id of the current node as an integer.

Usage

node_symbol(x)

Arguments

x

⁠[tree_sitter_node]⁠

A node.

Value

A single integer.

Examples

language <- treesitter.r::language()
parser <- parser(language)

text <- "fn <- function() { 1 + 1 }"
tree <- parser_parse(parser, text)
node <- tree_root_node(tree)

# Top level program node
node_symbol(node)

# The whole `<-` binary operator node
node <- node_child(node, 1)
node_symbol(node)

# Just the literal `<-` operator itself
node <- node_child_by_field_name(node, "operator")
node_symbol(node)

Get a node's underlying text

Description

node_text() returns the document text underlying a node.

Usage

node_text(x)

Arguments

x

⁠[tree_sitter_node]⁠

A node.

Value

A single string containing the node's text.

Examples

language <- treesitter.r::language()
parser <- parser(language)

text <- "1 + foo"
tree <- parser_parse(parser, text)
node <- tree_root_node(tree)

node |>
  node_child(1) |>
  node_child_by_field_name("rhs") |>
  node_text()

Node type

Description

node_type() returns the "type" of the current node as a string.

This is a very useful function for making decisions about how to handle the current node.

Usage

node_type(x)

Arguments

x

⁠[tree_sitter_node]⁠

A node.

Value

A single string.

Examples

language <- treesitter.r::language()
parser <- parser(language)

text <- "fn <- function() { 1 + 1 }"
tree <- parser_parse(parser, text)
node <- tree_root_node(tree)

# Top level program node
node_type(node)

# The whole `<-` binary operator node
node <- node_child(node, 1)
node
node_type(node)

# Just the literal `<-` operator itself
node <- node_child_by_field_name(node, "operator")
node
node_type(node)

Generate a TreeCursor iterator

Description

node_walk() creates a TreeCursor starting at the current node. You can use it to "walk" the tree more efficiently than using node_child() and other similar node functions.

Usage

node_walk(x)

Arguments

x

⁠[tree_sitter_node]⁠

A node.

Value

A TreeCursor object.

Examples

language <- treesitter.r::language()
parser <- parser(language)

text <- "1 + foo"
tree <- parser_parse(parser, text)
node <- tree_root_node(tree)

cursor <- node_walk(node)

cursor$goto_first_child()
cursor$goto_first_child()
cursor$node()
cursor$goto_next_sibling()
cursor$node()

Get a node's child by index

Description

These functions return the ith child of x.

  • node_child() considers both named and anonymous children.

  • node_named_child() considers only named children.

Usage

node_child(x, i)

node_named_child(x, i)

Arguments

x

⁠[tree_sitter_node]⁠

A node.

i

⁠[integer(1)]⁠

The index of the child to return.

Value

The ith child node of x or NULL if there is no child at that index.

Examples

language <- treesitter.r::language()
parser <- parser(language)

text <- "fn <- function() { 1 + 1 }"
tree <- parser_parse(parser, text)
node <- tree_root_node(tree)

# Starts with `program` node for the whole document
node

# Navigate to first child
node <- node_child(node, 1)
node

# Note how the named variant skips the anonymous operator node
node_child(node, 2)
node_named_child(node, 2)

# OOB indices return `NULL`
node_child(node, 5)

Get a node's child by field id or name

Description

These functions return children of x by field id or name.

  • node_child_by_field_id() retrieves a child by field id.

  • node_child_by_field_name() retrieves a child by field name.

Use language_field_id_for_name() to get the field id for a field name.

Usage

node_child_by_field_id(x, id)

node_child_by_field_name(x, name)

Arguments

x

⁠[tree_sitter_node]⁠

A node.

id

⁠[integer(1)]⁠

The field id of the child to return.

name

⁠[character(1)]⁠

The field name of the child to return.

Value

A child of x, or NULL if no matching child can be found.

Examples

language <- treesitter.r::language()
parser <- parser(language)

text <- "fn <- function() { 1 + 1 }"
tree <- parser_parse(parser, text)
node <- tree_root_node(tree)

# Navigate to first child
node <- node_child(node, 1)
node

# Get the field name of the first child
name <- node_field_name_for_child(node, 1)
name

# Now get the child again by that field name
node_child_by_field_name(node, name)

# If you need to look up by field name many times, you can look up the
# more direct field id first and use that instead
id <- language_field_id_for_name(language, name)
id

node_child_by_field_id(node, id)

# Returns `NULL` if no matching child
node_child_by_field_id(node, 10000)

Get a node's child count

Description

These functions return the number of children of x.

  • node_child_count() considers both named and anonymous children.

  • node_named_child_count() considers only named children.

Usage

node_child_count(x)

node_named_child_count(x)

Arguments

x

⁠[tree_sitter_node]⁠

A node.

Value

A single integer, the number of children of x.

Examples

language <- treesitter.r::language()
parser <- parser(language)

text <- "fn <- function() { 1 + 1 }"
tree <- parser_parse(parser, text)
node <- tree_root_node(tree)

# Navigate to first child
node <- node_child(node, 1)
node

# Note how the named variant doesn't count the anonymous operator node
node_child_count(node)
node_named_child_count(node)

Get a node's children

Description

These functions return the children of x within a list.

  • node_children() considers both named and anonymous children.

  • node_named_children() considers only named children.

Usage

node_children(x)

node_named_children(x)

Arguments

x

⁠[tree_sitter_node]⁠

A node.

Value

The children of x as a list.

Examples

language <- treesitter.r::language()
parser <- parser(language)

text <- "fn <- function() { 1 + 1 }"
tree <- parser_parse(parser, text)
node <- tree_root_node(tree)

# Navigate to first child
node <- node_child(node, 1)
node

# Note how the named variant doesn't include the anonymous operator node
node_children(node)
node_named_children(node)

Node descendants

Description

These functions return the smallest node within this node that spans the given range of bytes or points. If the ranges are out of bounds, or no smaller node can be determined, the input is returned.

Usage

node_descendant_for_byte_range(x, start, end)

node_named_descendant_for_byte_range(x, start, end)

node_descendant_for_point_range(x, start, end)

node_named_descendant_for_point_range(x, start, end)

Arguments

x

⁠[tree_sitter_node]⁠

A node.

start, end

⁠[integer(1) / tree_sitter_point]⁠

For the byte range functions, start and end bytes to search within.

For the point range functions, start and end points created by point() to search within.

Value

A node.

Examples

language <- treesitter.r::language()
parser <- parser(language)

text <- "fn <- function() { 1 + 1 }"
tree <- parser_parse(parser, text)
node <- tree_root_node(tree)

# The whole `<-` binary operator node
node <- node_child(node, 1)
node

# The byte range points to a location in the word `function`
node_descendant_for_byte_range(node, 7, 9)
node_named_descendant_for_byte_range(node, 7, 9)

start <- point(0, 14)
end <- point(0, 15)

node_descendant_for_point_range(node, start, end)
node_named_descendant_for_point_range(node, start, end)

# OOB returns the input
node_descendant_for_byte_range(node, 25, 29)

Get the first child that extends beyond the given byte offset

Description

These functions return the first child of x that extends beyond the given byte offset. Note that byte is a 0-indexed offset.

  • node_first_child_for_byte() considers both named and anonymous nodes.

  • node_first_named_child_for_byte() considers only named nodes.

Usage

node_first_child_for_byte(x, byte)

node_first_named_child_for_byte(x, byte)

Arguments

x

⁠[tree_sitter_node]⁠

A node.

byte

⁠[integer(1)]⁠

The byte to start the search from.

Note that byte is 0-indexed!

Value

A new node, or NULL if there is no node past the byte offset.

Examples

language <- treesitter.r::language()
parser <- parser(language)

text <- "fn <- function() { 1 + 1 }"
tree <- parser_parse(parser, text)
node <- tree_root_node(tree)

# Navigate to first child
node <- node_child(node, 1)
node

# `fn {here}<- function()`
node_first_child_for_byte(node, 3)
node_first_named_child_for_byte(node, 3)

# Past any node
node_first_child_for_byte(node, 100)

Node grammar types and symbols

Description

  • node_grammar_type() gets the node's type as it appears in the grammar, ignoring aliases.

  • node_grammar_symbol() gets the node's symbol (the type as a numeric id) as it appears in the grammar, ignoring aliases. This should be used in language_next_state() rather than node_symbol().

Usage

node_grammar_type(x)

node_grammar_symbol(x)

Arguments

x

⁠[tree_sitter_node]⁠

A node.

See Also

node_type(), node_symbol()

Examples

language <- treesitter.r::language()
parser <- parser(language)

text <- "fn <- function() { 1 + 1 }"
tree <- parser_parse(parser, text)
node <- tree_root_node(tree)

# Examples for these functions are highly specific to the grammar,
# because they relies on the placement of `alias()` calls in the grammar.
node_grammar_type(node)
node_grammar_symbol(node)

Node byte and point accessors

Description

These functions return information about the location of x in the document. The byte, row, and column locations are all 0-indexed.

  • node_start_byte() returns the start byte.

  • node_end_byte() returns the end byte.

  • node_start_point() returns the start point, containing a row and column location within the document. Use accessors like point_row() to extract the row and column positions.

  • node_end_point() returns the end point, containing a row and column location within the document. Use accessors like point_row() to extract the row and column positions.

  • node_range() returns a range object that contains all of the above information. Use accessors like range_start_point() to extract individual pieces from the range.

Usage

node_start_byte(x)

node_end_byte(x)

node_start_point(x)

node_end_point(x)

node_range(x)

Arguments

x

⁠[tree_sitter_node]⁠

A node.

Value

  • node_start_byte() and node_end_byte() return a single numeric value.

  • node_start_point() and node_end_point() return single points.

  • node_range() returns a range.

Examples

language <- treesitter.r::language()
parser <- parser(language)

text <- "fn <- function() { 1 + 1 }"
tree <- parser_parse(parser, text)
node <- tree_root_node(tree)

# Navigate to first child
node <- node_child(node, 1)

# Navigate to function definition node
node <- node_child(node, 3)
node

node_start_byte(node)
node_end_byte(node)

node_start_point(node)
node_end_point(node)

node_range(node)

Node metadata

Description

These functions return metadata about the current node.

  • node_is_named() reports if the current node is named or anonymous.

  • node_is_missing() reports if the current node is MISSING, i.e. if it was implied through error recovery.

  • node_is_extra() reports if the current node is an "extra" from the grammar.

  • node_is_error() reports if the current node is an ERROR node.

  • node_has_error() reports if the current node is an ERROR node, or if any descendants of the current node are ERROR or MISSING nodes.

Usage

node_is_named(x)

node_is_missing(x)

node_is_extra(x)

node_is_error(x)

node_has_error(x)

Arguments

x

⁠[tree_sitter_node]⁠

A node.

Value

TRUE or FALSE.

Examples

language <- treesitter.r::language()
parser <- parser(language)

text <- "fn <- function() { 1 + 1 }"
tree <- parser_parse(parser, text)
node <- tree_root_node(tree)

node <- node_child(node, 1)

fn <- node_child(node, 1)
operator <- node_child(node, 2)

fn
node_is_named(fn)

operator
node_is_named(operator)

# Examples of `TRUE` cases for these are a bit hard to come up with, because
# they are dependent on the exact state of the grammar and the error recovery
# algorithm
node_is_missing(node)
node_is_extra(node)

Node parse states

Description

These are advanced functions that return information about the internal parse states.

  • node_parse_state() returns the parse state of the current node.

  • node_next_parse_state() returns the parse state after this node.

See language_next_state() for more information.

Usage

node_parse_state(x)

node_next_parse_state(x)

Arguments

x

⁠[tree_sitter_node]⁠

A node.

Value

A single integer representing a parse state.

Examples

language <- treesitter.r::language()
parser <- parser(language)

text <- "fn <- function() { 1 + 1 }"
tree <- parser_parse(parser, text)
node <- tree_root_node(tree)

node <- node_child(node, 1)

# Parse states are grammar dependent
node_parse_state(node)
node_next_parse_state(node)

Node sibling accessors

Description

These functions return siblings of the current node, i.e. if you looked "left" or "right" from the current node rather "up" (parent) or "down" (child).

  • node_next_sibling() and node_next_named_sibling() return the next sibling.

  • node_previous_sibling() and node_previous_named_sibling() return the previous sibling.

Usage

node_next_sibling(x)

node_next_named_sibling(x)

node_previous_sibling(x)

node_previous_named_sibling(x)

Arguments

x

⁠[tree_sitter_node]⁠

A node.

Value

A sibling node, or NULL if there is no sibling node.

Examples

language <- treesitter.r::language()
parser <- parser(language)

text <- "fn <- function() { 1 + 1 }"
tree <- parser_parse(parser, text)
node <- tree_root_node(tree)

# Navigate to first child
node <- node_child(node, 1)

# Navigate to function definition node
node <- node_child(node, 3)
node

node_previous_sibling(node)

# Skip anonymous operator node
node_previous_named_sibling(node)

# There isn't one!
node_next_sibling(node)

Create a new parser

Description

parser() constructs a parser from a tree-sitter language object. You can use parser_parse() to parse language specific text with it.

Usage

parser(language)

Arguments

language

⁠[tree_sitter_language]⁠

A language object.

Value

A new parser.

Examples

language <- treesitter.r::language()
parser <- parser(language)
parser

text <- "1 + foo"
tree <- parser_parse(parser, text)
tree

Parser adjustments

Description

  • parser_set_language() sets the language of the parser. This is usually done by parser() though.

  • parser_set_timeout() sets an optional timeout used when calling parser_parse() or parser_reparse(). If the timeout is hit, an error occurs.

  • parser_set_included_ranges() sets an optional list of ranges that are the only locations considered when parsing. The ranges are created by range().

Usage

parser_set_language(x, language)

parser_set_timeout(x, timeout)

parser_set_included_ranges(x, included_ranges)

Arguments

x

⁠[tree_sitter_parser]⁠

A parser.

language

⁠[tree_sitter_language]⁠

A language.

timeout

⁠[double(1)]⁠

A single whole number corresponding to a timeout in microseconds to use when parsing.

included_ranges

⁠[list_of<tree_sitter_range>]⁠

A list of ranges constructed by range(). These are the only locations that will be considered when parsing.

An empty list can be used to clear any existing ranges so that the parser will again parse the entire document.

Value

A new parser.

Examples

language <- treesitter.r::language()
parser <- parser(language)
parser_set_timeout(parser, 10000)

Parse or reparse text

Description

  • parser_parse() performs an initial parse of text, a string typically containing contents of a file. It returns a tree for further manipulations.

  • parser_reparse() performs a fast incremental reparse. text is typically a slightly modified version of the original text with a new "edit" applied. The position of the edit is described by the byte and point arguments to this function. The tree argument corresponds to the original tree returned by parser_parse().

All bytes and points should be 0-indexed.

Usage

parser_parse(x, text, ..., encoding = "UTF-8")

parser_reparse(
  x,
  text,
  tree,
  start_byte,
  start_point,
  old_end_byte,
  old_end_point,
  new_end_byte,
  new_end_point,
  ...,
  encoding = "UTF-8"
)

Arguments

x

⁠[tree_sitter_parser]⁠

A parser.

text

⁠[string]⁠

The text to parse.

...

These dots are for future extensions and must be empty.

encoding

⁠[string]⁠

The expected encoding of the text. Either "UTF-8" or "UTF-16".

tree

⁠[tree_sitter_tree]⁠

The original tree returned by parser_parse(). Components of the tree will be reused to perform the incremental reparse.

start_byte, start_point

⁠[double(1) / tree_sitter_point]⁠

The starting byte and starting point of the edit location.

old_end_byte, old_end_point

⁠[double(1) / tree_sitter_point]⁠

The old ending byte and old ending point of the edit location.

new_end_byte, new_end_point

⁠[double(1) / tree_sitter_point]⁠

The new ending byte and new ending point of the edit location.

Value

A new tree.

Examples

language <- treesitter.r::language()
parser <- parser(language)

text <- "1 + foo"
tree <- parser_parse(parser, text)
tree

text <- "1 + bar(foo)"
parser_reparse(
  parser,
  text,
  tree,
  start_byte = 4,
  start_point = point(0, 4),
  old_end_byte = 7,
  old_end_point = point(0, 7),
  new_end_byte = 12,
  new_end_point = point(0, 12)
)

Points

Description

  • point() creates a new tree-sitter point.

  • point_row() and point_column() access a point's row and column value, respectively.

  • is_point() determines whether or not an object is a point.

Note that points are 0-indexed. This is typically the easiest form to work with them in, since most of the time when you are provided row and column information from third party libraries, they will already be 0-indexed. It is also consistent with bytes, which are also 0-indexed and are often provided alongside their corresponding points.

Usage

point(row, column)

point_row(x)

point_column(x)

is_point(x)

Arguments

row

⁠[double(1)]⁠

A 0-indexed row to place the point at.

column

⁠[double(1)]⁠

A 0-indexed column to place the point at.

x

⁠[tree_sitter_point]⁠

A point.

Value

  • point() returns a new point.

  • point_row() and point_column() return a single double.

  • is_point() returns TRUE or FALSE.

Examples

x <- point(1, 2)

point_row(x)
point_column(x)

is_point(x)

Queries

Description

query() lets you specify a query source string for use with query_captures() and query_matches(). The source string is written in a way that is somewhat similar to the idea of capture groups in regular expressions. You write out a pattern that matches a node in a tree, and then you "capture" parts of that pattern with ⁠@name⁠ tags. The captures are the values returned by query_captures() and query_matches(). There are also a series of predicates that can be used to further refine the query. Those are described in the query_matches() help page.

Read the tree-sitter documentation to learn more about the query syntax.

Usage

query(language, source)

Arguments

language

⁠[tree_sitter_language]⁠

A language.

source

⁠[string]⁠

A query source string.

Value

A query.

Examples

# This query looks for binary operators where the left hand side is an
# identifier named `fn`, and the right hand side is a function definition.
# The operator can be `<-` or `=` (technically it can also be things like
# `+` as well in this example).
source <- '(binary_operator
  lhs: (identifier) @lhs
  operator: _ @operator
  rhs: (function_definition) @rhs
  (#eq? @lhs "fn")
)'

language <- treesitter.r::language()

query <- query(language, source)

text <- "
  fn <- function() {}
  fn2 <- function() {}
  fn <- 5
  fn = function(a, b, c) { a + b + c }
"
parser <- parser(language)
tree <- parser_parse(parser, text)
node <- tree_root_node(tree)

query_matches(query, node)

Query accessors

Description

  • query_pattern_count() returns the number of patterns in a query.

  • query_capture_count() returns the number of captures in a query.

  • query_string_count() returns the number of string literals in a query.

  • query_start_byte_for_pattern() returns the byte where the ith pattern starts in the query source.

Usage

query_pattern_count(x)

query_capture_count(x)

query_string_count(x)

query_start_byte_for_pattern(x, i)

Arguments

x

⁠[tree_sitter_query]⁠

A query.

i

⁠[double(1)]⁠

The ith pattern to extract the start byte for.

Value

  • query_pattern_count(), query_capture_count(), and query_string_count() return a single double count value.

  • query_start_byte_for_pattern() returns a single double for the start byte if there was an ith pattern, otherwise it returns NA.

Examples

source <- '(binary_operator
  lhs: (identifier) @lhs
  operator: _ @operator
  rhs: (function_definition) @rhs
  (#eq? @lhs "fn")
)'
language <- treesitter.r::language()

query <- query(language, source)

query_pattern_count(query)
query_capture_count(query)
query_string_count(query)

text <- "
  fn <- function() {}
  fn2 <- function() {}
  fn <- 5
  fn <- function(a, b, c) { a + b + c }
"
parser <- parser(language)
tree <- parser_parse(parser, text)
node <- tree_root_node(tree)

query_matches(query, node)

Query matches and captures

Description

These two functions execute a query on a given node, and return the captures of the query for further use. Both functions return the same information, just structured differently depending on your use case.

  • query_matches() returns the captures first grouped by pattern, and further grouped by match within each pattern. This is useful if you include multiple patterns in your query.

  • query_captures() returns a flat list of captures ordered by their node location in the original text. This is normally the easiest structure to use if you have a single pattern without any alternations that would benefit from having individual captures split by match.

Both also return the capture name, i.e. the ⁠@name⁠ you specified in your query.

Usage

query_matches(x, node, ..., range = NULL)

query_captures(x, node, ..., range = NULL)

Arguments

x

⁠[tree_sitter_query]⁠

A query.

node

⁠[tree_sitter_node]⁠

A node to run the query over.

...

These dots are for future extensions and must be empty.

range

⁠[tree_sitter_range / NULL]⁠

An optional range to restrict the query to.

Predicates

There are 3 core types of predicates supported:

  • ⁠#eq? @capture "string"⁠

  • ⁠#eq? @capture1 @capture2⁠

  • ⁠#match? @capture "regex"⁠

Each of these predicates can also be inverted with a ⁠not-⁠ prefix, i.e. ⁠#not-eq?⁠ and ⁠#not-match?⁠.

String double quotes

The underlying tree-sitter predicate parser requires that strings supplied in a query must use double quotes, i.e. "string" not 'string'. If you try and use single quotes, you will get a query error.

⁠#match?⁠ regex

The regex support provided by ⁠#match?⁠ is powered by grepl().

Escapes are a little tricky to get right within these match regex strings. To use something like ⁠\s⁠ in the regex string, you need the literal text ⁠\\s⁠ to appear in the string to tell the tree-sitter regex engine to escape the backslash so you end up with just ⁠\s⁠ in the captured string. This requires putting two literal backslash characters in the R string itself, which can be accomplished with either "\\\\s" or using a raw string like r'["\\\\s"]' which is typically a little easier. You can also write your queries in a separate file (typically called queries.scm) and read them into R, which is also a little more straightforward because you can just write something like ⁠(#match? @id "^\\s$")⁠ and that will be read in correctly.

Examples

text <- "
foo + b + a + ab
and(a)
"

source <- "(identifier) @id"

language <- treesitter.r::language()

query <- query(language, source)
parser <- parser(language)
tree <- parser_parse(parser, text)
node <- tree_root_node(tree)

# A flat ordered list of captures, that's most useful here since
# we only have 1 pattern!
captures <- query_captures(query, node)
captures$node

Ranges

Description

  • range() creates a new tree-sitter range.

  • range_start_byte() and range_end_byte() access a range's start and end bytes, respectively.

  • range_start_point() and range_end_point() access a range's start and end points, respectively.

  • is_range() determines whether or not an object is a range.

Note that the bytes and points used in ranges are 0-indexed.

Usage

range(start_byte, start_point, end_byte, end_point)

range_start_byte(x)

range_start_point(x)

range_end_byte(x)

range_end_point(x)

is_range(x)

Arguments

start_byte, end_byte

⁠[double(1)]⁠

0-indexed bytes for the start and end of the range, respectively.

start_point, end_point

⁠[tree_sitter_point]⁠

0-indexed points for the start and end of the range, respectively.

x

⁠[tree_sitter_range]⁠

A range.

Value

  • range() returns a new range.

  • range_start_byte() and range_end_byte() return a single double.

  • range_start_point() and range_end_point() return a point().

  • is_range() returns TRUE or FALSE.

See Also

node_range()

Examples

x <- range(5, point(1, 3), 7, point(1, 5))
x

range_start_byte(x)
range_end_byte(x)

range_start_point(x)
range_end_point(x)

is_range(x)

Parse a snippet of text

Description

text_parse() is a convenience utility for quickly parsing a small snippet of text using a particular language and getting access to its root node. It is meant for demonstration purposes. If you are going to need to reparse the text after an edit has been made, you should create a full parser with parser() and use parser_parse() instead.

Usage

text_parse(x, language)

Arguments

x

⁠[string]⁠

The text to parse.

language

⁠[tree_sitter_language]⁠

The language to parse with.

Value

A root node.

Examples

language <- treesitter.r::language()
text <- "map(xs, function(x) 1 + 1)"

# Note that this directly returns the root node, not the tree
text_parse(text, language)

Retrieve the root node of the tree

Description

tree_root_node() is the entry point for accessing nodes within a specific tree. It returns the "root" of the tree, from which you can use other ⁠node_*()⁠ functions to navigate around.

Usage

tree_root_node(x)

Arguments

x

⁠[tree_sitter_tree]⁠

A tree.

Value

A node.

Examples

language <- treesitter.r::language()
parser <- parser(language)

text <- "fn <- function() { 1 + 1 }"
tree <- parser_parse(parser, text)
node <- tree_root_node(tree)

# Trees and nodes have a similar print method, but you can
# only use other `node_*()` functions on nodes.
tree
node

node |>
  node_child(1) |>
  node_children()

Retrieve an offset root node

Description

tree_root_node_with_offset() is similar to tree_root_node(), but the returned root node's position has been shifted by the given number of bytes, rows, and columns.

This function allows you to parse a subset of a document with parser_parse() as if it were a self-contained document, but then later access the syntax tree in the coordinate space of the larger document.

Note that the underlying text within x is not what you are offsetting into. Instead, you should assume that the text you provided to parser_parse() already contained the entire subset of the document you care about, and the offset you are providing is how far into the document the beginning of text is.

Usage

tree_root_node_with_offset(x, byte, point)

Arguments

x

⁠[tree_sitter_tree]⁠

A tree.

byte, point

⁠[double(1), tree_sitter_point]⁠

A byte and point offset combination.

Value

An offset root node.

Examples

language <- treesitter.r::language()
parser <- parser(language)

text <- "fn <- function() { 1 + 1 }"
tree <- parser_parse(parser, text)

# If `text` was the whole document, you can just use `tree_root_node()`
node <- tree_root_node(tree)

# If `text` represents a subset of the document, use
# `tree_root_node_with_offset()` to be able to get positions in the
# coordinate space of the original document.
byte <- 5
point <- point(5, 0)
node_offset <- tree_root_node_with_offset(tree, byte, point)

# The position of `fn` if you treat `text` as the whole document
node |>
  node_child(1) |>
  node_child(1)

# The position of `fn` if you treat `text` as a subset of a larger document
node_offset |>
  node_child(1) |>
  node_child(1)

Generate a TreeCursor iterator

Description

tree_walk() creates a TreeCursor starting at the root node. You can use it to "walk" the tree more efficiently than using node_child() and other similar node functions.

Usage

tree_walk(x)

Arguments

x

⁠[tree_sitter_tree]⁠

A tree.

Value

A TreeCursor object.

Examples

language <- treesitter.r::language()
parser <- parser(language)

text <- "1 + foo"
tree <- parser_parse(parser, text)

cursor <- tree_walk(tree)

cursor$goto_first_child()
cursor$goto_first_child()
cursor$node()
cursor$goto_next_sibling()
cursor$node()

Tree accessors

Description

  • tree_text() retrieves the tree's text that it was parsed with.

  • tree_language() retrieves the tree's language that it was parsed with.

  • tree_included_ranges() retrieves the tree's included_ranges that were provided to parser_set_included_ranges(). Note that if no ranges were provided originally, then this still returns a default that always covers the entire document.

Usage

tree_included_ranges(x)

tree_text(x)

tree_language(x)

Arguments

x

⁠[tree_sitter_tree]⁠

A tree.

Value

  • tree_text() returns a string.

  • tree_language() returns a tree_sitter_language.

  • tree_included_ranges() returns a list of range() objects.

Examples

language <- treesitter.r::language()
parser <- parser(language)

text <- "1 + foo"
tree <- parser_parse(parser, text)

tree_text(tree)
tree_language(tree)
tree_included_ranges(tree)

Tree cursors

Description

TreeCursor is an R6 class that allows you to walk a tree in a more efficient way than calling ⁠node_*()⁠ functions like node_child() repeatedly.

You can also more elegantly create a cursor with node_walk() and tree_walk().

Value

R6 object representing the tree cursor.

Methods

Public methods


Method new()

Create a new tree cursor.

Usage
TreeCursor$new(node)
Arguments
node

⁠[tree_sitter_node]⁠

The node to start walking from.


Method reset()

Reset the tree cursor to a new root node.

Usage
TreeCursor$reset(node)
Arguments
node

⁠[tree_sitter_node]⁠

The node to start walking from.


Method node()

Get the current node that the cursor points to.

Usage
TreeCursor$node()

Method field_name()

Get the field name of the current node.

Usage
TreeCursor$field_name()

Method field_id()

Get the field id of the current node.

Usage
TreeCursor$field_id()

Method descendant_index()

Get the descendent index of the current node.

Usage
TreeCursor$descendant_index()

Method goto_parent()

Go to the current node's parent.

Returns TRUE if a parent was found, and FALSE if not.

Usage
TreeCursor$goto_parent()

Method goto_next_sibling()

Go to the current node's next sibling.

Returns TRUE if a sibling was found, and FALSE if not.

Usage
TreeCursor$goto_next_sibling()

Method goto_previous_sibling()

Go to the current node's previous sibling.

Returns TRUE if a sibling was found, and FALSE if not.

Usage
TreeCursor$goto_previous_sibling()

Method goto_first_child()

Go to the current node's first child.

Returns TRUE if a child was found, and FALSE if not.

Usage
TreeCursor$goto_first_child()

Method goto_last_child()

Go to the current node's last child.

Returns TRUE if a child was found, and FALSE if not.

Usage
TreeCursor$goto_last_child()

Method depth()

Get the depth of the current node.

Usage
TreeCursor$depth()

Method goto_first_child_for_byte()

Move the cursor to the first child of its current node that extends beyond the given byte offset.

Returns TRUE if a child was found, and FALSE if not.

Usage
TreeCursor$goto_first_child_for_byte(byte)
Arguments
byte

⁠[double(1)]⁠

The byte to move the cursor past.


Method goto_first_child_for_point()

Move the cursor to the first child of its current node that extends beyond the given point.

Returns TRUE if a child was found, and FALSE if not.

Usage
TreeCursor$goto_first_child_for_point(point)
Arguments
point

⁠[tree_sitter_point]⁠

The point to move the cursor past.

Examples

language <- treesitter.r::language()
parser <- parser(language)

text <- "fn <- function(a, b) { a + b }"

tree <- parser_parse(parser, text)
node <- tree_root_node(tree)

cursor <- TreeCursor$new(node)

cursor$node()
cursor$goto_first_child()
cursor$goto_first_child()
cursor$node()
cursor$goto_next_sibling()
cursor$node()