AST

Trivia

Trivia is Forte's system for analyzing whitespace structure within text nodes. It breaks a text node's content into typed tokens (leading whitespace, words, inner whitespace, and trailing whitespace). This is useful for building linters that detect inconsistent indentation, formatting tools that normalize spacing, and analysis passes that inspect content structure.

#Getting Trivia

The getTrivia method on TextNode returns an array of Trivia tokens parsed from the node's content. The result is lazy-loaded and cached, so repeated calls do not re-parse:

1<?php
2
3use Forte\Ast\TextNode;
4use Forte\Ast\Trivia\Trivia;
5use Forte\Facades\Forte;
6
7$doc = Forte::parse('<div> Hello World </div>');
8
9$text = $doc->find(fn ($n) => $n instanceof TextNode);
10$trivia = $text->getTrivia();
11
12count($trivia); // 5

The five tokens for " Hello World " are leading whitespace (" "), word ("Hello"), inner whitespace (" "), word ("World"), and trailing whitespace (" ").

Only TextNode exposes trivia. Other node types like EchoNode or DirectiveNode do not have trivia analysis.

#Trivia Kinds

Each Trivia token has a kind property from the TriviaKind enum that describes its role in the text:

Kind Description
LeadingWhitespace Whitespace at the very start of the text
Word Non-whitespace content
InnerWhitespace Whitespace between words
TrailingWhitespace Whitespace at the very end of the text

Every Trivia also carries its raw content string and its byte offset within the text node:

1<?php
2
3use Forte\Ast\TextNode;
4use Forte\Ast\Trivia\TriviaKind;
5use Forte\Facades\Forte;
6
7$doc = Forte::parse('<p> Hello </p>');
8
9$text = $doc->find(fn ($n) => $n instanceof TextNode);
10$trivia = $text->getTrivia();
11
12$trivia[0]->kind; // TriviaKind::LeadingWhitespace
13$trivia[0]->content; // " "
14$trivia[0]->offset; // 0
15
16$trivia[1]->kind; // TriviaKind::Word
17$trivia[1]->content; // "Hello"

#Whitespace Analysis

Trivia tokens provide methods for classifying the type of whitespace they contain. These are particularly useful for linting tools that enforce indentation style:

1<?php
2
3use Forte\Ast\TextNode;
4use Forte\Ast\Trivia\TriviaKind;
5use Forte\Ast\Trivia\TriviaParser;
6
7$trivia = TriviaParser::parse(" Hello");
8
9$leading = $trivia[0];
10$leading->kind; // TriviaKind::LeadingWhitespace
11$leading->isSpaceOnly(); // true
12$leading->isTabOnly(); // false
13$leading->isMixedWhitespace(); // false
14$leading->isNewlineOnly(); // false

The isMixedWhitespace method checks whether a token contains two or more different types of whitespace characters (spaces, tabs, newlines). It only applies to LeadingWhitespace and TrailingWhitespace tokens, returning false for other kinds:

1<?php
2
3use Forte\Ast\Trivia\TriviaParser;
4
5$trivia = TriviaParser::parse("\t Hello");
6
7$trivia[0]->isMixedWhitespace(); // true (tab + space)

You can also use TriviaParser::parse() directly on any string, not just text extracted from a document. This is useful when analyzing content independently of the AST.

#Newline Counting

The getNewlineCount method counts newlines in a trivia token, normalizing all line ending styles (\r\n, \r, \n) before counting. The hasMultipleNewlines method returns true when two or more newlines are present, which typically indicates a blank line:

1<?php
2
3use Forte\Ast\Trivia\TriviaParser;
4
5$trivia = TriviaParser::parse("\n\n Hello\n");
6
7$trivia[0]->getNewlineCount(); // 2
8$trivia[0]->hasMultipleNewlines(); // true

TextNode provides convenience methods for counting newlines at the boundaries without inspecting individual trivia tokens:

1<?php
2
3use Forte\Ast\TextNode;
4use Forte\Facades\Forte;
5
6$doc = Forte::parse("<div>\n\n Hello\n</div>");
7
8$text = $doc->find(
9 fn ($n) => $n instanceof TextNode && $n->hasSignificantContent()
10);
11
12$text->countLeadingNewlines(); // 2
13$text->countTrailingNewlines(); // 1

These methods check only the first and last trivia tokens respectively. If the first token is not LeadingWhitespace, countLeadingNewlines returns 0.

#Trivia Properties

Each Trivia token exposes utility methods for working with its position and content:

Method Returns Description
getLength() int Byte length of the content
getEndOffset() int offset + getLength()
isEmpty() bool true if content is an empty string
contains(string) bool Substring check on content
1<?php
2
3use Forte\Ast\Trivia\TriviaParser;
4
5$trivia = TriviaParser::parse(" Hello ");
6
7$trivia[0]->getLength(); // 2
8$trivia[0]->getEndOffset(); // 2
9$trivia[0]->isEmpty(); // false
10$trivia[1]->contains('ell'); // true

#See also