AST
Trivia
Trivia is Forte's system for analyzing whitespace structure within text nodes. It breaks a text node's content into typed tokens (leading whitespace, words, inner whitespace, and trailing whitespace). This is useful for building linters that detect inconsistent indentation, formatting tools that normalize spacing, and analysis passes that inspect content structure.
#Getting Trivia
The getTrivia method on TextNode returns an array of Trivia tokens parsed from the node's content. The result is lazy-loaded and cached, so repeated calls do not re-parse:
1<?php
2
3use Forte\Ast\TextNode;
4use Forte\Ast\Trivia\Trivia;
5use Forte\Facades\Forte;
6
7$doc = Forte::parse('<div> Hello World </div>');
8
9$text = $doc->find(fn ($n) => $n instanceof TextNode);
10$trivia = $text->getTrivia();
11
12count($trivia); // 5
The five tokens for " Hello World " are leading whitespace (" "), word ("Hello"), inner whitespace (" "), word ("World"), and trailing whitespace (" ").
Only TextNode exposes trivia. Other node types like EchoNode or DirectiveNode do not have trivia analysis.
#Trivia Kinds
Each Trivia token has a kind property from the TriviaKind enum that describes its role in the text:
| Kind | Description |
|---|---|
LeadingWhitespace |
Whitespace at the very start of the text |
Word |
Non-whitespace content |
InnerWhitespace |
Whitespace between words |
TrailingWhitespace |
Whitespace at the very end of the text |
Every Trivia also carries its raw content string and its byte offset within the text node:
1<?php
2
3use Forte\Ast\TextNode;
4use Forte\Ast\Trivia\TriviaKind;
5use Forte\Facades\Forte;
6
7$doc = Forte::parse('<p> Hello </p>');
8
9$text = $doc->find(fn ($n) => $n instanceof TextNode);
10$trivia = $text->getTrivia();
11
12$trivia[0]->kind; // TriviaKind::LeadingWhitespace
13$trivia[0]->content; // " "
14$trivia[0]->offset; // 0
15
16$trivia[1]->kind; // TriviaKind::Word
17$trivia[1]->content; // "Hello"
#Whitespace Analysis
Trivia tokens provide methods for classifying the type of whitespace they contain. These are particularly useful for linting tools that enforce indentation style:
1<?php
2
3use Forte\Ast\TextNode;
4use Forte\Ast\Trivia\TriviaKind;
5use Forte\Ast\Trivia\TriviaParser;
6
7$trivia = TriviaParser::parse(" Hello");
8
9$leading = $trivia[0];
10$leading->kind; // TriviaKind::LeadingWhitespace
11$leading->isSpaceOnly(); // true
12$leading->isTabOnly(); // false
13$leading->isMixedWhitespace(); // false
14$leading->isNewlineOnly(); // false
The isMixedWhitespace method checks whether a token contains two or more different types of whitespace characters (spaces, tabs, newlines). It only applies to LeadingWhitespace and TrailingWhitespace tokens, returning false for other kinds:
1<?php
2
3use Forte\Ast\Trivia\TriviaParser;
4
5$trivia = TriviaParser::parse("\t Hello");
6
7$trivia[0]->isMixedWhitespace(); // true (tab + space)
You can also use TriviaParser::parse() directly on any string, not just text extracted from a document. This is useful when analyzing content independently of the AST.
#Newline Counting
The getNewlineCount method counts newlines in a trivia token, normalizing all line ending styles (\r\n, \r, \n) before counting. The hasMultipleNewlines method returns true when two or more newlines are present, which typically indicates a blank line:
1<?php
2
3use Forte\Ast\Trivia\TriviaParser;
4
5$trivia = TriviaParser::parse("\n\n Hello\n");
6
7$trivia[0]->getNewlineCount(); // 2
8$trivia[0]->hasMultipleNewlines(); // true
TextNode provides convenience methods for counting newlines at the boundaries without inspecting individual trivia tokens:
1<?php
2
3use Forte\Ast\TextNode;
4use Forte\Facades\Forte;
5
6$doc = Forte::parse("<div>\n\n Hello\n</div>");
7
8$text = $doc->find(
9 fn ($n) => $n instanceof TextNode && $n->hasSignificantContent()
10);
11
12$text->countLeadingNewlines(); // 2
13$text->countTrailingNewlines(); // 1
These methods check only the first and last trivia tokens respectively. If the first token is not LeadingWhitespace, countLeadingNewlines returns 0.
#Trivia Properties
Each Trivia token exposes utility methods for working with its position and content:
| Method | Returns | Description |
|---|---|---|
getLength() |
int |
Byte length of the content |
getEndOffset() |
int |
offset + getLength() |
isEmpty() |
bool |
true if content is an empty string |
contains(string) |
bool |
Substring check on content |
1<?php
2
3use Forte\Ast\Trivia\TriviaParser;
4
5$trivia = TriviaParser::parse(" Hello ");
6
7$trivia[0]->getLength(); // 2
8$trivia[0]->getEndOffset(); // 2
9$trivia[0]->isEmpty(); // false
10$trivia[1]->contains('ell'); // true
#See also
- Basic Nodes: TextNode and other node types
- Source Positions: Byte offsets and line numbers
- Traversal: Find and iterate text nodes in a document