| 
1 | 1 | r[lex.whitespace]  | 
2 | 2 | # Whitespace  | 
3 | 3 | 
 
  | 
 | 4 | +r[whitespace.syntax]  | 
 | 5 | +```grammar,lexer  | 
 | 6 | +@root WHITESPACE ->  | 
 | 7 | +    // end of line  | 
 | 8 | +      LF  | 
 | 9 | +    | U+000B // vertical tabulation  | 
 | 10 | +    | U+000C // form feed  | 
 | 11 | +    | CR  | 
 | 12 | +    | U+0085 // Unicode next line  | 
 | 13 | +    | U+2028 // Unicode LINE SEPARATOR  | 
 | 14 | +    | U+2029 // Unicode PARAGRAPH SEPARATOR  | 
 | 15 | +    // Ignorable Code Point  | 
 | 16 | +    | U+200E // Unicode LEFT-TO-RIGHT MARK  | 
 | 17 | +    | U+200F // Unicode RIGHT-TO-LEFT MARK  | 
 | 18 | +    // horizontal whitespace  | 
 | 19 | +    | TAB  | 
 | 20 | +    | U+0020  // space ' '  | 
 | 21 | +
  | 
 | 22 | +TAB -> U+0009  // horizontal tab ('\t')  | 
 | 23 | +
  | 
 | 24 | +LF -> U+000A  // line feed ('\n')  | 
 | 25 | +
  | 
 | 26 | +CR -> U+000D  // carriage return ('\r')  | 
 | 27 | +```  | 
 | 28 | + | 
4 | 29 | r[lex.whitespace.intro]  | 
5 | 30 | Whitespace is any non-empty string containing only characters that have the  | 
6 |  | -[`Pattern_White_Space`] Unicode property, namely:  | 
7 |  | - | 
8 |  | -- `U+0009` (horizontal tab, `'\t'`)  | 
9 |  | -- `U+000A` (line feed, `'\n'`)  | 
10 |  | -- `U+000B` (vertical tab)  | 
11 |  | -- `U+000C` (form feed)  | 
12 |  | -- `U+000D` (carriage return, `'\r'`)  | 
13 |  | -- `U+0020` (space, `' '`)  | 
14 |  | -- `U+0085` (next line)  | 
15 |  | -- `U+200E` (left-to-right mark)  | 
16 |  | -- `U+200F` (right-to-left mark)  | 
17 |  | -- `U+2028` (line separator)  | 
18 |  | -- `U+2029` (paragraph separator)  | 
 | 31 | +[`Pattern_White_Space`] Unicode property.  | 
19 | 32 | 
 
  | 
20 | 33 | r[lex.whitespace.token-sep]  | 
21 | 34 | Rust is a "free-form" language, meaning that all forms of whitespace serve only  | 
 | 
0 commit comments