Skip to content

Commit c532b8b

Browse files
Add additional type URLs chars to text-format spec.
Change the text-format spec to allow multiple slashes, most URI "sub-delimiters" and hex-escapes in type URLs prefixes of Any names. PiperOrigin-RevId: 839174308
1 parent 99e1552 commit c532b8b

File tree

1 file changed

+23
-7
lines changed

1 file changed

+23
-7
lines changed

content/reference/protobuf/textformat-spec.md

Lines changed: 23 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -11,10 +11,10 @@ documentation using the syntax specified in
1111
[ISO/IEC 14977 EBNF](https://en.wikipedia.org/wiki/Extended_Backus%E2%80%93Naur_Form).
1212

1313
{{% alert title="Note" color="note" %}}
14-
This is a draft spec reverse-engineered from the C++ text format
14+
This is a draft spec originally reverse-engineered from the C++ text format
1515
[implementation](https://github.com/protocolbuffers/protobuf/blob/master/src/google/protobuf/text_format.cc)
16-
and may change based on further discussion and review. While an effort has been
17-
made to keep text formats consistent across supported languages,
16+
It has evolved and may change further based on discussion and review. While an
17+
effort has been made to keep text formats consistent across supported languages,
1818
incompatibilities are likely to exist. {{% /alert %}}
1919

2020
## Example {#example}
@@ -98,6 +98,17 @@ hex = "0" | "1" | "2" | "3" | "4" | "5" | "6" | "7" | "8" | "9"
9898
| "a" | "b" | "c" | "d" | "e" | "f" ;
9999
```
100100

101+
A limited set of URL characters following
102+
[RFC 3986: Uniform Resource Identifier (URI)](https://www.rfc-editor.org/rfc/rfc3986#appendix-A):
103+
104+
```
105+
url_unreserved = letter | dec | "-" | "." | "~" | "_"
106+
url_sub_delim = "!" | "$" | "&" | "'" | "(" | ")"
107+
| "*" | "+" | "," | ";" | "="
108+
url_pct_encoded = "%" hex hex
109+
url_char = url_unreserved | url_sub_delim | url_pct_encoded
110+
```
111+
101112
### Whitespace and Comments {#whitespace}
102113

103114
```
@@ -263,17 +274,22 @@ Fields that are part of the containing message use simple `Identifiers` as
263274
names.
264275
[`Extension`](/programming-guides/proto2#extensions) and
265276
[`Any`](/programming-guides/proto3#any) field names are
266-
wrapped in square brackets and fully-qualified. `Any` field names are prefixed
267-
with a qualifying domain name, such as `type.googleapis.com/`.
277+
wrapped in square brackets and fully-qualified. `Any` field names are URI suffix
278+
references, meaning that they are prefixed with a qualifying authority and an
279+
optional path, such as `type.googleapis.com/`.
268280

269281
```
270282
FieldName = ExtensionName | AnyName | IDENT ;
271283
ExtensionName = "[", TypeName, "]" ;
272-
AnyName = "[", Domain, "/", TypeName, "]" ;
284+
AnyName = "[", UrlPrefix, "/", TypeName, "]" ;
273285
TypeName = IDENT, { ".", IDENT } ;
274-
Domain = IDENT, { ".", IDENT } ;
286+
UrlPrefix = url_char, { url_char | "/" } ;
275287
```
276288

289+
Text format serializers should not write any whitespace characters between the
290+
brackets of an `ExtensionName` or `AnyName`. Parsers should trim any whitespace
291+
characters before processing an `ExtensionName` or `AnyName`.
292+
277293
Regular fields and extension fields can have scalar or message values. `Any`
278294
fields are always messages. Example:
279295

0 commit comments

Comments
 (0)