Skip to content

Commit d42d4b0

Browse files
authored
New Security Considerations (#1618)
* New Security Considerations * Clarify reference resolution cycle warning * Minor change to wording about URI normalization * Change "vocabulary" to "extension"
1 parent c07e1a7 commit d42d4b0

File tree

1 file changed

+123
-26
lines changed

1 file changed

+123
-26
lines changed

specs/jsonschema-core.md

Lines changed: 123 additions & 26 deletions
Original file line numberDiff line numberDiff line change
@@ -2036,32 +2036,129 @@ SHOULD use the terms defined by this document to do so.
20362036

20372037
## Security Considerations {#security}
20382038

2039-
Both schemas and instances are JSON values. As such, all security considerations
2040-
defined in [RFC 8259][rfc8259] apply.
2041-
2042-
Instances and schemas are both frequently written by untrusted third parties, to
2043-
be deployed on public Internet servers. Implementations should take care that
2044-
the parsing and evaluating against schemas does not consume excessive system
2045-
resources. Implementations MUST NOT fall into an infinite loop.
2046-
2047-
A malicious party could cause an implementation to repeatedly collect a copy of
2048-
a very large value as an annotation. Implementations SHOULD guard against
2049-
excessive consumption of system resources in such a scenario.
2050-
2051-
Servers MUST ensure that malicious parties cannot change the functionality of
2052-
existing schemas by uploading a schema with a pre-existing or very similar
2053-
`$id`.
2054-
2055-
Individual JSON Schema extensions are liable to also have their own security
2056-
considerations. Consult the respective specifications for more information.
2057-
2058-
Schema authors should take care with `$comment` contents, as a malicious
2059-
implementation can display them to end-users in violation of a spec, or fail to
2060-
strip them if such behavior is expected.
2061-
2062-
A malicious schema author could place executable code or other dangerous
2063-
material within a `$comment`. Implementations MUST NOT parse or otherwise take
2064-
action based on `$comment` contents.
2039+
While schemas and instances are not always represented as JSON text, they are
2040+
defined in terms of the JSON data model. As such, the security considerations
2041+
defined in [RFC 8259][rfc8259] may still apply in environments where text-based
2042+
representations are used, particularly those considerations related to parsing,
2043+
number precision, and structural limitations.
2044+
2045+
Schemas and instances are frequently authored by untrusted parties.
2046+
Implementations that accept or evaluate such inputs may be exposed to several
2047+
classes of attack, particularly denial-of-service (DoS) by means of resource
2048+
exhaustion.
2049+
2050+
### Nested `anyOf`/`oneOf`
2051+
2052+
One risk for resource exhaustion in JSON Schema arises from the nested use of
2053+
`anyOf` and `oneOf`. While a single combinator keyword with multiple subschemas
2054+
is typically manageable, nesting them causes the number of evaluation paths to
2055+
grow exponentially.
2056+
2057+
For example, a `oneOf` with 5 subschemas, each containing another `oneOf` with 5
2058+
options, results in 25 evaluation paths. Adding a third level increases this to
2059+
125, and so on. Attackers can exploit this by crafting schemas that force
2060+
validators to explore a large number of branches.
2061+
2062+
This evaluation explosion is particularly dangerous when each path involves
2063+
expensive work such as collecting large annotations or evaluating complex
2064+
regular expressions. These effects multiply across paths and can result in
2065+
excessive CPU or memory consumption, leading to denial-of-service.
2066+
2067+
Implementations that evaluate untrusted schema are encouraged to take steps to
2068+
mitigate these threats with measures such as bounding combinator keyword depth
2069+
and breadth, limiting memory used for annotation collection, and guarding
2070+
against resource-intensive validations such as pathological regexes.
2071+
2072+
### Dynamic References
2073+
2074+
The paper ["The Complexity of JSON Schema: Undecidable, Expensive, Yet
2075+
Tractable" (Caroni et al., 2024)](https://doi.org/10.1145/3632891) has shown
2076+
that validation in the presence of dynamic references is PSPACE-complete. The
2077+
paper describes a method for replacing dynamic references with static ones, but
2078+
doing so can cause the size of the schema to grow exponentially. Implementations
2079+
should be aware of this risk and may wish to implement the method described in
2080+
the paper or impose limits on dynamic reference resolution.
2081+
2082+
### Infinite Loops and Cycles
2083+
2084+
Infinite loops can occur when evaluating schemas that produce cycles during
2085+
reference resolution. These cycles may involve multiple schemas. Not all
2086+
recursive schemas create loops, but implementations are advised to detect these
2087+
cycles and terminate evaluation when they are encountered.
2088+
2089+
### Schema Identity and Collisions
2090+
2091+
Schemas may declare an `$id` to identify themselves or have embedded schemas
2092+
that declare an `$id`. An attacker may attempt to register a schema with an
2093+
`$id` that collides with a previously registered schema, or that differs only by
2094+
case, encoding, or other URI normalization quirks. Such collisions could result
2095+
in overwriting or shadowing of trusted schemas.
2096+
2097+
Implementations should consider rejecting schemas that have identifiers
2098+
(including embedded schema identifiers) that conflict with registered schemas
2099+
and should apply any URI normalization and comparison logic consistently to
2100+
detect and prevent conflicts.
2101+
2102+
### External Schema Resolution
2103+
2104+
JSON Schema implementations are expected to resolve external references using a
2105+
local registry. Although the specification allows for dynamic retrieval
2106+
(`https:` to fetch schemas over HTTP, or `file:` to read schemas from disk),
2107+
this behavior is discouraged unless it's intrinsic to the use case, such as with
2108+
JSON Hyper-Schema.
2109+
2110+
Resolving schemas dynamically introduces several security concerns, each of
2111+
which can be mitigated by limiting or controlling resolution behavior. A tightly
2112+
scoped schema resolution policy significantly reduces the attack surface,
2113+
especially when validating untrusted data.
2114+
2115+
Implementations are advised to disable dynamic retrieval by default and limit
2116+
external schema resolution to the local registry unless dynamic retrieval is
2117+
explicitly enabled. If enabled, they should consider limiting the number of
2118+
dynamic retrievals a validation can perform and defining timeouts on dynamic
2119+
retrievals to reduce the risk of resource exhaustion.
2120+
2121+
#### HTTP(S) Specific Threats
2122+
2123+
Allowing schema references to resolve over HTTP or HTTPS introduces several
2124+
threats:
2125+
2126+
* **Denial of Service (DoS)**: Validation may hang or become slow if a
2127+
referenced schema URL is slow to respond or never returns.
2128+
* **Server-Side Request Forgery (SSRF)**: Malicious schemas can reference
2129+
internal-only services using hostnames like localhost or private IPs.
2130+
Implementations are advised to restrict HTTP schema retrieval to a
2131+
configurable allowlist of trusted domains.
2132+
* **Lack of Integrity Guarantees**: Retrieved schemas may be altered in transit
2133+
or change between validations. If network retrieval is allowed,
2134+
implementations are advised to only allow retrieval over HTTPS unless
2135+
specifically configured to allow unsecured transport.
2136+
2137+
#### File System Specific Threats
2138+
2139+
Allowing resolution from the local filesystem (`file:` URIs) raises different
2140+
issues:
2141+
2142+
* **Information Disclosure**: Malicious schemas may access sensitive files on
2143+
the system. Implementations should consider restricting filesystem access to
2144+
a specific schema directory tree.
2145+
* **Cross-Context Access**: A schema fetched from HTTP may try to reference a
2146+
schema on the filesystem. Implementations are advised to allow resolving
2147+
`file:` references only when the referencing schema was itself loaded from the
2148+
file system, similar to same-origin policies in web browsers.
2149+
* **Exposing Internal Paths**: Schemas that use `file:` URIs may reveal
2150+
host-specific filesystem details in two ways: through the `$id` itself or
2151+
through schema locations in validation output. Implementations are advised to
2152+
reject `$id` values that use the `file:` scheme. If `file:` URIs are permitted
2153+
internally, implementations are advised to sanitize them (for example, by
2154+
converting them to relative URIs) to avoid exposing host filesystem structure
2155+
to users.
2156+
2157+
### Extension-Specific Risks
2158+
2159+
Third-party JSON Schema extensions may introduce additional risks. Implementers
2160+
are advised to consult the specifications of any extensions they support and
2161+
take into account their security considerations as well.
20652162

20662163
## IANA Considerations
20672164

0 commit comments

Comments
 (0)