@@ -2036,32 +2036,129 @@ SHOULD use the terms defined by this document to do so.
2036
2036
2037
2037
## Security Considerations {#security}
2038
2038
2039
- Both schemas and instances are JSON values. As such, all security considerations
2040
- defined in [ RFC 8259] [ rfc8259 ] apply.
2041
-
2042
- Instances and schemas are both frequently written by untrusted third parties, to
2043
- be deployed on public Internet servers. Implementations should take care that
2044
- the parsing and evaluating against schemas does not consume excessive system
2045
- resources. Implementations MUST NOT fall into an infinite loop.
2046
-
2047
- A malicious party could cause an implementation to repeatedly collect a copy of
2048
- a very large value as an annotation. Implementations SHOULD guard against
2049
- excessive consumption of system resources in such a scenario.
2050
-
2051
- Servers MUST ensure that malicious parties cannot change the functionality of
2052
- existing schemas by uploading a schema with a pre-existing or very similar
2053
- ` $id ` .
2054
-
2055
- Individual JSON Schema extensions are liable to also have their own security
2056
- considerations. Consult the respective specifications for more information.
2057
-
2058
- Schema authors should take care with ` $comment ` contents, as a malicious
2059
- implementation can display them to end-users in violation of a spec, or fail to
2060
- strip them if such behavior is expected.
2061
-
2062
- A malicious schema author could place executable code or other dangerous
2063
- material within a ` $comment ` . Implementations MUST NOT parse or otherwise take
2064
- action based on ` $comment ` contents.
2039
+ While schemas and instances are not always represented as JSON text, they are
2040
+ defined in terms of the JSON data model. As such, the security considerations
2041
+ defined in [ RFC 8259] [ rfc8259 ] may still apply in environments where text-based
2042
+ representations are used, particularly those considerations related to parsing,
2043
+ number precision, and structural limitations.
2044
+
2045
+ Schemas and instances are frequently authored by untrusted parties.
2046
+ Implementations that accept or evaluate such inputs may be exposed to several
2047
+ classes of attack, particularly denial-of-service (DoS) by means of resource
2048
+ exhaustion.
2049
+
2050
+ ### Nested ` anyOf ` /` oneOf `
2051
+
2052
+ One risk for resource exhaustion in JSON Schema arises from the nested use of
2053
+ ` anyOf ` and ` oneOf ` . While a single combinator keyword with multiple subschemas
2054
+ is typically manageable, nesting them causes the number of evaluation paths to
2055
+ grow exponentially.
2056
+
2057
+ For example, a ` oneOf ` with 5 subschemas, each containing another ` oneOf ` with 5
2058
+ options, results in 25 evaluation paths. Adding a third level increases this to
2059
+ 125, and so on. Attackers can exploit this by crafting schemas that force
2060
+ validators to explore a large number of branches.
2061
+
2062
+ This evaluation explosion is particularly dangerous when each path involves
2063
+ expensive work such as collecting large annotations or evaluating complex
2064
+ regular expressions. These effects multiply across paths and can result in
2065
+ excessive CPU or memory consumption, leading to denial-of-service.
2066
+
2067
+ Implementations that evaluate untrusted schema are encouraged to take steps to
2068
+ mitigate these threats with measures such as bounding combinator keyword depth
2069
+ and breadth, limiting memory used for annotation collection, and guarding
2070
+ against resource-intensive validations such as pathological regexes.
2071
+
2072
+ ### Dynamic References
2073
+
2074
+ The paper [ "The Complexity of JSON Schema: Undecidable, Expensive, Yet
2075
+ Tractable" (Caroni et al., 2024)] ( https://doi.org/10.1145/3632891 ) has shown
2076
+ that validation in the presence of dynamic references is PSPACE-complete. The
2077
+ paper describes a method for replacing dynamic references with static ones, but
2078
+ doing so can cause the size of the schema to grow exponentially. Implementations
2079
+ should be aware of this risk and may wish to implement the method described in
2080
+ the paper or impose limits on dynamic reference resolution.
2081
+
2082
+ ### Infinite Loops and Cycles
2083
+
2084
+ Infinite loops can occur when evaluating schemas that produce cycles during
2085
+ reference resolution. These cycles may involve multiple schemas. Not all
2086
+ recursive schemas create loops, but implementations are advised to detect these
2087
+ cycles and terminate evaluation when they are encountered.
2088
+
2089
+ ### Schema Identity and Collisions
2090
+
2091
+ Schemas may declare an ` $id ` to identify themselves or have embedded schemas
2092
+ that declare an ` $id ` . An attacker may attempt to register a schema with an
2093
+ ` $id ` that collides with a previously registered schema, or that differs only by
2094
+ case, encoding, or other URI normalization quirks. Such collisions could result
2095
+ in overwriting or shadowing of trusted schemas.
2096
+
2097
+ Implementations should consider rejecting schemas that have identifiers
2098
+ (including embedded schema identifiers) that conflict with registered schemas
2099
+ and should apply any URI normalization and comparison logic consistently to
2100
+ detect and prevent conflicts.
2101
+
2102
+ ### External Schema Resolution
2103
+
2104
+ JSON Schema implementations are expected to resolve external references using a
2105
+ local registry. Although the specification allows for dynamic retrieval
2106
+ (` https: ` to fetch schemas over HTTP, or ` file: ` to read schemas from disk),
2107
+ this behavior is discouraged unless it's intrinsic to the use case, such as with
2108
+ JSON Hyper-Schema.
2109
+
2110
+ Resolving schemas dynamically introduces several security concerns, each of
2111
+ which can be mitigated by limiting or controlling resolution behavior. A tightly
2112
+ scoped schema resolution policy significantly reduces the attack surface,
2113
+ especially when validating untrusted data.
2114
+
2115
+ Implementations are advised to disable dynamic retrieval by default and limit
2116
+ external schema resolution to the local registry unless dynamic retrieval is
2117
+ explicitly enabled. If enabled, they should consider limiting the number of
2118
+ dynamic retrievals a validation can perform and defining timeouts on dynamic
2119
+ retrievals to reduce the risk of resource exhaustion.
2120
+
2121
+ #### HTTP(S) Specific Threats
2122
+
2123
+ Allowing schema references to resolve over HTTP or HTTPS introduces several
2124
+ threats:
2125
+
2126
+ * ** Denial of Service (DoS)** : Validation may hang or become slow if a
2127
+ referenced schema URL is slow to respond or never returns.
2128
+ * ** Server-Side Request Forgery (SSRF)** : Malicious schemas can reference
2129
+ internal-only services using hostnames like localhost or private IPs.
2130
+ Implementations are advised to restrict HTTP schema retrieval to a
2131
+ configurable allowlist of trusted domains.
2132
+ * ** Lack of Integrity Guarantees** : Retrieved schemas may be altered in transit
2133
+ or change between validations. If network retrieval is allowed,
2134
+ implementations are advised to only allow retrieval over HTTPS unless
2135
+ specifically configured to allow unsecured transport.
2136
+
2137
+ #### File System Specific Threats
2138
+
2139
+ Allowing resolution from the local filesystem (` file: ` URIs) raises different
2140
+ issues:
2141
+
2142
+ * ** Information Disclosure** : Malicious schemas may access sensitive files on
2143
+ the system. Implementations should consider restricting filesystem access to
2144
+ a specific schema directory tree.
2145
+ * ** Cross-Context Access** : A schema fetched from HTTP may try to reference a
2146
+ schema on the filesystem. Implementations are advised to allow resolving
2147
+ ` file: ` references only when the referencing schema was itself loaded from the
2148
+ file system, similar to same-origin policies in web browsers.
2149
+ * ** Exposing Internal Paths** : Schemas that use ` file: ` URIs may reveal
2150
+ host-specific filesystem details in two ways: through the ` $id ` itself or
2151
+ through schema locations in validation output. Implementations are advised to
2152
+ reject ` $id ` values that use the ` file: ` scheme. If ` file: ` URIs are permitted
2153
+ internally, implementations are advised to sanitize them (for example, by
2154
+ converting them to relative URIs) to avoid exposing host filesystem structure
2155
+ to users.
2156
+
2157
+ ### Extension-Specific Risks
2158
+
2159
+ Third-party JSON Schema extensions may introduce additional risks. Implementers
2160
+ are advised to consult the specifications of any extensions they support and
2161
+ take into account their security considerations as well.
2065
2162
2066
2163
## IANA Considerations
2067
2164
0 commit comments