Skip to content

Query var normalization should be hardened to avoid infinite space #1467

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
westonruter opened this issue Aug 13, 2024 · 1 comment
Open
Labels
[Plugin] Optimization Detective Issues for the Optimization Detective plugin [Type] Enhancement A suggestion for improvement of an existing feature

Comments

@westonruter
Copy link
Member

westonruter commented Aug 13, 2024

The post_name for an od_url_metrics post is comprised of an MD5-hash (via od_get_url_metrics_slug()) of the normalized query vars as returned by od_get_normalized_query_vars(). The normalization is done to avoid creating more od_url_metrics posts than are needed.

For example, when a 404 is returned the URL could contain anything. A request for /?p=123456789 results in query vars including array( 'p' => 123456789 ) even when the post doesn't exist. Every 404 response should be the same, so there is no need to have separate od_url_metrics posts for each 404 response. (In fact, is_404() should perhaps be included in the conditions used by od_can_optimize_response(), although in practice this will be the case since od_get_cache_purge_post_id() will be null.) This is already the case for search queries (when is_search()), since the URL space for searches is by definition unconstrained.) But the situation gets complicated with other types or archive queries. For example, hour, minute, and second are all valid public query vars to get posts by date. Or you can add tag and cat query vars with arbitrary comma-separated lists of tags and categories to create more possible permutations of query vars that will result in unique slug for an od_url_metrics post. How should this be handled?

Note that od_get_normalized_query_vars() isn't sorting the array keys, which is also needed for normalization, $wp->query_vars is populated by iterating over public_query_vars so in practice this shouldn't be a problem.

To guard against infinite permutations in query vars, we should perhaps disable optimization if pretty permalinks aren't enabled and/or if any public query vars are added via $_GET on top of the pretty permalink.

@westonruter westonruter added [Type] Enhancement A suggestion for improvement of an existing feature [Plugin] Optimization Detective Issues for the Optimization Detective plugin labels Aug 13, 2024
@github-project-automation github-project-automation bot moved this to Not Started/Backlog 📆 in WP Performance 2025 Aug 13, 2024
@westonruter westonruter moved this from Not Started/Backlog 📆 to Definition ✏️ in WP Performance 2025 Aug 13, 2024
@westonruter
Copy link
Member Author

This becomes a non-issue if unauthenticated end-users are not allowed to submit URL Metrics (cf. #1311). Or at least unauthenticated users could be prevented from submitting URL Metrics except for a subset of discrete pretty permalink URLs, e.g. homepage, singular post, page, author archives, term archive pages, and so on. The key is to not allow any URLs which have query vars in GET parameters, because that's what leads to the infinite URL space, as you can practically have an infinite number of URLs with different permutations. Although, come to think of it, the permutations would have to result in actual posts being returned so that there is no 404, since detection and optimization are disabled when is_404(). Therefore, this concern may be overblown.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
[Plugin] Optimization Detective Issues for the Optimization Detective plugin [Type] Enhancement A suggestion for improvement of an existing feature
Projects
Status: Definition ✏️
Development

No branches or pull requests

1 participant