Skip to content

Conversation

@sebastian-nagel
Copy link
Contributor

The pattern used to match CSS-embedded URLs is not limited, i.e. it matches URLs of any length, potentially causing a Java stack overflow (see commoncrawl#12).

This PR fixes the issue and adds a unit test to make it reproducible resp. verify the solution.

@ato
Copy link
Member

ato commented Oct 25, 2019

Looks like this patch also disallows whitespace within the URL? Under the old pattern url('foo bar') matched but with the new pattern it does not match. According to MDN's documentation whitespace should be allowed if the URL is quoted:

Quotes are required if the URL includes parentheses, whitespace, or quotes, unless these characters are escaped, or if the address includes control characters above 0x7e .

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants