Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

redirect doesn't work with urls containing non-ASCII symbols #116

Open
chmllr opened this issue Jun 13, 2012 · 10 comments
Open

redirect doesn't work with urls containing non-ASCII symbols #116

chmllr opened this issue Jun 13, 2012 · 10 comments

Comments

@chmllr
Copy link

chmllr commented Jun 13, 2012

The noir.response/redirect doesn't work with urls containing non-ASCII symbols:

(redirect "/motörhead")

won't be matched by the corresponding defpage "/motörhead" ... declaration, because the browser is redirected to /mot%F6rhead.

What helps is to encode the url with ring.util.codec/url-encode. I don't really understand which component is more responsible for this: noir, composure or ring, so maybe it's not noir's bug?

@Raynes
Copy link
Member

Raynes commented Jun 14, 2012

@ibdknox I've got no clue where to begin with this one. Please advise when you have time.

@ifesdjeen
Copy link

technically, /motörhead is not a valid url, so match for /mot%F6rhead is a valid thing.

All the routes (pre-routes, noir-routes and post-routes) are merged together and matching is performed by Clout, which is a dependency for Compojure (https://github.com/weavejester/clout).

I think the case that's quite close to yours (although it's more for keywords rather than fixed path) is covered in specs, too, you can check it here:
https://github.com/weavejester/clout/blob/master/test/clout/test/core.clj#L46-57

@chmllr
Copy link
Author

chmllr commented Jun 14, 2012

@ifesdjeen thanks! Well, these tests solely check the bindings of urls to keywords, but surely not the matching. I checked out the tests and tested the matching:

(deftest utf8-route-test1
  (is (= (route-matches "/motörhead" (request :get "/mot%F6rhead")) {})))

(deftest utf8-route-test2
  (is (= (route-matches "/mot%F6rhead" (request :get "/motörhead")) {})))

(deftest utf8-route-test3
  (is (= (route-matches "/motörhead" (request :get "/motörhead")) {})))

And, as I expected, the first two tests fail. The third test shows, that clout is able to gecognize non-ASCII symbols.

Moreover, concerning your note, that /motörhead is not a valid URL. Assuming, that some component (clout? noir?) makes an automatic translation to base-<whatever>, when I use non-latin letters, e.g.:

(redirect "/тест")

this will be redirected to /???? and this is definitely wrong, isn't it?

@ifesdjeen
Copy link

Yup, they should have failed :) at least it seemed so form the other examples.

The last (тест) one is rather related to UTF-8. I'm not absolutely sure how they encode/convert things inside, But Non-ascii chars should get transcoded to ?.

@chmllr
Copy link
Author

chmllr commented Jun 14, 2012

The problem with "/тест" I have, is that it works when I apply the before mentioned url-encode function to it specifying UTF-8 as encoding. But my file encoding is already UTF-8, so I'm pretty confused now, what's now a bug and what's not, and shouldn't this conversion be applied by default?

@ifesdjeen
Copy link

It's quite hard for me to tell what is a bug and what's not :) probably even harder than it is for you to say so)

My thoughts on that would be: a) urls should are ascii-encoded by default, so I would not rely on utf-to-ascii conversion b) special ASCII symbols should be url-encoded, and I would not rely on library to encode it for me

I may be wrong though)

@chmllr
Copy link
Author

chmllr commented Jun 14, 2012

If you mean with "ascii-encoded", that the only used alphabet is ASCII, then this is a great idea, but doesn't always work IRL :) E.g., a common practice for a blog engine is to create links to a blog post out of the post title (e.g., blog.com/2012/06/12/this-is-a-post-title). When I only use ASCII characters, how should I create titles for non-ASCII languages? (Sure, I could generate some ugly hashes, but c'mon its 2012...).

Anyway, currently I do url-encode the title and everything seems to work. I'm only wondering whether noir should do this for me (since redirect is a high-level function of a web framework), or at least it should be mentioned in the API, that it only works for encoded URLs and not for any text.

@ifesdjeen
Copy link

Titles for non-ascii languages get utf-8 converted and then url-encoded :)

@ibdknox
Copy link
Member

ibdknox commented Jun 14, 2012

Are you setting your JVM encoding?

:jvm-opts ["-Dfile.encoding=utf-8"]

@chmllr
Copy link
Author

chmllr commented Jun 14, 2012

@ibdknox yep, this was my first idea too. Unfortunately, it didn't help.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants