-
-
Notifications
You must be signed in to change notification settings - Fork 812
How to properly deal with emoji / 4 byte utf8? #348
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
@osheroff Yes, I think this is #223. Depending on reading of the JSON spec (or even version of it, as there are multiple by now), Jackson's behavior is what is specified as expected (original one), or possibly not. But unfortunately allowing alternate output is rather non-trivial due to the way Java's internal representation (UCS-2) interacts with input/output buffer boundaries. |
gotcha. is there an easy way to get at or reproduce the utter basics of json encoding, leaving multi-byte chars alone? my input is mostly-trusted, as mysql should probably have sanitized out invalid chars. |
@osheroff You mean to pass content as is? If you know what you are doing, you can use |
@cowtowncoder cool, thanks for the info. actually even ruby parses the escape sequence just fine (I was just confused because it has its own thanks again for your time! |
(fwiw, if anyone ever finds this thread again; I was not checking for surrogate pairs before doing |
@osheroff Glad you can make it work! |
Hi, I write https://github.com/zendesk/maxwell, and am having trouble with emoji characters in my json output.
If I send the string "We are the robots.🤖🤖🤖🤖" through the system, the output I get out of jackson is odd, I get:
I have varying degrees of success parsing this json. ruby barfs, chrome and scala and python appear to be fine, but I'd prefer going to 4 byte utf8 if possible.
I suspect this is the same issue as #223, if there's a workaround other than just a +1 to that discussion, it'd be great to know.
Thanks!
-ben
The text was updated successfully, but these errors were encountered: