Bug description
Velox currently doesn't escape Unicode characters when casting to JSON.
presto> select U&'\+01F64F';
_col0
-------
🙏
(1 row)
presto> select cast(U&'\+01F64F' as json);
_col0
----------------
"\uD83D\uDE4F"
(1 row)
Velox:
testCastToJson<StringView>(VARCHAR(), {"\U0001F64F"}, {"\"\\ud83d\\ude4f\""});
at 0: expected "\ud83d\ude4f", but got "🙏"
Velox uses folly::json::escapeString to case string to json. This function allows to specify configuration options including
// If true, non-ASCII utf8 characters would be encoded as \uXXXX:
// - if the code point is in [U+0000..U+FFFF] => encode as a single \uXXXX
// - if the code point is > U+FFFF => encode as 2 UTF-16 surrogate pairs.
bool encode_non_ascii{false};
The only difference with Presto is that folly::json::escapeString uses lowercase hex digits, while Presto uses uppercase.
CC: @zacw7 @aditi-pandit @amitkdutta @kagamiori @kevinwilfong
Related: FasterXML/jackson-core#717
System information
n/a
Relevant logs
No response
Bug description
Velox currently doesn't escape Unicode characters when casting to JSON.
Velox:
Velox uses folly::json::escapeString to case string to json. This function allows to specify configuration options including
The only difference with Presto is that folly::json::escapeString uses lowercase hex digits, while Presto uses uppercase.
CC: @zacw7 @aditi-pandit @amitkdutta @kagamiori @kevinwilfong
Related: FasterXML/jackson-core#717
System information
n/a
Relevant logs
No response