Skip to content

Commit 8e1aeb6

Browse files
authored
firestore: add an example to the comment in compareUtf8Strings() (#7113)
1 parent 50cd4f3 commit 8e1aeb6

File tree

1 file changed

+16
-0
lines changed
  • firebase-firestore/src/main/java/com/google/firebase/firestore/util

1 file changed

+16
-0
lines changed

firebase-firestore/src/main/java/com/google/firebase/firestore/util/Util.java

Lines changed: 16 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -99,6 +99,22 @@ public static int compareUtf8Strings(String left, String right) {
9999
// used to represent code points greater than 0xFFFF which have 4-byte UTF-8 representations
100100
// and are lexicographically greater than the 1, 2, or 3-byte representations of code points
101101
// less than or equal to 0xFFFF.
102+
//
103+
// An example of why Case 2 is required is comparing the following two Unicode code points:
104+
//
105+
// |-----------------------|------------|---------------------|-----------------|
106+
// | Name | Code Point | UTF-8 Encoding | UTF-16 Encoding |
107+
// |-----------------------|------------|---------------------|-----------------|
108+
// | Replacement Character | U+FFFD | 0xEF 0xBF 0xBD | 0xFFFD |
109+
// | Grinning Face | U+1F600 | 0xF0 0x9F 0x98 0x80 | 0xD83D 0xDE00 |
110+
// |-----------------------|------------|---------------------|-----------------|
111+
//
112+
// A lexicographical comparison of the UTF-8 encodings of these code points would order
113+
// "Replacement Character" _before_ "Grinning Face" because 0xEF is less than 0xF0. However, a
114+
// direct comparison of the UTF-16 code units, as would be done in case 1, would erroneously
115+
// produce the _opposite_ ordering, because 0xFFFD is _greater than_ 0xD83D. As it turns out,
116+
// this relative ordering holds for all comparisons of UTF-16 code points requiring a surrogate
117+
// pair with those that do not.
102118
final int length = Math.min(left.length(), right.length());
103119
for (int i = 0; i < length; i++) {
104120
final char leftChar = left.charAt(i);

0 commit comments

Comments
 (0)