We're encoding U+E78D to 0x83 0x36 0xCB 0x32 which seems totally wrong. That is neither 0xA6 0xD9 nor 0x84 0x31 0x82 0x36 (which are the sequences on L2/23-003R[1]). If we round-trip our byte sequence back to a code point, it decodes to U+E82E which is just a totally different PUA character. [1] https://www.unicode.org/L2/L2023/23003r-gb18030-recommendations.pdf
<rdar://problem/110952885>
Created attachment 466741 [details] encoder observation
Created attachment 466742 [details] decoder observation
Pull request: https://github.com/WebKit/WebKit/pull/15413
Committed 265633@main (0a2e4563bda0): <https://commits.webkit.org/265633@main> Reviewed commits have been landed. Closing PR #15413 and removing active labels.