Bug 258251 - REGRESSION(264918@main): GB18030 encoding isn't hooked up correctly
Summary: REGRESSION(264918@main): GB18030 encoding isn't hooked up correctly
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: Text (show other bugs)
Version: WebKit Nightly Build
Hardware: Unspecified Unspecified
: P2 Normal
Assignee: Alex Christensen
URL:
Keywords: InRadar
Depends on:
Blocks:
 
Reported: 2023-06-17 16:04 PDT by Myles C. Maxfield
Modified: 2023-06-29 15:38 PDT (History)
2 users (show)

See Also:


Attachments
encoder observation (885 bytes, text/html)
2023-06-17 16:06 PDT, Myles C. Maxfield
no flags Details
decoder observation (370 bytes, text/html)
2023-06-17 16:06 PDT, Myles C. Maxfield
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Myles C. Maxfield 2023-06-17 16:04:59 PDT
We're encoding U+E78D to 0x83 0x36 0xCB 0x32 which seems totally wrong. That is neither 0xA6 0xD9 nor 0x84 0x31 0x82 0x36 (which are the sequences on L2/23-003R[1]). If we round-trip our byte sequence back to a code point, it decodes to U+E82E which is just a totally different PUA character.

[1] https://www.unicode.org/L2/L2023/23003r-gb18030-recommendations.pdf
Comment 1 Radar WebKit Bug Importer 2023-06-17 16:05:16 PDT
<rdar://problem/110952885>
Comment 2 Myles C. Maxfield 2023-06-17 16:06:41 PDT
Created attachment 466741 [details]
encoder observation
Comment 3 Myles C. Maxfield 2023-06-17 16:06:50 PDT
Created attachment 466742 [details]
decoder observation
Comment 4 Alex Christensen 2023-06-29 10:32:28 PDT
Pull request: https://github.com/WebKit/WebKit/pull/15413
Comment 5 EWS 2023-06-29 15:38:17 PDT
Committed 265633@main (0a2e4563bda0): <https://commits.webkit.org/265633@main>

Reviewed commits have been landed. Closing PR #15413 and removing active labels.