Bug 260139

Summary: WebKit doesn't follow UAX14 line breaking rules for EX AL case
Product: WebKit Reporter: Makoto Kato <m_kato>
Component: TextAssignee: Nobody <webkit-unassigned>
Status: NEW ---    
Severity: Normal CC: brian, fantasai.bugs, karlcow, mmaxfield, simon.fraser, webkit-bug-importer, zalan
Priority: P2 Keywords: BrowserCompat, InRadar
Version: Safari 16   
Hardware: All   
OS: All   

Description Makoto Kato 2023-08-14 05:11:54 PDT
This is same as https://bugs.chromium.org/p/chromium/issues/detail?id=1472702. I found this when Gecko is moving to UAX14 for line breaking rules.

Step
====

1. Open data:text/html,<div%20style="width:1px">!ABC</div>

Result
======

"!ABC" is one line

EXPECTED RESULT
===============
"!" is first line, "ABC" is second line.

According to Unicode's UAX14. (https://www.unicode.org/reports/tr14/)
- LB2 Never break at the start of text.
- LB13 Do not break before ‘]’ or ‘!’ or ‘;’ or ‘/’, even after spaces.
- LB31 Break everywhere else.

So "!ABC" string is "EX ÷ AL x AL x AL" as https://www.unicode.org/reports/tr14/#Definitions. But Blink and WebKit seem to be "EX x AL x AL x AL". This behavior is incorrect with UAX14.  

When using 7-bit characters, WebKit uses own line break table (https://searchfox.org/wubkat/rev/cd25edd92284ea5ea247483e66b404c0774949b2/Source/WebCore/rendering/BreakLines.cpp#53). But this rule table isn't compatible with UAX14. So Chrome team should update this table to match with UAX14.
Comment 1 Radar WebKit Bug Importer 2023-08-21 05:12:12 PDT
<rdar://problem/114188466>