Bug 256716

Summary: DOMXPath- failing (fn-lang.html) due to U+212A handling
Product: WebKit Reporter: Ahmad Saleem <ahmad.saleem792>
Component: XMLAssignee: Nobody <webkit-unassigned>
Status: NEW ---    
Severity: Normal CC: annevk, ap, cdumez, darin, mmaxfield, rniwa, webkit-bug-importer, ysuzuki
Priority: P2 Keywords: BrowserCompat, InRadar, WPTImpact
Version: Safari Technology Preview   
Hardware: Unspecified   
OS: Unspecified   
See Also: https://github.com/whatwg/dom/issues/1199
Attachments:
Description Flags
CSS selector test none

Description Ahmad Saleem 2023-05-12 10:02:25 PDT
Hi Team,

While going through WPT test cases, came across one sub-test failure:

Test Case - https://wpt.fyi/results/domxpath/fn-lang.html?label=experimental&label=master&aligned

Link - http://wpt.live/domxpath/fn-lang.html

___________

Researching bit on when this was added and against which bugs, it seems to be U+212A handling issue either in CSS or DOM level.

Just raising, so we can fix it.

Thanks!
Comment 1 Alexey Proskuryakov 2023-05-14 12:05:37 PDT
// U+212A should match to ASCII 'k'.
// XPath 1.0 says:
// ... such that the attribute value is equal to the argument ignoring that suffix
// of the attribute value and ignoring case.
// XPath 3.1 says:
// ... true if and only if, based on a caseless default match as specified in
// section 3.13 of The Unicode Standard,
testFirstChild('lang("ko")', '<root><match xml:lang="&#x212A;o"/></root>');

-------

So this test says that U+212A KELVIN SIGN should match letter "k" in lang attribute, which seems weird, at least initially.

The Unicode Standard reference here is obsolete, but is almost certainly about "caseless matching" in https://www.unicode.org/versions/Unicode15.0.0/ch05.pdf. Which indeed has this behavior, but the rest of the Web platform does not AFAIK. Notably, even Chrome doesn't have this behavior for CSS selector matching.

I think that this is a bug in the standard, and Chrome is internally inconsistent, so we shouldn't be following it. CC'ing some folks who've done more recent work in this area to weigh in.
Comment 2 Alexey Proskuryakov 2023-05-14 12:05:56 PDT
Created attachment 466349 [details]
CSS selector test
Comment 3 Anne van Kesteren 2023-05-14 23:47:22 PDT
Looking at the source of the test it seems this was done based on an XPath 3.1 clarification, but indeed we use ASCII case-insensitive for language matching across the web platform. Let's change this test and add a clarification to DOM/HTML with regards to what kind of matching the XPath lang() function uses.
Comment 4 Radar WebKit Bug Importer 2023-05-19 10:03:20 PDT
<rdar://problem/109571692>
Comment 5 Ahmad Saleem 2024-03-29 16:36:09 PDT
Modifying the test upstream in WPT - https://github.com/web-platform-tests/wpt/pull/45436