| Summary: | REGRESSION (iOS 16.4 Public Beta): WebGL app jetsams quickly on iOS device | ||
|---|---|---|---|
| Product: | WebKit | Reporter: | Kurt Revis <krevis> |
| Component: | WebGL | Assignee: | Nobody <webkit-unassigned> |
| Status: | RESOLVED DUPLICATE | ||
| Severity: | Critical | CC: | dino, djg, hypertree, kbr, kkinnunen, kpiddington, krevis, mattwindwer |
| Priority: | P2 | ||
| Version: | Safari 16 | ||
| Hardware: | iPhone / iPad | ||
| OS: | iOS 16 | ||
| Attachments: | |||
|
Description
Kurt Revis
2023-03-01 10:46:33 PST
Created attachment 465244 [details]
Jetsam log from iPhone device
Created attachment 465245 [details]
Screen capture of simulator showing memory growth
Created attachment 465246 [details]
Output of `vmmap` of GPU process in simulator, 16.4 (bad)
Created attachment 465247 [details]
Output of `vmmap` of GPU process in simulator, 16.0 (good)
I'm attempting to bisect this with open-source WebKit, running in the simulator. It's difficult because normal event handling doesn't work, so I had to hack up the web site to simulate panning and zooming. Sorry, no public URL for that. I'm not seeing the bug on ToT (e1dfe8ee), but it does happen in safari-7615.1.18-branch, which I'm guessing is the closest equivalent to what's in iOS 16.4. This change looked potentially related: https://github.com/WebKit/WebKit/pull/10028 (ANGLE Metal program memory cache is unbounded) But reverting that from ToT didn't make the problem come back. So it's likely something else. Also: on safari-7615.1.18-branch, the bug happens when running in either the 16.2 or 16.4 simulator. So this appears to be a WebKit regression in 16.4, not anything else in the system. Webkit-7615.1.18.10: Bad Webkit-7615.1.18.100.1: Bad Current safari-7616.1.4-branch (9fb625): Good Webkit-7616.1.1 through 7616.1.2.2: Bad, but not as bad. GPU memory grows, but not as fast. Webkit-7616.1.3.1: Good. That indicates that a fix has happened. I'm concerned about whether it will get submitted to iOS 16.4, though. If that is only going into 16.5+, then we'll have a lot of unhappy users for a few months. Next step: find when this actually broke, in hopes of finding a workaround. I can confirm this happens to our web app (AvNav) also in 16.4 beta (we used WebGL for 3D vector maps) - app dies in a minute or two of use - out of memory in Webassembly module. This make the beta unusable for us in-spite of many great things like SIMD and Home Screen App improvements. I can't imagine Safari shipping with such a bug so lets hope its fixed in next Beta. But its very disappointing that this keeps happening - ghastly bug that makes my device useless for testing. I may skip betas in future - not worth it. Looks like the fix came with recent ANGLE changes. Relevant recent commits on main: 404173 "Update ANGLE to 2023-02-08" ab998e "ANGLE Metal program memory cache is unbounded" b37b92 "Update ANGLE to 2023-02-14" 68fbc8 "Update ANGLE to 2023-02-20" Before 404173, the situation is very bad -- the GPU process grows in memory very quickly. 404173 was an improvement but not a full fix. It's still bad but grows more slowly. ab998e does not appear to affect this case. b37b92 caused memory usage to become stable. 68fbc8 does not appear to affect this case. This was broken by: commit a24fae7ed823c34753ed685eec782fbddb20cd30 Author: Dan Glastonbury <djg@apple.com> Date: Wed Nov 23 20:27:07 2022 -0800 [ANGLE] Update ANGLE to 2022-11-14 (85c98a92bb763452133bd7b4580d80625bb2c75d) https://bugs.webkit.org/show_bug.cgi?id=248069 rdar://problem/102498441 There were 335 changes in ANGLE, 12 mentioning Metal. Thanks for the investigation, sorry you had go through it. > Before 404173, the situation is very bad -- the GPU process grows in memory very quickly. The original root cause was: 968041b54 Metal: Optimized BufferSubData per device That contained two issues: a massive memory leak and by design memory leak. > 404173 was an improvement but not a full fix. It's still bad but grows more slowly. So here the we fix the massive memory leak 9a6c90c8f Reland "Metal: Avoid leaking buffers for GPU access for non-discrete" > b37b92 caused memory usage to become stable. And here we revert the original by design memory leak. ee64836f7 Revert "Metal: Optimized BufferSubData per device" Webkit 7616 1.2.2 was built two weeks ago Feb 17, 2023 and had the bug: https://github.com/WebKit/WebKit/commits/WebKit-7616.1.2.2 16.4 beta 1 was released: 16th Feb - Thursday and included the bug. First good build happened just 4 days later - 21st Feb, 2023 https://github.com/WebKit/WebKit/commits/WebKit-7616.1.3 git log --oneline WebKit-7616.1.2.2..WebKit-7616.1.3 Includes 355 commits including two critical ones Kimmo is referring to: b37b92a67793 Update ANGLE to 2023-02-14 (2bcf94cc0b577225f7b925dd0cd1ed03541e30db) https://bugs.webkit.org/show_bug.cgi?id=252237 rdar://problem/105447176 This includes this revert: ee64836f7 Revert "Metal: Optimized BufferSubData per device" And this fix: ab998e353f10 ANGLE Metal program memory cache is unbounded https://bugs.webkit.org/show_bug.cgi?id=251915 rdar://105174119 Later tags 1.3.1 (Feb 23, 2023) and 1.3.2 (Feb 24, 2024) have just one unrelated fix and re-versioning. 16.4 beta 2 was released: 28th Feb - Tuesday an still has the bug - meaning its not based off of WebKit-7616.1.3. Which Beta will it include 1.3.x tagged code? Hard to tell. But I see a March 1 commit (yesterday) to update Safari UA to 16.4: https://github.com/WebKit/WebKit/commit/3c220716e9d6cba24dc20c174b43aceb94051934 So I suspect 16.4 release will contain the fix - lets hope early as next beta - next week. Kimmo: Thanks for the info -- that was what I was guessing, but hadn't verified the individual ANGLE commits yet. Looking forward to a fix in a later 16.4 build. Marking this as duplicate of the fix bug. I can let you know in this bug once the fix is available for testing. *** This bug has been marked as a duplicate of bug 252237 *** This appears to be much improved, and might even be totally fixed, in 16.4 Beta 3. Thanks for the fix. Hard to observe memory usage on the real device, but we're definitely not jetsamming instantly like we were in beta 1 and 2. Unfortunately the simulator runtime hasn't been updated for Beta 3, so we can't use that as a way to observe things, but we'll live. |